
Table of Contents (expandable outline, click on a heading below):
| Business Architecture | |
| Application Architecture | |
Information Architecture
|
|
| System Architecture | |
| Resources |
Intro: This is to share discoveries made
in computer science. Hopefully you will find some information here of value in
designing better systems. If you react strongly to anything expressed here, please let me
know so I may learn from your feedback. These are only opinions, and like the nose on your
face everyone has one. However good one's opinion is, it is never good if no one wants to
follow, or is unable to do so. So I eat my own words with a grain of salt. Consistent
application of team skills & drive leads to success. Much of the info here is
out-of-date. Caveat Emptor.
Each subject area has its own subschema, and each area is locked to another area in a complex dance:
Dr. Benjamin Yen at The Hong Kong University of Science and Technology offers a course on "Information Technology in Supply Chain Management". The 6th lecture is on "The Importance of Time". The 8th lecture is on "Partner Relationships & due diligence".

The top layer (the metamodel) sets up the container for next two layers (customer rules & data):
Knowledge Management is the responsible securing of a company's most valuable asset: Knowledge. Even in computer form it can come in many formats that are hard to tie together:
Knowledge has a certain life-cycle within an enterprise:

Note: Information becomes more distilled, abstract and strategic as it moves up the corporate pyramid, and more tactical and concrete as it moves down. Information is more descriptive on the left and more proscriptive on the right. Companies often lose the historic casual relationships among this data (traceability) and the accuracy for many assumptions goes unchallenged.
While at AT&T and the Ideon Group I built various incarnations of the above.
If you use un-integrated best-of-breed tools for portions of this you cannot attain the full synergy of having all this data tied together in an integrated repository.
If you are a CEO or CIO please consider setting me up as your Chief Knowledge Officer (CKO).
If you are a venture capitalist and willing to back me I am confident we can start a new company that will revolutionize corporate management processes and systems based on the above model. Detailed schemas and models are in my possession.
There is a paradigm shift that involves our way of thinking about Application Architecture. Here is the principle:
"Move that which varies to data; and that which is fixed to compiled optimized code."
In the case of a multi-enterprise digital marketplace 'that which varies' sometimes includes the composition of our data structures and also the business rules for long-running transaction commerce event processing (LRT CEP). Thus, both the metadata and the business rules need to be stored as data.
To restate more technically: If you permit metadata to vary you must define and constrain it to a metaschema (meta-metadata). To the extent possible, the designers need to step back and look at things from a "meta" level. [Find the DM "invariants" and the minimum customer-controlled "degrees of freedom" and the equations (i.e. structures+code) that makes it run.]
This is not rocket science, just common sense. Give the customer & operations the controls they desire, while you control the scope/safety of their operations. Treat variable requirements to all the resources which Information Architecture provides, while targeting fixed requirements to all the tactics that System Architecture provides.
A Data Dictionary has value to the customer, designers, engineers and maintainers. It is a vital piece of life cycle documentation. A repository expands upon the Data Dictionary to include all process & system artifacts and their relationships to each other.
For the reason why we need a repository please see me for a copy of the article "The Repository: A Modern Vision," by Philip A. Bernstein in the magazine "Database Programming and Design", issue of December, 1996, pp. 28-35. (Wish I had a URL for this.)
One standard of repository technology, the Open Information Model (OIM), is a set of metadata specifications to facilitate sharing and reuse in the application development and data warehousing domains. OIM is described in UML (Unified Modeling Language) and is organized in easy-to-use and easy-to-extend subject areas. The data model is based on industry standards such as UML, XML, and SQL. It has been reviewed by over 300 companies that are in the IT utility industry.
The Open Information Model is grouped into subject areas:
Info from former caretakers of OIM:
General info on Repositories:
I have had experience with many UML modeling tools while leading a 9-month in-depth evaluation. Material withheld.
StructureBuilder and TogetherSoft are unique in its way of keeping model and code in sync simultaneously, in-fact the "code" *is* the repository (except for geometry & UC), and it's under SCC for versioning and branching. The also are unique in that they generate not just the static Class Diagram, but also the dynamic SEQUENCE diagram by code inspection.
My recommendation is:
Configuration Management Today Yellow Pages
Many methodologies in practice today use deductive reasoning to arrive at a solution that precisely matches the stated requirements, item-for-item. All too often, these solutions are fragile as requirements change (item-for-item). The problem comes from methodologies that define the metadata requirements and then use that to bind the metadata, business rules & data-flows at compile-time. In our modern world, however, metadata, business rules & data-flows is a flexible as data and customer whims. So the old methodologies can't hack it. Personally, I recommend methodologies that use induction and generalization to arrive at highly adaptable solutions. Catalysis combined with Design Patterns is a good alternative.
"You know you have achieved perfection in design,
not when you have nothing more to add,
but when you have nothing more to take away."
--Antoine deSaint Exupery
"Solve a concrete problem by solving a more general problem. The general problem has paradoxically a simpler solution."
-- George Polya, How to Solve It
"Everything should be made as simple as possible, but not simpler."
--Einstein
"Any intelligent fool can make things bigger and more complex...It takes a touch of genius--and a lot of courage--to move in the opposite direction."
--Einstein
This is also known as the "Law of Parsimony" or
"Occam’s razor": "Pluralitas non est ponenda sine
neccesitate"
"Entities should not be multiplied unnecessarily."
"If you have two theories which both explain the observed facts then you should use the simplest until more evidence comes along"
"Simplicity is the ultimate sophistication."
--Loenardo da Vinci
IMHO, may methodologies in practice today use deductive reasoning to arrive at a solution that precisely matches the stated requirements, item-for-item. All too often, these solutions are fragile as requirements change. Personally, I recommend methodologies that use induction and generalization to arrive at highly adaptable solutions.
"The Laws of Architecture From a Physicist's Perspective" by Nikos A. Salingaros. Enjoyable. Poses the concept that macro-structures are made possible due to well designed imbalances in the micro-structures.
How can we as experienced professionals transmit our knowledge and experience to others, particularly as detailed solutions to engineers, while keeping business leaders informed on the beneficial value of the recommendation? It helps when delivering the such wisdom if we document how we arrived at our solution via the vehicle of business drivers. The actual steps one takes here consists of the following:
One communication device that has sprang up in the programming community for sharing precise design concepts is "patterns". The term "pattern" is from architect Christopher Alexander who advanced them to design buildings & communities. A Pattern is a specific document format that contain sections that clearly state the business driver for the architectural solution.
Specifically:
For example, here is a simple pattern that all agree upon: Layered Architecture
One of the nice things about patterns is there already exists a body of well-written, proven material that we can point at and improve upon and adopt or add to.
Here is a short list of Architectural Patterns:
Here are the original "Gang of Four" (GoF) patterns restated as Non-Software Examples. Many people don't realize that the GoF patterns are only just a beginning. Hundreds more have been added, building on earlier work.
Here is a handy pattern called Facades as Distributed Components.
Here is a Pattern Language ("a structured collection of patterns that build on each other to transform needs and constraints into an architecture") that can serve as a starting point for understanding EBJ & COM+: Distributed Component Design (PDF).
Here are some Pattern Libraries to explore. Find solutions to your problems here.
So, what are your Design Patterns? ...
Most robust software environments are based on "strong typing". A simple data change, such as a new attribute, impacts existing database, event, Java and template code and requires the intervention of a programmer to correct. This is because all metadata is bound at compile time. Unfortunately, the new multi-enterprise systems must be able to dynamically adapt to changing customer requirements on data content and structure (such as new XML/DTDs). Such fragility must be eliminated for systems to thrive. The Active Object Model is one solution.
AOM is based on accepted design patterns:
As you may have guessed, I have my own design that is a composite of the above patterns. It allows a non-programmer administrator to dynamically adjust the metadata, transforms, workflows and roles. It is the ideal integrated supply-chain solution.
In order to accommodate multi-enterprise knowledge exchange via XML I had to design and build a system that would not break when new unanticipated metadata and relationships (inheritance is-a, has-a, needs-a, etc.) are imported to the database. Of course this flies in the face of 2 decades of strongly-typed languages and DBMSs with static column names set at DDL-time. It is tough enough to make a non-fragile system that can handle dynamic metadata--try and get it to perform as well as a system with a schema known in advance. In order to do this I pushed Oracle8i to the limit using the new Object Oriented features and other features such as nested Table-Types, Index Organized Tables (IOT), Synthetic keys, controlled front-end compression, materialized views, etc. The result is a new "type" that can transform any static table in one also having dynamic RDF/TRIPLE-like properties.
Distinguish clearly between second, third and fourth generation methodologies. Swing your organization as rapidly as possible to fourth-generation techniques including fourth generation languages, application generators, prototyping, information engineering, flexible data bases, nonprocedural data-base languages, fourth-generation networks, low-maintenance software, support of user-driven computing, support of single-user computers and powerful decision-support systems.
Establish Information Centers to encourage, support and assure quality in user computing. Establish an Advanced Development Group to become skilled with software and techniques for faster application building. Wind down the conventional programming team until it does little more than maintenance. Rebuild systems which are expensive to maintain using software which gives low-cost maintenance. Encourage systems analyst to become expert with software with which they can build a complete application, not pass specifications to programmers. Use techniques which give one-person projects wherever possible. Ensure that thorough information engineering is done with computerized tools, and that sound data models are the basis for all future development. Build the capability to provide executive decision-support systems as fast as possible.
Hire the best quality developers, pay them well for excellence and build a high team morale. Motivate and reward them highly for high-speed high-quality completion of projects.
Ensure that strategic planning is done to ensure interoperability between systems, and ensure that users or Information Centers can extract and manipulate required information quickly using networks and data bases.
Provide services so that users can obtain the information they need (text and data) by themselves as much as possible, and can manipulate data without help from traditional programmers.
Understand that the traditional development life cycle needs fundamentally changing for many end-user requirements. The old standards and procedures are crippling for these requirements. You need a manual of DP development encompassing the new techniques.
Did you know you can develop many real-world business application systems 10 to 20 times faster than with conventional programming languages? RAD tools are under-utilized. Most people do not even know they exist. RAD combined with Joint Application Design methodology (JAD/RAD) is the fastest way to get the jump on your competition.
In the past I have worked with a number of 4GL RAD tools including PROMPT, BL/1, JdesignerPro and ViewPoint. I've also worked with a number of 3.5GLs, like dBase, R:Base, DataEase, Paradox, Access, etc. Through this I got real good at JAD/RAD and mentoring teams of executives with RAD tools.
Contrary to popular thinking, Magic's EC solutions prove that RAD code is reusable.
Warning: RAD tools have limitations on certain functions or features. If you hit one of these limitations you either live with it, or start afresh with a 3GL.
Old definition:
“Information Architecture is an integrated framework for defining, evolving or
maintaining existing information technology and acquiring new information
technology to achieve the Department’s strategic goals.”
-- Department of Energy
Pepsi Generation Definition:
“The individual who organizes the patterns inherent in data, making the complex clear
“A person who creates the structure or map of information which allows others to find their personal paths to knowledge
“The emerging 21st century professional occupation addressing the needs of the age focused upon clarity, human understanding, and the science of the organization of information”
--Richard Saul Wurman
The Information Architect is responsible for establishing information architecture strategy and content plans by creating information structures through content hierarchies, site maps, and navigational models that are aligned with the defined user experience and business requirements. The purpose is to create effective and intuitive user interfaces for the Brands web sites.
My definition:
Application Tuning requires a complete 360-digree picture of how each layer of the application stack works in concert with each other. This includes the following:
For example, a favorite trick of mine is to arrange extremely large parent-child tables with the parent table clustered on the public primary key and the child table clustered on the same migrated public foreign key. Thus when joining the tables for large output the query optimizer will perform a "sorted merge join" with pipelining: the first record comes quick, no hashing is needed to consume memory nor sorting to delay output. If the two tables are on different spindles the data will come as fast as the read/write head can deliver it in a sequential read. Even if this is impractical for your situation you may still get away with it using a cluster-indexed materialized view.
ERwin is the best of the "relational" DB modeling tools. It is strong on round-trip engineering with ability to selectively pinpoint and combine design changes between databases, models and generation scripts. Its weak point is that is has not kept up with object-relational technology, including the new Oracle8i features. For that, switch to the new Oracle Designer which models in UML.
The trick in designing an architecture for Data Warehouses is the management of the sheer volumes involved in the load process, and the creation of high-performance table/index structures for ad-hoc queries. You also have to think-through the process of adding another dimension retroactively. Metadata is needed to guide you to the best performing structures, and also to fill-in incomplete cubes. As most sites cannot afford a complete Data Warehouse from day-one you also have to plan its evolution around maximum utility and minimum through-away at each step.

The whole topic of ontology confuses many people. Even the experts love to blow smoke on the topic. Well, lets get clear what ontology really is to the IT community.
In philosophy an Ontology is defined as “a Systematic Account of Existence.” Huh??? you may say, and you are right, this has little meaning to computer geeks.

"Sometimes my whole world seems out of context."
(Cartoon commissioned by TimeØ for marketing collateral related to my work.)
In general (still non-IT) an Ontology includes the following items:
Thus an Ontology is the specification of grammar and word-combinations that make sense in your field of discussion--can be understood, exchanged and acted upon. Another important word that should be considered, but usually missed, is Epistemology, namely how mental knowledge and definitions match concrete reality and give you conceptual mastery over reality; methodologies for acquiring knowledge.
For a pure data-based system an ontology is simply an agreement on the structure of data and its meaning that can be shared with multiple parties.
It includes:
For example, once there is an agreement that “IN” represents an ‘inch’ within a particular field then the ontology ensures consistency to the rule and (unless otherwise defined) prohibits “In”, “INC.”, or any other code from possessing that meaning in that field. An IT ontology also includes an understanding on how to navigate for deeper knowledge, like from the PO to the line items to impact on inventory. It should also include the rules and language governing queries against the data such as SQL (Structured Query Language). Clear?
An ontology is a vocabulary of terms and relations rich enough to enable us to express knowledge and intention without semantic ambiguity. Ontologists strive for complete foundational vocabularies from which meaningful higher level concepts can be constructed. A formalized ontology is expressed using a formal, usually mathematical/logical, language. A reusable ontology is machine readable and expressed independently of any particular technology, e.g. relational, objects, rules ...
Ontologies exist at different levels of applicability:
Every company has its own ontology. In fact, this is the number one impediment to tight B2B integration, reintermediation, and, what HBR calls, "unbundling the corporation".
I specialize in solutions for ontological issues. This includes solutions for (1) multi-enterprise integration, and (2) for dynamic metadata (see the XML section for why this is now vital).
Ontology resources:
On the philosophical side (these two books are in opposition to each other):
The real value of XML is being missed by many.
There are several features of XML that need added emphasis:
Let me describe these two features in greater detail and connect them to benefits. If you had a rigidly fixed data structure to communicate internally between two programs and had to choose between a record-like structure or XML (under a single DTD), it would be more efficient to use a record-like structure since it typically would be more compact and directly map-able to a 3GL structure. No metadata needs to be passed because the metadata structure is fixed and is encoded into the application data structures. However, if the data structure is not fixed but included alternative structures and contextual variations (e.g. CC-PO & OAB-PO) then the record-like structure starts to fall apart with many variations and code complexities which distort the simple problem to be solved. By passing metadata along with the data you permit the program code to find its own path to the data, permitting contextual variations. Code is simpler, less fragile, more robust, more reusable. This benefits us internally because it reduces complexity in context-laden situations. It also benefits us externally in managing multiple customer DM formats and evolving requirements.
It is often stated that XML is a hierarchical data structure, but this overlooks the fact that there are more complex structures than hierarchical which XML also supports. For example, a XML purchase order may hypothetically support multiple credit cards. The data can be arranged with header info, followed by line items, and each line item may hold the credit card data. If a succeeding line item used the same credit card as previously used in a prior line item it does not need to replicate the CC data; instead it can 'point' to the prior data internally, referencing it by an internal arbitrary label or by navigational instructions. Similarly, any data structure within a XML document can reference any other data structure within another XML document, even across machine boundaries. This mechanism permits cyclic and self-referential structures (such as genealogy data with divorce, etc., or, in the context of a digital marketplace, a bill-of-materials parts explosion). XML permits directed graph relationships, making possible artificial intelligence semantic networks used in generalized knowledgebases.
One of the major flaws of relational databases is that while they *can* store complex relational structures they *cannot* fetch and put complex structures in a single I/O but must move all data in and out via multiple table navigations (or large highly-redundant square shaped views). As one expert noted: "I do not disassemble my car each night and reassemble it in the morning so why should I do that with my data?" The ability to locally wrapper a RDBMS and deliver data in meaningful XML structures makes robust scalable Business Objects a reality. Databases like Poet, ODI eXcelon, and the coming enhancements from Oracle (IFS) and Microsoft will make complex database navigational code a thing of the past.
So here is the PROBLEM:
Most programming languages, RDBMS and IDLs are "strongly typed" and "statically bound", i.e. data structures are defined at compile time and variation over alternative data structures is not permitted. This does not conform to the requirements of a B2B Digital Marketplace! The requirements of Digital Marketplaces case fight against strong typing. Information liquidity requires that more than one type of P.O. (for example) be able to flow through the event. With multiple buyers, multiple sellers and multiple rules of engagement we require a way to accommodate greater flexibility and heterogeneity in our data types and thus new mechanisms (data & design patterns) to be able to handle this requirement.
This stands as an obstacle to Intellectual Property reusability.
How does XML help?
Rather than having "Clean Interfaces", you can have loosely typed "Dirty Interfaces" that are 'Content-Open but Purpose-Specific'. Rather than associating *one* DTD with each event type the event needs to support a *family* of related DTD types.
Just passing the data around as Attribute-Value pairs or <tag>value</tag> is NOT enough: Components must know the <tag> names. Different-but-similar data structures must be *constrained* to use the same <tag> names and similar structures.
The W3C recognizes this need and is stepping up to the challenge. They are laying the groundwork for Taxonomies that are linked to Vocabularies which can provide the needed consistency. Their work is based on solid AI, IA and ontological concepts. See W3C Metadata Activity Statement for example.
If mapped to a relational databases (for speed) the Information Architecture whitepaper proposal for Taxon, Attribute, and Infon structures services melds neatly into this goal.
I propose "using XML for action oriented content", namely, Long-Running Transaction content like RFI, PO, logistics, capacity scheduling, etc. that may vary in structure from customer to customer--even on the same DM. LRT data is like a snowball rolling downhill--it collects new data as it rolls along and you cannot predict the course it will take.
Also, use XML as the basis for robust reusability architecture such that data content changes does not break every piece of code, event, UI and DB in its path.
Do you want to stay on top of breaking Oracle developments (like XML and 8i partitioning) and begin working with their software as it becomes available? For this OTN offers 3 Technology Tracks. They are a tremendous value at $200 each--full-use, single-user licenses which purchased individually would cost more than $5,000 per track. With this you get an initial shipment of the products CD's in your track, plus automatic shipment of updates for 12 months as major releases become available (like MSDN subscription). Developer licensed Technology Tracks give you the right to develop, but not deploy, applications using the products in the track on a single CPU.
The Internet Servers Technology Track offer Oracle's leading data server and application server for cross-platform application development. Included:
The Internet Tools Technology Track offers Oracle's modeling, 4GL and Java development tools for Internet and enterprise application development. Included:
The Business Intelligence Tools Technology Track offers Oracle's integrated ad hoc query, reporting, data analysis and data mart tools for decision analysis and data warehouse applications. Included:
Oracle8 "REF"erences are not recommended. By way of background see Robert J. Muller's "Database Design for Smarties; using UML for data modeling" page 369.
Very large transactional data volume (VLTDV)Here a whitepaper on Oracle's new ability to handle large-scale data loading. Quotes (from page 14+15):
"With appropriate mapping [of partitions to companies]... partitions can be backed up and recovered independently from other partitions of the same table or index... enables operations like data uploads... to proceed without affecting online application functionality."
"Transportable tablespaces area new feature of Oracle8i that allows a user to move or copy a subset of one database to another... orders of magnitude faster than either export/import or unload/load because it involves only copying data files and integrating metadata... can be used to distribute an updated parts catalog [!] ..."
If you have to push the upper end of the availability envelope (+99.9% online) then get the book "Oracle 24X7 Tips & Techniques" (1,005 pages). Here is a synopsis by the publisher.
Why is distributed transaction rollback so hard? In the old days Oracle took too much RAM per user-connection. (As you know, commit/rollback happens inside a "connection".) They fixed that by moving much of it to RAM in the client's Oracle API. The problem is (and it is not easy to solve) a connection opened in one client box cannot be accessed from a client in another box. The new TP monitors overcome this restriction by pooling connections in a central or distributed resource. They also serialize the incoming async SQL. This adds yet another layer and performance constraint on the system, but is required to employ n-tier asynchronous processing.
However does Oracle8i global transaction save the day? Read on...
A.k.a "Distributed execution environment" and "distributed process transactions" (*not* the same thing as "distributed database transactions"). The business goal here is to be able to take a single UI-I/O transaction and distribute its work among multiple concurrent components. If an error occurs in any component the application should be able to roll back the work of all the components.
If you look at some parts of Oracle's website you would get the impression they can handle it. Example. However, Oracle8's use of the above terms contradict the following quotes: "The optional XA feature asynchronous XA calls is not supported." and "In an Oracle system, once a thread has been started and establishes a connection, only that thread can use that connection. No other thread can make a call on that connection."
The Oracle representatives said they think what we need here is not directly supported. Thus be warned.
UPDATE: With Oracle's new distributed cache technology the above problem may now go away.
ACCIDENTAL FAX EXPOSES ORACLE: A confidential Oracle document accidentally faxed to The Wall Street Journal gave a sneak peek at information such as the database company's 20 largest customers for the fiscal period ended Aug. 31. While not product specific, the document outlined the revenue received, payment terms and discounts for each customer. The discounts ranged from 42% up to 94%, the WSJ reported. Database specialist Richard Finklestein indicated the deep discounts might have been to keep accounts from going to rival Microsoft. Story.
New commercial off-the-shelf (COTS) tools exist that generate the Java code for Domain Objects over relational databases, eliminating code complexity with a simple-to-maintain, yet feature rich, debugged solution. Among the features are ACID logical transaction commit/rollback, distributed caching, navigation, synchronized buffering, and mapping relational tables to Java objects in a simple declarative fashion (e.g.. TopLink). If used, it could eliminate a whole swath of hard-to-maintain code, developed via heroic efforts.
Normalization vs. GUIDs is a hot religious topic. DBA need to understand the issues of data stability as seen from the OO perspective. Numerous experts that have written on why you need to make the move from normalized databases with " intelligent keys" (in 5th normal form) to tables based on object identifiers (GUIDs). See me for a copy.
DM Review magazine published a two-part article on this topic titled "Primary Key Reengineering Projects: The Problem" and "Primary Key Reengineering Projects: The Solution".
[Once, in my past life, someone (not me) made a typo on an intelligent primary key that was not caught in time. The correction cost $$$,$$$.00 because it cascaded to so many tables and no one could tell if the related records meant the 'intended' value or the 'expressed' value. (E.g. PKey:"80286", Desc:"Pentium"). They had to phone, fax, mail, etc everyone who placed an order.]
Similarly, if you need a product classification taxonomy then goto the UN/SPSC open coding system or Thomas Register for classifying goods and services. See also SC4 and STEP.
Frequently used data found in one table may be copied into a second table for increased performance. For example, product catalog data may be copied into the order line-item records in order to eliminate the need to go back to the product table in the future.
Benefit:
Risk:
Use only where all of the following conditions are met:
It is particularly useful to copy information into the "driving" record of a query. (This requires an understanding of the DBMS query optimizer.) The type of information to put in the driving record include:
Implementation Impact:
To implement this enhancement requires modifications to schema, triggers, events, Java code and SQL code, and replication of data. Complete regression testing is needed, along with pre-test modification of existing test data to conform to the new schema.
Sqlsniffer will analyze the functionality of a specified application at a SQL call level. This is better than blindly monitoring database performance.
Do you work with a VLDB requiring varying degrees response-time?
One suite of products you might want to consider in this space are from e-zdata.net. Their product works via triggers to intercept record-not-found events. It then looks at a secondary storage site (e.g. tape) for the data. One of the advantages of their approach is that existing code does not need to be revised, and the administrator is free to tune the system as desired on a record-level basis. This approach, one that internally modifies the RDBMS software stands in contrast to the approaches that use a hardware-only solution.
This does not mean that I trust this product. If anyone has first-hand experience with this product please contact me.
ODBC and JDBC should not be used for large batch data loading. ODBC is usually implements a record-at-a-time protocol and is too slow. There are provisions in the ODBC standard for the movement of Blocks of records, but this is rarely implemented and must be supported by both client and server. Even using Oracle's native API performance will be too slow for record-at-a-time updates.
When it comes to actually moving massive amounts of data into a Oracle database it is best to use Oracle's own tools, like SQL*Loader. (There are reasons for this but I'll spare you the details. Every good RDBMS offers a loader that loads faster than their API normally permits.)
An outstanding need is a high-speed tool that validates the data in its serial format (on disk or in-memory via pipe filter) prior to loading into Oracle. Another need is a fast way to determine the 'deltas' when the customer gives you complete re-dump of their data. See me for a solution to this problem.
The new Oracle8i direct path load API is a set of OCI interfaces that provides an application access to the direct path load engine in the Oracle server. Previously, the only client of the direct path load engine has been Oracle's SQL*Loader utility. Now, it is possible for ISVs to develop applications which can take advantage of the performance benefits provided by the direct load engine, including parallel load. There is no longer a need to write SQL*Loader control and data files in order to access the direct path load engine.
CardoNet has a technology to "push" catalog data on a scheduled basis, and dynamically "pull" data, like inventory availability in real-enough-time. They offer software for both the suppliers and the market maker. The supplier-side software uses JDBC to hook into the supplier's database. It converts and translates the data to XML and sends only changes (deltas) to the market-maker's central catalog. It's weakness is that it can handle only one DTD, and no natural language parsing. (Note: a single XML/DTD can read multiple relational input tables and update multiple output tables.)
Their tag line is: "The only company focused exclusively on market makers".
Note: CardoNet keeps a second copy of the database at the customer's site in order to determine the deltas to transmit.
It is possible to integrate our remote databases into our database in real-time while still offering optimum performance.
If they are using Oracle and we are using Oracle then there should be no problem. We can create synonyms of their tables or views in our DB and they will 'appear' local. Oracle can break-down any query so it can be optimally attacked at both sites with minimum computational and communication resources.
If security is a concern with an external enterprise they can redefine their table through a SQL "view" that allows them to control access privileges, and also add a trigger to keep a shadow log of all adjustments. Often it also is possible to map a customer's DB format into the local schema using a "view" that you create that permits reuse of local code.
This gets better if we are using Oracle8i. A new facility, Net8, gives encrypted DB-to-DB data communications over the internet.
Alternatively, if the data is read-only and snapshot can be made and refreshed on a periodic basis.
The problem of poor quality data for batch loading or remote data integration requires a lot of forethought. Here are some questions to keep in mind:
Suggestion: referential integrity and OID equivalence needs to be maintained over the life of the database for later data mining and reporting. (e.g. how many orders for Product_ID=325342?) However, the referential integrity constraints placed into the Oracle schema may be removed if it has been proved that the integrity is being maintained algorithmically by the programs. Removing DB RI can speed up the system and make data loading easier. Periodically, RI can be reinstalled and then immediately removed as a later test of integrity.
Problem: As once-tightly integrated systems partition into loosely-coupled systems onto different boxes certain information must remain replicated in each partition, specifically user profile/privilege info. How does one replicate profile info over many related systems, hardware and software, internal and external, and is there an open standard for this which 3rd party component OEMs can use?
One emerging marketplace answer seems to be LDAP/CIM, Common Information Model.
The authority of this shared-profile-access-security standard is the DMTF User & Security Working Group.
Here is a B2B Trading Hub based on this technology:
The meta-schema of CIM is very powerful and generic.
A mention of TimeØ's Information Architecture was made in the Internet World, June 14, 1999 article Detailed Database Can Be Key to E-Commerce Success. (In the interview I detailed Perot's TimeØ distributed asynchronous system with business objects, two-tier rendering, etc. However, the reporter was specifically interested in the topic of product representation. So I detailed him about taxonomies, knowledge representation, inheritable metadata structures, and translation ontologies. This was what he was hoping to hear; it was the topic of his article; he understood it completely; research it in college, etc.; and it was also similar to what he was hearing from the other leading-edge companies he sought out. You can tell he was fishing for an unusual angle. After all the technical material, he quotes the philosophical connection to epistemologies because the topic interested him and he needed an intriguing way to tie his article together on a human angle.) The article makes a statement on the unique value of TimeØ's future intellectual assets. Quote: "The asset they create--this structured data or data input system on which the rest of their business is built-is often the only thing they own that cannot be duplicated easily."
Basic Layers:
Layers with layers within layers.
|
Client |
Remote
Applications |
Browser, Java applet, Legacy system |
|
|
Communications |
HTTP, ftp, telnet, 3270, EDI, VPN, fax |
|
|
Access |
Security, Certification, Firewall, Encryption, Session Closure, Connection Context |
|
Presentation |
Native Presentation |
Layout XML to: HTML, Character, Record layout, EDI (via Templates+Rules), Validation |
|
|
Generic Presentation |
Semantic XML to Layout XML |
|
Transaction |
Application Context |
State Management |
|
|
Workflow Processing |
Commerce Events, State Transition Table, Workflow, Job Initiation, Temporal & Priority Control |
|
|
Application Logic |
Custom application-specific code goes here |
|
Data |
OO
Data Object |
Metadata markup, DB Data«Semantic XML |
|
|
Semantic Navigation |
Ontology, Context, KQML, Semantic Undo |
|
|
Integrity & Validation |
Data Dictionary, Referential Integrity |
|
|
Location Transparency |
Heterogeneous navigation, Scale Redirection, Legacy Redirection |
|
|
Transaction Monitor |
Stored Procedures, ACID |
|
|
Data View |
SQL, Dataset |
|
|
Physical Data Stores |
Index/Cluster Architecture, Bloom Filter, Encryption, RAID |
Note: The Legacy Integration Layer is at the top, not the bottom. Why? This permits nomadic operation; data goes through translation layers, state and workflow is preserved (sometimes with blocking, buffering & aggregation), and the legacy device is treated as a generator of events--with appropriate response & recovery to incoherent behavior.
This is an outstanding article on scalability.
A central issue is architectural style or scalability paradigm.
At the large-grain distributed component level are you?:
Each of these paradigms can support a reliable architecture, but that does not mean that it is scalable.
These terms are used with a special meaning beyond their normal connotations:
SERVERLET-STYLE (not exactly servlet, but similar in concept):
Partition along vertical applications and inside them bind horizontal layers together in the same component to the extent possible. App-components can be parallelized, distributed and load-balanced against incoming demand, not horizontal speeds & feeds. Distributed DB transactions are not a major issue in this one environment. The idea here is that every user I/O event goes through one and only one chunk of linked code tailored to that event. A private input queue and SQL*Net are the only permitted communication going on here as components do not talk to each other; except possibly a temporary shared data heap.
DISTRIBUTED CALL-RETURN:
Vertical and horizontal component partitioning is used. When one component calls another it must suspend and keep its memory, thread and network connection, and spin new threads for incoming events. Chains of suspended resources "stack up" till the system resource utilization starts to slow down the application and then they *really* stack-up.
OBJECT-ORIENTED ORB:
This is a variation of distributed call-return. Only now you add a profusion of in-memory "instances" with "identity" that must be located in a namespace, and a tight-fisted approach to accessing data. Thus Individual.FirstName() and Individual.LastName() are two network calls. This is *very* bad for performance. A facade is required to serialize data.
MOBILE AGENT BASED:
This is the same as OO ORB, but it eliminates the multiple network calls between objects by allowing two communicating objects to physically move into the same CPU for the duration of their coupling. ObjectSpace Voyager is the best example of this with MOM, ORB, transactions, persistence, federated namespace & broker, time semantics, etc.
WORKFLOW-BASED:
Components are stateless and data flows unidirectionally. If user demand exceeds system capacity the excess remains shunted to a disk queue and does not consume system resources. Messages are large and pass-by-value (e..g. XML). A separation in the component of workflow code from functional code enables reusability without fragile data dependencies.
Again, each of these paradigms can support asynchronous message oriented middleware, but only "workflow" effectively uses it to the full. The others require system resources to be tied up waiting for a call to return. Exception: EJB and CICS can flush a suspended component to the disk, and all paradigms can use virtual memory, but this is sub-optimal.
Scalability is not the only concern. Time-to-market is strongly influenced by IDE's with wizards & COTS components, and system "qual-ities" are quickly enabled by COTS application server support frameworks.
A Hybrid solution is also possible. IBM recommends using Workflow for boundary & control components and ORB for domain objects.
Additional information on alternative architectures and why they are important:
For extremely large-scale distributed applications asynchronous messaging is essential. However, most designers implement MOM the wrong way. Jim Grey, the world's authority on MOM has a paper that states Queues are Databases. This has profound implications around the integration of messaging, state management, and database design.
Application workflow queues need to be combined with DB I/O and made transactional in the event of external failure. They are in Oracle Advanced Queue/Workflow, a part of their integration server. Another benefit: business rules do not require additional re-queuing when they are implemented via SQL on the queue.
At Perot Systems I had to establish a framework for nonrepudiation using "shadow tables", and long-running transactions across multiple connections in a n-tier message-oriented-middleware architecture. This got me into triggers implementing the timestamp protocol so as to add optimistic locking features and versioned data not available in 8i at the time. Coupled with a dispatcher that logs incoming requests, the transaction# timestamps provided a way to achieve semantic "undo" for long-running transactions across all tables.
Microsoft's .NET Framework is fantastic. Based on my very limited experience, I've seen a four-fold improvement in productivity over J2EE.
OO Guru Bertrand Meyer states The Significance of .NET this way: "At the center of the .NET framework is an object model, called the VOS (Virtual Object System); at the center of that object model is a type system. This already sets .NET apart from the other component models available, which are organized around a programming language (Java), an application interconnection model (CORBA), a wiring model (COM). An O-O enthusiast like me is entitled to see .NET as the latest vindication of the observation that, in today's software world, there is no going around the O-O model [not the component model] as a foundation for any serious software engineering work."
What makes the .NET Framework so special? A synergistic combination of many factors:
Programming languages are compiled to a common Microsoft Intermediate Language (MSIL) bytecode.
A Common Language Specification (CLS) specifies minimum requirements for language interoperability. Yet it also permits a language to ignore the rules for tricks localized to the language.
A Common Type System (CTS) defines all primitive data types across all languages and also how they are to be composed as non-primitive types, such as struts, enumerations, arrays, classes, etc.
Inheritance and refinement across different languages is possible.
Metadata and schema information is represented in a dynamic meta-model and not bound at compile time. They actually use the Active Object Model!
Metadata and schema information can be dynamically deduced from the data or database, or it can be loaded from XML Schema or saved as XML Schema.
A Wizard can generate strongly-typed classes from the schema if desired, for compile-time binding.
The nature of the DBMS or data store is hidden from the application. All data is converted into XML format under a XML Schema definition.
Rather than a record-at-a-time, or a table-at-a-time, an entire subschema subset of data can be loaded at one time in one transient connection. This creates a "Disconnected DataSet". It does not leave expensive database Connections hanging.
Disconnected DataSets can be rapidly moved in-bulk to distributed systems so fine-grain access and modification of DB data properties does not consume round-trips (or transactions).
Disconnected DataSets can be returned to the database. The DataSet keeps a before and after image. This permits only altered data to be returned to the DB. It also permits a check against the 'before' data to ensure that the data in the DB did not change by another process. Long-lived optimistic transactional locking is thus enforced by the .NET Framework without holding locks in the DBMS. (This also works for XML stores and other stores that are not as sophisticated as a DBMS.)
Doing a complete 180° .NET is 100% stateful (not stateless as COM+ with DB round-trips for each UI I/O).
Uses two-tier rendering, so a presentation format can be designed once and made automatically available in multiple client UI formats & protocols.
The ASP.NET Base Classes for visual controls are extensible.
Wizards and Form Properties make connecting to DataSets a snap.
This diagram captures the relationships within ADO.NET object model. Usually just a hierarchical model is presented, but as you can see below the actual relationships are really quite rich.
BTW, a worthy add-on to Visual Studio is Rational XDE, which provides immediate two-way conversion between UML and C#.
Microsoft .NET Enterprise Servers:
Non-.NET Servers:
[Warning: most of the information here is over a year old. Corrections are welcomed.]
Here is were I say some negative things about EJB. Please don't get me wrong. The features provided by EJB are definitely needed. If you were not to use EJB then you would have to invent these features on your own and likely you are not able to do a better job in less time with less money. However, that does not mean there are no better alternatives out there--I just don't know what they are--at least for distributed Java components under a ORB with OO connectivity/invocation paradigm. It may be that even at its best, at this present time, there are issues to address.
Here's the sales pitch: "EJB provides vendor-neutral access to a variety of infrastructure services, such as distributed communication services, naming and directory services, transaction services, messaging services, data access and persistence services, and resource-sharing services." See the "What is it & What's in it for me" pdf whitepaper from the Seybold Group. Whether it can achieve these goals in reality is an open issue.
COTS EJB (not the programmer) manages automatic distribution, load balancing, dynamic relocation, replication, failover, security, communication, transaction integrity, persistence, JNDI naming & lifecycle management (i.e. how do I find it & if it's asleep, wake it), caching, pooling, concurrency.
One thing you should know about EJB is that it is an enterprise component framework, not an inter-enterprise component integration framework. The application servers provided by BEA, IBM, etc. cannot integrate to each other nor can they integrate to themselves when hosted by different companies. The assumption is that each company manages their own server within isolated jurisdictions. JMS provides a way for loose integration but not while extending those characteristics that makes EJB a robust architecture.
IBM SanFrancisco project contains over 1100 prebuilt and pretested extendable Java components of both common and domain-specific business entities & processes, including Order Management & Warehouse Mgmt. IBM plans to complete porting them all into EJB and use them as the basis of an industry supported Application Architecture.
From what I can figure out, it is free--you pay for support and/or the privilege to contribute to its evolution.
Here are details on its Order Management component.
Just because you may know everything there is to know about J2EE and OOAD does not mean you know jack about designing a good robust application. Get the book and learn from the mistakes of others:
The EJB spec states that if a transaction rolls back, the statefull session beans do not automatically rollback, only the database does. Hookmethods are provided to enable this but it has to be manually coded to save and recall prior settings. There is no magic here.
In all, the whole issue of "transitive closure" of "state" is a muddled message in regard to (1) temporary eviction from memory and (2) rollbacks. "Transitive closure" here means all things referenced by an object and all things referenced by those objects, ad-naseum. "State" includes DB values, temporary variables, and open resources such as open files, socket descriptors, and DB connections.
Another, much smaller, are of concern is that EJB does not allow "loopback calls" (if A calls B, B can't call A).
A major advantage of EJB over COM+ is the presence of stateful RAM-resident components. Thus EJB can effectively use increased RAM for increased speed, while Microsoft goes back to the DBMS for each pass. Microsoft does not push statefullness because they have a vested interest in PC's which characteristically are RAM poor.
EJB caches stateful entity beans so the disk is not hit repeatedly. However, caching is the domain of the DBMS. Moving it to EJB causes duplication of locking, transaction, life-cycle, etc. services.
Weblogic once stated they immediately execute a SQL UPDATE for every EJB call that updates a field. If that is still the case we need to make sure we update all fields in one call, or else performance will suffer.
Some interesting published stats on scalability of WebLogic using clustering:
- A single WebLogic server (running on a 6-way Unix processor) simultaneously served 50,000 active clients, executing over 2500 EJB round-trip method invocations per second.
- Remote Method Invocation (RMI) benchmarks have shown that the throughput of a WebLogic cluster servicing 10,000 active client applications scales linearly up to 10 single-processor PCs, providing a maximum of 7942 round-trip method invocations per second.
- A WebLogic cluster of 12 servers running on three 4-way PC servers was able to serve 2675 dynamic Web pages per second, or 231 million pages per day (a load 38% greater than what the Internets busiest site, Yahoo!, reported for December 98). In this benchmark, WebLogic provided linear scalingaveraging 223 pages per second per CPU.
(Please note: There are no database I/O costs included in each of the above results, and hence they should not be used for capacity planning. Nevertheless, they do prove out the performance and scalability of Java, WebLogic, and our clustering solution.)
These numbers hide something.
The first bullet was for a non-distributed SMP (symmetric multiprocessor, shared memory) box. On the second bullet, the number of processors increased from 6 to 10 but the client count went down from 50,000 to 10,000. This is what you would expect when you move from a SMP to a fully distributed system.
My take: Roughly, a $5m, 16-cpu SMP forms an upper threshold on performance. The jump to performance above the high-end SMP is a large distributed system (e.g. 40+ cpu); where paradoxically, the hardware costs goes down (they are off-the-self-PCs) but the software costs goes up because custom software architectures are needed for locality, recovery, etc. that slice into the application logic (like TX) and take a long time to design/develop/debug. EJB was born for this.
As stated, the systems below do not have database I/O. That also means they do not use JTS. Also, the beans are likely stateless. Little of this matches most applications and the performance impact of stateful & transactional distributed processing is major. OK, OK, the emperor's new clothes are indeed pretty, but can anyone send me new numbers we can use.
Sun's RMI takes 2899 microsec for what a revised RMI API does in 57 microsec, and 40 times higher than RPC. Report.
Related topic at: "Performance Evaluation of Popular Distributed Object Technologies for Java"
EJB scalable communication middleware remains a central issue. Here is what BEA Weblogic says about their RMI implementation: "Sun's reference implementation of RMI uses multiple sockets. WebLogic's implementation uses a single, multiplexed, asynchronous, bi-directional connection for RMI, JDBC, and Events. With WebLogic RMI, the complicated virtual circuit that is maintained by the reference implementation is replaced by a single connection to the WebLogic Server. WebLogic RMI serialization, offers a significant performance gain, even for one-time use of a remote class. Unlike the reference implementation, with WebLogic RMI there is no performance penalty for co-located objects that are defined as remote. References to co-located "remote" objects are resolved as direct references to the actual implementation object, rather than to the generated proxies." Source.
At the heart of WebLogic's communication is what they call "RichSocket". It's a multiplex socket clocked at 7942 RMI round-trip methods/sec on a 10 PC LAN (no DB). They did not give a EJB benchmark on a LAN (ha!). In a non-distributed 4-CPU SMP they got 3910 EJB round-trip methods/sec
The distributed objects model is deficient as a way to achieve scalability. All attempts to fix the problems above result in new singletons that limit scalability or facades that are non-OO in nature. In part, this may be viewed as an impedance mismatch between OO and scalability via pipelining. Pipelining requires that data must 'flow', OO requires that data must 'stay put' (i.e. encapsulation). However, most data has a point of ingress and egress. This may be viewed as an impedance mismatch between OO and scalability via statelessness (objects have state).
OO was designed for reusability and maintainability, not scalability. Only mobile objects (aka ObjectSpace Voyager), or selective elimination of some precious OO principles around hardware boundaries can overcome this problem. (Note: Embedding internal knowledge about record boundaries into parameter selection is a modest OO violation. Passing large aggregates of data, some of which might or might not be needed by the caller, is another violation. Loose typing of parameters (which we need to do) is another violation (in some minds).)
These individuals have a real grasp on computer technology and its future. It is a privilege to sit at their feet and learn.
When applied to multi-enterprise supply-chain integration a number of tough issues come to the fore: epistemology, ontology, semantics, integrity, normalization, OO impedance mismatch, transactional ACID constraints, long-lived transaction compensation scenarios, workflow & dataflow, metadata standards, repositories, availability, scalability, recoverability, nonrepudiation, authentication, translation, internationalization, dimensional folding, etc. At the same time the convergence of many technologies causes a deeper look into how to put things together.
I have composed a number of whitepapers on these or related topics, but unfortunately many are not available for general distribution.
The stuff listed here is deep and theoretical and form IMHO the ultimate expressions in their field. It may be of great value to take the time to learn and apply them. Be warned, they are very hard to Grok.
Wolfram's book "A New Kind of Science" is finally out. The book presents new discoveries between cellular automa and process found everywhere.
The W3C is making good progress and agreement on the Semantic Net to replace the current internet. Watch RDF & TRIPLE.
James Martin's latest book predicts that AI will learn from the internet and we humans will allow it run our corporations in a competitive race for the best AI "master".
On the horizon, the convergence of genetic engineering and computer science is underway. Both are variations on "information" that will eventually converge. The end result will redefine who we are and what it means to be "human". Read Stephen Hawking's "the Universe in a Nutshell" chapter 6.
There seems to be two types of productive meetings:
In every meeting you attend it is important to recognize which type of meeting you're in. The type of people you will find in a planning meeting likely do not have the patience or desire to work out the solution in the meeting. There also are different types of personalities at work in each type of meeting so choose invitees carefully. Keep pointy-haired bosses and Dilberts in separate corners to do their best work. Issues meetings often throw both types of people together and the tendencies of each type causes it to feel out or control. Guidelines need to be set.
"The first responsibility of a leader is to define reality. The last is to say thank you."
-- Max DePree
"Hold yourself responsible for a higher standard than anybody else expects of you. Never excuse yourself. Never pity yourself. Be a hard master to yourself - and be lenient to everybody else."
-- Henry Ward Beecher
"To thine own self be true; And it followeth as the night the day, thou cans't then be false to any man."
-- William Shakespeare
"The Law of Integrity: There are two basic types of leadership in business today, transactional and transformational. Transactional leadership is the ability to direct people, manage resources, and get the job done. But transformational leadership, the most important form of leadership today, is the ability to motivate, inspire, and bring people to higher levels of performance."
--Brian Tracy, Laws of Business Success
"When it comes to commitment, there are really only four types of people:
1. Cop-outs: People who have no goals and do not commit.
2. Holdouts: People who don't know if they can reach their goals, so they are afraid to commit.
3. Dropouts: People who start toward a goal but quit when the going gets tough.
4. All-outs: People who set goals, commit to them, and pay the price to reach them.
What kind of person are you?"
--John C. Maxwell, The 21 Indispensable Qualities of a Leader
"Never confuse movement with action."
-- Ernest Hemingway
"It is better to know some of the questions than all of the answers."
-- James Thurber
It is better to debate a question without settling it, than to settle a question without debating it.
-- Joseph Joubert
"For every complex problem there is an answer that is clear, simple, and wrong."
-- H L Mencken
A mission organization wanted to send helpers to Dr. David Livingstone, the missionary to Africa, so its leader wrote, "Have you found a good road to where you are? If so, we want to send other men to join you." Livingstone replied, "If you have men who will come 'only' if they know there is a good road, I don't want them. I want men who will come even if there is no road at all."
"The reasonable man adapts himself to the world; the unreasonable man persists in trying to adapt the world to himself. Therefore all progress depends on the unreasonable man."
-- George Bernard Shaw, Man and Superman
The plan is nothing, the planning is everything.
-- Sir Winston Churchill
"The salvation of mankind lies only in making everything the concern of all."
-- Alexander Solzhenitsyn
"If you don't stand for something, you'll fall for anything."
-- Dr. Martin Luther King
"Those who can make you believe absurdities can make you commit atrocities."
-- Voltaire
You are the way and the wayfarers.
And when one of you falls down he falls for those behind him, a caution against the stumbling stone.
Ay, and he falls for those ahead of him, who though faster and surer of foot, yet removed not the stumbling stone.
-- Kahlil Gibran
If you have built castles in the air, your work need not be lost; that is where they should be. Now build the foundations under them.
-- Henry David Thoreau
"modus in rebus"--moderation in all things -- Horace.
"To live in the presence of great truths and eternal laws, to be led by permanent ideals - that is what keeps a man patient when the world ignores him, and calm and unspoiled when the world praises him."
-- Honore De Balzac
Self-confidence is important. Confidence in others is essential.
-- William A. Schreyer, CEO, Merrill Lynch
"People who look down upon other people don't end up being looked up to."
-- Robert Half
"The hottest places in Hell are reserved for those who, in time of moral crisis, maintain their neutrality."
-- Dante, "The Inferno"
"When a man knows he is to be hanged in an fortnight it concentrates his mind wonderfully."
-- Samuel Johnson
"Pray that success will not come any faster than you are able to endure it."
-- Elbert Hubbard
"Twenty years from now you will be more disappointed by the things that you didn't do than by the ones you did do. So throw off the bowlines. Sail away from the safe harbor. Catch the trade winds in your sails. Explore. Dream. Discover."
-- Mark Twain
"You can't depend on your judgment when your imagination is out of focus."
-- Mark Twain
In Japan there is a delicacy called fugu. It is raw pufferfish flesh, an animal that happens to be one of the most poisonous in the world. Every year some 100 people become ill after eating fugu. More than half of them die. And yet there is a simple saying: "The only man who is crazier than the man who eats fugu is the one who does not eat fugu."
Most papers in Computer Science describe how their author learned what someone else already knew.
-- Peter Landin, Circa 1967
"Programming today is a race between software engineers striving to build bigger and better idiot-proof programs, and the universe trying to produce bigger and better idiots. So far, the universe is winning."
-- Rick Cook, Mission Manager, NASA Mars Pathfinder Project
Metcalfes Law states the total value of a network is equal to the square of the number of subscribers, while the value to a subscriber is equal to the number of subscribers. The law describes why it is essential that everyone have access to a single network instead of being subscribers on isolated networks.
George Gilder's Telecosm Law #10: The Law of Instantaneous Information
This law is a commandment to save time: the companies that save their clients time will profit in the telecosm. Time to market, turnaround time, disk seek and rotate time, time to retirement, network delay time, memory access time all reduce to two key metrics: the speed of light and the span of life. A physical limit and a biological limit, these are the governing scarcities of the information age.
Peter Cohrane (1996) estimates the brain to have a processing power of around 1000 million-million operations per second, (one Petaops) and a memory of 10 Terabytes. If current trends continue, computers could have these capabilities by 2047. Such computers could be on body personal assistants able to recall everything one reads, hears, and sees.
As Edsger Dijkstra pointed out in his Turing Award lecture, the ideas we can express, and even think, are greatly influenced by the languages we use. In order to achieve our software-building ambitions, we therefore need languages that support abstraction and factoring of algorithms to form "intellectually manageable programs".
"The most important kind of time is the time that it takes to
acquire and retain your customers. That's what 'time zero' is all about: How fast are you retaining customers that you already have?
How fast are you grabbing new ones?"
-- Mark Teflian, President of TimeØ
"The central event of the 20th century is the overthrow
of matter. In technology, economics, and the politics of nations, wealth -- in the form of
physical resources -- has been losing value and significance. The powers of mind are
everywhere ascendant over the brute force of things."
-- Magna Carta for the Knowledge Age
by Esther Dyson, George Gilder, George Keyworth, and Alvin Toffler
In case you wanted know, What really is a computer and how does it work? Here is the answer.