发信人: wjanry()
整理人: (2000-03-07 20:29:41), 站内信件
|
Integrating Object and Relational Technologies
Abstract
Object-oriented and object-based technology is the predominant approac h used to build mainstream business systems. Effective use of object t echnology has demonstrated the ability to provide organizations with h igh-quality systems that are extremely understandable, maintainable an d scaleable.
Relational database management systems are the most prevalent implemen tation of back-end physical data stores for business applications. The y are frequently selected for persistent object storage and retrieval in object-relational systems.
An object-oriented design may resemble a relational design (e.g. the c lasses in the object model may be reflected in some fashion as entitie s in the relational schema). Usually this occurs only during the early conceptual phases of system development. However, as the design and i mplementation of the system matures, the goals of object-oriented and relational database development diverge. This divergence is caused bec ause the objective of relational database development is to normalize data whereas the goal of object-oriented design is to model the proble m domain as real-world objects. The result is an object-relational cha sm.
The utilization of a scaleable and flexible mapping framework is a cri tical success factor for object-relational systems. The object-relatio nal chasm must be spanned. The bridge must be strong and flexible. The bi-directional flow of data must be fast and reliable.
This paper presents a proven architecture for implementing object-rela tional systems. It discusses a conceptual application architecture and details essential object-relational mapping characteristics.
The Need for Object-Relational Integration
Relational Databases
Relational databases are a proven, mature and stable technology. They are and will continue to be the primary data storage technology for a variety of reasons.
They are a safe and easy decision. It is a technology that has been us ed and tested for a number of years and is consequently well-understoo d.
Current Investment. Organizations have made significant investments in relational technology. They have purchased systems and software, trai ned their staff, and deployed mission critical systems.
Entrenched legacy systems. Legacy systems contain critical corporate d ata which resides in relational databases and must be used by new syst ems.
Simplicity and elegance. The relational model is a simple model which has a sound mathematical foundation.
Demonstrated industrial-strength success. Relational systems are used to implement OLTP systems. These systems are optimized for performance and transaction processing, and utilize the database's built-in integ rity constraints.
Tools. A wide variety of tools exist to design, implement and maintain relational database management systems including query and report-wri ting capabilities.
What Creates the Need for Object-Relational Integration?
Relational Database Design
While object-oriented and relational design share some common characte ristics, fundamental differences make seamless integration a challenge . The data models and computational models are different [Bancilhon 3] .
The relational model is composed of entities and relations. An entity may be a physical table or a logical projection of several tables also known as a view. Figure 1 illustrates LINEITEM and PRODUCT tables and the various relationships between them. A relational model has the fo llowing elements:
Figure 1, A Relational Model
An entity has columns. Each column is identified by a name and a type. In the example, the LINEITEM table has Description, Id, Number, Order _Id, and Quantity columns.
An entity has records or rows. Each row represents a unique tuple of i nformation which typically represents an object's persistent data.
Each entity has one or more primary keys. The primary keys uniquely id entifies each record (e.g. Id is the primary key for LINEITEM table).
Support for relations is vendor specific. The example illustrates the logical model and the relation between the PRODUCT and LINEITEM tables . In the physical model relations are typically implemented using fore ign key / primary key references. If one entity relates to another, it will contain columns which are foreign keys. Foreign key columns cont ain data which can relate specific records in the entity to the relate d entity.
Relations have multiplicity(also known as cardinality). Common cardina lities are one to one (1:1), one to many (1:m), many to one (m:1), and many to many (m:n). In the example, LINEITEM has a 1:1 relationship w ith PRODUCT and PRODUCT has a 0:m relationship with LINEITEM.
Object Model Design
An object model contains classes (this is an oversimplification; for a detailed description of an object model consult any of the fine texts available on object-oriented analysis and design). Classes represent definitional information. In a running system, individual instances of classes exists as objects. A class contains structure, represented as attributes, and behavior, represented as methods. The object model al so contains relationships among classes. Figure 2 illustrates a simple object model. Some of its important aspects are:
Figure 2, An Object Model
The LineItem class has attributes and behavior. LineItem implements th e behavior (method) calculateTotal.
Relationships in an object model are explicitly represented as attribu tes. The LineItem class has an attribute product. The product attribut e contains an instance of the Product class (or one of its subclasses) . There is no relationship navigation through primary/foreign keys as required in relational databases.
The object model supports inheritance. A class can inherit data and be havior from another class (e.g. SoftwareProduct and HardwareProduct pr oducts inherit attributes and methods from Product class).
Application Architectures
Software development professionals have crossed the object-relational chasm. Through analysis and experience, some have come to understand t he problem. They have engineered reusable solutions that allow them to easily traverse the chasm. To better understand successful architectu res, a review of less optimal solutions would be fortuitous. The motto of most who have crossed is "damned the torpedoes, full steam ahead!" They have pieced together haphazard solutions which become legacy app lications as soon as they are released.
Relational Legacy Architecture
Figure 3, Relational legacy architecture
The model for these relational legacy applications is illustrated in f igure 3. The application is composed of two major components - the gra phical user interface and the database. This application is not object -oriented. Often, much of the application is automatically generated b y a tool. Whether code generated or hand coded, these applications tig htly bind the user interface directly to the database. SQL queries are scattered through-out and critical business logic cannot be distingui shed from GUI navigation source code. The architecture can be characte rized as extremely brittle. Changes made to the database have immediat e effect and propagate failures indiscriminately throughout the applic ation.
Classic 3-Tier Architecture
Figure 4, Tightly coupled 3-Tier architecture
Figure 4 illustrates an incrementally better approach. It recognizes t he value of the business object model. The user interface does not hav e any direct interaction with the database; but instead, it interacts with the business object model and is responsible for object persisten ce. This additional layer of abstraction slightly improves the quality of the architecture, but does little to address the brittle nature of the application. The GUI is tightly coupled to the object model. The impact of change to the object model cannot be easily assessed or cont rolled. When objects change, the user interface will be directly impac ted. The same condition exists between the business object model and d atabase.
Application developers spend over 30% of their time implementing relat ional database access in object-oriented applications. If the object-r elational interface is not correctly implemented, the investment is lo st. Implementing an object-relational framework captures this investme nt. The object-relational framework can be reused in subsequent applic ations reducing the object-relational implementation cost to less than 10% of the total implementation costs. The most important cost to con sider when implementing any system is maintenance. Over 60% percent of the total costs of a system over its entire life-cycle can be attribu ted to maintenance. A poorly implemented object relational system is b oth a technical and financial maintenance nightmare.
Component-Based 3-Tier Architecture
Successful object-oriented systems are characterized by a component-ba sed architecture with clearly defined graphical user interface (GUI), business object model and physical data store layers. This conceptual model has been popularized as the 3-tiered application architecture.
Figure 5, 3-Tier Architecture
Visual integrated development environments (IDEs) and object-oriented analysis and design tools are commonly used for system construction. T hese highly productive tools allow application developers to rapidly c onstruct graphical user interface components and bind them to business objects. Event-based frameworks (e.g. Smalltalk's Model/View/Controll er or MVC) are often used to couple the graphical user interface to th e business object model. MVC, and other similar frameworks, provide fa cilities to loosely couple the GUI with the business object model. The GUI and Business Object Model are developed as functionally independe nt components and their common interface is very compact.
The GUI and business object model are developed in a homogeneous envir onment using object-oriented languages and tools. Consistent design an d implementation practices are followed. Applications implemented usin g object-oriented (or even object-aware) physical data stores further leverage a homogeneous object-oriented environment. The semantics for persistence are consistent with the systems implementation and object- relational translation is not required.
The majority of business applications utilize relational technology as a physical data store. It is important to encapsulate the relational database in an object-oriented application, otherwise object-relationa l translation code will proliferate through-out the system. If the dat a store is sufficiently wrapped, the impact on the object model is min imized and a more robust and scaleable solution is attained.
In summary, a well designed 3-tier architecture has the following char acteristics:
Clearly defined, loosely coupled tiers/layers.
Internally, each layer is partitioned into a functionally oriented com ponent.
The interface between the layers is well defined, well documented and compact.
A reusable framework is implemented to couple the GUI to the business object model. The framework employs an event-based approach which loos ely couples the business object model to the GUI.
A reusable framework is implemented to couple the business object mode l to the physical data store. The framework provides facilities which utilize object-relational mapping information specified during develop ment to automatically translate objects into requests which can be ser ved by relational database servers.
The subsequent sections of this documents will further explore the int erface and integration issues between the object model and the physica l data store.
The Object-Relational Framework
Many solutions exist which let applications directly access relational data. Solutions even exist for non-object-oriented languages, but the interface is typically reflected as an application programming interf ace (API). The database APIs come in standard flavors (e.g. ODBC) and are proprietary (native bindings to specific databases). The APIs prov ide data manipulation language (DML) pass through services which allow applications to access raw relational data. In object-oriented applic ations, the data must undergo object-relational translation prior to b eing used by the application. This requires considerable amount of app lication code to translate raw database API results into application o bjects. The role of the object-relational framework is to generically encapsulate the physical data store and to provide appropriate object translation services.
Some of the essential characteristics of an object-relational framewor k are:
Performance: Close consideration must be given towards decomposing obj ects into data and composing objects from data. In systems where data through-put is high and critical, this is often an Achilles heel of an inadequately designed access layer.
Minimize design compromises: A familiar pattern to object technologist s who have built systems which utilize relational databases is to adju st the object model to facilitate storage into relational systems, and to ALTer the relational model for easier storage of objects. While mi nor adjustments are often needed, a well designed access layer minimiz es both object and relation model design degradation.
Extensibility: The access layer is a white-box framework which allows application developers to extend the framework if certain functionalit y is desired in the framework. Typically, an access layer will support without extension 65-85% of an application's data storage requirement s. If the access layer is not designed as an extensible framework, ach ieving the last 35-15% of an application's data storage requirements c an be very difficult and costly.
Documentation: The access layer is a both a black-box component, and a white-box framework. The API of the black-box component must be clear ly defined, well documented, and easily understood. As previously ment ioned, the access layer is designed to be extended. An extensible fram ework must be very thoroughly documented. Classes which are intended t o be subclassed must be identified. The characteristics of each releva nt class's protocol must be specified (e.g. public, private, protected , final, …). Moreover, a substantial portion of the access layer fram ework's design must be exposed and documented to facilitate extensibil ity.
Support for common object-relational mappings: An access layer should provide support for some basic object-relational mappings without exte nsion. These object-relational mappings are discussed further in a sub sequent section of this document.
Persistence Interfaces: In an object oriented application, the busines s model for an object application captures semantic knowledge of probl em domain. Developers should manipulate and interact with objects with out having to worry too much about the data storage and retrieval deta ils. A well defined subset of persistent interfaces (save, delete, fin d) should be provided to application developers.
Common Object-Relational Services
Common patterns are emerging for object-relational applications. IT pr ofessionals who have repeatedly crossed the chasm are beginning to und erstand and recognize certain structures and behaviors which successfu l object-relational applications exhibit. These structures and behavio rs have been formalized by the high-level CORBA Services specification s.
The CORBA service specifications which are applicable and useful to co nsider for object-relational mapping are:
Persistence
Query
Transactions
Concurrency
Relationships
The following sections will use these categories to structure a discus sion of common object-relational services. The reader is encouraged to reference the appropriate CORBA specifications for further details.
Persistence
Persistence is a term used to describe how objects utilize a secondary storage medium to maintain their state across discrete sessions. Pers istence provides the ability for a user to save objects in one session and access them in a later session. When they are subsequently access ed, their state (e.g. attributes) will be exactly the same as it was t he previous session. In multi-user systems, this may not be the case s ince other users may access and modify the same objects. Persistence i s interrelated with other services discussed in this section. The cons ideration of relationship, concurrency and others is intentional (and consistent with CORBA's decomposition of the services).
Examples of specific services provided by persistence are:
Data source connection management: Object-relational applications must initiate connection to the physical data source. Relational database systems typically require identification of the server and database. T he specifics of connection management tends to be database vendor spec ific and the framework must accordingly be designed in a flexible acco mmodating manner.
Object retrieval: When objects are restored from the database, data is retrieved from the database and translated into objects. This process involves extracting data from database specific structures retrieved from the data source, marshaling the data from database types into the appropriate object types and/or classes, creation of the appropriate object, and setting the specific object attributes.
Object storage: The process of object storage mirrors object retrieval . The values of the appropriate attributes are extracted from the obje ct, a database specific structure is created with the attribute values (this may be a SQL string, stored procedure, or special remote proced ure call), and the structure is submitted to the database.
Object deletion: Objects that are deleted from within a system, must h ave their associated data deleted from the relational database. Object deletion requires that appropriate information be extracted from the object, a deletion request be constructed (this may be a SQL string, s tored procedure, or special remote procedure call), and the request su bmitted to the database.
Query
Persistent object storage is of little use without a mechanism to sear ch for and retrieve specific objects. Query facilities allow applicati ons to interrogate and retrieve objects based on a variety of criteria . The basic query operations provided by an object-relational mapping framework are find and find unique. The find unique operation will ret rieve a specific object and find will return a collection of objects b ased on a query criteria.
Data store query facilities vary significantly. Simple file-based data stores may implement rigid home-grown query operations, while relatio nal systems provide a flexible data manipulation language. Object-rela tional mapping frameworks extend the relational query model to make it object-centric rather than data centric. Pass-through mechanisms are also implemented to leverage relational query flexibility and vendor-s pecific extensions (e.g. stored-procedures).
Transactions
Object-relational mapping frameworks which are used in industrial-stre ngth applications must provide transaction support. Transactional supp ort enables the application developer to define an atomic unit of work . The operations within a transaction either all execute successfully or the transaction fails as whole. Object-relational frameworks at a m inimum should provide a relational database-like commit/rollback trans action facility. Designing object-relational frameworks in a multi-use r environment can present many challenges and careful thought should b e given to it.
Concurrency
Multi-user object-oriented systems must control concurrent access to o bjects. When an object is accessed simultaneously by many users, the s ystem must provide a mechanism to insure modifications to the object i n the persistent store occur in a predictable and controlled manner. O bject-relational frameworks may implement pessimistic and/or optimisti c concurrency controls.
Pessimistic concurrency control requires that the application develope r specify their intent when the object is retrieved from the data stor e (e.g. read only, write lock, …). If objects are locked, other users may block when accessing the object and wait for the lock to be relin quished. Pessimistic concurrency should be used and implemented with c aution as it is possible to create dead-lock situations.
Optimistic concurrency assumes that it is unlikely that the same objec t will be simultaneously accessed. Concurrency issues are detected whe n the modifications are saved to the database. Typically, if the objec t has been modified by another user since its retrieval, an error will be returned to the application indicating failure of the modify opera tion. It is the application's responsibility to detect and handle the error. This calls for the framework to cache the concurrent values of objects and compare them against the database.
Relationships
Objects have relationships to other objects. An Order object has many LineItem objects. A Book object has many Chapter objects. An Employee object belongs to exactly one Company object. In relational systems, r elations between entities are implemented using foreign key / primary key references. In object-oriented systems relations are usually expli citly implemented through attributes. If an Order object has LineItem' s, then Order will contain an attribute named lineItems. The lineItems attribute of Order will contain many LineItem objects.
The relationship aspects of an object-relational framework are interde pendent with the persistence, transaction, and query services. When an object is stored, retrieved, transacted, or queried consideration mus t be given to its related objects.
While it is conceptually advantageous to consider common object-relati onal services separately, their object-relational framework implementa tions will be codependent.
Common Object-Relational Translations
Common object-relational translation patterns exist which span the obj ect-relational chasm. The translations account for common object model designs, common relational schemas, and approaches for bi-directional translation. The subsequent sections discuss the common object-relati onal translations.
Simple Class/Table Translations
Tables are used to store an object's persistent attributes. The table which contains an object's data is said to be mapped to the object's c lass. Following are examples of basic class to table mappings:
Figure 6, One class maps to two tables
Figure 7, Two classes map to a single table
One class maps to one or more tables. A single class can map to severa l tables. When an object is restored from the database, the object-rel ational framework must compose the object by building a join of the ta bles. The decision to map a class to more than one table should be mad e judiciously, as it has performance implications for both the object- relational framework and the relational database system. Additionally, updates may be impossible.
One table maps to one or more classes. This is also referred to as an embedded class. Embedded class maps improve performance of object pers istence at the cost of extensibility and violate the principles of nor malization. The embedded class is dependent on the parent class. It's unique identity via it's relationship to the parent class.
The attributes of a class can map to one or more columns in a table. S ometimes even a combination of more than one column is used to calcula ted the value of the attribute. In general, for each persistent busine ss class the following information must be specified:
Mapped Tables. The most basic information that the object-relational f ramework must maintain is information pertaining to which tables a cla ss is mapped.
Primitive Attributes. The term primitive attribute is used to denote a n attribute of a class which maps to a column of a table. The primitiv e attribute will represent scalar data types.
Reference Attributes. Reference attributes represent relationships to other classes. If the Order class has a one to one relationship with L ineItem class, then the Order class can implement an attribute lineIte ms which can be used to manage the collection of related LineItem obje cts.
Relationships
The classes in an object model can be related to one another through a ggregation or association. The cardinality of the relationship can be either [1,1], [1,m], [m,1], [m,m]. Tables are related to one another u sing the same cardinalities. As previously stated, the relationships b etween objects are explicitly implemented through reference attributes and tables are related to each other using foreign keys associations. An object-relational framework maps relationships between objects usi ng the foreign key of corresponding tables. The approach can have a si gnificant impact on the performance and flexibility of the application . Examples of common relationship maps are:
1:1 relationship map: A one to one relationship between classes can be represented in a variety of ways within the database. From a performa nce perspective, the optimal approach is to implement a 1:1 mapping in a single table using an imbedded class map since there are fewer tabl es to navigate. From a database design perspective, it is more desirab le to implement a separate table with foreign keys to two other tables which contain data for objects on each side of the relationship (also known as a lookup table). This is a costly approach since three table s must be joined to traverse the relationship. Perhaps the most preval ent technique is to represent the 1:1 relationship using two tables wh ere one table has a foreign key reference to the second.
1:m relationship map: A 1:m (one to many) relationship is typically re presented in one of two ways. A lookup table approach similar to the 1 :1 relationship map can be used. The lookup table contains foreign key referen ces to tables which map to the classes on each side of the relationshi p. Relationship traversal requires that the three tables be joined. Th e 1:m relationship can also be represented using two tables. Here, the table on the many side of the relationship will have a foreign key re ference to the table on the other side. Traversing the relationship is simpler and quicker since only two tables are involved.
m:n relationship map: The m:n (many to many) relationship is represent ed using a lookup table. The lookup table contains foreign key referen ces to tables which map to the classes on each side of the relationshi p. Relationship traversal requires that the three tables be joined.
Inheritance
Figure 8, Collapsed inheritance
An essential aspect of object technology is inheritance. Inheritance a llows data and behavior to be reused and modified by a class's subclas ses. Relational database management systems do not support inheritance . Entities cannot inherit from other entities. An object-relational fr amework can employ several techniques to map inheritance to a relation al schema. The various types of inheritance mappings are:
Distributed inheritance map: Class inheritance hierarchies are represe nted with distributed inheritance maps by mapping each concrete subcla ss to a separate table. All inherited attributes are replicated in eac h table. This approach is useful when the superclasses have fewer attr ibutes than the subclasses. Distributed inheritance maps are difficult to implement when heterogeneous collections of objects are retrieved.
Collapsed inheritance map: Inheritance hierarchies can be represented in a single relational table. Each record in the table uses attributes pertinent to one subclass, while the other attributes are kept null. Each is distinguished by a type field which identifies the object repr esented.
System Architectures
Figure 9, System Architectures
A system architecture, among other things, specifies how a system is i mplemented with respect to platform, network, and other off-the-shelf components. An object-relational framework enables a variety of system architectures. In fact, an object-relational framework used in conjun ction with a sound application architecture facilitates system archite cture changes.
Figure 9 illustrates a (simplified) spectrum of system architectures. From left to right, there is a progression from fat to thin clients. C onversely, the server becomes progressively more complex (you can't ha ve your cake and eat it too). Some of the common system architectures and configurations are discussed below.
Client A ( Traditional Client/Server). Well implemented client/server systems should exhibit all of the characteristics discussed in the app lication architecture section of this document. The GUI and the busine ss model are coupled through a compact, tightly specified interface. T he business model and the data store are coupled together using an obj ect-relational framework. Client/server applications exhibit sophistic ated user interfaces with rich functionality. The client/Server archit ecture also lends itself to rapid application development integrated d evelopment environments and fourth generation languages.
Client B (Distributed Architecture). Distributed architectures (also k nown as n-tier) have high cost and high return. They are inherently co mplex. System implementers must consider many issues including use of distributed object systems (e.g. CORBA and DCOM), distributed transact ions, and integration of heterogeneous systems. The same application a rchitecture principles apply to distributed systems. The primary benef it of distributed architectures is that they are highly scaleable and configurable.
Client C (Thin Client). The thin client architecture is typically impl emented as a WWW-based application. The users will interact with the a pplication using a web browser such as Navigator, Mosaic, or Internet Explorer. The GUI can be composed of a combination of HTML, Java, and ActiveX controls. The GUI logic and flow is controlled by client side applets, controls, and scripting languages. The critical processing wi ll occur on the server. The server is implemented as a combination of off-the-shelf components and the business object server. The business object server is described in the previous distributed architecture di scussion. Object-relational mapping is encapsulated by the business ob ject server. The thin client approach has many benefits which include minimal client hardware requirements, minimal client software requirem ents, and enablement of automated Client Software Distribution.
Conclusion
Building object-relational systems present many challenges to IT organ izations. To maintain their competitive advantages they must deploy ne w systems at accelerating rate. Use of a pre-built object-relational f ramework allows an organization to leverage reuse and focus on address ing the business problem domain issues. They need not become consumed by the complexity of implementing their own object-relational mapping technology. An object-relational framework encompasses many years of l essons learned by industry professionals. The mistakes have been made and corrected. The performance bottlenecks have been identified and op timized.
The object-relational chasm can be traversed. Systems can be architect ed without sacrificing performance, productivity, integrity or concurr ency and. Objects and relational databases can successfully be integra ted to leverage the strengths of both technologies. When the technolog y is used effectively, the resulting architecture is robust and scalea ble.
-- ※ 来源:.月光软件站 http://www.moon-soft.com.[FROM: 202.96.184.41]
|
|