IntegratingObjectandRelationalTech

精华区

当前位置：网易精华区>>讨论区精华>>编程开发>>● 系统分析>>自开版到2000-04-10待整理精华>>Integrating Object and Relational Tech

主题：Integrating Object and Relational Tech

发信人: wjanry()
整理人: (2000-03-07 20:29:41), 站内信件

Integrating Object and Relational Technologies

Abstract

Object-oriented and object-based technology is the predominant approac
h used to build mainstream business systems. Effective use of object t
echnology has demonstrated the ability to provide organizations with h
igh-quality systems that are extremely understandable, maintainable an
d scaleable.

Relational database management systems are the most prevalent implemen
tation of back-end physical data stores for business applications. The
y are frequently selected for persistent object storage and retrieval
in object-relational systems.

An object-oriented design may resemble a relational design (e.g. the c
lasses in the object model may be reflected in some fashion as entitie
s in the relational schema). Usually this occurs only during the early
conceptual phases of system development. However, as the design and i
mplementation of the system matures, the goals of object-oriented and
relational database development diverge. This divergence is caused bec
ause the objective of relational database development is to normalize
data whereas the goal of object-oriented design is to model the proble
m domain as real-world objects. The result is an object-relational cha
sm.

The utilization of a scaleable and flexible mapping framework is a cri
tical success factor for object-relational systems. The object-relatio
nal chasm must be spanned. The bridge must be strong and flexible. The
bi-directional flow of data must be fast and reliable.

This paper presents a proven architecture for implementing object-rela
tional systems. It discusses a conceptual application architecture and
details essential object-relational mapping characteristics.

The Need for Object-Relational Integration

Relational Databases

Relational databases are a proven, mature and stable technology. They
are and will continue to be the primary data storage technology for a
variety of reasons.

They are a safe and easy decision. It is a technology that has been us
ed and tested for a number of years and is consequently well-understoo
d.
Current Investment. Organizations have made significant investments in
relational technology. They have purchased systems and software, trai
ned their staff, and deployed mission critical systems.
Entrenched legacy systems. Legacy systems contain critical corporate d
ata which resides in relational databases and must be used by new syst
ems.
Simplicity and elegance. The relational model is a simple model which
has a sound mathematical foundation.
Demonstrated industrial-strength success. Relational systems are used
to implement OLTP systems. These systems are optimized for performance
and transaction processing, and utilize the database's built-in integ
rity constraints.
Tools. A wide variety of tools exist to design, implement and maintain
relational database management systems including query and report-wri
ting capabilities.
What Creates the Need for Object-Relational Integration?

Relational Database Design

While object-oriented and relational design share some common characte
ristics, fundamental differences make seamless integration a challenge
. The data models and computational models are different [Bancilhon 3]
.

The relational model is composed of entities and relations. An entity
may be a physical table or a logical projection of several tables also
known as a view. Figure 1 illustrates LINEITEM and PRODUCT tables and
the various relationships between them. A relational model has the fo
llowing elements:

Figure 1, A Relational Model

An entity has columns. Each column is identified by a name and a type.
In the example, the LINEITEM table has Description, Id, Number, Order
_Id, and Quantity columns.
An entity has records or rows. Each row represents a unique tuple of i
nformation which typically represents an object's persistent data.
Each entity has one or more primary keys. The primary keys uniquely id
entifies each record (e.g. Id is the primary key for LINEITEM table).

Support for relations is vendor specific. The example illustrates the
logical model and the relation between the PRODUCT and LINEITEM tables
. In the physical model relations are typically implemented using fore
ign key / primary key references. If one entity relates to another, it
will contain columns which are foreign keys. Foreign key columns cont
ain data which can relate specific records in the entity to the relate
d entity.
Relations have multiplicity(also known as cardinality). Common cardina
lities are one to one (1:1), one to many (1:m), many to one (m:1), and
many to many (m:n). In the example, LINEITEM has a 1:1 relationship w
ith PRODUCT and PRODUCT has a 0:m relationship with LINEITEM.
Object Model Design

An object model contains classes (this is an oversimplification; for a
detailed description of an object model consult any of the fine texts
available on object-oriented analysis and design). Classes represent
definitional information. In a running system, individual instances of
classes exists as objects. A class contains structure, represented as
attributes, and behavior, represented as methods. The object model al
so contains relationships among classes. Figure 2 illustrates a simple
object model. Some of its important aspects are:

Figure 2, An Object Model

The LineItem class has attributes and behavior. LineItem implements th
e behavior (method) calculateTotal.
Relationships in an object model are explicitly represented as attribu
tes. The LineItem class has an attribute product. The product attribut
e contains an instance of the Product class (or one of its subclasses)
. There is no relationship navigation through primary/foreign keys as
required in relational databases.
The object model supports inheritance. A class can inherit data and be
havior from another class (e.g. SoftwareProduct and HardwareProduct pr
oducts inherit attributes and methods from Product class).
Application Architectures
Software development professionals have crossed the object-relational
chasm. Through analysis and experience, some have come to understand t
he problem. They have engineered reusable solutions that allow them to
easily traverse the chasm. To better understand successful architectu
res, a review of less optimal solutions would be fortuitous. The motto
of most who have crossed is "damned the torpedoes, full steam ahead!"
They have pieced together haphazard solutions which become legacy app
lications as soon as they are released.

Relational Legacy Architecture

Figure 3, Relational legacy architecture

The model for these relational legacy applications is illustrated in f
igure 3. The application is composed of two major components - the gra
phical user interface and the database. This application is not object
-oriented. Often, much of the application is automatically generated b
y a tool. Whether code generated or hand coded, these applications tig
htly bind the user interface directly to the database. SQL queries are
scattered through-out and critical business logic cannot be distingui
shed from GUI navigation source code. The architecture can be characte
rized as extremely brittle. Changes made to the database have immediat
e effect and propagate failures indiscriminately throughout the applic
ation.

Classic 3-Tier Architecture

Figure 4, Tightly coupled 3-Tier architecture

Figure 4 illustrates an incrementally better approach. It recognizes t
he value of the business object model. The user interface does not hav
e any direct interaction with the database; but instead, it interacts
with the business object model and is responsible for object persisten
ce. This additional layer of abstraction slightly improves the quality
of the architecture, but does little to address the brittle nature of
the application. The GUI is tightly coupled to the object model. The
impact of change to the object model cannot be easily assessed or cont
rolled. When objects change, the user interface will be directly impac
ted. The same condition exists between the business object model and d
atabase.

Application developers spend over 30% of their time implementing relat
ional database access in object-oriented applications. If the object-r
elational interface is not correctly implemented, the investment is lo
st. Implementing an object-relational framework captures this investme
nt. The object-relational framework can be reused in subsequent applic
ations reducing the object-relational implementation cost to less than
10% of the total implementation costs. The most important cost to con
sider when implementing any system is maintenance. Over 60% percent of
the total costs of a system over its entire life-cycle can be attribu
ted to maintenance. A poorly implemented object relational system is b
oth a technical and financial maintenance nightmare.

Component-Based 3-Tier Architecture
Successful object-oriented systems are characterized by a component-ba
sed architecture with clearly defined graphical user interface (GUI),
business object model and physical data store layers. This conceptual
model has been popularized as the 3-tiered application architecture.

Figure 5, 3-Tier Architecture

Visual integrated development environments (IDEs) and object-oriented
analysis and design tools are commonly used for system construction. T
hese highly productive tools allow application developers to rapidly c
onstruct graphical user interface components and bind them to business
objects. Event-based frameworks (e.g. Smalltalk's Model/View/Controll
er or MVC) are often used to couple the graphical user interface to th
e business object model. MVC, and other similar frameworks, provide fa
cilities to loosely couple the GUI with the business object model. The
GUI and Business Object Model are developed as functionally independe
nt components and their common interface is very compact.

The GUI and business object model are developed in a homogeneous envir
onment using object-oriented languages and tools. Consistent design an
d implementation practices are followed. Applications implemented usin
g object-oriented (or even object-aware) physical data stores further
leverage a homogeneous object-oriented environment. The semantics for
persistence are consistent with the systems implementation and object-
relational translation is not required.

The majority of business applications utilize relational technology as
a physical data store. It is important to encapsulate the relational
database in an object-oriented application, otherwise object-relationa
l translation code will proliferate through-out the system. If the dat
a store is sufficiently wrapped, the impact on the object model is min
imized and a more robust and scaleable solution is attained.

In summary, a well designed 3-tier architecture has the following char
acteristics:

Clearly defined, loosely coupled tiers/layers.
Internally, each layer is partitioned into a functionally oriented com
ponent.
The interface between the layers is well defined, well documented and
compact.
A reusable framework is implemented to couple the GUI to the business
object model. The framework employs an event-based approach which loos
ely couples the business object model to the GUI.
A reusable framework is implemented to couple the business object mode
l to the physical data store. The framework provides facilities which
utilize object-relational mapping information specified during develop
ment to automatically translate objects into requests which can be ser
ved by relational database servers.
The subsequent sections of this documents will further explore the int
erface and integration issues between the object model and the physica
l data store.

The Object-Relational Framework

Many solutions exist which let applications directly access relational
data. Solutions even exist for non-object-oriented languages, but the
interface is typically reflected as an application programming interf
ace (API). The database APIs come in standard flavors (e.g. ODBC) and
are proprietary (native bindings to specific databases). The APIs prov
ide data manipulation language (DML) pass through services which allow
applications to access raw relational data. In object-oriented applic
ations, the data must undergo object-relational translation prior to b
eing used by the application. This requires considerable amount of app
lication code to translate raw database API results into application o
bjects. The role of the object-relational framework is to generically
encapsulate the physical data store and to provide appropriate object
translation services.

Some of the essential characteristics of an object-relational framewor
k are:

Performance: Close consideration must be given towards decomposing obj
ects into data and composing objects from data. In systems where data
through-put is high and critical, this is often an Achilles heel of an
inadequately designed access layer.
Minimize design compromises: A familiar pattern to object technologist
s who have built systems which utilize relational databases is to adju
st the object model to facilitate storage into relational systems, and
to ALTer the relational model for easier storage of objects. While mi
nor adjustments are often needed, a well designed access layer minimiz
es both object and relation model design degradation.
Extensibility: The access layer is a white-box framework which allows
application developers to extend the framework if certain functionalit
y is desired in the framework. Typically, an access layer will support
without extension 65-85% of an application's data storage requirement
s. If the access layer is not designed as an extensible framework, ach
ieving the last 35-15% of an application's data storage requirements c
an be very difficult and costly.
Documentation: The access layer is a both a black-box component, and a
white-box framework. The API of the black-box component must be clear
ly defined, well documented, and easily understood. As previously ment
ioned, the access layer is designed to be extended. An extensible fram
ework must be very thoroughly documented. Classes which are intended t
o be subclassed must be identified. The characteristics of each releva
nt class's protocol must be specified (e.g. public, private, protected
, final, …). Moreover, a substantial portion of the access layer fram
ework's design must be exposed and documented to facilitate extensibil
ity.
Support for common object-relational mappings: An access layer should
provide support for some basic object-relational mappings without exte
nsion. These object-relational mappings are discussed further in a sub
sequent section of this document.
Persistence Interfaces: In an object oriented application, the busines
s model for an object application captures semantic knowledge of probl
em domain. Developers should manipulate and interact with objects with
out having to worry too much about the data storage and retrieval deta
ils. A well defined subset of persistent interfaces (save, delete, fin
d) should be provided to application developers.
Common Object-Relational Services
Common patterns are emerging for object-relational applications. IT pr
ofessionals who have repeatedly crossed the chasm are beginning to und
erstand and recognize certain structures and behaviors which successfu
l object-relational applications exhibit. These structures and behavio
rs have been formalized by the high-level CORBA Services specification
s.

The CORBA service specifications which are applicable and useful to co
nsider for object-relational mapping are:

Persistence
Query
Transactions
Concurrency
Relationships
The following sections will use these categories to structure a discus
sion of common object-relational services. The reader is encouraged to
reference the appropriate CORBA specifications for further details.

Persistence
Persistence is a term used to describe how objects utilize a secondary
storage medium to maintain their state across discrete sessions. Pers
istence provides the ability for a user to save objects in one session
and access them in a later session. When they are subsequently access
ed, their state (e.g. attributes) will be exactly the same as it was t
he previous session. In multi-user systems, this may not be the case s
ince other users may access and modify the same objects. Persistence i
s interrelated with other services discussed in this section. The cons
ideration of relationship, concurrency and others is intentional (and
consistent with CORBA's decomposition of the services).

Examples of specific services provided by persistence are:

Data source connection management: Object-relational applications must
initiate connection to the physical data source. Relational database
systems typically require identification of the server and database. T
he specifics of connection management tends to be database vendor spec
ific and the framework must accordingly be designed in a flexible acco
mmodating manner.
Object retrieval: When objects are restored from the database, data is
retrieved from the database and translated into objects. This process
involves extracting data from database specific structures retrieved
from the data source, marshaling the data from database types into the
appropriate object types and/or classes, creation of the appropriate
object, and setting the specific object attributes.
Object storage: The process of object storage mirrors object retrieval
. The values of the appropriate attributes are extracted from the obje
ct, a database specific structure is created with the attribute values
(this may be a SQL string, stored procedure, or special remote proced
ure call), and the structure is submitted to the database.
Object deletion: Objects that are deleted from within a system, must h
ave their associated data deleted from the relational database. Object
deletion requires that appropriate information be extracted from the
object, a deletion request be constructed (this may be a SQL string, s
tored procedure, or special remote procedure call), and the request su
bmitted to the database.
Query
Persistent object storage is of little use without a mechanism to sear
ch for and retrieve specific objects. Query facilities allow applicati
ons to interrogate and retrieve objects based on a variety of criteria
. The basic query operations provided by an object-relational mapping
framework are find and find unique. The find unique operation will ret
rieve a specific object and find will return a collection of objects b
ased on a query criteria.

Data store query facilities vary significantly. Simple file-based data
stores may implement rigid home-grown query operations, while relatio
nal systems provide a flexible data manipulation language. Object-rela
tional mapping frameworks extend the relational query model to make it
object-centric rather than data centric. Pass-through mechanisms are
also implemented to leverage relational query flexibility and vendor-s
pecific extensions (e.g. stored-procedures).

Transactions
Object-relational mapping frameworks which are used in industrial-stre
ngth applications must provide transaction support. Transactional supp
ort enables the application developer to define an atomic unit of work
. The operations within a transaction either all execute successfully
or the transaction fails as whole. Object-relational frameworks at a m
inimum should provide a relational database-like commit/rollback trans
action facility. Designing object-relational frameworks in a multi-use
r environment can present many challenges and careful thought should b
e given to it.

Concurrency
Multi-user object-oriented systems must control concurrent access to o
bjects. When an object is accessed simultaneously by many users, the s
ystem must provide a mechanism to insure modifications to the object i
n the persistent store occur in a predictable and controlled manner. O
bject-relational frameworks may implement pessimistic and/or optimisti
c concurrency controls.

Pessimistic concurrency control requires that the application develope
r specify their intent when the object is retrieved from the data stor
e (e.g. read only, write lock, …). If objects are locked, other users
may block when accessing the object and wait for the lock to be relin
quished. Pessimistic concurrency should be used and implemented with c
aution as it is possible to create dead-lock situations.

Optimistic concurrency assumes that it is unlikely that the same objec
t will be simultaneously accessed. Concurrency issues are detected whe
n the modifications are saved to the database. Typically, if the objec
t has been modified by another user since its retrieval, an error will
be returned to the application indicating failure of the modify opera
tion. It is the application's responsibility to detect and handle the
error. This calls for the framework to cache the concurrent values of
objects and compare them against the database.

Relationships
Objects have relationships to other objects. An Order object has many
LineItem objects. A Book object has many Chapter objects. An Employee
object belongs to exactly one Company object. In relational systems, r
elations between entities are implemented using foreign key / primary
key references. In object-oriented systems relations are usually expli
citly implemented through attributes. If an Order object has LineItem'
s, then Order will contain an attribute named lineItems. The lineItems
attribute of Order will contain many LineItem objects.

The relationship aspects of an object-relational framework are interde
pendent with the persistence, transaction, and query services. When an
object is stored, retrieved, transacted, or queried consideration mus
t be given to its related objects.

While it is conceptually advantageous to consider common object-relati
onal services separately, their object-relational framework implementa
tions will be codependent.

Common Object-Relational Translations
Common object-relational translation patterns exist which span the obj
ect-relational chasm. The translations account for common object model
designs, common relational schemas, and approaches for bi-directional
translation. The subsequent sections discuss the common object-relati
onal translations.

Simple Class/Table Translations
Tables are used to store an object's persistent attributes. The table
which contains an object's data is said to be mapped to the object's c
lass. Following are examples of basic class to table mappings:

Figure 6, One class maps to two tables

Figure 7, Two classes map to a single table

One class maps to one or more tables. A single class can map to severa
l tables. When an object is restored from the database, the object-rel
ational framework must compose the object by building a join of the ta
bles. The decision to map a class to more than one table should be mad
e judiciously, as it has performance implications for both the object-
relational framework and the relational database system. Additionally,
updates may be impossible.

One table maps to one or more classes. This is also referred to as an
embedded class. Embedded class maps improve performance of object pers
istence at the cost of extensibility and violate the principles of nor
malization. The embedded class is dependent on the parent class. It's
unique identity via it's relationship to the parent class.

The attributes of a class can map to one or more columns in a table. S
ometimes even a combination of more than one column is used to calcula
ted the value of the attribute. In general, for each persistent busine
ss class the following information must be specified:

Mapped Tables. The most basic information that the object-relational f
ramework must maintain is information pertaining to which tables a cla
ss is mapped.
Primitive Attributes. The term primitive attribute is used to denote a
n attribute of a class which maps to a column of a table. The primitiv
e attribute will represent scalar data types.
Reference Attributes. Reference attributes represent relationships to
other classes. If the Order class has a one to one relationship with L
ineItem class, then the Order class can implement an attribute lineIte
ms which can be used to manage the collection of related LineItem obje
cts.
Relationships
The classes in an object model can be related to one another through a
ggregation or association. The cardinality of the relationship can be
either [1,1], [1,m], [m,1], [m,m]. Tables are related to one another u
sing the same cardinalities. As previously stated, the relationships b
etween objects are explicitly implemented through reference attributes
and tables are related to each other using foreign keys associations.
An object-relational framework maps relationships between objects usi
ng the foreign key of corresponding tables. The approach can have a si
gnificant impact on the performance and flexibility of the application
. Examples of common relationship maps are:

1:1 relationship map: A one to one relationship between classes can be
represented in a variety of ways within the database. From a performa
nce perspective, the optimal approach is to implement a 1:1 mapping in
a single table using an imbedded class map since there are fewer tabl
es to navigate. From a database design perspective, it is more desirab
le to implement a separate table with foreign keys to two other tables
which contain data for objects on each side of the relationship (also
known as a lookup table). This is a costly approach since three table
s must be joined to traverse the relationship. Perhaps the most preval
ent technique is to represent the 1:1 relationship using two tables wh
ere one table has a foreign key reference to the second.
1:m relationship map: A 1:m (one to many) relationship is typically re
presented in one of two ways. A lookup table approach similar to the 1
:1 relationship map can be used. The lookup table contains foreign key referen
ces to tables which map to the classes on each side of the relationshi
p. Relationship traversal requires that the three tables be joined. Th
e 1:m relationship can also be represented using two tables. Here, the
table on the many side of the relationship will have a foreign key re
ference to the table on the other side. Traversing the relationship is
simpler and quicker since only two tables are involved.
m:n relationship map: The m:n (many to many) relationship is represent
ed using a lookup table. The lookup table contains foreign key referen
ces to tables which map to the classes on each side of the relationshi
p. Relationship traversal requires that the three tables be joined.
Inheritance

Figure 8, Collapsed inheritance

An essential aspect of object technology is inheritance. Inheritance a
llows data and behavior to be reused and modified by a class's subclas
ses. Relational database management systems do not support inheritance
. Entities cannot inherit from other entities. An object-relational fr
amework can employ several techniques to map inheritance to a relation
al schema. The various types of inheritance mappings are:

Distributed inheritance map: Class inheritance hierarchies are represe
nted with distributed inheritance maps by mapping each concrete subcla
ss to a separate table. All inherited attributes are replicated in eac
h table. This approach is useful when the superclasses have fewer attr
ibutes than the subclasses. Distributed inheritance maps are difficult
to implement when heterogeneous collections of objects are retrieved.

Collapsed inheritance map: Inheritance hierarchies can be represented
in a single relational table. Each record in the table uses attributes
pertinent to one subclass, while the other attributes are kept null.
Each is distinguished by a type field which identifies the object repr
esented.
System Architectures

Figure 9, System Architectures

A system architecture, among other things, specifies how a system is i
mplemented with respect to platform, network, and other off-the-shelf
components. An object-relational framework enables a variety of system
architectures. In fact, an object-relational framework used in conjun
ction with a sound application architecture facilitates system archite
cture changes.

Figure 9 illustrates a (simplified) spectrum of system architectures.
From left to right, there is a progression from fat to thin clients. C
onversely, the server becomes progressively more complex (you can't ha
ve your cake and eat it too). Some of the common system architectures
and configurations are discussed below.

Client A ( Traditional Client/Server). Well implemented client/server
systems should exhibit all of the characteristics discussed in the app
lication architecture section of this document. The GUI and the busine
ss model are coupled through a compact, tightly specified interface. T
he business model and the data store are coupled together using an obj
ect-relational framework. Client/server applications exhibit sophistic
ated user interfaces with rich functionality. The client/Server archit
ecture also lends itself to rapid application development integrated d
evelopment environments and fourth generation languages.
Client B (Distributed Architecture). Distributed architectures (also k
nown as n-tier) have high cost and high return. They are inherently co
mplex. System implementers must consider many issues including use of
distributed object systems (e.g. CORBA and DCOM), distributed transact
ions, and integration of heterogeneous systems. The same application a
rchitecture principles apply to distributed systems. The primary benef
it of distributed architectures is that they are highly scaleable and
configurable.
Client C (Thin Client). The thin client architecture is typically impl
emented as a WWW-based application. The users will interact with the a
pplication using a web browser such as Navigator, Mosaic, or Internet
Explorer. The GUI can be composed of a combination of HTML, Java, and
ActiveX controls. The GUI logic and flow is controlled by client side
applets, controls, and scripting languages. The critical processing wi
ll occur on the server. The server is implemented as a combination of
off-the-shelf components and the business object server. The business
object server is described in the previous distributed architecture di
scussion. Object-relational mapping is encapsulated by the business ob
ject server. The thin client approach has many benefits which include
minimal client hardware requirements, minimal client software requirem
ents, and enablement of automated Client Software Distribution.
Conclusion
Building object-relational systems present many challenges to IT organ
izations. To maintain their competitive advantages they must deploy ne
w systems at accelerating rate. Use of a pre-built object-relational f
ramework allows an organization to leverage reuse and focus on address
ing the business problem domain issues. They need not become consumed
by the complexity of implementing their own object-relational mapping
technology. An object-relational framework encompasses many years of l
essons learned by industry professionals. The mistakes have been made
and corrected. The performance bottlenecks have been identified and op
timized.

The object-relational chasm can be traversed. Systems can be architect
ed without sacrificing performance, productivity, integrity or concurr
ency and. Objects and relational databases can successfully be integra
ted to leverage the strengths of both technologies. When the technolog
y is used effectively, the resulting architecture is robust and scalea
ble.

--
※ 来源:．月光软件站 http://www.moon-soft.com．[FROM: 202.96.184.41]

[关闭][返回]