ComponentsandObjectsTogether

精华区

当前位置：网易精华区>>讨论区精华>>编程开发>>● 系统分析>>自开版到2000-04-10待整理精华>>Components and Objects Together

主题：Components and Objects Together

发信人: wjanry()
整理人: majorsun(2000-03-08 14:00:47), 站内信件

Components and Objects Together May 1999
Table of Contents

The differences between the emerging component-based development and l
ong-standing object-oriented development are often unclear. Find out h
ow separate these concepts really are.

by Clemens Szperski

Components are on the upswing; objects have been around for some time.
It is understandable, but not helpful, to see object-oriented program
ming sold in new clothes by simply calling objects “components.” The
emerging component-based approaches and tools combine objects and com
ponents in ways that show they are really separate concepts. In this a
rticle, I will examine some key differences between objects and compon
ents to clarify these muddy waters. In particular, you’ll see that ap
proaches based on visual assembly tools really assemble objects, not c
omponents, but they create components when saving the finished assembl
y.
Why Components?

What is the rationale behind component software? Or rather, what is it
that components should be? Traditionally, closed solutions with propr
ietary interfaces addressed most customers’ needs. Heavyweights such
as operating systems and database engines are among the few examples o
f components that did reach high levels of maturity. Large software sy
stems manufacturers often configure delivered solutions by combining m
odules in a client-specific way. However, the interfaces between such
modules tend to be proprietary, at most open to highly specialized ind
ependent software vendors (ISVs) that specifically produce further mod
ules for such systems. In many cases, these modules are fused together
during a linking step and are no longer distinguishable in deployed s
olutions.

Attempts to create low-level connection standards or wiring standards
are either product or standard-driven. The Microsoft standards, restin
g on COM, have always been product-driven and are thus incremental, ev
olutionary, and, to a degree, legacy-laden by nature.

Standard-driven approaches usually originate in industry consortia. Th
e prime example here is the effort of the Object Management Group (OMG
). However, the OMG hasn’t contributed much in the component world an
d is now falling back on JavaSoft’s Enterprise JavaBeans standards fo
r components, although it’s attempting a CORBA Beans generalization.
The EJB standard still has a long way to go; so far it is not implemen
tation language-neutral, and bridging standards to Java external servi
ces and components are only emerging.

At first, it might surprise you that component software is largely pus
hed by desktop- and Internet-based solutions. On second thought, this
should not surprise you at all. Component software is a complex techno
logy to master—and viable, component-based solutions will only evolve
if the benefits are clear. Traditional enterprise computing has many
benefits, but these benefits all depend on enterprises willing to evol
ve substantially.

In the desktop and Internet worlds, the situation is different. Centra
lized control over what information is processed when and where is not
an option in these worlds. Instead, content (such as web pages or doc
uments) arrives at a user’s machine and needs to be processed there a
nd then. With a rapidly exploding variety of content types—and open c
oding standards such as XML—monolithic applications have long reached
their limits. Beyond the flexibility of component software is its cap
ability to dynamically grow to address changing needs.

What a Component Is and Is Not

The separate existence and mobility of components, as witnessed by Jav
a applets or ActiveX components, can make components look similar to o
bjects. People often use the words “component” and “object” interc
hangeably. In addition, they use constructions such as “component obj
ect.” Objects are said to be instances of classes or clones of protot
ype objects. Objects and components both make their services available
through interfaces. Language designers add further irritation by disc
ussing namespaces, modules, packages, and so on. I will try to unfold,
explain, and justify these terms. Next, I’ll browse the key terms wi
th brief explanations, relating them to each other. Based on this, I’
ll then look at a refined component definition. Finally, I’ll shed so
me light on the fine line between component-based programming and comp
onent assembly.

Terms and Concepts

Components. A component’s characteristic properties are that it is a
unit of independent deployment; a unit of third-party composition; and
it has no persistent state.

These properties have several implications. For a component to be inde
pendently deployable, it needs to be well-separated from its environme
nt and from other components. A component therefore encapsulates its c
onstituent features. Also, since it is a unit of deployment, you never
partially deploy a component.

If a third party needs to compose a component with other components, t
he component must be self-contained. (A third party is one that you ca
nnot expect to access the construction details of all the components i
nvolved.) Also, the component needs to come with clear specifications
of what it provides and what it requires. In other words, a component
needs to encapsulate its implementation and interact with its environm
ent through well-defined interfaces and platform assumptions only. It’
s also generally useful to minimize hard-wired dependencies in favor o
f externally configurable providers.

Finally, you cannot distinguish a component without any persistent sta
te from copies of its own. (Exceptions to this rule are attributes not
contributing to the component’s functionality, such as serial number
s used for accounting.) Without state, a component can be loaded into
and activated in a particular system—but in any given process, there
will be at most one copy of a particular component. So, while it is us
eful to ask whether a particular component is available or not, it isn
’t useful to ask about the number of copies of that component. (Note
that a component may simultaneously exist in different versions. Howev
er, these are not copies of a component, but rather different componen
ts related to each other by a versioning scheme.)

In many current approaches, components are heavyweights. For example,
a database server could be a component. If there is only one database
maintained by this class of server, then it is easy to confuse the ins
tance with the concept. For example, you might see the database server
together with the database as a component with persistent state. Acco
rding to the definition described previously, this instance of the dat
abase concept is not a component. Instead, the static database server
program is a component, and it supports a single instance: the databas
e object. This separation of the immutable plan from the mutable insta
nces is the key to avoiding massive maintenance problems. If component
s could be mutable, that is, have state, then no two installations of
the same component would have the same properties. The differentiation
of components and objects is thus fundamentally about differentiating
between static properties that hold for a particular configuration an
d dynamic properties of any particular computational scenario. Drawing
this line carefully is essential to curbing manageability, configurab
ility, and version control problems.

Objects. The notions of instantiation, identity, and encapsulation lea
d to the notion of objects. In contrast to the properties characterizi
ng components, an object’s characteristic properties are that it is a
unit of instantiation (it has a unique identity); it has state that c
an be persistent; and it encapsulates its state and behavior.

Again, several object properties follow directly. Since an object is a
unit of instantiation, it cannot be partially instantiated. Since an
object has individual state, it also needs a unique identity to identi
fy the object, despite state changes, for its lifetime. Consider the a
pocryphal story about George Washington’s axe, which had five new han
dles and four new axe-heads—but was still George Washington’s axe. T
his is typical of real-life objects: nothing but their abstract identi
ty remains stable over time.

Since objects get instantiated, you need a construction plan that desc
ribes the new object’s state space, initial state, and behavior befor
e the object can exist. Such a plan may be explicitly available and is
then called a class. Alternatively, it may be implicitly available in
the form of an object that already exists, that is close to the objec
t to be created, and can be cloned. You’ll call such a preexisting ob
ject a prototype object.

Whether using classes or prototype objects, the newly instantiated obj
ect needs to be set to an initial state. The initial state needs to be
a valid state of the constructed object, but it may also depend on pa
rameters specified by the client asking for the new object. The code t
hat is required to control object creation and initialization could be
a static procedure, usually called a constructor. Alternatively, it c
an be an object of its own, usually called an object factory, or facto
ry for short.

Object References and Persistent Objects

The object’s identity is usually captured by an object reference. Mos
t programming languages do not explicitly support object references; l
anguage-level references hold unique references of objects (usually th
eir addresses in memory), but there is no direct high-level support to
manipulate the reference as such. (Languages like C provide low-level
address manipulation facilities.) Distinguishing between an object—a
triple definition of identity, state, and implementing class—and an
object reference (just holding the identity) is important when conside
ring persistence. As I’ll describe later, almost all so-called persis
tence schemes just preserve an object’s state and class, but not its
absolute identity. An exception is CORBA, which defines interoperable
object references (IORs) as stable entities (which are really objects)
. Storing an IOR makes the pure object identity persist.

Components and Objects

Typically, a component comes to life through objects and therefore wou
ld normally contain one or more classes or immutable prototype objects
. In addition, it might contain a set of immutable objects that captur
e default initial state and other component resources. However, there
is no need for a component to contain only classes or any classes at a
ll. A component could contain traditional procedures and even have glo
bal (static) variables; or it may be realized in its entirety using a
functional programming approach, an assembly language, or any other ap
proach. Objects created in a component, or references to such objects,
can become visible to the component’s clients, usually other compone
nts. If only objects become visible to clients, there is no way to tel
l whether or not a component is purely object-oriented inside.

A component may contain multiple classes, but a class is necessarily c
onfined to a single component; partial deployment of a class wouldn’t
normally make sense. Just as classes can depend on other classes (inh
eritance), components can depend on other components (import). The sup
erclasses of a class do not necessarily need to reside in the same com
ponent as the class. Where a class has a superclass in another compone
nt, the inheritance relation crosses component boundaries. Whether or
not inheritance across components is a good thing is the focus of heat
ed debate. The theoretical reasoning behind this clash is interesting
and close to the essence of component orientation, but it’s beyond th
e scope of this article.

Modules

Components are rather close to modules, as introduced by modular langu
ages in the early 1980s. The most popular modular languages are Modula
-2 and Ada. In Ada, modules are called packages, but the concepts are
almost identical. An important hallmark of modular approaches is the s
upport of separate compilation, including the ability to properly type
-check across module boundaries.

With the introduction of the Eiffel language, the claim was that a cla
ss is a better module. This seemed justified based on the early ideas
that modules would each implement one abstract data type (ADT). After
all, you can look at a class as implementing an ADT, with the addition
al properties of inheritance and polymorphism. However, modules can be
used, and always have been used, to package multiple entities, such a
s ADTs or indeed classes, into one unit. Also, modules do not have a c
oncept of instantiation, while classes do. (In module-less languages,
this leads to the construction of static classes that essentially serv
e as simple modules.)

Recent language designs, such as Oberon, Modula-3, and Component Pasca
l, keep the modules and classes separate. (In Java, a package is somew
hat weaker than a module and mostly serves namespace control purposes.
) Also, a module can contain multiple classes. Where classes inherit f
rom each other, they can do so across module boundaries. You can see m
odules as minimal components. Even modules that do not contain any cla
sses can function as components.

Nevertheless, module concepts don’t normally support one aspect of fu
ll-fledged components. There are no persistent immutable resources tha
t come with a module, beyond what has been hardwired as constants in t
he code. Resources parameterize a component. Replacing these resources
lets you version a component without needing to recompile; localizati
on is an example. Modification of resources may look like a form of a
mutable component state. Since components are not supposed to modify t
heir own resources (or their code!), this distinction remains useful:
resources fall into the same category as the compiled code that forms
part of a component.

Component technology unavoidably leads to modular solutions. The softw
are engineering benefits can thus justify initial investment into comp
onent technology, even if you don’t foresee component markets.

It is possible to go beyond the technical level of reducing components
to better modules. To do so, it is helpful to define components diffe
rently.

Component: A Definition

“A software component is a unit of composition with contractually spe
cified interfaces and explicit context dependencies only. A software c
omponent can be deployed independently and is subject to composition b
y third parties.” (Workshop on Component-Oriented Programming, ECOOP,
1996.)

This definition covers the characteristic properties of components I’
ve discussed. It covers technical aspects such as independence, contra
ctual interfaces, and composition, and also market-related aspects suc
h as third parties and deployment. It is the unique property of compon
ents, not only of software components, to combine technical and market
aspects. A purely technical interpretation of this view maps this com
ponent concept back to that of modules, as illustrated in the followin
g definition: A component is a set of simultaneously deployed atomic c
omponents. An atomic component is a module plus a set of resources.

This distinction of components and atomic components caters to the fac
t that most atomic components are not deployed individually, although
they could be. Instead, atomic components normally belong to a set of
components, and a typical deployment will cover the entire set.

Atomic components are the elementary units of deployment, versioning a
nd replacement; although it’s not usually done, individual deployment
is possible. A module is thus an atomic component with no separate re
sources. (Java packages are not modules, but the atomic units of deplo
yment in Java are class files. A single package is compiled into many
class files—one per class.)

A module is a set of classes and possibly non-object-oriented construc
ts, such as procedures or functions. Modules may statically require th
e presence of other modules in order to work. Hence, you can only depl
oy a module if all the modules it depends on are available. The depend
ency graph must be acyclic or else a group of modules in a cyclic depe
ndency relation would always require simultaneous deployment, violatin
g the defining property of modules.

A resource is a frozen collection of typed items. The resource concept
could include code resources to subsume modules. The point here is th
at there are resources besides the ones generated by a compiler compil
ing a module or package. In a pure object approach, resources are seri
alized immutable objects. They’re immutable because components have n
o persistent identity. You cannot distinguish between duplicates.

Interfaces

A component’s interfaces define its access points. These points let a
component’s clients, usually components themselves, access the compo
nent’s services. Normally, a component has multiple interfaces corres
ponding to different access points. Each access point may provide a di
fferent service, catering to different client needs. It’s important t
o emphasize the interface specifications’ contractual nature. Since t
he component and its clients are developed in mutual ignorance, the st
andardized contract must form a common ground for successful interacti
on.

What nontechnical aspects do contractual interfaces need to obey to be
successful? First, keep the economy of scale in mind. Some of a compo
nent’s services may be less popular than others, but if none are popu
lar and the particular combination of offered services is not either,
the component has no market. In such a case, the overhead cost of cast
ing a particular solution into a component form may not be justified.

Notice, however, that individual adaptations of component systems can
lead to developing components that have no market. In this situation,
component system extensions should build on what the system provides,
and the easiest way of achieving this may be to develop the extension
in component form. In this case, the economic argument applies indirec
tly: while the extending component itself is not viable, the resulting
combination with the extended component system is.

Second, you must avoid undue market fragmentation, as it threatens the
viability of components. You must also minimize redundant introductio
ns of similar interfaces. In a market economy, such a minimization is
usually the result of either early standardization efforts in a market
segment or the result of fierce eliminating competition. In the forme
r case, the danger is suboptimality due to committee design, in the la
tter case it is suboptimality due to the nontechnical nature of market
forces.

Third, to maximize the reach of an interface specification, and of com
ponents implementing this interface, you need common media to publiciz
e and advertise interfaces and components. If nothing else, this requi
res a small number of widely accepted unique naming schemes. Just as I
SBN (International Standard Book Number) is a worldwide and unique nam
ing scheme to identify any published book, developers need a similar s
cheme to refer abstractly to interfaces by name. Like an ISBN, a compo
nent identifier is not required to carry any meaning. An ISBN consists
of a country code, a publisher code, a publisher-assigned serial numb
er, and a checking digit. While it reveals the book’s publisher, it d
oes not code the book’s contents. The book title may hint at the mean
ing, but it’s not guaranteed to be unique.

Explicit Context Dependencies

Besides specifying provided interfaces, the previous definition of com
ponents also requires components to specify their needs. That is, the
definition requires specification of what the deployment environment w
ill need to provide, so that the components can function. These needs
are called context dependencies, referring to the context of compositi
on and deployment. If there were only one software component world, it
would suffice to enumerate required interfaces of other components to
specify all context dependencies. For example, a mail-merge component
would specify that it needs a file system interface. Note that with t
oday’s components, even this list of required interfaces is not norma
lly available. The emphasis is usually just on provided interfaces.

In reality, several component worlds coexist, compete, and conflict wi
th each other. At least three major worlds are now emerging, based on
OMG’s CORBA, Sun’s Java, and Microsoft’s COM. In addition, componen
t worlds are fragmented by the various computing and networking platfo
rms. This is not likely to change soon. Just as the market has so far
tolerated a surprising multitude of operating systems, there will be r
oom for multiple component worlds. Where multiple worlds share markets
, a component’s context dependencies specification must include its r
equired interfaces and the component world (or worlds) for which it ha
s been prepared.

There will, of course, also be secondary markets for cross-component-w
orld integration. In analogy, consider the thriving market for power-p
lug adapters for electrical devices. Thus, bridging solutions, such as
the OMG’s COM and CORBA Interworking standard, mitigate chasms.

Component Weight

Obviously, a component is most useful if it offers the right set of in
terfaces and has no restricting context dependencies; that is, if it c
an perform in all component worlds and requires no interface beyond th
ose whose availability is guaranteed by the different component worlds
. However, few components, if any, would be able to perform under such
weak environmental guarantees. Technically, a component could come wi
th all required software bundled in, but that would clearly defeat the
purpose of using components in the first place. Note that part of the
environmental requirements is the machine on which the component can
execute. In the case of a virtual machine, such as the Java Virtual Ma
chine, this is a straightforward part of the component world specifica
tion. On native code platforms, a mechanism such as Apple’s fat binar
ies (which pack multiple binaries into one file), would still allow a
component to run everywhere.

Instead of constructing a self-sufficient component with everything bu
ilt in, a component designer may opt for maximal reuse. Although maxim
izing reuse has many advantages, it has one substantial disadvantage:
the explosion of context dependencies. If designs of components were,
after release, frozen for all time, and if all deployment environments
were the same, this would not pose a problem. However, as components
evolve, and different environments provide different configurations an
d version mixes, it becomes a showstopper to have a large number of co
ntext dependencies. To summarize: maximizing reuse minimizes use. In p
ractice, component designers have to strive for a balance.

Component-Based Programming vs. Component Assembly

Component technology is sometimes used as a synonym for “visual assem
bly” of pre-fabricated components. Indeed, for relatively simple appl
ications, wiring components is surprisingly productive— for example,
JavaSoft’s BeanBox lets a user connect beans visually and displays su
ch connections as pieces of pipework: plumbing instead of programming.

It is useful to take a look behind the scenes. When wiring or plumbing
components, the visual assembly tool registers event listeners with e
vent sources. For example, if the assembly of a button and a text fiel
d should clear the text field whenever the button is pressed, then the
button is the event source of the event button pressed and the text f
ield is listening for this event. While details are of no importance h
ere, it is clear that this assembly process is not primarily about com
ponents. The button and the text field are instances, that is, objects
not components. (When adding the first object of a kind, an assembly
tool may need to locate an appropriate component.)

However, there is a problem with this analysis. If the assembled objec
ts are saved and distributed as a new component, how can this be expla
ined? The key is to realize that it is not the graph of particular ass
embled objects that is saved. Instead, the saved information suffices
to generate a new graph of objects that happens to have the same topol
ogy (and, to a degree, the same state) as the originally assembled gra
ph of objects. However, the newly generated graph and the original gra
ph will not share common objects: the object identities are all differ
ent.

You should then view the stored graph as persistent state but not as p
ersistent objects. Therefore, what seems to be assembly at the instanc
e rather than the class level—and is fundamentally different—is a ma
tter of convenience. In fact, there is no difference in outcome betwee
n this approach of assembling a component out of subcomponents and a t
raditional programmatic implementation that hard codes the assembly. V
isual assembly tools are free to not save object graphs, but to genera
te code that when executed creates the required objects and establishe
s their interconnections. The main difference is the degree of flexibi
lity in theory. You can easily modify the saved object graph at run ti
me of the deployed component, while the generated code would be harder
to modify. This line is much finer as it may seem—the real question
is whether components with self-modifying code are desirable. Usually
they are not, since the resulting management problems immediately outw
eigh the possible advantages of flexibility.

It is interesting that persistent objects, in the precise sense of the
term, are only supported in two contexts: object-oriented databases,
still restricted to a small niche of the database market, and CORBA-ba
sed objects. In these approaches, object identity is preserved when st
oring objects. However, for the same reason, you cannot use these when
you intend to save state and topology but not identity. You would nee
d an expensive deep copy of the saved graph to effectively undo the in
itial effort of saving the universal identities of the involved object
s.

On the other hand, neither of the two primary component approaches, CO
M and JavaBeans, immediately support persistent objects. Instead, they
only emphasize saving the state and topology of a graph of objects. T
he Java terminology is “object serialization.” While object graph se
rialization would be more precise, this is much better than the COM us
e of the term persistence in a context where object identity is not pr
eserved. Indeed, saving and loading again an object graph using serial
ization (or COM’s persistence mechanisms) is equivalent to a deep cop
y of the object graph. (Many systems use this equivalence to implement
deep copying.)

While it might seem like a major disadvantage of these approaches comp
ared to CORBA, note that persistent identity is a heavyweight concept
that you can always add where needed. For example, COM supports a stan
dard mechanism called monikers, objects that resolve to other objects.
You can use moniker to carry a stable unique identifier (a surrogate)
and the information needed to locate that particular instance. The re
sulting construct is about as heavyweight as the standard CORBA Object
References. Java does not yet offer a standard like COM monikers, but
you could add one easily.

Component Objects

Components carry instances that act at run time as prescribed by their
generating component. In the simplest case, a component is a class an
d the carried instances are objects of that class. However, most compo
nents (whether COM or JavaBeans) will consist of many classes. A Java
Bean is externally represented by a single class and thus is a single
kind of object representing all possible instantiations or uses of tha
t component. A COM component is more flexible. It can present itself t
o clients as an arbitrary collection of objects whose clients only see
sets of unrelated interfaces. In JavaBeans or CORBA, multiple interfa
ces are ultimately merged into one implementing class. This prevents p
roper handling of important cases such as components that support mult
iple versions of an interface, where the exact implementation of a par
ticular method shared by all these versions needs to depend on the ver
sion of the interface the client is using. The OMG’s current CORBA Co
mponents proposal promises to fix this problem.

Mobile Components vs. Mobile Objects

Surprisingly, mobile components and objects are just as orthogonal as
regular components and objects. As demonstrated by the Java applet and
ActiveX approaches, it is useful to merely ship a component to a site
and then start from fresh state and context at the receiving end. Lik
ewise, it is possible to have mobile objects in an environment that is
n’t component-based at all. For example, Modula-3 Network Objects can
travel the network, but do not carry their implementation with them.
Instead, the environment expects all required code to already be avail
able everywhere. It is also possible to support both mobile objects an
d mobile components. For example, a mobile agent (a mobile autonomous
object) that travels the Internet to gather information should be acco
mpanied by its supporting components. A recent example is Java Aglets
(agent applets).

The Ultimate Difference

While components capture the static nature of a software fragment, obj
ects capture its dynamic nature. Simply treating everything as dynamic
can eliminate this distinction. However, it is a time-proven principl
e of software engineering to try and strengthen the static description
of systems as much as possible. You can always superimpose dynamics w
here needed. Modern facilities such as meta-programming and just-in-ti
me compilation simplify this soft treatment of the boundary between st
atic and dynamic. Nevertheless, it’s advisable to explicitly capture
as many static properties of a design or architecture as possible. Thi
s is the role of components and architectures that assign components t
heir place. The role of objects is to capture the dynamic nature of th
e arising systems built out of components. Component objects are objec
ts carried by identified components. Thus, both components and objects
together will enable the construction of next-generation software.

Blackbox vs. Whitebox Abstractions and Reuse
Blackbox vs. whitebox abstraction refers to the visibility of an imple
mentation behind its interface. Ideally, a blackbox’s clients don’t
know any details beyond the interface and its specification. For a whi
tebox, the interface may still enforce encapsulation and limit what cl
ients can do (although implementation inheritance allows for substanti
al interference). However, the whitebox implementation is available an
d you can study it to better understand what the box does. (Some autho
rs further distinguish between whiteboxes and glassboxes, where a whit
ebox lets you manipulate the implementation, and a glassbox merely let
s you study the implementation.)

Blackbox reuse refers to reusing an implementation without relying on
anything but its interface and specification. For example, typical app
lication programming interfaces (APIs) reveal no implementation detail
s. Building on such an API is thus blackbox reuse of the API’s implem
entation. In contrast, whitebox reuse refers to using a software fragm
ent, through its interfaces, while relying on the understanding you ga
ined from studying the actual implementation. Most class libraries and
application frameworks are delivered in source form and application d
evelopers study a class implementation to understand what a subclass c
an or must do.

There are serious problems with whitebox reuse across components, sinc
e whitebox reuse renders it unlikely that the reused software can be r
eplaced by a new release. Such a replacement will likely break some of
the reusing clients, as these depend on implementation details that m
ay have changed in the new release.

--
※ 来源:．月光软件站 http://www.moon-soft.com．[FROM: 202.96.184.41]

[关闭][返回]