发信人: wjanry()
整理人: majorsun(2000-03-08 14:00:47), 站内信件
|
Components and Objects Together May 1999
Table of Contents
The differences between the emerging component-based development and l ong-standing object-oriented development are often unclear. Find out h ow separate these concepts really are.
by Clemens Szperski
Components are on the upswing; objects have been around for some time. It is understandable, but not helpful, to see object-oriented program ming sold in new clothes by simply calling objects “components.” The emerging component-based approaches and tools combine objects and com ponents in ways that show they are really separate concepts. In this a rticle, I will examine some key differences between objects and compon ents to clarify these muddy waters. In particular, you’ll see that ap proaches based on visual assembly tools really assemble objects, not c omponents, but they create components when saving the finished assembl y.
Why Components?
What is the rationale behind component software? Or rather, what is it that components should be? Traditionally, closed solutions with propr ietary interfaces addressed most customers’ needs. Heavyweights such as operating systems and database engines are among the few examples o f components that did reach high levels of maturity. Large software sy stems manufacturers often configure delivered solutions by combining m odules in a client-specific way. However, the interfaces between such modules tend to be proprietary, at most open to highly specialized ind ependent software vendors (ISVs) that specifically produce further mod ules for such systems. In many cases, these modules are fused together during a linking step and are no longer distinguishable in deployed s olutions.
Attempts to create low-level connection standards or wiring standards are either product or standard-driven. The Microsoft standards, restin g on COM, have always been product-driven and are thus incremental, ev olutionary, and, to a degree, legacy-laden by nature.
Standard-driven approaches usually originate in industry consortia. Th e prime example here is the effort of the Object Management Group (OMG ). However, the OMG hasn’t contributed much in the component world an d is now falling back on JavaSoft’s Enterprise JavaBeans standards fo r components, although it’s attempting a CORBA Beans generalization. The EJB standard still has a long way to go; so far it is not implemen tation language-neutral, and bridging standards to Java external servi ces and components are only emerging.
At first, it might surprise you that component software is largely pus hed by desktop- and Internet-based solutions. On second thought, this should not surprise you at all. Component software is a complex techno logy to master—and viable, component-based solutions will only evolve if the benefits are clear. Traditional enterprise computing has many benefits, but these benefits all depend on enterprises willing to evol ve substantially.
In the desktop and Internet worlds, the situation is different. Centra lized control over what information is processed when and where is not an option in these worlds. Instead, content (such as web pages or doc uments) arrives at a user’s machine and needs to be processed there a nd then. With a rapidly exploding variety of content types—and open c oding standards such as XML—monolithic applications have long reached their limits. Beyond the flexibility of component software is its cap ability to dynamically grow to address changing needs.
What a Component Is and Is Not
The separate existence and mobility of components, as witnessed by Jav a applets or ActiveX components, can make components look similar to o bjects. People often use the words “component” and “object” interc hangeably. In addition, they use constructions such as “component obj ect.” Objects are said to be instances of classes or clones of protot ype objects. Objects and components both make their services available through interfaces. Language designers add further irritation by disc ussing namespaces, modules, packages, and so on. I will try to unfold, explain, and justify these terms. Next, I’ll browse the key terms wi th brief explanations, relating them to each other. Based on this, I’ ll then look at a refined component definition. Finally, I’ll shed so me light on the fine line between component-based programming and comp onent assembly.
Terms and Concepts
Components. A component’s characteristic properties are that it is a unit of independent deployment; a unit of third-party composition; and it has no persistent state.
These properties have several implications. For a component to be inde pendently deployable, it needs to be well-separated from its environme nt and from other components. A component therefore encapsulates its c onstituent features. Also, since it is a unit of deployment, you never partially deploy a component.
If a third party needs to compose a component with other components, t he component must be self-contained. (A third party is one that you ca nnot expect to access the construction details of all the components i nvolved.) Also, the component needs to come with clear specifications of what it provides and what it requires. In other words, a component needs to encapsulate its implementation and interact with its environm ent through well-defined interfaces and platform assumptions only. It’ s also generally useful to minimize hard-wired dependencies in favor o f externally configurable providers.
Finally, you cannot distinguish a component without any persistent sta te from copies of its own. (Exceptions to this rule are attributes not contributing to the component’s functionality, such as serial number s used for accounting.) Without state, a component can be loaded into and activated in a particular system—but in any given process, there will be at most one copy of a particular component. So, while it is us eful to ask whether a particular component is available or not, it isn ’t useful to ask about the number of copies of that component. (Note that a component may simultaneously exist in different versions. Howev er, these are not copies of a component, but rather different componen ts related to each other by a versioning scheme.)
In many current approaches, components are heavyweights. For example, a database server could be a component. If there is only one database maintained by this class of server, then it is easy to confuse the ins tance with the concept. For example, you might see the database server together with the database as a component with persistent state. Acco rding to the definition described previously, this instance of the dat abase concept is not a component. Instead, the static database server program is a component, and it supports a single instance: the databas e object. This separation of the immutable plan from the mutable insta nces is the key to avoiding massive maintenance problems. If component s could be mutable, that is, have state, then no two installations of the same component would have the same properties. The differentiation of components and objects is thus fundamentally about differentiating between static properties that hold for a particular configuration an d dynamic properties of any particular computational scenario. Drawing this line carefully is essential to curbing manageability, configurab ility, and version control problems.
Objects. The notions of instantiation, identity, and encapsulation lea d to the notion of objects. In contrast to the properties characterizi ng components, an object’s characteristic properties are that it is a unit of instantiation (it has a unique identity); it has state that c an be persistent; and it encapsulates its state and behavior.
Again, several object properties follow directly. Since an object is a unit of instantiation, it cannot be partially instantiated. Since an object has individual state, it also needs a unique identity to identi fy the object, despite state changes, for its lifetime. Consider the a pocryphal story about George Washington’s axe, which had five new han dles and four new axe-heads—but was still George Washington’s axe. T his is typical of real-life objects: nothing but their abstract identi ty remains stable over time.
Since objects get instantiated, you need a construction plan that desc ribes the new object’s state space, initial state, and behavior befor e the object can exist. Such a plan may be explicitly available and is then called a class. Alternatively, it may be implicitly available in the form of an object that already exists, that is close to the objec t to be created, and can be cloned. You’ll call such a preexisting ob ject a prototype object.
Whether using classes or prototype objects, the newly instantiated obj ect needs to be set to an initial state. The initial state needs to be a valid state of the constructed object, but it may also depend on pa rameters specified by the client asking for the new object. The code t hat is required to control object creation and initialization could be a static procedure, usually called a constructor. Alternatively, it c an be an object of its own, usually called an object factory, or facto ry for short.
Object References and Persistent Objects
The object’s identity is usually captured by an object reference. Mos t programming languages do not explicitly support object references; l anguage-level references hold unique references of objects (usually th eir addresses in memory), but there is no direct high-level support to manipulate the reference as such. (Languages like C provide low-level address manipulation facilities.) Distinguishing between an object—a triple definition of identity, state, and implementing class—and an object reference (just holding the identity) is important when conside ring persistence. As I’ll describe later, almost all so-called persis tence schemes just preserve an object’s state and class, but not its absolute identity. An exception is CORBA, which defines interoperable object references (IORs) as stable entities (which are really objects) . Storing an IOR makes the pure object identity persist.
Components and Objects
Typically, a component comes to life through objects and therefore wou ld normally contain one or more classes or immutable prototype objects . In addition, it might contain a set of immutable objects that captur e default initial state and other component resources. However, there is no need for a component to contain only classes or any classes at a ll. A component could contain traditional procedures and even have glo bal (static) variables; or it may be realized in its entirety using a functional programming approach, an assembly language, or any other ap proach. Objects created in a component, or references to such objects, can become visible to the component’s clients, usually other compone nts. If only objects become visible to clients, there is no way to tel l whether or not a component is purely object-oriented inside.
A component may contain multiple classes, but a class is necessarily c onfined to a single component; partial deployment of a class wouldn’t normally make sense. Just as classes can depend on other classes (inh eritance), components can depend on other components (import). The sup erclasses of a class do not necessarily need to reside in the same com ponent as the class. Where a class has a superclass in another compone nt, the inheritance relation crosses component boundaries. Whether or not inheritance across components is a good thing is the focus of heat ed debate. The theoretical reasoning behind this clash is interesting and close to the essence of component orientation, but it’s beyond th e scope of this article.
Modules
Components are rather close to modules, as introduced by modular langu ages in the early 1980s. The most popular modular languages are Modula -2 and Ada. In Ada, modules are called packages, but the concepts are almost identical. An important hallmark of modular approaches is the s upport of separate compilation, including the ability to properly type -check across module boundaries.
With the introduction of the Eiffel language, the claim was that a cla ss is a better module. This seemed justified based on the early ideas that modules would each implement one abstract data type (ADT). After all, you can look at a class as implementing an ADT, with the addition al properties of inheritance and polymorphism. However, modules can be used, and always have been used, to package multiple entities, such a s ADTs or indeed classes, into one unit. Also, modules do not have a c oncept of instantiation, while classes do. (In module-less languages, this leads to the construction of static classes that essentially serv e as simple modules.)
Recent language designs, such as Oberon, Modula-3, and Component Pasca l, keep the modules and classes separate. (In Java, a package is somew hat weaker than a module and mostly serves namespace control purposes. ) Also, a module can contain multiple classes. Where classes inherit f rom each other, they can do so across module boundaries. You can see m odules as minimal components. Even modules that do not contain any cla sses can function as components.
Nevertheless, module concepts don’t normally support one aspect of fu ll-fledged components. There are no persistent immutable resources tha t come with a module, beyond what has been hardwired as constants in t he code. Resources parameterize a component. Replacing these resources lets you version a component without needing to recompile; localizati on is an example. Modification of resources may look like a form of a mutable component state. Since components are not supposed to modify t heir own resources (or their code!), this distinction remains useful: resources fall into the same category as the compiled code that forms part of a component.
Component technology unavoidably leads to modular solutions. The softw are engineering benefits can thus justify initial investment into comp onent technology, even if you don’t foresee component markets.
It is possible to go beyond the technical level of reducing components to better modules. To do so, it is helpful to define components diffe rently.
Component: A Definition
“A software component is a unit of composition with contractually spe cified interfaces and explicit context dependencies only. A software c omponent can be deployed independently and is subject to composition b y third parties.” (Workshop on Component-Oriented Programming, ECOOP, 1996.)
This definition covers the characteristic properties of components I’ ve discussed. It covers technical aspects such as independence, contra ctual interfaces, and composition, and also market-related aspects suc h as third parties and deployment. It is the unique property of compon ents, not only of software components, to combine technical and market aspects. A purely technical interpretation of this view maps this com ponent concept back to that of modules, as illustrated in the followin g definition: A component is a set of simultaneously deployed atomic c omponents. An atomic component is a module plus a set of resources.
This distinction of components and atomic components caters to the fac t that most atomic components are not deployed individually, although they could be. Instead, atomic components normally belong to a set of components, and a typical deployment will cover the entire set.
Atomic components are the elementary units of deployment, versioning a nd replacement; although it’s not usually done, individual deployment is possible. A module is thus an atomic component with no separate re sources. (Java packages are not modules, but the atomic units of deplo yment in Java are class files. A single package is compiled into many class files—one per class.)
A module is a set of classes and possibly non-object-oriented construc ts, such as procedures or functions. Modules may statically require th e presence of other modules in order to work. Hence, you can only depl oy a module if all the modules it depends on are available. The depend ency graph must be acyclic or else a group of modules in a cyclic depe ndency relation would always require simultaneous deployment, violatin g the defining property of modules.
A resource is a frozen collection of typed items. The resource concept could include code resources to subsume modules. The point here is th at there are resources besides the ones generated by a compiler compil ing a module or package. In a pure object approach, resources are seri alized immutable objects. They’re immutable because components have n o persistent identity. You cannot distinguish between duplicates.
Interfaces
A component’s interfaces define its access points. These points let a component’s clients, usually components themselves, access the compo nent’s services. Normally, a component has multiple interfaces corres ponding to different access points. Each access point may provide a di fferent service, catering to different client needs. It’s important t o emphasize the interface specifications’ contractual nature. Since t he component and its clients are developed in mutual ignorance, the st andardized contract must form a common ground for successful interacti on.
What nontechnical aspects do contractual interfaces need to obey to be successful? First, keep the economy of scale in mind. Some of a compo nent’s services may be less popular than others, but if none are popu lar and the particular combination of offered services is not either, the component has no market. In such a case, the overhead cost of cast ing a particular solution into a component form may not be justified.
Notice, however, that individual adaptations of component systems can lead to developing components that have no market. In this situation, component system extensions should build on what the system provides, and the easiest way of achieving this may be to develop the extension in component form. In this case, the economic argument applies indirec tly: while the extending component itself is not viable, the resulting combination with the extended component system is.
Second, you must avoid undue market fragmentation, as it threatens the viability of components. You must also minimize redundant introductio ns of similar interfaces. In a market economy, such a minimization is usually the result of either early standardization efforts in a market segment or the result of fierce eliminating competition. In the forme r case, the danger is suboptimality due to committee design, in the la tter case it is suboptimality due to the nontechnical nature of market forces.
Third, to maximize the reach of an interface specification, and of com ponents implementing this interface, you need common media to publiciz e and advertise interfaces and components. If nothing else, this requi res a small number of widely accepted unique naming schemes. Just as I SBN (International Standard Book Number) is a worldwide and unique nam ing scheme to identify any published book, developers need a similar s cheme to refer abstractly to interfaces by name. Like an ISBN, a compo nent identifier is not required to carry any meaning. An ISBN consists of a country code, a publisher code, a publisher-assigned serial numb er, and a checking digit. While it reveals the book’s publisher, it d oes not code the book’s contents. The book title may hint at the mean ing, but it’s not guaranteed to be unique.
Explicit Context Dependencies
Besides specifying provided interfaces, the previous definition of com ponents also requires components to specify their needs. That is, the definition requires specification of what the deployment environment w ill need to provide, so that the components can function. These needs are called context dependencies, referring to the context of compositi on and deployment. If there were only one software component world, it would suffice to enumerate required interfaces of other components to specify all context dependencies. For example, a mail-merge component would specify that it needs a file system interface. Note that with t oday’s components, even this list of required interfaces is not norma lly available. The emphasis is usually just on provided interfaces.
In reality, several component worlds coexist, compete, and conflict wi th each other. At least three major worlds are now emerging, based on OMG’s CORBA, Sun’s Java, and Microsoft’s COM. In addition, componen t worlds are fragmented by the various computing and networking platfo rms. This is not likely to change soon. Just as the market has so far tolerated a surprising multitude of operating systems, there will be r oom for multiple component worlds. Where multiple worlds share markets , a component’s context dependencies specification must include its r equired interfaces and the component world (or worlds) for which it ha s been prepared.
There will, of course, also be secondary markets for cross-component-w orld integration. In analogy, consider the thriving market for power-p lug adapters for electrical devices. Thus, bridging solutions, such as the OMG’s COM and CORBA Interworking standard, mitigate chasms.
Component Weight
Obviously, a component is most useful if it offers the right set of in terfaces and has no restricting context dependencies; that is, if it c an perform in all component worlds and requires no interface beyond th ose whose availability is guaranteed by the different component worlds . However, few components, if any, would be able to perform under such weak environmental guarantees. Technically, a component could come wi th all required software bundled in, but that would clearly defeat the purpose of using components in the first place. Note that part of the environmental requirements is the machine on which the component can execute. In the case of a virtual machine, such as the Java Virtual Ma chine, this is a straightforward part of the component world specifica tion. On native code platforms, a mechanism such as Apple’s fat binar ies (which pack multiple binaries into one file), would still allow a component to run everywhere.
Instead of constructing a self-sufficient component with everything bu ilt in, a component designer may opt for maximal reuse. Although maxim izing reuse has many advantages, it has one substantial disadvantage: the explosion of context dependencies. If designs of components were, after release, frozen for all time, and if all deployment environments were the same, this would not pose a problem. However, as components evolve, and different environments provide different configurations an d version mixes, it becomes a showstopper to have a large number of co ntext dependencies. To summarize: maximizing reuse minimizes use. In p ractice, component designers have to strive for a balance.
Component-Based Programming vs. Component Assembly
Component technology is sometimes used as a synonym for “visual assem bly” of pre-fabricated components. Indeed, for relatively simple appl ications, wiring components is surprisingly productive— for example, JavaSoft’s BeanBox lets a user connect beans visually and displays su ch connections as pieces of pipework: plumbing instead of programming.
It is useful to take a look behind the scenes. When wiring or plumbing components, the visual assembly tool registers event listeners with e vent sources. For example, if the assembly of a button and a text fiel d should clear the text field whenever the button is pressed, then the button is the event source of the event button pressed and the text f ield is listening for this event. While details are of no importance h ere, it is clear that this assembly process is not primarily about com ponents. The button and the text field are instances, that is, objects not components. (When adding the first object of a kind, an assembly tool may need to locate an appropriate component.)
However, there is a problem with this analysis. If the assembled objec ts are saved and distributed as a new component, how can this be expla ined? The key is to realize that it is not the graph of particular ass embled objects that is saved. Instead, the saved information suffices to generate a new graph of objects that happens to have the same topol ogy (and, to a degree, the same state) as the originally assembled gra ph of objects. However, the newly generated graph and the original gra ph will not share common objects: the object identities are all differ ent.
You should then view the stored graph as persistent state but not as p ersistent objects. Therefore, what seems to be assembly at the instanc e rather than the class level—and is fundamentally different—is a ma tter of convenience. In fact, there is no difference in outcome betwee n this approach of assembling a component out of subcomponents and a t raditional programmatic implementation that hard codes the assembly. V isual assembly tools are free to not save object graphs, but to genera te code that when executed creates the required objects and establishe s their interconnections. The main difference is the degree of flexibi lity in theory. You can easily modify the saved object graph at run ti me of the deployed component, while the generated code would be harder to modify. This line is much finer as it may seem—the real question is whether components with self-modifying code are desirable. Usually they are not, since the resulting management problems immediately outw eigh the possible advantages of flexibility.
It is interesting that persistent objects, in the precise sense of the term, are only supported in two contexts: object-oriented databases, still restricted to a small niche of the database market, and CORBA-ba sed objects. In these approaches, object identity is preserved when st oring objects. However, for the same reason, you cannot use these when you intend to save state and topology but not identity. You would nee d an expensive deep copy of the saved graph to effectively undo the in itial effort of saving the universal identities of the involved object s.
On the other hand, neither of the two primary component approaches, CO M and JavaBeans, immediately support persistent objects. Instead, they only emphasize saving the state and topology of a graph of objects. T he Java terminology is “object serialization.” While object graph se rialization would be more precise, this is much better than the COM us e of the term persistence in a context where object identity is not pr eserved. Indeed, saving and loading again an object graph using serial ization (or COM’s persistence mechanisms) is equivalent to a deep cop y of the object graph. (Many systems use this equivalence to implement deep copying.)
While it might seem like a major disadvantage of these approaches comp ared to CORBA, note that persistent identity is a heavyweight concept that you can always add where needed. For example, COM supports a stan dard mechanism called monikers, objects that resolve to other objects. You can use moniker to carry a stable unique identifier (a surrogate) and the information needed to locate that particular instance. The re sulting construct is about as heavyweight as the standard CORBA Object References. Java does not yet offer a standard like COM monikers, but you could add one easily.
Component Objects
Components carry instances that act at run time as prescribed by their generating component. In the simplest case, a component is a class an d the carried instances are objects of that class. However, most compo nents (whether COM or JavaBeans) will consist of many classes. A Java Bean is externally represented by a single class and thus is a single kind of object representing all possible instantiations or uses of tha t component. A COM component is more flexible. It can present itself t o clients as an arbitrary collection of objects whose clients only see sets of unrelated interfaces. In JavaBeans or CORBA, multiple interfa ces are ultimately merged into one implementing class. This prevents p roper handling of important cases such as components that support mult iple versions of an interface, where the exact implementation of a par ticular method shared by all these versions needs to depend on the ver sion of the interface the client is using. The OMG’s current CORBA Co mponents proposal promises to fix this problem.
Mobile Components vs. Mobile Objects
Surprisingly, mobile components and objects are just as orthogonal as regular components and objects. As demonstrated by the Java applet and ActiveX approaches, it is useful to merely ship a component to a site and then start from fresh state and context at the receiving end. Lik ewise, it is possible to have mobile objects in an environment that is n’t component-based at all. For example, Modula-3 Network Objects can travel the network, but do not carry their implementation with them. Instead, the environment expects all required code to already be avail able everywhere. It is also possible to support both mobile objects an d mobile components. For example, a mobile agent (a mobile autonomous object) that travels the Internet to gather information should be acco mpanied by its supporting components. A recent example is Java Aglets (agent applets).
The Ultimate Difference
While components capture the static nature of a software fragment, obj ects capture its dynamic nature. Simply treating everything as dynamic can eliminate this distinction. However, it is a time-proven principl e of software engineering to try and strengthen the static description of systems as much as possible. You can always superimpose dynamics w here needed. Modern facilities such as meta-programming and just-in-ti me compilation simplify this soft treatment of the boundary between st atic and dynamic. Nevertheless, it’s advisable to explicitly capture as many static properties of a design or architecture as possible. Thi s is the role of components and architectures that assign components t heir place. The role of objects is to capture the dynamic nature of th e arising systems built out of components. Component objects are objec ts carried by identified components. Thus, both components and objects together will enable the construction of next-generation software.
Blackbox vs. Whitebox Abstractions and Reuse
Blackbox vs. whitebox abstraction refers to the visibility of an imple mentation behind its interface. Ideally, a blackbox’s clients don’t know any details beyond the interface and its specification. For a whi tebox, the interface may still enforce encapsulation and limit what cl ients can do (although implementation inheritance allows for substanti al interference). However, the whitebox implementation is available an d you can study it to better understand what the box does. (Some autho rs further distinguish between whiteboxes and glassboxes, where a whit ebox lets you manipulate the implementation, and a glassbox merely let s you study the implementation.)
Blackbox reuse refers to reusing an implementation without relying on anything but its interface and specification. For example, typical app lication programming interfaces (APIs) reveal no implementation detail s. Building on such an API is thus blackbox reuse of the API’s implem entation. In contrast, whitebox reuse refers to using a software fragm ent, through its interfaces, while relying on the understanding you ga ined from studying the actual implementation. Most class libraries and application frameworks are delivered in source form and application d evelopers study a class implementation to understand what a subclass c an or must do.
There are serious problems with whitebox reuse across components, sinc e whitebox reuse renders it unlikely that the reused software can be r eplaced by a new release. Such a replacement will likely break some of the reusing clients, as these depend on implementation details that m ay have changed in the new release.
-- ※ 来源:.月光软件站 http://www.moon-soft.com.[FROM: 202.96.184.41]
|
|