The state of Java application middleware, Part 1

Find out what 'Java middleware' really means, what products fit into this category, and how middleware may impact your efforts as a developer -- and as a business

Summary
Nearly everyone has heard the term middleware, but relatively few actually have a full understanding of its meaning and significance. In this article, author Cliff Berg reveals the current state of Java middleware. He compares the many categories of products, explains the various features, and discusses how -- and to what degree -- these features actually are supported. The author details new APIs, including Enterprise JavaBeans, the Java Transaction API, and JDBC2, in the context of current and anticipated Java middleware. Finally, he explains the role of application servers and compares them with other kinds of Java middleware.

This, the first article in a two-part series, includes a Middleware Glossary. Part 2 of this series will focus on Enterprise JavaBeans. (8,500 words)

By Cliff Berg

Client/server is dead. That's the buzz now that newer Internet-based technologies are flourishing. But those new technologies are merely the natural evolution of earlier approaches, implemented with newer, more open protocols and designed to provide greater scalability, manageability, and diversity.

The magnitude of this evolution is astounding. Most of the major client/server vendors have modernized their products and now direct their marketing dollars into three-tiered technologies. In most cases, the newer products are Java-centric and Internet-protocol centric. For example, I identified at least 46 Java middleware products at last count. Two years ago it would have been hard to come up with half that number.

This is the first of a two-part series of articles dedicated to explaining general-purpose Java middleware in its current forms. In this first article, I'll examine the features of current products and explain why these features are important. In the second part, Anil Hemrajani will examine Enterprise JavaBeans (EJB) and show how the current generation of Java middleware products relate to and support this important component standard.

Background
First of all, let's define Java middleware. The term encompasses application servers like BEA WebLogic, messaging products like Active Software's ActiveWorks and Push Technologies's SpiritWAVE, and hybrid products that build on a DBMS legacy and add server-based Java object execution features. I could have focused on a more narrow segment, such as application servers, but that would have been unfair to the many products that don't fit this category precisely but nevertheless should be considered for multitier applications. Further, even among application servers there is quite a spectrum, including those that are primarily servlet servers as well as those that are ORB-based or OODB-based. Drawing a line between all these products proves increasingly difficult. The unifying feature, however, is that they all attempt to solve the multitier application deployment problem by using Java and Internet technologies.

The business case to use Java in middleware is compelling; among the advantages offered by Java middleware are the following:

The goal of middleware is to centralize software infrastructure and its deployment. Client/server originates from an era of integration within a single department. Organizations now commonly attempt integration across departmental boundaries -- even from one organization to another. The Internet -- which entices businesses with its ability to serve as a global network that lets departments and partners interconnect efficiently and quickly -- has generated the demand for this integration.

Java provides a lingua franca to easily interconnect data and application across organizational boundaries: In a distributed global environment, in which you have no control over what technology choices your partners may make, smart companies choose open and platform-neutral standards. Companies cannot anticipate who will become their customers, partners, or subsidiaries two years down the road, so it isn't always possible to plan for a common infrastructure with one's partners. In this uncertain situation, the best decision may be to use the most universal and adaptable technologies possible.

Java lets you reduce the number of programming languages and platforms your staff must understand. Why? Because Java is now deployed in contexts as diverse as Internet browsers, stored procedures within databases, business objects within middleware products, and client-side applications.

At the age of three, however, Java technology is still somewhat immature, and this is true of the products discussed in this article. On the other hand, we may now be in an era when products never truly reach maturity, because the underlying technologies on which they're based change so rapidly. In fact, I've found significant problems with virtually every middleware product I've used, including supposedly mature products that have been on the market for a few years and have recently come out with significant new features. The point is, by the time a vendor manages to fix problems, new features have been added. The cycle for adding new features is now much shorter than it has ever been, and so products don't have enough time to become stable before they include the next major feature set. This may be something we have to get used to, and learning the strengths and weaknesses of the products we choose is an important part of any application design and prototype cycle.

Goals for middleware
The EJB middleware component standard was developed with the following goals:

The focus of the EJB standard is therefore on creating an interoperability standard for Java middleware, shielding programmers from having to deal with many of the difficult issues that arise when developing distributed applications. This allows software developers to concentrate on business logic instead of writing sophisticated homegrown infrastructure and tools. As a result, businesses can put most of their educational resources into training staff in business processes, which typically is what provides the greatest payoff.

To the list above, I add the following additional goals for enterprise-class Java middleware. These product features are needed in the long term in order to successfully run and maintain a middleware-based environment:

Components and features of Java middleware platforms
The fastest-growing category of Java middleware today is probably application servers. However, it is essential to realize the wide variety of application servers (and other kinds of middleware products) that exist. Distinctions among Java middleware product categories today are represented not by a line but by a vast middleware continuum. I will now discuss the features of Java middleware, based on my own work comparisons, which encompass every class of Java middleware product I know about.

Object, component, and container models
Application components must adhere to some runtime deployment model, which specifies how the component communicates with its environment; (possibly) how it is installed, started, stopped, and called; and how it accesses services important for its environment. Popular Java-centric server-component runtime and container models include RMI, EJB, CORBA, DCOM, servlet, JSP (Java Server Pages), and Java stored procedures. In addition, the component models can be expressed in a variety of underlying languages, including Java, IDL, C++, and many others.

There is overlap with various component models. For example, RMI is a trivial component model with very basic facilities for object activation and location, and is primarily a remote invocation standard, whereas EJB leverages RMI and specifies RMI as its primary object invocation model. EJB also supports CORBA. In fact, none of these models is exclusive, and many Java application servers support most or all of the models above.

Many Java middleware servers run multiple business-object instances (which the CORBA world now calls servants) within a single Java virtual machine (JVM). The type-safety of the Java language allows a single JVM process to service requests from multiple clients and use program data structures and class loaders to keep client data separate. As long as servants don't employ their own native methods, it isn't possible for one servant to bring other servants down if it crashes (unless it encounters a bug in the JVM itself), or access data that is private to other classes. A properly designed object server will protect its private objects and prevent errant objects from accessing what belongs to other objects.

However, data declared static in a Java class can be shared among clients within the same JVM if the clients use the same class loader, so rules need to be defined to dictate when a separate JVM (or the equivalent of a separate JVM using memory-partitioning techniques) or separate class loader is needed to give a client its own static data space. Such rules have been specified for applets, but not for other execution environments. Sun's Java Web Server uses a single JVM for all servlets and a separate class name space for servlets with a different code base. EJB circumvents the issue by prohibiting nonfinal static data.

Performance can be increased if objects are inactivated or passivated when not in use, freeing up resources such as database connections. For this reason, many servers passivate and reactivate objects as appropriate. Similarly, some products keep frequently created objects in a pool or cache in an initialized state and ready for immediate use. The object container must manage passivation and reactivation as well as the pooled resources affected by passivation.

EJB compatibility (version)
The EJB model already is becoming universally supported. Nearly every middleware vendor has promised to support it and many already do. Moreover, the Object Management Group (OMG) has incorporated a mapping to EJB as part of the proposed CORBA Component Specification. It's hard to imagine that even Microsoft, the lone and steadfast holdout, won't eventually yield and provide EJB containers for DCOM.

While different EJB-compatible middleware can deploy and operate the same application components (as long as those components use only standard required EJB features), there is still a great deal of variation among EJB-compliant servers. For one thing, the EJB specification itself is evolving. An important question when evaluating Java middleware products, therefore, is: Does the server support the latest version of EJB, or does it support only an earlier version? Another key question is: Is the product EJB-centric, or is EJB support included only in the product's value-added features (and thus unavailable when EJB services or APIs are used)? And finally: Which optional EJB features are included (for example, entity beans and container-managed persistence)?

Sometimes, major products can surprise you in terms of what they support. For example, IBM advertised its WebSphere application server as an EJB application server, but Version 1 supported EJB entity beans, not session beans. This is interesting because EJB requires session beans for compatibility but entity beans (more complex) are optional. (The latest release of WebSphere (Version 2) includes support for session beans.)

Java execution machine
The execution environment for a Java application is provided by a runtime facility that is able to either directly execute Java bytecode or convert Java bytecode to an executable form and then execute it. The most common type of execution environment is a JVM, which interprets Java bytecode at runtime, usually caching some bytecode operations (by converting them into equivalent "quick ops") as well as virtual method bindings so that they don't have to be repeated. Another form of execution machine is the just-in-time (JIT) compiler, which compiles bytecode to native code once, as needed, and (nonpersistently) caches the compiled code in memory. Sun's HotSpot JVM is a third kind of execution machine; it performs runtime analysis of code execution and performs selective inlining based on this runtime analysis. A fourth variety is Oracle's stored-procedure JVM, which provides native compilation and persistent caching of the native binary compiled code, resulting in extremely high execution speed.

Most application servers, such as BEA WebLogic, come bundled with their own Java execution machine. In many cases, the JVM is replaceable, but at the user's risk, since the server itself may be written in Java and therefore tested only with the provided JVM. Over time, expect middleware as well as other kinds of Java environments to rely on each platform's native JVM (almost an oxymoron!), since JVMs now are delivered as standard components on many platforms, including Mac OS, Solaris, Windows NT, and Windows 98.

Java version compatibility
During the Java Business Expo in New York City this past December, Sun announced the Java 2 platform (aka JDK 1.2). JDK 1.02 is largely history now for most platforms, so I am most concerned with Java versions 1.1 and 2. Version 2 is considered the "enterprise" platform and contains important API and framework additions to allow open middleware to be written.

Generally, application servers execute with a single JVM or a set of identical pooled JVMs per server machine, so all deployed applications must be compatible with that JVM and its version of Java. If multiple JVM versions are needed, it may be possible to run a separate application server instance for each version. In addition, a JVM instance must be compatible with any protocol stub classes that need to reside on the clients it will be talking to. Since such stub classes are nowadays typically object protocol stubs (such as RMI or CORBA stubs, which are generated automatically and compiled as part of the servant skeleton code generation), and since the servant skeleton code runs in the same JVM as the servant, this implies that both the client and the server must be JVM-compatible, even though they run in separate JVMs. This isn't a requirement, just a fact of life. Java 2 JVMs are generally backward compatible with JDK 1.1 JVMs, so this shouldn't be a problem.

As of this writing, many application servers already support Java 2. Movement to the Java 2 platform will be much more rapid than the movement from 1.02 to 1.1; there is no AWT compatibility issue to hinder the move; the features of Java 2 are important for middleware platforms; and Java 1.2 has been available in beta form for more than a year (and most application server vendors have been working to support it).

Resource connections
A value-added feature for middleware products is the type of data source with which they can connect. Implementing reliable, high-throughput, distributed, transaction-capable data source drivers is no small task. Many products come with adapters or drivers that let applications connect to particular data sources (such as specific databases, CICS, 3270 (screen scraping), MOM products, SAP, and DCOM bridges) as well as Internet service protocols (such as SMTP). Connecting with a general-purpose JDBC or ODBC driver isn't often a viable solution if several data sources need to be integrated into a single transaction or if high throughput is required.

An important part of the EJB specification concerns resource pooling. The resource management and transaction management APIs are closely related because resources are transactional objects. Most middleware products provide connection-pooling features because establishing a connection to a data resource is an expensive operation. Current products providing connection pools usually do it in one of two ways:

  1. Integrate connection pooling into data source drivers
  2. Integrate connection pooling into a data access API

The Java Database Connectivity (JDBC) Version 2 Specification and the Java Transaction API (JTA) Specification together define an API for Java JDBC drivers to interconnect with a transaction manager and a connection pool manager. Because the specification is brand new, I know of no commercial products that implement it; in fact, as of this writing JTA is only in release 0.93. However, these APIs certainly will be implemented by Java-centric middleware products in the near future.

Some products provide object-to-data-source mapping, allowing a non-object-oriented data source such as a relational database to appear as if it were a persistent object-oriented source, such as an object database. These "mapping" products can work in various ways, but usually involve post-processing of application classes to replace object references (such as the getfield JVM instruction) with calls into the mapping layer. The products usually require you to use either their drivers or a restricted set of third-party drivers, which may be available for only a limited set of platforms and DBMSs.

Transaction management
One of the primary reasons to use Web-based middleware is to build applications that access data sources from different -- perhaps geographically distant -- parts of an organization, or at least to have the flexibility to do so. (The alternative is to install expensive leased lines and use a client/server solution, but this has given way to the much more cost-effective approach of using the Internet and simple and accessible Web protocols.)

The ability to cost-effectively tie together multiple data sources from any and all parts of an organization -- for example, a call center in Houston and, let's say, an order fulfillment center in Wichita -- has increased the importance of the requirement to create distributed transactions using data sources of different kinds. The Gartner Group now uses the term zero latency business to describe businesses that are able to complete a business process end-to-end in one stroke, rather than dividing into queued steps and allowing the entire process to complete over a period of time. To do this, the entire business process must be handled as a single software transaction, even if it accesses multiple databases from different vendors, possibly in different geographic locations. Further, any transactional system needs to be as recoverable and reliable as if one were dealing with a single database.

Since the importance of this requirement is a recent occurrence, true support for distributed transactions, while claimed by many, actually is quite spotty. This situation will change when vendors begin to implement the JDBC 2 Standard Extension and Java Transaction API specifications. These APIs provide a framework API for building pure Java drivers, and for Java-based transactional resource managers to interface with the driver. In the meantime, if a product advertises both JDBC compatibility and distributed transactions, the claim may be true, but exercise healthy skepticism.

Most products today allow multiple data sources to be accessed as part of a distributed transaction, but only if all data sources are the same kind and use the same driver. The reason, according to some middleware vendors, is that many of the distributed transaction C libraries supplied by database vendors appear to be nonre-entrant or otherwise inadequate at this time. Therefore, most products claiming distributed transaction support actually provide it reliably only for multiple data sources from the same kind of database, so that the middleware vendor can utilize the database's native (and proprietary) distributed transaction capability.

This is different from, and much more restrictive than, using the data source's public distributed transaction interface (assuming it has one) to coordinate with other kinds of databases. The ability to coordinate multiple databases is lacking in many products at this time. Those that do provide true distributed transaction support across multiple databases, such as BEA WebLogic, typically accomplish it by using a multiprocess architecture and shared memory-message queues to work around the aforementioned nonre-entrance in the database vendor libraries.

Middleware products generally connect to their transaction management services at the driver level, so to access the middleware's transaction manager you currently have to use the middleware's database drivers. Which means if they don't have a driver for all the data sources you want to talk to, you're out of luck. This situation will change rapidly as soon as the JTA and JTS specifications are finalized. Already some vendors (such as Gemstone, which uses drivers developed by Intersolv), provide JDBC drivers that track the upcoming standards. Such drivers connect to most databases and interface to the middleware's transaction management service in what will become plug-replaceable as it is adopted by more and more driver and middleware vendors.

Adding the checks and logic for handling distributed transactions to an application requires advanced programming expertise. An organization should not need advanced programming skills just to maintain its business logic. Therefore, there has been an effort to separate transactional logic -- a vital part of a distributed application -- from business logic. To this end, EJB-compliant application servers deal with transactions in a declarative manner, and programmers are discouraged from referencing transactions explicitly in their code. Enterprise beans, if deployed in their default manner, aren't even permitted to perform transactional operations -- for example, a commit() on a data source connection will cause an exception to be thrown. The commit is performed by the middleware that allocated the thread in which the remote call is being serviced. It calls commit() on the current global transaction, rather than on the individual data source. This is transparent to the application.

An important aspect of using distributed transactions in an application is that the underlying mechanisms should be reliable. First of all, you assume the data sources are reliable, and that if one goes down before a two-phase commit completes, the downed system will be able to recover itself and the lost transaction, bringing it back in sync with all the other systems. You also assume, however, that the transaction system itself is reliable, and that if it fails, you won't, for example, have "stuck queues" to be restarted and databases to be fixed by hand, as is the case with a legacy transactional workflow product I'm familiar with.

CORBA support
Most application servers provide CORBA support, either by providing their own ORB, as in the case of BEA M3, or by bundling a third-party ORB such as Visibroker or Orbix. In many cases, the ORBs provided are derived from other ORBs; Oracle's ORB, for example, is derived from Visigenic's.

CORBA support is a matter of degree, however. There are many standard CORBA services, the most important of which is the CORBA COS Naming Service. If a product provides COS Naming, it can publish persistent objects that other ORBs can find in a standard way. Objects published with COS Naming can be accessed using JNDI, via the COS Naming JNDI provider implementation.

Notification, a new and important service, defines a messaging service with quality-of-service levels. Provider implementations are likely to be developed to access this service using the Java Message Service (JMS). The JMS allows you to choose quality-of-service levels if the underlying message service provides them.

Support for the new Portable Object Adapter (POA) Specification and Java binding also is an important capability for CORBA applications, as it provides a vendor-neutral standard for object persistence and activation. At this time, few vendors provide POA support. Most likely EJB applications will not be exposed to POA particulars, and POA merely merely will represent an implementation service behind the middleware covers. Still, it does present the possibility of portable object containers, which Sun has said it is exploring for future EJB enhancements.

The initial release of Java 2 did not include support for transporting RMI calls with the IIOP protocol. However, this support is now available as an early access technology and will likely be in the next JDK release. It is an important integration feature for linking CORBA and Java systems. For middleware to fully support RMI over IIOP, development tools will need to support RMI clients accessing CORBA servants (generation of required RMI-compliant IDL interfaces and accompanying RMI interfaces); and they will need to support CORBA clients and RMI over IIOP (generation of IDL for RMI servants).

Finally, few people realize that there is no RMI standard for propagating transaction context from one transaction manager to another, whereas IIOP provides such a standard. If you are building an enterprise-class system, you might therefore be advised to choose middleware that includes an ORB.

Service and object directories
To use a remote service of any kind, you have to know how to find it. Most middleware incorporates a directory service of some kind, allowing an application component or client to find other objects or services. The CORBA COS Naming and Trading services are examples of directory services, as are RMI's Registry and Activation Daemon.

Some products provide their own variations and value-added features for object lookup, usually related to load balancing, as will be discussed below. In addition, different directory services utilize different object naming schemes and protocols. One well-known naming scheme is the URL, and many ORBs allow applications to register objects using a URL; in fact, CORBA defines two standards -- iiop and giopref -- for naming objects using URLs. In this case, iiop uses an "iiop:" URL protocol scheme, while giopref represents a remote object reference in string form.

Load balancing and replication
Presently, load balancing is most commonly achieved at the point of object or service location. This is true even of Web servers, for which DNS-based load balancers and IP redirectors balance load by mapping an incoming request for a specific server machine (specified by domain name or by IP address) to one of a set of servers in an IP pool.

Application servers present a more complex requirement because the service is manifested by multiple object servants, potentially replicated on multiple machines, instead of a single Web server per machine. Furthermore, while traditional HTTP connections are logically stateless, object connections may not be. An object locator therefore needs to distinguish between stateless and stateful object connections when making reconnection decisions. Object lookup is often initiated in a Web server-based component such as an ISAPI or NSAPI module, a servlet, or server-side JavaScript.

Object load balancing can be implemented during object lookup in several ways:

  1. Each client can have an object lookup proxy that scans a domain for a particular type of object; listeners in the domain reply if they can supply an object of that type
  2. A centralized (but replicated for redundancy) load-balancing lookup service can broadcast up-to-date load information to clients; clients then make decisions based on the broadcast information
  3. A centralized load-balancing lookup service can be made available for clients to contact

Object load-balancing systems can use a variety of policies, algorithms, and information to make decisions. Products I've studied use the following techniques:

In the future, expect quality of service to be increasingly important as a load-balancing criteria.

Presently, load balancing is treated by EJB as a value-added feature. A natural place for load balancing to occur is in the Home interface create() call. This call is used by clients to obtain an instance against which they can begin making remote calls. Alternatively, load balancing also can be implemented in the name service that finds the Home object.

The new POA specification provides a standard means of implementing load balancing by performing invocation redirection. This isn't widely implemented yet, but will likely replace current schemes in the near future. For beans invoked through an ORB, this is a standard mechanism that could be employed in future products.

Clusters and federation
Clustering refers to the ability to define a collection of servers that provide a common set of replicated services. A cluster can be implemented at several levels, including the communication level, the middleware level, or the application level. A cluster may be dedicated to servicing a local group of clients, allowing clients to access the cluster for most requests and thereby minimizing the load on central systems outside the cluster.

Clients may even maintain continuous network connections with cluster servants without impacting scalability. In this role, the cluster serves as a concentrator and requests that can't be satisfied locally are forwarded to another cluster using a small number of cluster-to-cluster connections. Messaging systems are often tiered in this way.

Federation refers to the ability to link clusters or service domains together to form a larger whole, with each cluster handling a set of local services. If a request is for a service beyond the local domain, it is automatically forwarded to the appropriate service domain. Directory services use federation to link directory domains together in a controlled way. Federation is supported by LDAP 3 and is generally transparent to application code. There are proprietary implementations of federated object servers such as Visibroker's Location service, which uses OSAgents to forward requests between linked subnetworks to find a desired object type.

Failure detection and failover
Many products provide failover capability; i.e., they detect system component failures and dynamically substitute a replacement component for the failed component. The degree to which this is automatic and transparent to the client varies. For example, a servant-object failure can be handled in the client stub or it can be handled on the server. If the client application and application-specific servants do not receive any exceptions due to a failure and the failure is completely handled by the middleware components, the failover capability is considered transparent.

Transparent failover is important, since the exceptions that can be thrown don't usually contain enough information to allow the client to decide what to do in a platform-independent manner; further, it makes sense that if failover is supported, it be automatic, since the kinds of errors a platform can produce as well as the corrective actions required tend to be inherently product-specific.

Failure can occur outside the realm of application servant objects. For example, the system hosting the directory service could fail, or the system containing the servant object or containing a client's session state could fail. To protect against this, many products provide redundancy in these components and implement a replicated client state model. Thus, if any one system fails, there is a means to recover. In the context of EJB, this can be provided by distributing the state information for a container-managed or bean-managed entity bean. The problem, however, is that there might be state information created since the last transaction commit, and the server may not be able to recover that state information in a failover, in which case it should retry the current operation. Even so, it's always possible that the current transaction may be aborted due to an unsuccessful failover attempt, and the application must be prepared to retry an aborted operation.

Types of persistence supported
The term persistence is very ambiguous and is used in many different ways. In the context of CORBA, a persistent object is one that can be reactivated at any time and that knows how to re-establish its own internal state. Another concept is session persistence, which generally means that a client has reserved for itself some form of session state on a server, which is maintained even if connections are destroyed and re-established.

For the EJB model, there are two kinds of servant objects: session objects and entity objects. I'll look at these in detail in Part 2 of this series, but briefly an entity object is somewhat analogous to CORBA's persistent object, in the sense that it can be looked up and reactivated by any client at any time. A session object -- which may or may not be stateful -- lacks this ability. (As it turns out, the naming of these object types might be confusing in some applications, because entity objects can be employed to implement session persistence, and session object can be used to implement stateless data access services for backend data sources. The primary intended pattern of use of entity beans, however, is for encapsulation of arbitrary persistent data sources.)

These aren't the only models. Further, the way in which state is maintained varies among servers and among protocols. Web servers provide state mechanisms that rely on client-based storage, called "cookies." Java-capable Web servers that support servlets can maintain session state in the servlet itself, if designed to do so. This issue also is related to object activation, which I'll discuss below.

The EJB model further distinguishes between bean-managed state and container-managed state. Servers that support container-managed state implement state persistence automatically, using technology that is the choice of the server. In that case, a declarative means is used to tell the server at deployment time which fields in a servant should be persisted. The server might even provide a mechanism for controlling the mapping of persistent entity fields to external data sources. This is an area for considerable value-added features.

Some products build on a persistent object heritage and take this one step further. For example, the GemStone/J application server (which now supports Java 2) provides a persistent object JVM that automatically stores persistently those object member variables that are designated as persistent. More commonly, the application server supporting container-managed persistence implements it with automatic and transparent storage of members as columns in a separate relational database.

Connection models
An important issue for client design is whether the application-server client components support multithreading. Many don't. For example, if the client invokes a remote operation in a main thread, but the user clicks on a control that causes another operation to be initiated in the UI thread, what happens? Either the second call will block until the first one is done, or both calls will be multiplexed on one connection and proceed to the server in parallel. Many middleware client components don't support multithreading at all -- each client JVM can make only one call at a time to the given server!

Once a request is received by the server, a server-side component is invoked to process it. If this component is a persistent application servant, accessible from multiple clients, it either must be designed to be re-entrant so that each remote call doesn't clobber other concurrent calls, or the middleware must synchronize access to the component. EJB entity beans are automatically synchronized by EJB servers by imposing transaction isolation on their state. Enterprise beans aren't permitted to use thread-synchronization primitives or multithreading directly.

To minimize the use of communication resources between client and server, some products multiplex all supported protocols onto a single communication connection with each client. When this occurs, the implementation should make sure it allocates more connections when the load exceeds a configurable threshold. It seems that most products fail to do this. Without such a safeguard this single connection can bottleneck in some server-to-server applications. Multiplexing all logical connections onto a single communication connection also is a means of simplifying the required firewall configuration, since at most only a few ports need to be opened for the server. Alternatively, a proxy can be employed to do the multiplexing; the Orbix Wonderwall, for example, employs this approach.

Application and business logic deployment
Multitier applications have distributed components that must be upgraded simultaneously. The components can be installed on clients; they can be Enterprise JavaBeans; they can be SQLJ stored procedures and database schemas; or they can be any other form in which business rules and data can be defined. Regardless of what form the business logic components take, a distributed system needs a strategy for deploying upgrades and modifications, so that current users don't experience errors -- or intolerable downtime -- as a result of the upgrade.

This isn't an easy problem to solve. The thin-client revolution standardized clients to some extent, and for many applications deploying business rules and code to clients is no longer necessary. However, even if this is the case, the middle tier must still be kept synchronized with data and other tiers. Many application server products attempt to address the deployment problem, including BEA WebLogic's ZAC/JRunner for client deployment and Jaguar with Castanet. If you're using IBM Visual Age and Tivoli, you can use Tivoli Beans to add support for Tivoli's Install and Synchronize feature.

The problem of deploying and installing middleware components to multiple geographically separated servers is still a separate issue from simply getting the code to the right server. An organization that needs remote-servant upgrade and installation capability should make sure that middleware administration and business logic deployment features are scriptable, or provide remote deployment features.

Authentication and encryption
The propagation of user authentication from one middleware component to another is an important issue. Products from a single vendor often don't have this problem. For example, the Oracle 8 drivers know how to transparently obtain a user's identity from the Oracle Application Server, and they trust that identity for database access. Products from different vendors often require you to enter database passwords in the application server configuration file, or -- worse -- to embed user IDs and passwords in the business logic.

These products need a way to cross-configure trust relationships so that one (a database, for example) trusts identities authenticated by another. Future releases of the JDK will likely address authentication and privilege domains more fully, alleviating this situation.

Some products have already implemented the JDK 1.2 keystore (a repository for private keys and certificates). This is a change from last year, when most SSL-enabled products came with their own keystore, requiring the user to separately manage keys and certificates for each product. This is one of the more important features of the Java 2 platform.

User directories and key management
A user directory is a database containing user names and information about them. LDAP is quickly becoming the universal directory standard and is implemented by most directory servers today. An LDAP directory can be accessed by the Java Naming and Directory Interface (JNDI) API. JNDI can be used to store user identifiers, roles, group membership, and access rights. It also can be used to store user certificates for verifying digital signatures and for sending encrypted messages to certificate owners. Most Java middleware uses the JNDI API to store user and service directory information in a directory server of the organization's choice.

In addition, products that use digital certificates (most do) must provide a means to:

Since the server product is responsible for storing its own server certificate, it is desirable for the server to allow its own certificate to be stored in a central repository, rather than in a private file of its own where it can possibly be stolen. Many servers now use key management products from third parties such as Phaos Technology to centralize and add additional security to the server key management process.

Access control
User permissions can be granted at the group level, the user level, or the role level. Permissions for access to data resources can be granted at the data source level or data source-pool level. The EJB model also permits permissions to be granted at the method level. Permissions may need to be granted for access to other resources such as message queues, services, or message types. Some messaging systems allow access control to be specified for individual messages as well. Some products, such as Gemstone/J, allow you to provide access control to fine-grained persistent objects -- even objects that aren't individually indexed by a directory.

Access control should ideally be declarative and use a single authentication mechanism, indigenous to each platform. Currently it is often necessary to embed usernames and passwords in applications. Some products let you declare this information in a configuration file, but that isn't ideal because it means one administrator has access to all passwords. Preferably, a Principal (note that Identity is deprecated and is replaced by Principal in Java 2) should propagate from one middleware component to the next without awareness by the application code. Access control for global roles and users should be assigned by administrators for realms whose resources are being accessed.

Dealing with network security configurations
Proxy firewalls, which are installed in most networks, filter traffic based on the protocol being passed, the port, and possibly the content of the data itself. There's a trend to build new protocols within HTTP so that the content, while defined by a MIME type, appears to the firewall as merely HTTP traffic and so is passed without any special reconfiguration. For other protocols, it is generally necessary to configure the proxy so that the protocol is allowed to pass on a specified port. There are at least two reasons why this may not be practical. One is that the protocol and associated server software isn't sufficiently trusted, in terms of its reliability and lack of vulnerability to external attack. Another is that the protocol may require multiple ports -- perhaps a variable number.

One solution provided by some application server vendors to relieve the firewall issue is an HTTP proxy, which wraps requests in HTTP at one end and unwraps them at the other end. This wrapping can sometimes be inefficient, and since HTTP is a purely one-way model, makes some functions difficult to achieve, such as RMI or CORBA callbacks. If the protocol requires multiple ports, some products provide a proxy that bundles all requests into a single client connection -- which is conveyed on a single configurable port -- and unbundles them at the server end. These gateway components also sometimes can provide other functions, including request forwarding. Don't assume the existence of a firewall gateway feature, however, because there are many major products (especially among messaging products) that don't provide them, and writing your own presents some security risk.

Management tools
The convenience and flexibility of administration is an area in which middleware products greatly differ. Some otherwise robust mainstream products allow you to view what is happening, but to change anything you have to edit configuration files and restart. This is partly a reflection of the newness of these products. On the other hand, many Java middleware products have had fairly sophisticated management capabilities from the beginning. Interactive management features of various products include:

One of the more interesting Java management consoles I've seen is employed by Active Software's ActiveWorks messaging server. It graphically depicts in realtime the flow of messages through a network. This helps to diagnose operational problems within a system by allowing an administrator to visually grasp what is happening -- or what should be happening. At first I was skeptical of the usefulness of this, as it looked gimmicky, but it in fact proved invaluable.

It's critical that administrators shouldn't have to understand an application to administer it. Therefore, monitoring features are important so that changes can be dynamically made based on load, without impacting application operability. Further, it should be possible to move components from one server to another -- perhaps dynamically -- without breaking references, including persistent object references. This is why it's so important to separate business logic from infrastructure, and to use consistent standards-based location- and user-directory services throughout all components.

Object schema evolution
Stateful objects such as EJB entity beans and object databases present an additional lifecycle problem: if an object's definition is modified, the state of all objects of that type may potentially need to be updated as well. Traditionally, this is accomplished by unloading all objects, performing the upgrade to the schema or object definitions, and then reloading them using a custom program or a tool that implements the required state changes. Some products provide a declarative mechanism that doesn't require reloading, such as a script that runs against the existing database. This can save unnecessary work and complexity during maintenance cycles because effort can focus on modifying only things that needs to change, instead of unloading and reloading everything.

Development tools
Most Java middleware comes with a development environment for creating applications. This isn't strictly necessary, thanks to Java component standards like JavaBeans, which makes it possible to share components across development environments. The EJB standard requires a significant amount of automatic code generation on the part of an application-server development environment. This can be accomplished with a command-line tool or via an IDE.

The most common kind of component library provided for Web application servers is a presentation library. Since users can be expected to use different kinds of client platforms, including browsers and JVMs of different brands, and the presentation model for a thin client is very different from that of a Java client, it is necessary to isolate business logic and session state from dependence on the presentation and presentation state.

Presentation components tend to be highly proprietary and are often bundled with other services such as session state and object lookup, and can therefore lock you into an application server. An example is Netscape Application Server's App Logic component, which encourages developers to create monolithic "app logics" (Netscape's term) that are completely dependent on NAS technology -- not a very open approach.

A development environment is usually provided to encourage developers to use value-added features. For example, if a database connection can be created simply by selecting a JDBC control from a toolbar, programmers will probably do it. Value-added features should be used with caution, however, as they can lead to a dependence on the vendor's component library. A solution to this is to encapsulate dependence on them, thereby obtaining their benefits, but isolating the dependence to a well-defined and replaceable layer.

A recent trend is for application server IDEs to support object design tools such as Rational Rose. This makes it possible to integrate a system-object design with the OODBMS schema and business objects. New companies (such as Inline Software) are being formed to dedicate themselves to pursuing the integration of multitier-Java, object-oriented design processes.

Platforms
In choosing Java middleware, you must consider the platforms for which it is available. But wait! -- doesn't Java run anywhere? Not all Java middleware is pure Java, nor should it be. The fact that developers can deploy Java business objects doesn't mean the server itself is written in Java. Some application servers are indeed pure Java applications, but most aren't.

Reliability
The year 1999 should prove to be a watershed year for Java. It is crucial that Java prove itself in the enterprise, and the enterprise Java platform -- Java 2 -- is finally here. New and improved JVMs are on the way as well. Whether these JVMs can handle a high-volume operation and hundreds or thousands of threads remains to be seen (but this issue's cover story -- "The Volano Report: Which Java platform is fastest, most scalable?" -- certainly sheds some light on the issue!). Sun has had a year to test its own JDK 1.2 JVM and will be releasing HotSpot in the first half of 1999. Java developers no doubt are hoping that past problems with the reliability of first-generation JITs will become history as these components reach the next level of maturity. I plan to do substantial measuring of my own over the next year to test these new core platforms.

Apart from underlying JVMs, it is critical that services be reliable as well. Many products have designs that make it possible to deploy a system with no single point of failure. One common failure point is the object lookup service, and some products don't yet support distributed and replicated object directories or object-state databases, making them vulnerable to failure. The durability of transaction logs used by OTMs is an issue as well, as is persistent storage used by message-based products. Some Java-capable but non-Java products that have severe reliability issues in this arena are being rewritten in Java in hopes of improving reliability.

A current listing of Java middleware products can be found at the Digital Focus Java middleware information Web site. (See the Resources section of this article.)

Conclusion
The diversity of Java middleware products and the overlap of their features makes them difficult to categorize. The most important category is EJB application servers, but object databases, messaging middleware, Web application servers, and many other categories comprise a set of choices appropriate for different applications. Organizations face the challenge of unifying the way applications that use these products are developed and managed.

EJB and accompanying standards provide a framework for building open standards-based, transactional, and multitier Java applications. Still, many areas need attention, including areas that now use vendor-specific solutions, such as load balancing and authentication. Access control needs to be fully declarative and propagate to data sources as well as software components, and it needs to permit separate administrative realms. Moreover, presentation-standards issues need to be addressed, as does code deployment. Finally, pure Java plug-replaceable database drivers that support distributed transactions need to be implemented.

In short, there's still a ways to go. You can build scalable, platform-neutral, multitier applications, but only if you avoid the many product-specific features available. Portability should improve this year as the industry adopts new standards, including the EJB standard in particular. Next month Anil Hemrajani will examine EJB up close, and see how it's implemented within products at present.

For an explanation of middleware terms used in this article, see the Middleware Glossary.

About the author
Cliff Berg is VP and CTO of Digital Focus, a leading Java and Internet technology system integrator. Cliff is author of the book Advanced Java Development For Enterprise Applications (Prentice Hall, 1998), and authored the Java Q&A Column in Dr. Dobb's Journal for its first two years, and to which he still contributes.

Contributing to this article was, Anil Hemrajani () President and CEO of Divya Inc., a Java and Internet-technology consulting firm. Anil has authored numerous articles on Java and contributed to books on Java technology, and lectures frequently at Java conferences.

Resources



Middleware Glossary

Activation
Bringing an executable component into a live state, after which it can respond to invocations
Application server
A server program that allows the installation of application specific software components, in a manner so that they can be remotely invoked, usually by some form of remote object method call.
Bean-managed persistence
When an Enterprise JavaBean performs its own long-term state management
Bytecode
In the context of Java, bytecode is the platform-independent executable program code
Class loader
A Java class that serves the function of retrieving other Java classes and loading them into memory
Clustering
Aggregating multiple servers together to form a service pool of some kind, usually for achieving redundancy or improving performance
Component standard
A definition of how software components cooperate, and in particular the roles and interfaces of each. In the context of Java middleware, component standards usually include specifications of the middleware interfaces exposed to the components, and the component interfaces required by the middleware
Concentrator
A facility for aggregating requests onto a single or small number of channels, for efficiency.
Container managed persistence
When an Enterprise JavaBean server manages a bean's long-term state.
CORBA
Standard maintained by the Object Management Group (OMG), called the Common Object Request Broker Architecture.
COS Naming
CORBA standard for object directories.
Data source
This is the term used by the JTA and JDBC specifications to refer to persistent repository of data. It usually represents a database. It also may refer to an object that makes database connections available (i.e. a driver).
DCOM
Microsoft's Distributed Component Object Model.
DNS
Domain Name Service, the Internet standard for looking up machine IP addresses by logical name.
EJB
Enterprise JavaBeans.
Enterprise JavaBeans
A server component standard developed by Sun Microsystems
Entity bean
An Enterprise JavaBean that maintains state across sessions, and may be looked up in an object directory by its key value
Failover
The ability to respond resiliently to a component failure by switching to another component
IDL
interface description language, CORBA's syntax for defining object remote interfaces
IIOP
Internet Inter-ORB Protocol, CORBA's wire protocol for transmitting remote object method invocations
ISAPI
Microsoft's C++ API for coding application extensions for its Internet Information Server
Java Naming and Directory Interface
The Java standard API for accessing directory services, such as LDAP, COS Naming, and others
Java Transaction API
Java API for coding client demarcated transactions, and for building transactional data source drivers
JDBC2
Newly released extensions to the JDBC API
JNDI
Java Naming and Directory Interface
JTA
Java Transaction API
JTS
The Java Transaction Service, which in the Java binding for the CORBA Transaction Service. Provides a way for middleware vendors to build interoperable transactional middleware
JVM
Java virtual machine
Keystore
A repository for private keys and certificates
LDAP
Lightweight Directory Access Protocol, a protocol for directory services, derived from X.500
Messaging middleware
Middleware that supports a publish-and-subscribe or broadcast metaphor
Middleware
Software that runs on a server, and acts as either an application processing gateway or a routing bridge between remote clients and data sources or other servers, or any combination of these
MOM
message-oriented middleware
NSAPI
Netscape's C language API for adding application extensions to their Web servers
OMG
Object Management Group, an organization that defines and promotes object oriented programming standards
OODB
object-oriented database
OODBMS
object-oriented database management system
ORB
object request broker, the primary message routing component in a CORBA product
Passivate
To place an object in a dormant state when it is not being accessed, such that it can later be returned to an active and usable state
Persistence
Maintaining state over a long time, especially across sessions
POA
portable object adapter
Pooling
Maintaining a collection of objects, servers, connections, or other resources for ready access, so that one does not need to be created anew each time one is needed
Portable Object Adapter
A new CORBA standard for defining object lifecycle and activation
Quality of service
The set of characteristics of a connection that include response time, reliability and error rate, throughput, and other measures that impact usability
RMI
Remote Method Invocation, the Java standard technology for building distributed objects whose methods can be invoked remotely across a network
RMI over IIOP
Using the CORBA IIOP wire protocol from an RMI API
Servant
Loosely, an object that exposes a remote interface so that it can be called from a remote client. Has a very specific meaning in the context of the POA standard
Servlet
An application extension to a Java Web server
Session bean
An Enterprise JavaBean that does not maintain its state from one session to the next. Appears to the client as if the bean was created just for that client
Skeleton
A server-side software component that serves to relay remote calls from a client to the methods of a servant running in a server. Usually a skeleton is automatically generated by a special compiler
SQLJ
An extended Java syntax for embedding SQL-like commands in a Java program
Stub
A client-side software component that serves to forward remote calls to a remote server, and receive the subsequent responses. Usually automatically generated by a special compiler
Three-tier
An architecture in which a remote client access remote data sources via an intervening server
Transaction manager
A software component that coordinates the separate transactions of multiple data sources, so that they behave as a single unified transaction. Requires data source drivers that can participate in this kind of coordination. Also usually provides the ability to monitor transactions and provide statistics
Transactional
When an operation has the property that it either completes, or if it does not complete due to a failure, it either undoes its own effects or has the ability to complete at a later time when the failure is repaired

Return to article

(c) Copyright 1999 Web Publishing Inc., an IDG Communications company

Feedback: jweditors@javaworld.com
Technical difficulties: webmaster@javaworld.com
URL: http://www.javaworld.com/jw-03-1999/jw-03-middleware_p.html
Last modified: Thursday, July 01, 1999