Remote Method Invocation: Java for Distributed Systems

Matthew H. Pinner

November 05, 2002

Abstract

The Java programming language was built with simplicity and networking in mind. These things, when coupled with its pure object-oriented nature, make for easy distribution of applications. Remote Method Invocation (RMI) is the work horse of any distributed system in Java. RMI, being part of the core language, is leveraged within other Java APIs. This makes available an entire suite of tools useful for a distributed system.

Through comparison to Remote Procedure Calls, and the distribution of a matrix multiplication operation, we will see how to one distributes an application using RMI. There are some clear advantages and disadvantages of the Java solution that will also be reviewed.

Introduction

Large projects are always managed more easily when broken down into smaller tasks. Work can then be scheduled and split among the available resources. Through the ages distributed systems have accomplished more than the best human. The Egyptian Pyramids are good example of an ancient distributed system. No one pharaoh could have built a pyramid by himself. The work was divided among a multitude of slaves. When this is applied to computing, often one can collaborate to find a timely solution to even the most computationally intensive problems.

Traditional methods of distributed programming are tedious and error-prone, but distributing an application over a network of computers is a simple task with the Java Programming Language [6]. Java Remote Method Invocation (RMI) allows one to write distributed objects using Java. This paper will describe the advantages of using RMI over other tools for distributed programming. Then we’ll review the process and code of distributing an application with RMI.

Advantages of Distribution with RMI

A nice way to work with a distributed problem is to have an object on another machine, and you can send a message to that object and get a response as if it lived on your machine. This distribution transparency is exactly what Remote Method Invocation can do for you. Interfaces are heavily utilized. When a remote object is created, you mask its implementation by passing around an interface. When a client acquires a remote object handle, they really get a local stub that talks across the network. An RMI based system one can utilize many of Java’s other features. RMI provides a mechanism to activated classes from a remotely.

RMI can be thought of as remote procedure calls (RPC) for Java, but distribution with RMI has many benefits over other traditional RPC systems. RMI is a purely Java system, and therefore maintains Java’s object-oriented nature. RPC and another methods have strived for langage neutrality. This forces them to support only the common features. They only provide functionality that exists on all the targeted platforms. To use these techniques, objects often need to be flatted in primitive C data types. The developer can no longer use design patterns or any of the other advantages of the object-oriented paradigm.

Remote Procedure Calls (RPC) and the Common Object Request Broker Architecture (CORBA) do much to abstract the socket programming, but neither has the object portability of Java. To support multiple operating systems these tools must work for the lowest common denominator, supporting only primitive bound data-types.

Our computing demands are continually strained and exceeding our current hardware capabilities. With the cost of high-end parallel machines, distributed systems make sense as a more cost effective alternative. Blue Gene, IBM’s latest multi-million dollar project in parallel computation, crams 323 (32,768) processors in one 40 by 40 foot system [1]. Even this is relatively small when compared to the 4 million users to which SETI has been distributing its computations [8]. Distributed solutions have many benefits over traditional parallel methods which are initially not obvious. Parallel systems rely on homogeneous hardware and static network architecture. That setup is not tolerant to point failures and hard to maintain. A distributed system is based on autonomous machines. Distributed systems control many loosely coupled heterogeneous computers as a coherent system. Systems become very robust and fault tolerant because key components can be redundant and are seamlessly interchangeable.

At the lowest level a distributed application relies on coordinating processes that are running on multiple machines. Programming that many sockets can be quite complex and time consuming, one needs to make careful considerations for data portability. Luckily there are many tools available to encapsulate the data translation and network functionality.

Java was originally designed to be simple and system architecture neutral. It facilitates the creation of distributed systems in many ways. Its development was geared toward the networked application. Multithreaded network operations and network safe objects were available from the first release of Java. These were heavily utilized in version 1.1 to create Java Remote Method Invocation (RMI). RMI supports true object-oriented designs over different hosts and serves as a perfect tool for distributed data processing.

The distributed-object model provides the highest degree of distribution transparency. The state of a remote object resides on its server machine, and only on that machine can the object be cloned. Access to an object can be controlled through the use of monitors. If the state of an object changes infrequently and there are many reads for each write, the object can be cached on all clients to speed up multiple reads. On each read the monitors will ask the objects sever if there were any critical updates. Only then, will a new request go over the wire.

Agents

A single software agents can operate on client and server sides of a system. This degree of encapsulation is important for making your code thread-safe. Agents can coordinate and cache to efficiently maintaining mutually exclusive access to your remote object. With RMI you design agents as a further abstraction of the remote objects.

“Some agents can monitor their progress towards achieving their goals at a higher level than just successful execution of methods, like an object [4].”

Agents are designed with a functional goal. This functionality is only available through Java because of its dynamic class loading abilities. They can download the code to achieve their goal, if it isn’t available.

Dynamic class loading is feature unique to Java because of it’s interpreted instead of compiled. This means the linking stage happens during runtime. When a remote object is sent an object is has no definition for, It will look to the incoming objects codebase and download the definition from there. Objects are sent as byte streams and converted with Java’s serialization mechanism. Primitive types have standard byte stream formats, and objects referenced by attributes are automatically serialized and annotated with a codebase. The object-oriented approach allows for ridge type checking. Just about anything, except file descriptors or socket connections can be made into a serialized type. The java.io.Serializable is important for our RMI objects because of these object exchanges and the persistence of the data type. It allows for an object to be constructed into and reconstructed from a sequence of bytes. This byte stream is then perfect for Java’s I/O and networking packages. A NotSerializableException will be thrown, if the object is not serializable. To make an object serializable one needs to implement java.io.Serializable, and all of its attributes and methods need be serializable.

The suite of tools provided with the Java Development Kit (JDK) making working with RMI simple and fast. The RMI registry is at the core, and It serves as a naming and lookup service for remote objects. The registry runs as a separate Java runtime environment on the remote object’s host. The RMI registry runs on port 1099 by default, but can be set differently on execution. Once running you can begin to create and manipulate remote objects.

The first step is to write your remote interface. This is merely the prototype for your remote object that you desire to expose to remote invocation. The interface must extend java.rmi.Remote.

public interface ComputationEngine extends Remote { public Object computejob( Job j ) throws RemoteException; }

You now write an object implementation which implements the remote interface and extends java.rmi.server.UnicastRemoteObject. This objects implementation is usually completed with a main function that registers itself with the RMI registry. Objects register themselves by name using the java.rmi.Naming interface. We name our object “CoMpUtEr” ni this example.

public class ComputationEngineImpl extends UnicastRemoteObject implements ComputationEngine { public Object computejob( Job j ) { return j.compute(); } public static void main(String argv[]) { ComputationEngineImpl computer = new ComputationEngineImpl(); Naming.rebind(“CoMpUtEr", computer); } }

Once these classes are completed and compiled, you can then move on with the RMI compiler. The RMI compiler is another helpful tool that simplifies the task of creating your distributed system. This is a source code generation program that creates stubs and skeletons that implement your remote interface. The stub is the client’s mechanism for interacting with the remote object. It contains all the serialization for marshalling method parameters. When an object is requested of the registry the stub of that object is created and returned to the client. On the server, the skeleton is called upon to extract the serialized parameters and pass them to your object implementation

After the stubs and skeletons are created your object can be registered. Our remote implementation of the object can then be fetched through a call like this:

String objectName =”rmi://objectHost.mines.edu/CoMpUtEr"; ComputationEngine myObject = (ComputationEngine)Naming.lookup(objectName);

Conclusion

Although Java is not the newest technology, there are continually new developments that further the ease of distributing systems. Java is one such technology that enables anyone to quickly distribute an application without having worry about the details of portable data exchange or underlying network protocols. Much of the work has already been completed by Sun. You merely need to add your application specific code to their already vase system.

Reference

  1. “Blue Gene: A vision for protein science using a petaflop supercomputer”
  2. Eckel, Bruce. “Thinking in Java” Prentice Hall (1998).
  3. Farley, Jim. “Java Distributed Computing” O’Reilly (1998).
  4. Flanagan et al. “Java Enterprise in a Nutshell” O’Reilly (1999).
  5. Henry, Kevin. “Distributed Computation with Java Remote Method Invocation” http://www.acm.org/crossroads/xrds65/ovp65.html
  6. “The Source for Java™ Technology”
  7. Tanenbaum and Steen. “Distributed Systems: Principles and Paradigms” Prentice Hall (2002).
  8. “Seti Institute”
  9. Walsh, Linda. “The java enterprise CD Bookshelf” O’Reilly (2001).