Simple Object Access Protocol (SOAP)

SOAP is an XML-based messaging protocol. It defines a set of rules for structuring messages that can be used for simple one-way messaging but is particularly useful for performing RPC-style (Remote Procedure Call) request-response dialogues. It is not tied to any particular transport protocol though HTTP is popular. Nor is it tied to any particular operating system or programming language so theoretically the clients and servers in these dialogues can be running on any platform and written in any language as long as they can formulate and understand SOAP messages. As such it is an important building block for developing distributed applications that exploit functionality published as services over an intranet or the internet.

Let’s look at an example. Imagine you have a very simple corporate database that holds a table specifying employee reference number, name and telephone number. You want to offer a service that enables other systems in your company to do a lookup on this data. The service should return a name and telephone number (a two element array of strings) for a given employee reference number (an integer). Here is a Java-style prototype for the service:
String[] getEmployeeDetails ( int employeeNumber );

The SOAP developer’s approach to such a problem is to encapsulate the database request logic for the service in a method (or function) in C or VB or Java etc, then set up a process that listens for requests to the service; such requests being in SOAP format and containing the service name and any required parameters. As mentioned, the transport layer might be HTTP though it could just as easily be SMTP or something else. Now, the listener process, which for simplicity is typically written in the same language as the service method, decodes the incoming SOAP request and transforms it into an invocation of the method. It then takes the result of the method call, encodes it into a SOAP message (response) and sends it back to the requester. Conceptually, this arrangement looks like the following:

While there are many different specific architectures possible for implementing this arrangement, for the purposes of illustration we will summarise one specific possibility.

Let’s say the database system is Oracle. The developer writes the service method in Java and connects to the database using an Oracle implementation of JDBC. The listener process is a Java Servlet running within a Servlet Engine such as Tomcat. The servlet has access to some Java classes capable of decoding and encoding SOAP messages (such as Apache SOAP for Java) and is listening for those messages as an HTTP POST. The transport is HTTP over TCP/IP. The client is an excel spreadsheet. It uses a VB Macro which in turn exploits the Microsoft SOAP Toolkit to encode a SOAP request and decode the response received. Here is a schematic of what that specific implementation looks like:

Note that on the client side the VB Macro relies on both the Microsoft SOAP Toolkit (the SOAP DLLs) and a HTTP Connector interface. Such HTTP Connector DLLs are typically already installed as a part of Internet Explorer. On the server side you will notice that the SOAP package relies on some XML Parser to parse the SOAP messages. In the case of Apache SOAP for Java this will be Xerces.

There are of course many other ways to go about building such a service without using SOAP. One obvious way is to allow your clients direct access to a stored procedure in the database via ODBC or JDBC. Here’s a few reasons why you might want to choose a SOAP-based solution instead:
With a stored procedure solution you will not be able to send or receive rich data structures as parameters or return values. This is a result of the nature of relational database procedure calls. The parameters to such calls are limited to a list of values of primitive type (integer, float, string etc.). The same goes for the data returned. In our simple example of corporate telephone numbers this is not an issue. We send an employee number (an integer) and receive a name and telephone number (a pair of strings). But what if your service needs to provide the employee’s usual telephone number plus a list of other telephone numbers which are valid during certain periods? This would be the case if your database tracks changing phone numbers of employees as they go on business trips. Now your service must return a complex type of the form:
EmployeeContactDetail {
String employeeName;
String phoneNumber;
TemporaryPhoneNumber[] tempPhoneNumber;
}

Where the user-defined type TemporaryPhoneNumber is defined as:
TemporaryPhoneNumber {
int startDate; //julian date
int endDate; //julian date
String phoneNumber;
}

Note that there can be any number (zero or more) temporary phone number records for the employee in question. The prototype for your service now looks like this:
EmployeeContactDetail getEmployeeDetails ( int employeeNumber );
Now, an ODBC or JDBC approach obliges you to flatten complex data structures. But in our case this is impossible since there are an unknown number of TemporaryPhoneNumber records. The SOAP protocol, on the other hand, is sufficiently powerful to allow you to encode data structures of any level of complexity.

You may have an n-tier architecture where some of your business logic is coded outside the database and the services you intend to write need access to that business logic. With a stored procedure solution, your choices are limited to rewriting that logic in SQL (always an unattractive proposition and, in any case, not always possible depending upon the precision requirements of the business logic calculations) or creating some kind of openserver-style solution where the calculations are handed off by the stored procedure to a calculation engine which incorporates your business logic code. This is a piece of work but perhaps a good choice if your business logic is not written in Java. If it is written in java then all you would need to do (conceptually) in the SOAP-based solution is include the business logic as a Jar between the Service Method Jar and the JDBC Jar in the above diagram. Importantly, you’ll notice that it is not SOAP that empowers us to do this but the fact that we are using a servlet engine. There is nothing to stop us from simply writing a servlet to encapsulate our business logic, which will then in turn take care of the database access and calculations. So why involve SOAP in this case? The fact is that otherwise you have the choice of 1) building one servlet per service or 2) building a generic servlet but inventing your own custom method identification and parameter encoding scheme. If you choose SOAP on the other hand, not only has all the method identification and parameter encoding work been done for you but the protocol is a w3c standard so your clients will not have to learn your custom protocol. This is important when considering offering services over the internet.

JDBC is only valid for Java clients. If you have a mix of Java and non-Java clients then you will have an inconsistent method of access to your services. You may even have more than one database on the back end (and these databases may even be of different types) and you want to insulate the client from this fact. Once again, a servlet-only solution (without using SOAP) will get you around these problems but will be less attractive than a SOAP-based solution for the reasons given above.

Okay, so what about CORBA? It is true that CORBA will address all of these issues. You can have complex data types, clients and servers can employ any mix of languages and platforms, you can reuse the business logic layer of your n-tier architecture and you can insulate the client from back end architectural details. So why should we introduce SOAP? Here’s a couple of reasons:
CORBA requires you to compile and distribute client stubs for each type of client that you have. This is not always practical particularly when you have many platform and language combinations or when you want to offer services to anonymous clients over the internet.

If developing web services (see below) then IIOP (CORBA’s transport protocol) is not particularly firewall friendly. So if you want to offer services to clients over the internet, while it will not be impossible with CORBA, you will have to overcome some firewall-related obstacles.
Importantly though, SOAP is an XML-based protocol and consequentially particularly verbose. CORBA over IIOP will beat it for performance as marshalling and demarshalling in CORBA is more efficient and there is less data on the wire. That being said though, there is one significant advantage of SOAP being XML-based: the fact that it is human readable and writable. This means you can easily read and manipulate the messages that are going over the wire. This is extremely useful when debugging.

In this first section of SOAP Basics we considered an example of how you might use SOAP in the context of a corporate intranet. And we saw that when you have a need to communicate in complex data structures or across a variety of platforms it can be a good choice (or at least as good as CORBA). But where SOAP really shines is as the message protocol for web services. To understand why this so you will need to have an idea what is meant by the terms Web Services and the Service Web which we will describe in the next section. [Nicholas Quaine]

One thought on “Simple Object Access Protocol (SOAP)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s