Languages

Java template for WSDL-first web services using CXF (for Maven2 and Eclipse)

October 23rd, 2008 / Joe on Computing

This took me a while to put together so I thought I’d post it. I wanted the simplest possible template for building a web service in Java. I wanted it to be JAX-WS compliant, so I used the CXF open source implementation which is not only compliant, but also flexible and fast. I also wanted the template to be WSDL first, meaning that I should be able to edit the WSDL by hand to maintain total control over the service contract, then from that, generate Java code to make it easy to fill in the implementation.  (I consider that to be an important part of web service best practices. Doing it the other way - automatically generating WSDL from code - is simpler, but results in messy, sometimes incorrect WSDL that limits your ability to change web service implementations later.) Furthermore, I didn’t want to edit any generated code. I wanted to be able to fill in the implementation details by inheriting from a generated class or implementing a generated interface. Finally, I wanted to take advantage of Maven to build the project, but also be able to work on it in Eclipse, taking advantage of its Web Tools Platform (WTP) to allow synchronization with a live application server. Here’s the result in just under 300 lines of code. (Or you can cut to the chase and just download the zip file and follow the instructions at the end of this posting.)

First, here is the trade.xsd schema file containing the input and output datatypes used by the web services:

<?xml version="1.0" encoding="UTF-8"?>

<xsd:schema targetNamespace="http://com.joemo.schema.trade" xmlns="http://com.joemo.schema.trade"
	xmlns:xsd="http://www.w3.org/2001/XMLSchema">

	<!-- web service input types -->

	<xsd:element name="trade">
		<xsd:complexType>
			<xsd:sequence>
				<xsd:element name="security" type="xsd:string" minOccurs="1" maxOccurs="1" />
				<xsd:element name="quantity" type="xsd:integer" minOccurs="1" maxOccurs="1" />
				<!-- note the use of "unbounded"; comments can occur multiple times -->
				<xsd:element name="comments" type="comment" minOccurs="1" maxOccurs="unbounded" />
			</xsd:sequence>
		</xsd:complexType>
	</xsd:element>

	<xsd:complexType name="comment">
		<xsd:sequence>
			<xsd:element name="message" type="xsd:string" minOccurs="1" maxOccurs="1" />
			<xsd:element name="author" type="xsd:string" minOccurs="1" maxOccurs="1" />
		</xsd:sequence>
	</xsd:complexType>

	<!-- web service output types -->

	<xsd:element name="status">
		<xsd:complexType>
			<xsd:sequence>
				<xsd:element name="id" type="xsd:string" minOccurs="1" maxOccurs="1" />
				<xsd:element name="message" type="xsd:string" minOccurs="1" maxOccurs="1" />
			</xsd:sequence>
		</xsd:complexType>
	</xsd:element>

</xsd:schema>

Next, we need the trade.wsdl file which imports the schema file and completes the WSDL definition:

<?xml version="1.0" encoding="UTF-8"?>
<wsdl:definitions targetNamespace="http://com.joemo.schema.tradeservice"
	xmlns="http://com.joemo.schema.tradeservice"
	xmlns:tr="http://com.joemo.schema.trade"
	xmlns:soapenc="http://schemas.xmlsoap.org/soap/encoding/" xmlns:wsdl="http://schemas.xmlsoap.org/wsdl/"
	xmlns:wsdlsoap="http://schemas.xmlsoap.org/wsdl/soap/" xmlns:xsd="http://www.w3.org/2001/XMLSchema">

	<wsdl:types>
		<xsd:schema targetNamespace="http://com.joemo.schema.tradeservice">
			<xsd:import namespace="http://com.joemo.schema.trade" schemaLocation="trade.xsd" />
		</xsd:schema>
	</wsdl:types>

	<wsdl:message name="tradeInput">
		<wsdl:part name="trade" element="tr:trade" />
	</wsdl:message>

	<wsdl:message name="tradeOutput">
		<wsdl:part name="status" element="tr:status" />
	</wsdl:message>

	<wsdl:portType name="TradeService">
		<wsdl:operation name="book">
			<wsdl:input message="tradeInput" />
			<wsdl:output message="tradeOutput" />
		</wsdl:operation>
	</wsdl:portType>

	<wsdl:binding name="TradeServiceHTTPBinding" type="TradeService">
		<wsdlsoap:binding style="document"
			transport="http://schemas.xmlsoap.org/soap/http" />
		<wsdl:operation name="book">
			<wsdlsoap:operation soapAction="" />
			<wsdl:input>
				<wsdlsoap:body use="literal" />
			</wsdl:input>
			<wsdl:output>
				<wsdlsoap:body use="literal" />
			</wsdl:output>
		</wsdl:operation>
	</wsdl:binding>

	<wsdl:service name="TradeServicePorts">
		<wsdl:port binding="TradeServiceHTTPBinding" name="TradeService">
			<wsdlsoap:address
				location="http://localhost:9084/tradeService/TradeServicePorts" />
		</wsdl:port>
	</wsdl:service>

</wsdl:definitions>

Now we need a Maven project file that will take this WSDL and generate the Java code. Here’s what the pom.xml file looks like. It’s long and messy but it does a lot. It specifies all the dependencies and the compiler level, includes the rule to generate Java code from WSDL whenever necessary, and includes Jetty and WTP support for testing and running the web services in different environments.

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
	xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
	<modelVersion>4.0.0</modelVersion>
	<groupId>com.joemo</groupId>
	<artifactId>ws-example</artifactId>
	<packaging>war</packaging>
	<version>0.1</version>
	<name>ws-example</name>
	<url>http://maven.apache.org</url>
	<properties>
		<cxf.version>2.1</cxf.version>
		<spring.version>2.5</spring.version>
	</properties>
	<dependencies>
		<dependency>
			<groupId>org.apache.cxf</groupId>
			<artifactId>cxf-rt-core</artifactId>
			<version>${cxf.version}</version>
		</dependency>
		<dependency>
			<groupId>org.apache.cxf</groupId>
			<artifactId>cxf-rt-frontend-jaxws</artifactId>
			<version>${cxf.version}</version>
		</dependency>
		<dependency>
			<groupId>org.apache.cxf</groupId>
			<artifactId>cxf-rt-transports-http</artifactId>
			<version>${cxf.version}</version>
		</dependency>
		<dependency>
			<groupId>org.apache.cxf</groupId>
			<artifactId>cxf-common-utilities</artifactId>
			<version>${cxf.version}</version>
		</dependency>
		<dependency>
			<groupId>org.springframework</groupId>
			<artifactId>spring-core</artifactId>
			<version>${spring.version}</version>
		</dependency>
		<dependency>
			<groupId>org.springframework</groupId>
			<artifactId>spring-beans</artifactId>
			<version>${spring.version}</version>
		</dependency>
		<dependency>
			<groupId>junit</groupId>
			<artifactId>junit</artifactId>
			<version>4.4</version>
			<scope>test</scope>
		</dependency>
	</dependencies>
	<build>
		<plugins>
			<!-- Use Java 5 -->
			<plugin>
				<groupId>org.apache.maven.plugins</groupId>
				<artifactId>maven-compiler-plugin</artifactId>
				<configuration>
					<source>1.5</source>
					<target>1.5</target>
				</configuration>
			</plugin>

			<!-- CXF WSDL-to-Java code generation -->
			<plugin>
				<groupId>org.apache.cxf</groupId>
				<artifactId>cxf-codegen-plugin</artifactId>
				<version>2.0.6</version>
				<executions>
					<execution>
						<id>generate-sources</id>
						<phase>generate-sources</phase>
						<configuration>
							<sourceRoot>${basedir}/target/generated/src/main/java</sourceRoot>
							<wsdlOptions>
								<wsdlOption>
									<wsdl>src/main/resources/trade.wsdl</wsdl>
								</wsdlOption>
							</wsdlOptions>
						</configuration>
						<goals>
							<goal>wsdl2java</goal>
						</goals>
					</execution>
				</executions>
			</plugin>
			<!-- Jetty support for testing -->
			<plugin>
				<groupId>org.mortbay.jetty</groupId>
				<artifactId>maven-jetty-plugin</artifactId>
			</plugin>
		</plugins>
		<!-- Eclipse WTP support -->
		<pluginManagement>
			<plugins>
				<plugin>
					<artifactId>maven-eclipse-plugin</artifactId>
					<configuration>
						<wtpversion>2.0</wtpversion>
						<wtpapplicationxml>true</wtpapplicationxml>
						<wtpmanifest>true</wtpmanifest>
						<downloadSources>true</downloadSources>
						<downloadJavadocs>true</downloadJavadocs>
						<projectNameTemplate>[artifactId]-[version]</projectNameTemplate>
						<manifest>${basedir}/src/main/resources/META-INF/MANIFEST.MF</manifest>
					</configuration>
				</plugin>
			</plugins>
		</pluginManagement>
	</build>
</project>

Among other things, the rules in this pom.xml file will generate a Java interface called TradeService (based on the names in the WSDL file). The only code we will have to write is the implementation of this interface. Although this generation is done automatically by any Maven commands that need it (e.g. mvn package or mvn install) you might want to force it to be done sooner rather than later, so that you can refresh your Eclipse project with the generated code, enabling Eclipse to recognize the interface that you’re trying to implement. You can do this using the commands:

mvn generate-sources
mvn eclipse:clean eclipse:eclipse

This generates Java code from the WSDL, then regenerates the Eclipse project files, after which you should be able to refresh the project in Eclipse. If you see errors about libraries not being found, you may need to configure Eclipse to know about your Maven repository, i.e. select Eclipse / Window / Preferences / Java / Build Path / Classpath Variables, then enter the appropriate settings, e.g.

Name: M2_REPO
Path: C:/Documents and Settings/MyAccount/.m2/repository

Once the project is properly configured in Eclipse, you can fill in the implementation:

package com.joemo.service;

import trade.schema.joemo.com.Comment;
import trade.schema.joemo.com.Status;
import trade.schema.joemo.com.Trade;
import tradeservice.schema.joemo.com.TradeService;

public class TradeServiceImpl implements TradeService {

	public Status book(Trade trade) {
		System.out.print ("Booking security ");
		System.out.print (trade.getSecurity());
		System.out.print (", quantity ");
		System.out.print (trade.getQuantity());
		System.out.println();
		if (trade.getComments() != null) {
			System.out.println ("Comments:");
			for (Comment c : trade.getComments()) {
				System.out.print (c.getAuthor());
				System.out.print (": ");
				System.out.print (c.getMessage());
				System.out.println();
			}
		}
		Status s = new Status();
		s.setId("12345");
		s.setMessage("ok");
		return s;
	}

}

We are almost done. We still need a web.xml file which will direct SOAP requests to the CXF infrastructure:

<?xml version="1.0" encoding="UTF-8"?>
<web-app id="WebApp_9" version="2.4" xmlns="http://java.sun.com/xml/ns/j2ee"
	xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
	xsi:schemaLocation="http://java.sun.com/xml/ns/j2ee http://java.sun.com/xml/ns/j2ee/web-app_2_4.xsd">
	<context-param>
		<param-name>contextConfigLocation</param-name>
		<param-value>classpath:appContext.xml</param-value>
	</context-param>
	<listener>
		<listener-class>org.springframework.web.context.ContextLoaderListener</listener-class>
	</listener>
	<servlet>
		<servlet-name>dispatcher</servlet-name>
		<servlet-class>org.apache.cxf.transport.servlet.CXFServlet</servlet-class>
		<load-on-startup>1</load-on-startup>
	</servlet>
	<servlet-mapping>
		<servlet-name>dispatcher</servlet-name>
		<url-pattern>/*</url-pattern>
	</servlet-mapping>
</web-app>

Finally, we need the appContext.xml file, which is the Spring configuration file loaded by CXF that defines the web service endpoint:

<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
	xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:context="http://www.springframework.org/schema/context"
	xmlns:jee="http://www.springframework.org/schema/jee" xmlns:jaxws="http://cxf.apache.org/jaxws"
	xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-2.5.xsd
http://www.springframework.org/schema/jee http://www.springframework.org/schema/jee/spring-jee-2.5.xsd
http://www.springframework.org/schema/context http://www.springframework.org/schema/context/spring-context-2.5.xsd
http://cxf.apache.org/jaxws http://cxf.apache.org/schemas/jaxws.xsd"
	default-dependency-check="none" default-lazy-init="false">

	<!-- Load the needed resources that are present in the cxf* jars -->
	<import resource="classpath:META-INF/cxf/cxf.xml" />
	<import resource="classpath:META-INF/cxf/cxf-extension-soap.xml" />
	<import resource="classpath:META-INF/cxf/cxf-servlet.xml" />

	<!-- Hook up the web service -->
	<jaxws:endpoint id="ws-example" implementor="com.joemo.service.TradeServiceImpl"
		address="/ws-example" />

</beans>

That’s everything. There are six files, only one of which contains any Java code. You need to make sure you put each file in the right place:

ws-example/pom.xml
ws-example/src/main/resources/trade.xsd
ws-example/src/main/resources/trade.wsdl
ws-example/src/main/resources/appContext.xml
ws-example/src/main/webapp/WEB-INF/web.xml
ws-example/src/main/java/com/joemo/service/TradeServiceImpl.java

A zip file of this example is available for download here. To build and run it, you will need Maven to be installed on your development system. Unzip the file, and in the directory containing the pom.xml file, run the command:

mvn jetty:run

That will generate the Java code from the WSDL, build the example, and run the web service in the Jetty container. You should be able to visit the URL http://localhost:8080/ws-example/ws-example?wsdl from a web browser and see the WSDL for the web service, test the web service using SoapUI, and so on.

Alternatively you can run the command:

mvn eclipse:eclipse

and follow the directions from my earlier blog entry to run the example using Eclipse WTP, which will allow you to edit the code while keeping it synchronized with a live application server.

Good luck! If you encounter any problems using this template, please email me or post a comment so that I can look into it and revise the instructions if necessary.

A maze of twisty little Java web service standards, all alike

October 22nd, 2008 / Joe on Computing

It’s almost impossible to keep up with all the fractal-like Java standards related to web services. As fast as each can be learned, Sun invents another, and a dozen open source implementations appear. For my own sanity I tried to create a rough map of some of them. I’ll try to avoid making recommendations; my main objective is to sketch out how they fit together.

First, it’s important to understand that there are three main players with implementations of of these standards: Sun, the Apache foundation, and Codehaus. There are many other open source implementations as well, but these are the three 800 pound gorillas, for a total of 2400 pounds, or almost exactly one metric tonne (for our international audience).

Second, keep in mind that there are three important APIs which are inter-related: JAX-WS, JAXB, and StAX. Once you understand how these fit together, everything else falls into place more easily.

JAX-WS

Let’s begin our journey with the latest Sun standard for creating and consuming web services: JAX-WS, which stands for Java API for XML Web Services. This standard was introduced in 2004. You can ignore JAX-RPC, since JAX-WS replaces it.

There are three noteworthy implementations of JAX-WS. The first is from Sun, and is called JAX-WS RI for the JAX-WS Reference Implementation (they always had a way with names). The second and third are both from the Apache Group and are called Axis2 and CXF. You can ignore Axis1, XFire, and Celtix, since they are all obsolete. There is also a web service framework called Spring-WS, but it’s not JAX-WS compliant.

So if you are creating web services in Java, the first order of business is to to choose an implementation to work with, and unless you have a reason not to, you should probably stick to one that complies with JAX-WS, which means either JAX-WS RI, Axis2, or CXF.

Related to these is an open source project from Sun called Web Services Interoperability Technologies (WSIT), previously known as Project Tango. This is an implementation of several web service standards (WS-SecurityPolicy, WS-ReliableMessaging, and so on). Metro is an open source web service stack which is a combination of JAX-WS RI and WSIT (so it’s actually a reasonable fourth option).

JAX-WS is oriented around SOAP web services, but many programmers are now using the REST approach. Sun is coming out with the JAX-RS API to support that, but it’s not quite ready yet.

JAXB

Web service development requires mapping between XML and Java objects. JAXB is the Sun API for that (also referred to as JAXB2 since the latest version is the important one). There are two noteworthy implementations: JAXB-RI (Sun’s reference implementation) and JaxMe (the unfortunately named contribution from Apache). JaxMe is in the incubation stage and is not formally part of Apache yet. There are many other interesting and popular XML/Java mapping frameworks, but most of them are not compliant with JAXB. Examples include Castor (from Codehaus), JiBX (a spectacularly fast open source implementation), and XMLBeans (a flexible implementation from Apache).

A recurring source of confusion is that in the past, Sun was less clear about the distinction between APIs and reference implementations, so people would take JAXB to mean both, and you would often see online articles like “Which is better: JAXB or JiBX?” But today developers should always try to use the JAXB API, which will enable a choice of compliant implementations such as JAXB-RI or JaxMe with minimal or no code changes.

StAX

For Java code that needs to read and write large XML documents quickly without necessarily mapping them to objects, there is the Streaming API for XML (StAX). There are several implementations of this API too. There is the Sun Java Streaming XML Parser called SJSXP (another snappy name from Sun), the Woodstox open source implementation which is excellent, and the StAX reference implementation from Codehaus which is referred to simply as StAX (unfortunately perpetuating the confusion between APIs and implementations). Xerces is a streaming XML processing library which used to be part of the Apache project, and work was underway to make it StAX compliant, but that was dropped.

Putting it all together

Web services need to process XML, sometimes mapping it to and from Java objects (e.g. for creating proxy objects and an RPC-like experience), and sometimes processing it directly (e.g. for streaming results when high performance is needed). Therefore Sun designed the JAX-WS API to rely on the JAXB API, which makes perfect sense; any JAX-WS compliant web service implementation should therefore be able to use any JAXB compliant mapping library. Other relationships between these APIs are up to individual implementations. For example, JAX-WS RI supports the StAX API, so you can use any StAX implementation for streaming. CXF also supports the StAX API, as well as a host of Java/XML mapping options including JAXB (allowing the use of any JAXB compliant implementation), XMLBeans, Castor, and JiBX. Yes, they are heroes.

So if you get confused, just ask yourself clarifying questions like: Does this Java web service library support JAX-WS? Which JAXB compliant Java/XML mapping implentation shall I use? Which is better for processing streaming XML? Woodstox or the StAX reference implementation?

If you’re still confused, then just accept the recommendations I promised not to make: Use CXF for your web services (which complies with JAX-WS), JAXB-RI for your Java/XML mapping (which complies with JAXB), and Woodstox for streaming (which complies with StAX).

Is Eclipse collapsing under its own weight?

October 20th, 2008 / Joe on Computing

Maybe Eclipse’s black-hole-like splash screen is more appropriate than its designers realize. Eclipse’s open architecture has enabled the creation of countless useful plugins, and that’s helped maintain its position as the leading Java IDE. But as plugins compound upon plugins, bugs and compatibility issues have been surfacing increasingly frequently, and I’m starting to get the sense the Eclipse developers have lost control.

I spent considerable time downloading Ganymede today for only one reason: I wanted to try the new JAX-WS WSDL First project wizard. It would be hard to come up with a geekier, more obscure name than that, but in essence the feature promised to allow me to create a web service with the click of a button, following various best practices. With great anticipation (okay, I’m exaggerating; with vague hopefulness) I downloaded Ganymede, and decided to try the wizard with the CXF web service library. Here’s the error I got:

Error instantiating builder 'org.eclipse.stp.sc.annvalidator'.
Plug-in org.eclipse.stp.sc.annvalidator was unable to load class org.eclipse.stp.sc.annvalidator.builder.AnnValidator.

What? Did they even test this? And who is Ann Validator anyway? A quick Google search turned up incomprehensible articles with titles like “AnnValidator missing from Ganymede Update”. The gist of the articles was basically “oh yeah, we should clean up our releases better”.

I’m probably oversimplifying the Eclipse point of view, but I don’t really care. The point is, now I have to research an obscure problem in order to perform something that should be the most basic of activities - creating a simple web service, using a fresh download of a supposedly mature IDE.

There is only one thing keeping me from switching to Netbeans immediately and recommending that everyone dump Eclipse: I’m still waiting for a good Perforce plugin. Okay, two things: I’m also addicted to Max Uermann’s Goto File plugin.

Spring Madness - .NET Calling

September 19th, 2008 / Tales from a Trading Desk

Today was one of those days when you regret ever venturing down the Java road due to the madness imposed on development from the infrastructure teams within an investment bank. Setting up my development environment with Eclipse, source control and Tomcat was a painful and brain numbing experience. At least by day end I had hopefully contributed to the bloat of Spring configuration. Tomorrow I’m hoping a colleague will show me the secret to getting IntelliJ setup with Tomcat - ever since I used IntelliJ at a recently merged European investment bank I’ve found it increasingly hard to return to Eclipse.

One thing that is clear from today is that C# is in my view a nicer language given the 3.0 feature set - LINQ etc.

      

Disco - Another map-reduce framework

September 5th, 2008 / Tech notes
Highscalability.com discusses about Disco, a map-reduce framework here .

PureMVC

August 13th, 2008

Cross-language implementation of the MVC meta-pattern - PureMVC. Supported languages:

  • ActionScript 2
  • ActionScript 3
  • C#
  • ColdFusion
  • haXe
  • Java
  • Perl
  • PHP
  • Python
  • Ruby

Practical PLT Part 5: A Parser Generator

July 29th, 2008

At this point we’ve developed a complete, though very simple, interpreter and runtime environment.  To some degree, this interpreter is all we need.  It can express any function, it can easily be extended, it has a simple parser, and it has a working garbage collector.  What more could we ask for?  Life is good.

Life is good, that is, until we ask somebody else to use it.  Their first response is probably an annoyance at the syntax — it’s a little surprising to have to write (+ 1 1) after grade school has burned 1 + 1 into our minds — and, truth be told, the S-Expression syntax isn’t the most natural or compact for every problem.

Luckily we can, with a little work, entirely eliminate this complaint by creating a parser generator — a function that takes as input a grammar (a formal description of a language), and produces as output a parser for that language.

Read the rest of this entry »

Slimmer, trimmer messaging from Google

July 15th, 2008 / Joe on Computing

Google’s Protocol Buffers offer lightweight, language-independent object serialization. I love the design, especially as I’m increasingly seeing enterprise networks clogged with hordes of oversized XML messages. Protocol buffer bindings are available for C++, Java, and Python, but not C# yet. Once there is support for .NET, I think this could be a really interesting technology for financial applications.

Practical PLT Part 4: A Garbage Collector

June 24th, 2008

In the previous articles of this series we’ve seen how to create a simple Scheme interpreter, then how to write an S-Expression parser to feed the interpreter, and in the last article we saw how to most conveniently bind C++ functions to our interpreter with the strategic application of template classes and a “meta-circular” code-generator that built up a vital part of our interpreter (using our interpreter to do it).

Though we’ve come a long way, there’s still one glaring hole in our interpreter design: we’ve left the allocate<T> function unspecified.  This function is meant to allocate a garbage collected value of type T, which implies that we’ve got to write a garbage collector.  In this article, that’s exactly what we’ll do.

Read the rest of this entry »

Developing Web applications with Maven and Eclipse: You *can* have it all

June 1st, 2008 / Joe on Computing

When developing applications using Eclipse or a similar IDE, you quickly get used to being able to test your software immediately after making a change. Plugins like MyEclipseIDE enable that kind of instant edit/compile/test cycle for web applications as well.

But if you’re building web applications with Maven, it’s not so easy. Maven is a fantastic tool for building applications and managing dependencies, but it lends itself to a more batch-oriented mode of operation in which you build and deploy war files from the command line. I found myself wishing I could have the best of both worlds; building my web applications using Maven, but with a seamless edit/compile/test cycle when using Eclipse. Then I discovered the WTP (Web Tools Platform) project. Hallelujah!
(more…)

Groovy: Java++ by being Java–

May 8th, 2008 / davber does IT
What? Yet another dynamic scripting language for the JVM? Are you not fed up with the Java-based, and rarely used, implementations of the hyped languages Ruby and Python? Sit down and let me explain. Groovy is actually not a brand new language, but rather an extension of Java. The difference between this extension and that of, say [...]

Practical PLT Part 2: An S-Expression Parser

May 1st, 2008

In the last article we saw how to build a simple interpreter.  We quickly ran into the problem that complex program terms (like the definition of the factorial function) were incredibly difficult to construct manually.  In this article we’ll address that problem by first establishing a shorthand for program terms: S-Expressions.  Then we’ll see how we can mechanically translate textual S-Expressions into the linked Pair structure that our interpreter requires.

Read the rest of this entry »

Comet: Kaazing

April 30th, 2008 / Tales from a Trading Desk

I recently came across Kaazing. Unfortunately Kaazing’s web site doesn’t have a lot of technical information ( , but thanks to the power of Google ) you’ll soon find out that “new type of Virtual Machine(VM) and an Enterprise Comet Bus” refers to the Chi VM. Similar to Lightstreamer, Kaazing is build on Java technology. Here are a few questions I’d like answered:

  • Anyone know of any banks using Kaazing for trading system solutions?
  • Is there any data to compare Kaazing features/performance against Lightstreamer/Liberator?
  • Does the Kaazing SDK come with client side support for Flex and Silverlight?
  • How do I get on the beta program given that is closed? ;)

Practical PLT Part 1: A Scheme Interpreter

April 22nd, 2008

“What?! You’re going to create another programming language? It’s going to be too hard for new people to learn, and there’s just too much complication involved! Interpreters, type-checkers, compilers, parsers, garbage collectors: these things are too difficult for average programmers to understand. And what’s the real business value anyway? You shouldn’t reinvent the wheel! Have you taken a look at Jimmy’s XML configuration system? He says that his files let you just wire up components, and surely that’s simpler than a new language.”

This series of articles on practical programming language theory (PLT) will be an answer to the above refrain. In this first article, I’ll show just how simple it is to make an interpreter for a basic functional language using any modern C++ compiler.

Read the rest of this entry »

New market-surveillance application

April 22nd, 2008

Built on the Apama platform, it’s called the Detica Market Surveillance Accelator. Designed as a sift, to survey all trades booked in a system for bad and/or illegal behavior, detect errors, and to aid in development of a more proactive support model.

In light of the Société Générale debacle, I expect to see more of these in development. Apama strikes me as a near-ideal match for this kind of application.

The Rise of App Engine

April 11th, 2008 / Listen to Me!!!
Google released AppEngine, and there were a bunch of reviews comparing it with Amazon EC2. Not quite the same. From a manager's perspective who is trying to get a web project done probably yes, you can achieve the same goal in this case with both...

But, the real difference is in the level at which these are operating. AppEngine, Heroku et al are I would say at a service level. EC2 bascially lets you write what the heck ever code you want, and provides an abstraction at the machine level. Instead of running on the bare metal, you would run on EC2 and get all the goodies that come with the classic "Add another layer of abstraction" rule.

The importance of AppEngine is the productivity gain that it would bring into a very focused but large set of applications. Yes, there is no support for batch and so, lot of other cleanup jobs etc cannot be really run on the platform as of now. But, it is still in it's initial stages and too early to comment. I don't see it ever brimming with additional features though. If you have noticed with Google's products, it is not the feature set that they try to win the market with. Example: Yahoo Mail and Yahoo Messenger kick GMail and GTalk's butt respectively in terms of feature set. But, it is ease of use, attacking the problems that matter (like spam, being able to find what you wrote, and storage size)

Also, I really don't see them ever supporting my current favorite web framework Grails. Why? Grails is tied to hibernate which I don't see running on a non relational database like the one google has as the data store of AppEngine. Not to say they are not working on a "Write your Grails, we port it to something else and run on AppEngine" virtual machine.

There have been free services that ran JSP/ASP based web applications out there for a while. But, those were more of a "write your app, run on our server" kind of model. App Engine is essentially a scalability/availability mantra for the masses. Google has enforced the scalability best practices by restricting the developers to use a subset of features which lend themselves to scaling.

On an un-related note, I could not find a way to get either blogger or feedburner to offer Tag specific feeds. I had to jump through hoops to achieve it. More on that in a later blog. Stay tuned...

TSSJS 2008 Day Three - Synopsis

March 28th, 2008 / Listen to Me!!!
Session II – eBay Market Place Architecture – Randy Shoup

What happened to Session I ‘s coverage? Lets just blame it on the night before :-)

We all know eBay doesn’t do transactions. Well at least no client side transactions. Their databases run on auto commit. One way they deal with it is by carefully ordering database transactions (example: inserting the slave record and then inserting the master to ensure a consistent master). They also have reconciliary jobs that go through the database and cleanse it periodically.

Strategies for scalability used by eBay
1. Partition
2. Asynchrony
3. Automate Everything
4: Remember Everything Fails

1. Partition

Obviously, they don’t use sessions and they don’t cache business objects (surprisingly). As expected, they use URL rewriting and cookies to track the user. If the data they have to keep about the user is larger than will fit in these two schemes, they use a scratch database. Since they don’t cache business / user related data, they do hammer the databases for all their queries. To handle this situation, they use a custom sharding solution over their ORM and partition their database based on functional divides in the application.

Search: Search queries come to an aggregator which Is actually a scatter-gather (from Enterprise Integration Patterns). This component forwards the search requests to individual nodes which are responsible for indexing and searching just a part of the entire data space. And, then return the results to the aggregator which aggregates (da !!!) and displays them.

2. Asynchrony

The really hard part of massaging systems is guaranteeing once only delivery. If you loosen this restriction, it is a lot easier to scale. They deal with duplicate events by modeling event processing to be idempotent. They deal with out of ordering by making the consumer go to a service that returns the latest state of the event once the consumer receives the event.

3. Automate Everything

Part of that is adaptive configuration: The consumers that dynamically adjusts to meet the SLA by changing parameters like event polling size , number of threads etc. The adaptive configuration also adapts to changes in number of consumer instances.

He gives an example of an adaptive search experience. They have a feedback loop and in an offline way, they analyze it , create a metadata out of it and feed it to the system that uses it to change it’s behavior. Perturbation is the idea that 90% of the time they recommend the optimal. 10% of the time they recommend new options B, C, D etc.. So that if D becomes popular, it will become the dominant recommended value. They also overweigh the negative feedback so that the oscillations are dampened. Pretty slick.

Strategy 4: Remember Everything Fails

Some of the failure patterns used are failure detection, rollback and graceful degradation. Applications log to a message bus and they have listeners that automate the failure detection. It also allows them to detect historical data and it is used from a capacity planning perspective. They get about 1.5TB of log messages every day :-) grep that.

Code rollout/rollback: They have a policy. NO changes to the site that cannot be undone. Each feature has a rollout plan. And, there is a monster rollout plan for the 2 weeks. There is an automated tool that rolls out the dependencies in the reverse dependencies. The automated tool also does rollbacks.

Here is a cool feature. Every feature has a on and off state. It allows them to turn features off rather than redeploying code that lacks that feature. This allows them to deploy features off and then start them later. They are decoupling code deployment from feature deployment. From a developer perspective, they check for feature availability. To blow my own horn a bit, I have built features in the past which can be turned on and off at runtime. I know what you’re thinking (don’t freak out, I don’t) . This is similar to OSGi. Nope not quite. OSGi is about deploying services and controlling their existence. I would say this may be similar from an implementation perspective, but the intent is quite different.

When the resource fails, and it is not critical, it is safely ignored. If a critical service fails, they go to an async mode (and do the processing later) or do failover. When a service does come back up, you don’t want all clients hitting it at once. They have a phased way of letting clients to hit it.

Overall, this talk was very informative. Randy went through a lot of concepts in great detail at a very high pace. I felt there was so much more information left that the talk could have gone for one more hour at the same pace.

Session III – The Busy Java Developer’s Guide to Scala – Ted Neward

A pure functional language has no side effects. But you knew that already. This talk focused on giving an Scala intro to a Java developer. Again, I am not going to cover much of this talk as you can read about Scala yourself.

Lunch Keynote Panel: Patrick Linskey, Ted Neward and others with Eugnee Ciurana as the mike boy ;-)

The conversation went towards the over abundance of frameworks in java. One good point made was, you should never sit to write a framework. You build an application and then extract a framework. In that case, YAGNI will be inherent in that effort. One of the very few web frameworks that was built like that is Rails.

Another point is, it has to be usable before it is reusable. Good one.

In answer to whether the appearance of free type(terse) languages (where syntax does not matter much) java will allow syntax to be optional, it was pointed out that if you try to shoe horn additional features into the language, it will fail under it’s own weight. The main thing in Java is the platform and the APIs and frameworks that are available. Additional languages that run on java platform but support a whole new set of features will see the light.

Session IV: Map Reduce – Why does it Matter – Eugene Ciurana

We looked at Map Reduce and worked out through implementation scenarios in various business domains and the problems or hurdles that we would face in arriving at the solutions using Map Reduce. Audience got to participate and overall, it was a good exercise. (There, I can talk in bullshit business lingo too :-) )

I'm sure you've seen this already, but it is too cool not to point out
http://members.aol.com/matt999h/bullshit.htm

TSSJS 2008 Day Two - Synopsis

March 27th, 2008 / Listen to Me!!!
Session I - Concurrency: Past and Present – Brian Goetz

I have heard Brian’s talks in the past in the No Fluff Just Stuff conferences, and they have always been enlightening.

Brian recommends these papers about concurrency:
coping with parallelism treiber 1986
why threads are a bad idea, osterhout 1995
the problem with threads, edward lee – 2006

The talk is a bit more targeted at junior developers than I had hoped. Brian dealt with deadlocks and deadlock avoiding mechanisms. Brian talked about STM (Software Transactional Memory) and why he thinks it is not the silver bullet. He does not mention the reasons. He says he is thinking of presenting them in a talk in JavaOne.

Session II- Performance Puzzlers: Kirk Pepperdine & Brian Goetz

What followed were general commandments of performance (at least at this day and age). Release memory variables soon. Perofrm benchmarks on a dedicated machine.

One interesting point that was made was that in a benchmark with a system using OR mapper and a similar one that didn’t the one with the OR mapper performed better. The reason was that the concurrent garbage collector was essentially stealing processor cycles to clean up all the objects created by the OR mapper and ended up throttling the system against flooding the database with requests. When the OR mapper was missing, the database was thrashing leading to worser performance. So, even though both the direct JDBC implementation and the ORM implementation hit the database with similar number of requests, the ORM implementation is not flooding the database and is providing throttling for the requests. Interesting, but this is one of the things that happens by accident.

Session III- Implementing an ESB solution using Mule – Ross Mason

Services can be categorized as being in one of three layers.
Task based – represents a business task – example: buy a product – little reuse
Entity Based – represents a sub task – example: bill a customer – some reuse
Utility based – represents an atomic independent service – example: credit card processing – most reuse

Reusability of a service usually depends on which layer the service is classified into. With most reuse coming form the lowest. (as expected)

Mule provides support for creating functional test cases which, I think are more useful than testing one single class in isolation. More about unit testing vs functional testing in a later post.

Mule Expression framework is a way to evaluate expressions on a mule message. This makes content based routing easier. The expressions can be in xpath, groovy and a whole lot of other types.

Lunch Keynote: Why the Next Five Years will be about languages – Ted Neward

Ted built a case of language oriented programming. Object oriented programming is not the pinnacle in the evolution of languages. I seriously doubt if we will ever reach the pinnacle :-). One of the required features of a language will be tool support. These days it is a lot easier to get tool support for new languages. And, many of these languages can be made to run on the same platform that we currently run on. In the near future, we will see a lot more languages prop up into our mainstream development. Overall, the keynote was great.

Session IV: You got Your Ruby in my Java – Chris Nelson

Chris gave an introduction of Ruby. This talk is basically a Ruby introduction to the Java developer. I am not going to list the content of the talk as you can google it up and read about Ruby. It is nice that we as an industry are finally breaking the language stalemate that we hit and are willing to look at other languages to improve productivity.


Fire Side Chat: Concurrent Programming with Java and Erlang:

This was the first fireside chat that I’ve been to. The closest I’ve been to a fire side chat is viewing a chat on Google video :-). This free form of discussion with the right participants can really touch on various topics. Something like the discussion on TSS without the flame wars :-).

Fire Side Chat: Mission Critical deployment at Leap Frog Systems - Eugene Ciruana

I have to admit that I don't exactly remember the name of the presentation. But, this is by far the best one I've attended at TSSJS. The crown jewel if you will. To give an analogy, if you've read Patterns of Enterprise Application Architecture - Martin Fowler, it is the embodiment of years (if not decades) of rich software development expertise condensed in a book. This talk was the embodiment of years of practical, i repeat, practical software development experience condensed in a 1 hr talk. This was a fire side chat and not a typical presentation. We the audience, got to ask all sorts of questions about Leap Frog's infrastructure and more importantly, the design decisions that went into selecting the software solutions. It was like discussing the battle plans and reviewing the war with a war general.

No matter how many subject matter experts you talk to, it is the experience in the field and in the trenches that really teaches you and really counts.

TSSJS 2008 Day One - Synopsis

March 26th, 2008 / Listen to Me!!!
Key Note: Neal Ford, Thoughtworks

Neal started with an assault on Ruby of Rails developers(and later justified the reason why ruby on rails developers brag). I think the reason why they brag so much is because it is so productive. Thankfully, on the java side, we have Grails to the rescue. Anybody who hasn’t taken a look at Grails absolutely should.

Next, he builds a case for DSLs. One of the strong points of DSLs is that the context is implicit. You don’t have to keep reminding the runtime of the context:

Example: Consider the following java code

Car car = new Car();
car.setColor(Color.BLACK);
car.setTransmission(Transmission.MANUAL);
car.setPrice(30000);

Now take a look at this in a dsl that I just made up

create Car:car
color:black
transmission:manual
price:30000

Now that is much more clearer as we don’t have to keep ourselves repeating the context which in this case is car.

One thing you have to give to Neal Ford is, he knows how to give presentations. In his presentation, he normally uses just one sentence per slide, Sometimes just one word. And lots of pictures. These pictures are typically not from the computer science domain. These are usually analogies from the real world that help the listener in understanding the underlying concepts. Another technique I noticed him use is when he wants the user to concentrate on his talk and not on the screen, he leaves the screen blank.

Also, a template for your presentation is not needed. Leave it dark. It works magic with the audience. Templates and light colored background distracts the listener from what you want to show. This style of presentation is also visible in Steve Job’s presentation. He never uses a template.

Ok that was an un-intended tangent. Now back to the talk: As a programmer, it adds real value to write programs that are close to how the user talks about the problem. It is a huge advantage if the users can read what the programmer can produce..

The final point was that the development stack of the future would be a structured programming language at the bottom. A dynamic programming language on top of it to improve development productivity and then a DSL on top to stay close to the problem space.

Tactical Design by Glenn Vanderburg:

Presentation tip: Try to build on key aspects of prior presentations (the ones you like) so that the audience get a transitionary feel.

His talk concentrated on how to improve the design skills of poor designers and make them good designers. I think it is obvious that DRY and sticking to one level of abstraction thought the method is a good way to improve design. This is analogous to saying “use Design Patterns”. It is also ironic that he took the same topics and repeated them over and over. Overall, the talk was sub mediocre. Luckily, I had my laptop with me and caught up on some grails reference documentation.

p.s: Lenovo sucks. But having a built in upgraded battery back that gives you 6 hours of alive time rocks.

Also, if ure in a talk that you are not sure of, sit at the last so that you can sneak out if the talk sucks. (Also iterated by Hani in one of the previous years of TSSJS)

Self Scaling Java Based Cloud Architecures – Jinesh Varia, Evangalist at Amazon

I can say I hate talks given by evangelists as most of them tend to be biased. The talk sounded like it was a product presentation and less about giving insight. It takes a visionary to be able to give insight. I thought I can stick around as I wanted to know more about Amazon Web Services. Basically it was a vendor talk, and does not delve into the methodology or mindset that leads to cloud computing.

After 30 minutes, I could not bear it anymore and snuck out.

Next stop: Building REST-ful web services with the JAX-RS API – Mark Hansen

This talk turned into a ‘show and tell’ and I snuck out as soon as I typed the title.

Speed: Kirk Pepperdine

This was the third talk I went to in an hour. You can’t go wrong with a talk about JVM J. The talk covered improvements in garbage collection times and algorithms. One of the cool things that Java 7 (sun jvm) will do is look for local references and allocate them on the register. So that they never see RAM.

After the talk, I caught up to Kirk and asked him a few questions about when you would do volatile (for primitives) vs when you would do AtomicBoolean(and other atomic classes). More on this on a separate blog.

Designing for scalability: Patrick Linskey

This talk was about horizontal vs vertical scalability. Running into contention because of shared state. And, about partitioning the application at various levels to achieve scalability.

An application can be partitioned in multiple ways. Some of which include:

Partitioning along application bottlenecks.

Patritioning along data set “fault lines”. Example: geographic collocation so that only relevant information is close to where it is being used. This is an example of a stateful service being partitioned for scalability.

Also using asynchronous execution to increase scalability. Nothing ground breaking. One good point made was that asynchronous architectures inherently support throttling so that the system can catch up on processing in off peak hours.


Boldly Go where the java language has never gone before – Geert Bevin

This is an interesting frame of mind that you do not need to leave the java language. The language can be instrumented or transformed into a different execution platform so that the developer skills can be reused. Geert gives example of Terracota which performs dynamic byte code manipulation of classes to provide clustering without the developer having to use a vendor specific API.

The next example Geert touches is GWT. GWT lets the developer code in java but generates javascript under the covers in production. I would think the intent of GWT is nice. Free the developer from having to worry about javascript incompatibilities between different browsers. The implementation is based on java, which I increasingly think is a wrong tool to create web based/GUI applications. The Java syntax is too buttoned up. Grails on the other hand is a wonderful tool to create web application. I think a grails implementation of GWT would really shine.

The next example is Android. This is my favorite. Android, as you might already know lets the developer write code in java, but gets compiled into an executable that gets interpreted by the Dalvik virtual machine on the cell phone. Although the developer is coding in java, like in GWT, what gets executed has nothing to do with the JVM. I have not done any serious Android development. My take is that UI development is better left to templates. I think it is just a matter of time that a UI templating framework is released on top of Android.

The scalability pitfalls of the realtime web- Jonas Jacobi & John Fallows

The presenters build a case for a event driven server which is basically a reverse ajax. And started listing the pitfalls and solutions(or hacks depending on your perspective) to these pitfalls. Snoozfest. And yes, I got out as I could.

There was only one other presentation that was remotely interesting, which was a use case on SCA.

Next Generation Payment Systems using SCA

My take on SCA is that it is trying to make the same mistake that EJBs did with the “thy shalt give thy java interface to the client”. I am a big proponent of simple and non language specific contracts between multiple parties in a distributed platform. When you expose a webservice (even REST based ones) you are essentially creating a contract with the end developer about the nature of the data interchange or service invocation. The problem with a language specific contract is that apart from being tied to the language, you now get into headaches with versioning. Most programming languages that we currently have do not work well with versioning. So much so, that these days, we have had to build specifications like OSGi to handle multiple versions of an object existing in the same virtual machine.

The beginning of Microsoft’s collaborative work with the Eclipse Foundation

March 19th, 2008 / Development in a Blink

Bruce Kyle posts details.