Performance

20-million messages per second?

September 24th, 2008

At a client-site I was privy to an impressive IBM presentation on their

Websphere MQ Low Latency Messaging platform (LLM)
, with some pretty exorbitant claims: 8-million messages per second on gig-ethernet, and 21-million per second on infiniband(!).

I’ll be interested to see if it stands up to scrutiny.

Improving Performance of XML Serializers in .Net

June 23rd, 2008

1. Problem background

.Net infrastructure makes it very easy to define and use strongly typed wrappers (XML serializes) to read from and write to an XML document. Default implementation of this support comes with a performance penalty that your program pays at run-time. By default, at runtime, .Net infrastructure will generate a serializer class on the fly, compile it using C# compiler into a temporary assembly, load the assembly and then use the generated class. Startup time cost of doing so is obvious, and may not be acceptable in many scenarios. In addition if compiler is not available in production (not installed, or disabled to run by policy), application will simply fail. This article shows how to address the problem by explicitly generating serialization assemblies and shipping them along with your application.

2. Creating sample project

Lets start by creating a sample project that uses XML serialization and demonstrating the problem. In Visual Studio go to File/New/Project…, select Visual C#/Windows/Console Application. Input “XmlSerializerLab” for the project name and hit OK.

Now that we have generated default console template let add very simple code that creates and uses XML serialization infrastructure. Below are the modifications to generated Program.cs with explanations:

   1:  using System;
   2:  using System.IO;
   3:  using System.Xml.Serialization;
   4:   
   5:   
   6:  namespace XmlSerializerLab
   7:  {
   8:   [XmlRootAttribute("MyRoot")]
   9:   public class MyXmlRoot
  10:   {
  11:   }
  12:   
  13:   class Program
  14:   {
  15:     static void Main(string[] args)
  16:     {
  17:       // create strongly typed content
  18:       MyXmlRoot root = new MyXmlRoot();
  19:   
  20:       // create serializer
  21:       var serializer = new XmlSerializer(
  22:         typeof(MyXmlRoot));
  23:   
  24:       // serialize
  25:       var ms = new MemoryStream();
  26:       serializer.Serialize(ms, root);
  27:   
  28:       // verify serialized XML
  29:       ms.Position = 0;
  30:       Console.WriteLine(
  31:         new StreamReader(ms).ReadToEnd());
  32:      }
  33:    }
  34:  }

 

Lines 2-3 import namespaces for XML serializer and stream support

Lines 8-11 define the simplest possible class to represent our XML document. We only represent the top level element which should suffice for our test.

Line 18 creates an instance of strongly typed class that we will serialize using framework

Lines 20-26 serialize our instance into memory stream

Lines 28-31 print the content of the memory stream to standard output so we can eyeball the resulting XML content

If you run your application now you should see something similar to:

image

3. Verifying the default implementation

Now we can add some code to verify that .Net runtime tries to use pre-generated serializers before generating and compiling them on the fly. By default .Net runtime will look for serializer in “AssemblyName.XmlSerializers.dll”, where “AssemblyName” is the name of the assembly that contains the actual class being serialized (MyXmlRoot in our example). To verify we will listen to domain’s AssemblyResolve event and print all the assemblies that .Net loader tried to locate but failed.

   1:  using System;
   2:  using System.IO;
   3:  using System.Xml.Serialization;
   4:   
   5:   
   6:  namespace XmlSerializerLab
   7:  {
   8:   [XmlRootAttribute("MyRoot")]
   9:   public class MyXmlRoot
  10:   {
  11:   }
  12:   
  13:   class Program
  14:   {
  15:     static void Main(string[] args)
  16:     {
  17:       AppDomain.
  18:         CurrentDomain.AssemblyResolve +=
  19:         (sender, e) =>
  20:       {
  21:         Console.WriteLine("Not found: {0}",
  22:           e.Name);
  23:         return null;
  24:       };
  25:       
  26:       // create strongly typed content
  27:       MyXmlRoot root = new MyXmlRoot();
  28:   
  29:       // create serializer
  30:       var serializer = new XmlSerializer(
  31:         typeof(MyXmlRoot));
  32:   
  33:       // serialize
  34:       var ms = new MemoryStream();
  35:       serializer.Serialize(ms, root);
  36:   
  37:       // verify serialized XML
  38:       ms.Position = 0;
  39:       Console.WriteLine(
  40:         new StreamReader(ms).ReadToEnd());
  41:      }
  42:    }
  43:  }

 

Lines 17-24 subscribe to AssemblyResolve event and output the name of the assembly being located to the console. If you run application now, you will see that .Net loader was trying to load serialization assembly for MyXmlRoot class twice. First it attempted to locate an assembly using its strong name (XmlSerializerLab.XmlSerializers, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null) and then assuming the assembly is not signed (XmlSerializerLab.XmlSerializers).

4. Solution

As I mention in the very beginning of the article solution is really simple: we need to pre-generate serialization assemblies at build time and ship them with our application. Please note that Visual Studio has a project attribute “Generate serialization assembly” in project properties/Build tab/Output section. Contrary to what its name implies, it won’t do you any good, unless you project contains Web service proxies. I’m not sure if this is by design or simply a bug, but for everything else adding an explicit post-build script (sgen /a:"$(TargetPath)" /force) will do it. Make sure that “sgen” utility is in the PATH or its full path is explicitly resolved. Alternatively you can run the command manually from Visual Studio command line. Start VS command line, change your active folder to our project’s bin/Debug and run the following:

image

If you examine Debug folder now, you will find that “XmlSerializerLab.XmlSerializers.dll” was created. Run you application now. You will see that no assembly resolution events are fired by .Net loader and startup time is considerably faster.

Not-So-Hidden Latency

March 19th, 2008

I had a meeting this morning with Al Moore, one of the founders of Fixnetix, a provider of ultra low latency market data and connectivity. It immediately brought to mind a conversation I had with Tom Groenfeldt earlier this week about hidden latency. It continues to baffle me when financial institutions will spend millions shaving microseconds off of their data handling processes by optimizing their code and implementing CEP solutions, and then, after all is said and done, they’ll take this newly optimized codebase and hook it up to something like Reuters to receive their data, which itself has a latency that is milliseconds more than a low-latency data provider. Why not pocket that money, save those man hours and just switch data providers? Or better yet, do both?

It’s a classic case of not seeing the forest for the trees. Optimizing a system requires looking at the entire system - not just diving into a piece of it. You might very well shave more latency off of your architecture by changing data providers or removing that one extra switch from your network architecture than spending man-years optimizing your event processing software. Financial institutions need to remember to focus on the not-so-hidden latency before diving into a search for hidden latency.

CEP code fragment: Leapfrog queueing in Apama monitorscript

February 7th, 2008

The idea is to respond to incoming price events, but always process the latest event, skipping intermediate events. Here’s the way you can express that in Apama’s monitorscript language.

You can stage processing of an external event by caching an internal-version of the event, and enqueuing a ‘process-this-price’ event (putting it at the back of the queue). As more 3Y prices come in, they overwrite the cached event, and enqueue more ‘process-this-price’ events. When the ‘process-this-price’ events are reached, the latest price will be removed from the cache and the internal-version routed as the ‘real’ price event. Subsequent ‘process-this-price’ events result in no events because the 3Y was already removed from the cache. Here’s the code: Read the rest of this entry »

CUDA: GPU architecture for NVidia cards

February 4th, 2008

This morning fellow 49′er Doug and I spoke with a colleague in Citibank who spent friday at an absolutely free boot camp for NVidias GPU development architecture. Wish we’d known about it beforehand, if nothing else for the free lunch. :)

There’s a lot of activity around building development architectures to take advantage of the compute power on Graphics Processing Units. Read the rest of this entry »

A Profiler Worth Mentioning

June 5th, 2007

After spending days trying to get Rational’s Quantity and Compuware’s DevParter Studio to play nicely with our multi-threaded Windows application (with no success to show whatsoever), I’m happy to report that our search for a .NET 2.0-compatible performance profiler is at a tentative end. On the advice of an esteemed colleague (let’s call him “Luke”), we downloaded a trial version of jetBrain’s dotTrace 3.0, and it, in a matter of minutes, has proved itself to be far more performant and user-friendly than either of the two highly-touted packages above.

The trial version (good for 10 days) can be found here.