Thinking twice about .NET events in high-performance scenarios

Yesterday I asked a question on Stack Overflow about using .NET events in high-performance code. My question was, basically, whether it makes sense to do so, or if it’s appropriate to use some alternate strategy for providing “event-like” behavior that results in less GC pressure.

If you’re interested in the details of the question, you can always visit the above link. But in essence, my question stemmed from the realization that your typical event-based code involves:

  • A delegate most likely of type EventHandler<TEventArgs>, where TEventArgs derives from EventArgs
  • An object of the abovementioned TEventArgs type, which must be instantiated for every “event”

That last part is really the crux of my question. In a high-performance context, is it really the right thing to do to create a new object on every event, if events are happening extremely rapidly?

The voice of reason inevitably steps in to answer questions like this, and yesterday that came in the form of (among others—actually there were a few) the ever-reliable Hans Passant, an SO user I don’t happen to know personally but whom I’ve come to greatly respect for his consistently matter-of-fact and logical answers. He said:

[The TEventArgs] objects are always gen #0 objects. They don’t stick around long enough to ever get promoted. Both allocating and garbage collecting them is dirt cheap. This is just not a real problem, don’t chase that ghost.

100% sound, reasonable advice. But I chased the ghost anyway. And I’m posting this to share my findings with the world.

(The short answer, if you don’t feel like following me too far down this detail-laden road, is that Hans was right. It… barely… matters. But I still think the information I’m about to share is, at the very least, interesting.)

First of all, here’s what I did. I wrote a little demo program comprising the following pieces:

  1. An object that raises events at a ridiculously high rate (basically, as fast as the machine and the CLR can manage)
  2. A method that tracks GC collections and memory allocation while said object is doing its thing

Here’s the base type I used as a basis for each of the different “strategies” for raising events:

abstract class CrazyDataSource : IDisposable
{
    ManualResetEvent _stop;

    public CrazyDataSource()
    {
        _stop = new ManualResetEvent(false);
    }

    public void StartReceivingDataLikeCrazy()
    {
        var t = new Thread(ReceiveDataLikeCrazy);
        t.IsBackground = true;
        t.Start();
    }

    public void Dispose()
    {
        _stop.Set();
    }

    protected abstract void OnDataReceived(int data);

    private void ReceiveDataLikeCrazy()
    {
        int data = 0;
        while (!_stop.WaitOne(0))
        {
            OnDataReceived(data++);
        }
    }
}

As you can see, the above type calls OnDataReceived literally as fast as it possibly can (on a single thread). This is definitely an exaggerated scenario, as in virtually no cases will any real-world event be raised at this frequency. (I work at an algorithmic trading firm which trades thousands of products, whose software subscribes to real-time market data; and our market data definitely doesn’t come in that fast.) But sometimes taking an exaggerated approach is the best way to highlight real differences.

Now, to actually test out the performance difference between the various approaches to providing event-like behavior, I wrote three classes deriving from CrazyDataSource: one that raises events the “normal” way, one that raises events not matching the traditional EventHandler<TEventArgs> signature, and one that doesn’t raise .NET events at all and just uses a virtual method to signal the receipt of data (for example). And of course a little program to test the GC pressure exerted by each.

Now here’s the interesting part. Take a look at the output of an example run of said program:


Press Enter to start testing the 'conventional' event-based approach.
Press Esc any time to stop.

1000000 events raised; 321772 total memory used.
23 gen 0 collections
21 gen 1 collections
0 gen 2 collections

2000000 events raised; 264428 total memory used.
46 gen 0 collections
44 gen 1 collections
0 gen 2 collections

3000000 events raised; 207084 total memory used.
69 gen 0 collections
67 gen 1 collections
0 gen 2 collections

Finished. Cleaning up...
Done. Press Enter to continue.

Press Enter to start testing the 'unconventional' event-based approach.
Press Esc any time to stop.

1000000 events raised; 205660 total memory used.
0 gen 0 collections
0 gen 1 collections
0 gen 2 collections

2000000 events raised; 213852 total memory used.
0 gen 0 collections
0 gen 1 collections
0 gen 2 collections

3000000 events raised; 213852 total memory used.
0 gen 0 collections
0 gen 1 collections
0 gen 2 collections

Finished. Cleaning up...
Done. Press Enter to continue.

Press Enter to start testing the virtual method approach.
Press Esc any time to stop.

1000000 events raised; 205776 total memory used.
0 gen 0 collections
0 gen 1 collections
0 gen 2 collections

2000000 events raised; 213968 total memory used.
0 gen 0 collections
0 gen 1 collections
0 gen 2 collections

3000000 events raised; 213968 total memory used.
0 gen 0 collections
0 gen 1 collections
0 gen 2 collections

Finished. Press Enter to quit.

Now that, to me, is pretty interesting. The conventional event-based approach actually did result in not only plenty of garbage collections, but even generation 1 collections! This is contrary to what Hans had indicated, and it surprised me as well (how were those little objects making it all the way to gen #1?).

In contrast, using the “unconventional” event-based approach created seemingly no garbage at all. And neither did the virtual method approach. This makes sense; if you don’t instantiate objects, there isn’t going to be anything to collect. Still, the contrast is quite striking.

Now, before you get all excited and decide to forsake events forever, let me just state the obvious: this was a ridiculously unrealistic test. Any code you may deal with that raises a million events in like 1–2 seconds is likely to be… well, in desperate need of fixing. Or scrapping. So I am definitely not planning on digging through all my company’s code and refactoring the events out of everything.

But I am glad that I conducted this test, as it will inform the way I design high-performance components moving ahead into the future. In the context of real-time trading, where garbage collections are one’s worst enemy, it’s definitely good to know as much about where memory is being allocated in your code as possible, so that—should the need arise—at least you know where to go when it comes time to prune. (I’m not saying that stripping out events would be my first order of business, mind you. Not by a long shot.)

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: