Comparing two WorkerChain implementations

Comparing two WorkerChain implementations

In my last post I gave my rationale for designing a WorkerChain class that would provide a simple API for executing actions atomically and asynchronously — but in sequence with respect to one another. (The concept is a bit involved; to fully understand it, take a look at the original post.)

My (First) Implementation

My first crack at the implementation of this class (full source code posted here) might be described as follows:

The chain consists of a collection of workers. This collection may be visualized as a team of individuals in a line, each of whom is standing next to a switch labeled “Get To Work.” Each worker has a single task assigned to him. When his “Get To Work” switch is flipped on, he immediately switches it off and gets started on his task. Once he’s finished, he does two things:

  1. He flips his neighbor’s switch to “On” (neighbor here meaning the worker who comes after him on the team). If his neighbor’s switch was already on, he leaves it on.
  2. If his own switch was switched on while he was working, he restarts: that is, he switches it off and gets started right back up on his task.

The code I wrote to implement this concept wasn’t all that complicated, really. The workers were implemented as the ThreadedWorker class, which spawns its own thread in its constructor and uses an AutoResetEvent as its signal to get to work. On completion, the worker raises an event: WorkCompleted.

So really the meat of this class is in these two methods:

public void GetToWork() {
    _actionRequestedEvent.Set();
}

// this loop is running on a dedicated thread
private void Work() {
    while (_actionRequestedEvent.WaitOne()) {
        try {
            _workItem.Invoke();

        } finally {
            OnWorkCompleted();
        }
    }
}

The way the chain works is this: in the constructor, it initializes an internal array of ThreadedWorker objects, and adds a handler to each worker’s WorkCompleted event that triggers the next worker’s GetToWork method.

This ThreadedWorkerChain class exposes one method: StartWork. What this method does is very simple: it calls GetToWork on the first worker in the chain.

With me so far? OK, good.

Aaronaught‘s Implementation

I posted a question about this idea on Stack Overflow (a ridiculously addictive site… if you’re a huge nerd like myself) and, to my pleasant surprise, got some excellent suggestions. The most fully fledged suggestion — by which I mean a full freaking implementation (no kidding!) — came from the user Aaronaught, who (probably correctly) remarked that my solution seemed overly complex.

Here’s how Aaronaught’s implementation might be described (note that his terms “pipeline” and “stage” roughly correspond to my own terms “chain” and “worker” — but not exactly, as I will explain):

The pipeline consists of a collection of stages. Each stage may be visualized as a little machine that produces and stores what I’ll call “workers” (even though “worker” here doesn’t really mean what it meant in my version — here the “worker” is really the action itself). Each machine has a button labeled “Execute.” When the button is pressed, the machine does the following:

  1. If the machine doesn’t contain any workers, it produces one and immediately sets it to work.
  2. If the machine already contains one worker, it produces another worker that is programmed to get to work immediately once the first worker is finished.
  3. At any given time the machine can store no more than two workers. So if the machine already contains two workers, pressing the button does nothing.
  4. Once a worker is finished, it self-destructs (the machine consequently contains one less worker).

I think I’ve done Aaronaught’s idea justice in the above description. Obviously, it’s only an analogy; but for the purpose of comparing our two implementations in principle, I think it’s pretty accurate.

Now let’s take a look at the code, complete with comments (added by yours truly) explaining how the code works in relation to my somewhat ridiculous description above:

private bool CanExecute()
{
    if (readyEvent.WaitOne(0, true))   // machine is empty --
        return true;                   // produce a worker
    if (!queuedEvent.WaitOne(0, true)) // is another worker ready?
        return false;                  // yep -- skip out
    readyEvent.WaitOne();              // nope -- produce one!
    queuedEvent.Set();                 // new worker is ready to go
    return true;
}

public bool Execute()
{
    if (!CanExecute()) // machine already contains two workers
        return false;
    action();          // set next worker to work
    readyEvent.Set();  // worker self-destructs
    return true;
}

That gives you an idea how the stage “machines” (my term, not Aaronaught’s) work. The pipeline works, quite simply, by enumerating over the stages and executing each one in sequence — but cutting out as soon as Execute returns false (in terms of the analogy, when pressing the button does nothing):

public void Start(int index)
{
    foreach (Stage currentStage in stages.Skip(index + 1))
        if (!currentStage.Execute())
            break;
}

Now, since one of my requirements for this chain idea was for it to be non-blocking, I would simply wrap that Start method in a call to ThreadPool.QueueUserWorkItem. But otherwise, it definitely gets the job done.

Conclusion?

I haven’t really reached a conclusion yet, actually. Preliminary testing demonstrates that both of these approaches work. In my opinion, Aaronaught’s is cleaner, and simpler. On the other hand, I feel that these implementations are different enough that it’s not immediately obvious which one is “better.” There is also a tiny issue* with Aaronaught’s version, which I mentioned in a comment to Aaronaught’s answer in the question I posted on Stack Overflow (for patient readers only!).

Sadly, I’ve also been having trouble trying to come up with a good way to compare these implementations quantitatively. (There’s always profiling, but that can be quite tricky with threads, especially when wait handles are involved.)

Any ideas on this matter of comparing implementations would be most welcome. In the meantime, my tremendous thanks to Aaronaught, as well as the others who responded to my question, for providing such thoughtful input.

My next post is going to be about something completely different: my first open-source project!

*”Issue” might not really be the word; it’s certainly not a bug — just a sort of gray area where it’s unclear whether the requirements are met perfectly.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: