Monthly Archives: February 2011

Sometimes the best strategy is to retreat

This post has moved to my new location, philosopherdeveloper.com:

http://philosopherdeveloper.com/posts/sometimes-the-best-strategy-is-to-retreat.html

Advertisements

Lady Gaga needs a grammar lesson

So, I was lis— my wife was listening to the song “Bad Romance” this morning and I heard these lyrics:

I want your love and I want your revenge
You and me could write a bad romance

I had heard these lines countless times before, and yet… it just struck me for the first time this morning:

You and me could write?

It’s very strange to defend a guess (and a bonus: yes, static constructors get called when using reflection)

First of all: I fully intend to respond to the questions a couple of readers raised in response to my recent post about agile development. The truth is that when I wrote it, I was fully aware that I was giving a very high-level view of what agile development is without providing a lot of explanation about how it’s actually implemented (hence the common concern, “So is agile really just about not planning anything?”—doesn’t sound right, does it?). But I’m not quite ready to delve into an in-depth response just yet! That will require some time, and I’ve been quite busy at work.

The only thing I have to write about today is that I find it very odd when people defend their positions, when their positions are clearly just guesses.

Here’s an example. This morning on Stack Overflow, a user asked if a type’s static constructor would be called upon field access using reflection. I thought it was a pretty good question, albeit fairly trivial to test (I don’t know why the user didn’t just check and see for himself).

The correct answer is yes, field access using reflection will call a type’s static constructor. This is verifiable, and I included example code in my answer.

Curiously, another user answered with this:

If the value is set in the static constructor, it’s ONLY set when first accessed which won’t include being accessed via reflection. It isn’t initialized at runtime automatically.

This is factually wrong. So apparently the user was just guessing at the correct answer.

OK, that happens. But then another user pointed out that the answer appeared to be incorrect, citing my answer, to which the user responded with another guess:

Actually I am correct. In Dan’s answer he uses typeof(TestClass) which in fact calls the Static Constructor.

Again, factually wrong! We can clearly see this with the following very short code:

class TestClass
{
    static TestClass()
    {
        Console.WriteLine("I am in TestClass's static constructor.");
    }
}

class Program
{
    public static void Main(string[] args)
    {
        Type testClassType = typeof(TestClass);
        Console.WriteLine("OK, we're past the typeof part.");
    }
}

Output:

OK, we're past the typeof part.

So the typeof operator does not cause a type’s static constructor to be invoked. Why would this user (1) state a guess as if it’s a fact, and then: (2) defend that guess with another guess, also wrong?

I would like to believe this is an isolated case, but actually I believe that people do this all the time. In fact I tend to think that a whole lot of the controversial issues that people discuss in our society are actually matters of fact which simply have not been definitively established yet (e.g., will government policy X or Y be more effective at reducing crime?).

The human inclination to vehemently defend guesses is one which I have to believe has served our species well in the past (I should hope so, otherwise I fail to see why we do it so readily). But I still just don’t quite get it.

A concrete example of how Control.Invoke can cause deadlock

I’ve written a number of times in this blog about deadlock; but I don’t know that I’ve ever given a concrete example illustrating how it can actually occur in code.

The unfortunate truth is that it’s easier to stumble upon than you might think. What I will give shortly is an example using some very common tools many .NET developers should be familiar with: the Windows Forms API and a simple WaitHandle.

But first, a refresher on what deadlock is.

A mother and son negotiating over a broken vase

Deadlock happens when two processes are waiting on each other. In the cartoon above, the boy will not reveal what happened (obvious though it may be) until his mother promises not to be mad. His mother, in turn, can’t make any such promise until she knows what happened.

In a Windows Forms application, it is disturbingly easy to fall into the trap of writing code that results in a deadlock. I find that this generally happens with developers who, to borrow one of my mom’s expressions, “know just enough to be dangerous” about multithreading. These are the kinds of developers who know about the lock statement, about threads and mutexes and wait handles, etc., but who aren’t exactly sure how to wield these tools in a totally safe way.

In all honesty, I have to count myself among these developers, since I can’t say I am so good at multithreading that I never make mistakes in this realm. (To be brutally honest, I don’t think anyone is that good… but I’ve talked about that before.)

So, how can deadlock happen so easily in a Windows Forms app? Simple: through a single careless use of the Control.Invoke method.

To understand this, it’s necessary to first understand how Windows Forms actually works. It’s actually pretty straightforward: Windows Forms basically consists of a message pump:

The Windows message pump

A Windows application is little more than a queue of messages being sent to a window. Every action performed by the user—moving the mouse, hitting a key on the keyboard, clicking on a button, etc.—pushes a “message” into a queue, while Windows continuously pops these “messages” from the queue and executes whatever code a programmer has associated with each message.

(In Windows Forms, these messages are exposed in the form of .NET events: MouseMove, Click, etc.)

Now, a crucial point that many rookie developers miss is that this queue is being processed by a single thread, typically referred to as the “UI thread” or “foreground thread” (these are not formal terms, as far as I know). It needs to be this way because, like probably 95% of software components in existence, the controls in the System.Windows.Forms namespace were not designed to be thread-safe when accessed concurrently from multiple threads. And so calls to methods that affect UI controls—like adding to a ListBox, setting the text of a TextBox, etc.—must be made from the UI thread; otherwise, it will be pandemonium.

So whenever one wants to update the UI from a background thread (by the way, in my personal opinion, you should just never do this at all—this almost always indicates overly tight coupling or an otherwise problematic design—but that’s a subject for another post altogether), the most common way to do so is by calling Control.Invoke.

This is what that does:

  1. Pushes a new message onto the message queue
  2. Waits for that message to be processed

Uh oh… there is a scary word there: “Waits.” Any scenario where your program is waiting is one where deadlock may creep in, if you aren’t careful.

And so here at last is that concrete example I promised (you can grab this in full project form from a new repository I’ve created on GitHub for the sole purpose of hosting code I share on this blog):

Imports System.Threading

Public Class DeadlockExampleForm

    Private _unlocked As New ManualResetEvent(False)

    Private Sub BigButton_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles BigButton.Click
        Dim t As New Thread(AddressOf Deadlock)
        t.Start()

        Me.Text = "Entering Deadlock"

        ' This line will wait forever.
        _unlocked.WaitOne()
    End Sub

    Private Sub Deadlock()
        ' Invoke is a blocking call. This thread will not continue past this
        ' line until the Click event handler above runs. Until _unlocked is
        ' signalled below, this will never happen.
        Invoke(New Action(AddressOf NotifyComplete))

        ' This line will never be reached.
        _unlocked.Set()
    End Sub

    Private Sub NotifyComplete()
        Me.Text = "Escaped Deadlock"
    End Sub

End Class

See how easy that is? That’s seriously all that needs to happen for deadlock to strike a Windows Forms application: you call Invoke from a background thread, and from the UI thread you wait on something that happens after that Invoke call on the background thread. Now the background thread is waiting (because the message pushed by the Invoke call is not yet processed), and the UI thread is waiting (because the background thread hasn’t done whatever the UI thread is waiting for). Disaster.

Full disclosure: yes, I am familiar with this problem because it has bitten me in the past. Don’t let it bite you!

Everything agile

I’ve mentioned probably about a hundred times that I have recently joined ThoughtWorks. What I haven’t done is discuss what ThoughtWorks, as a company, actually does.

So, the company is one of the driving forces for agile software development in the industry. We* provide consulting and training to organizations that want to adopt agile practices, and we also develop (“deliver”) software to organizations using an agile approach. (There is also ThoughtWorks Studios, a division of the company which develops products for agile development.)

What the heck is agile development?

I’m glad you asked!

The “traditional” way of developing software involves a lot of planning: establishing clearly defined requirements, drafting feature and technical specifications, designing test cases, prioritizing features, etc.—in this way, a traditional software development methodology such as RUP is highly focused on predicting the challenges and needs of a software project.

Picture of a man thinking, What will happen?

The “agile” way of developing software is in many ways a response to the faults of traditional processes that leaders within the software industry have observed over the past several decades. Agile development represents a fundamentally different approach to handling change within a software project: whereas a traditional approach attempts to prevent costly changes by anticipating requirements, an agile approach strives to minimize the negative impacts of changes by adapting quickly.

The fact that agile software development typically involves less up-front planning than traditional development has resulted in a lot of misinformation out there about what it means to be agile. Some detractors think of the term “agile” as meaning “unstructured”, “disorganized”, or even “risky”:

Men getting right to work

A better way of understanding what it means to be agile is this: agile is all about adapting to change. As I said, traditional practices tend to try and plan an entire software project from beginning to end. In this scenario, change is very expensive because, with so much effort invested in planning, every little change requires additional effort towards adjusting schedules, budgets, requirements, etc. The agile philosophy is that change is inevitable, and so rather than fight against it, it makes more sense to actually expect it (and adapt to it).

So you might think of “traditional” as representing a more predictive approach to software, and “agile” as representing a more adaptive approach.

The traditional-agile continuum

Now here’s the crucial part about what I just said: when you expect something to happen, it doesn’t make much sense to plan for it not to happen.

That might sound a little too… obvious (or even tautological). Let me put it differently, using an analogy.

Let’s say I go to the horse races. I’ve done a lot of research, and according to what I’ve learned, I strongly suspect that Seabiscuit is going to win. This actually isn’t much of a leap of faith on my part; Seabiscuit nearly always wins. So it just makes sense for me to assume, or anyway, to guess, that Seabiscuit will win again. And yet, when it comes time to place my money on a horse, suppose I put it on War Admiral instead.

This doesn’t make sense, right? I expect one thing to happen, and yet I act as if something else is going to happen.

A guy with an umbrella who thinks it will be sunny

It doesn’t make sense, and yet this is what traditional development is. Honestly, these days nobody honestly expects that changes won’t happen. They will happen, and they are generally anticipated with fear, because developers know just how much additional work they can cause. Agile development is about saying, “Hey, wait a minute, why are we building up these huge specifications and making these great big plans when we know it’s going to hurt like crazy when (not if, when) they change?”

To put it another way: agile development is about not spending $1,000,000 on a house where you know (or are relatively sure) an earthquake is going to hit and obliterate it sometime in the next few months.

I was just thinking about all this today during a team meeting with some of my classmates from school. We were working on an assignment consisting of roughly four parts, and repeatedly the conversation kept coming back to the “big picture”: how we were going to structure the assignment overall. There would be a lot of back-and-forth on areas that were subjective, unclear, or complex. There was much speculation and discussion of the unknowable (e.g., what would have happened if the plan had been X).

That is, there was a lot of mental gear-turning that was premature because the ideas being explored were not yet fully in focus.

Gears turning toward an unknown end

This is exactly what the agile approach is meant to avoid: wasting effort. After all, this is why change is so expensive in the first place: it undoes much of the work that has been done. If that work hadn’t been done in the first place, change wouldn’t seem so scary.

And so at this meeting, as we were going on and on in our conversation about this or that future aspect of our project, we eventually arrived at a simple realization: Why don’t we take an agile approach to this? Rather than plan the entire thing from start to finish, we can just get started on the parts we know about now, and as we move forward a clearer picture of those parts we don’t know as much about should start to form, making it easier for us to work out decisions in those areas when we come to them.

After this realization, I think the bigger realization came to me (well, came to me again—I have actually thought about this before, but it seems every now and then I’m reminded of it): an “agile methodology” is not just a software development methodology. It is really just a way of approaching any problem: by expecting change, and planning to adapt to it, rather than planning for one fixed set of circumstances and praying those circumstances don’t change.

*It feels a bit weird to say “we” to refer to a company I’ve just joined, but… gotta start some time, right?

Help! I’ve been straw-manned!

I think we’ve all been straw-manned at one point or another.

A straw man

For those unfamiliar with the term, a straw man is basically a false representation of an opposing opinion, which one attacks in order to present the illusion of having defeated said opponent in an argument. I remember one of my philosophy professors in college explained the concept using the example of a political debate where one candidate is unable to attend and so a straw man is brought in as a substitute. This allows the candidate in attendance to make unfair or flat-out inaccurate characterizations of his opponent’s platform: “My opponent likes to pretend that…”, “My opponent would have you believe that…”, etc.

This is how the term straw man is generally understood and presented: as a mean-spirited, aggressive tactic for undermining another person’s viewpoint. But I think it’s actually not so different from the much more common, much less malevolent scenario where two people simply misunderstand each other.

Case in point: many moons ago, a user on Stack Overflow posed a seemingly harmless question asking users to list some confusingly-named methods in the .NET BCL—cases where, just by looking at a method, a developer might think it does one thing, when in fact it does another.

In other words, mislabeled stuff.

A confusing Microsoft/Apple logo

Now that's confusing!

There were plenty of great answers, which prompted a lot of good discussion (e.g., what should you call such and such a method, etc.); however, there was also a pretty heated argument about one answer in particular: my answer.

I said that the DateTime.Add method (and all the DateTime methods whose names start with the word “Add”) has a misleading name. You know, because the DateTime type is immutable, these methods actually return new values, yada yada yada.

For example:

DateTime d = DateTime.Now;
d.AddHours(1.0);           // This does nothing meaningful.
d = d.AddHours(1.0);       // THIS does something meaningful.

I suggested that the word “Add” implies mutability, primarily based on the precedent of such other methods as ArrayList.Add, List<T>.Add, and so on.

What I would have done is given these methods names starting with the word “Plus”: DateTime.Plus, etc. I feel that this word doesn’t imply mutability, just as the phrase “x plus 5” does not imply any modification of the value stored at x.

To be clear: I realize this is 100% subjective. Someone else could see the word “Add” and not think it implies mutability, which undercuts the premise of my answer completely for that individual. I won’t dispute that.

But a funny thing happened with that answer: to my surprise, not only did some users disagree with me, they really disagreed with me. And it all started with a comment by a user who shall remain nameless (unless you click on the link above and look at the comments yourself, where you can clearly see his name):

I can’t say that I ever really thought twice about this. DateTime is a value type; it really doesn’t matter what its methods are called, none of them could possibly be mutators.

Never mind that this conflates the concepts of value types and immutability (don’t get me wrong; I believe the user in question has these concepts very straight in his head, but it is a misunderstanding shared by quite a few .NET developers); the real problem occurred a few comments later:

Consider if C# had date literals like VB and you could write #5/14/2010#.AddDays(1). What would you expect this to do? I maintain that if people get this wrong, it’s their fault, not the Framework’s fault for having a “confusing” method name. If I say “take your date of birth and add 1 month”, that doesn’t imply that you should (or could) physically change the date on which you were born.

It’s about here in the discussion that a straw man started to enter the picture. Honestly I felt a bit like our exchange was beginning to look like this:

A Red Delicious apple with a sticker on it that says "Gala"

Gala

Me: Boy, that’s a misleading caption on that picture of an apple.
Him: Why? It’s obviously a Red Delicious.
Me: Yeah, but… the caption says Gala.
Him: So what? It’s deep red; the skin is shiny and smooth; it is in every way like a Red Delicious!
Me: If that’s really your argument, the caption could just as well say “Orange” and you would still think it’s not misleading.
Him: For it to say “Orange” would make no sense. What are you talking about?
Me: I know it wouldn’t; that’s my point: it’s a bad name.
Him: I’m sorry, but you’d have to be stupid to think that’s a Gala.
Me: I never said I thought it was a Gala. I said it was a misleading caption.

Now, I don’t believe this user had any particularly strong interest in arguing with me just for the sake of it (that would be kind of silly). Rather, this is what I think happened. When I said that the DateTime.Add method (and its kin) had a confusing or misleading name, it reminded this particular user of other developers he’s interacted with, who may not understand the difference between value types and reference types, or why value types should generally be immutable, and thus why you should not call DateTime.Add and expect the instance to be modified. And so as this user became increasingly agitated (or so it seemed to me), he was really arguing, not with me, but with those other developers of whom I reminded him simply by commenting on what I felt (and still feel) is a misleading name.

That is to say, the straw man in this case was perhaps not an imaginary figure, but in fact other people. And maybe the whole reason for the debate that ensued was not that the other user wanted to argue with me, but simply that the topic brought up associations in his mind, to which he had a strong negative reaction.

Or maybe I’m just wrong; I don’t know. But I still think those methods have misleading names.

The Promise of PostSharp

So, it’s my first day at ThoughtWorks post-orientation.

As I have not yet been assigned to a project, I’ve spent my day on “the beach”: an open area in the office where consultants are free (and encouraged) to explore new technologies, write in their blogs—basically, whatever they want.

I’ve decided to do both of the above. The technology I’m exploring specifically is PostSharp, which looks like a really cool way to concisely eliminate a lot of boilerplate code and potentially do some really cool things at compile time.

PostSharp logo

I’ll let you all know how things go with that.

Garbage, garbage, everywhere

Today I want to respond to a comment that was left on a previous post with regard to the following LINQ query:

For Each prod As Product In Products.Where(Function(p) p.IsQuoting).OrderBy(Function(p) p.Expiration).ThenBy(Function(p) p.Strike)
    prod.SendQuotes()
Next

What’s wrong with the above code? In most cases, nothing, really. It gets the job done in a somewhat obvious way, in a some readable line of code.

That’s most cases. But most cases don’t take performance into consideration; and you may recall, in the post in question, I was specifically talking about performance.

In order to understand the performance impact of that code, it’s necessary to first understand garbage.

Back in the early days of computer programming, whenever a developer allocated some memory in code, he or she was responsible for also deallocating it.

This actually made for some pretty lean code. Imagine it like a society in which every citizen always deals with his or her own trash: rather than have collectors who come through on a regular basis, this society’s members all dispose of their own trash in one way or another. In an ideal world, this society would enjoy significant savings as a result of not needing to invest in the infrastructure to deal with trash on a large scale.

However, in the real world, such a society would almost certainly have a huge trash problem, since not every member of society is a responsible citizen. In software terms, not every program always deallocates its memory perfectly. This often results in what is referred to as a memory leak.

Developers using platforms like Java (JVM) or .NET (CLR) have this fanciful notion in their heads that memory leaks can’t happen on my platform, because my language is garbage collected! This is not entirely true to begin with (in .NET, at least, memory leaks are very possible and potentially very costly); but more importantly, it’s just a bad attitude to have. When you believe that you can just do whatever the heck you want because your language is garbage collected, chances are you’re going to end up writing some pretty inefficient code.

For those of you who may not know what garbage collection is in software, it’s pretty much what it sounds like: a process whereby a system will “clean up” the memory allocated by a program after that memory is no longer in use, without requiring software developers to do this themselves. So a language with garbage collection (e.g., C#) is like a society in which individual citizens do not concern themselves with disposing of their own waste. In other words, it’s like our society, which does have garbage collectors who handle everyone’s garbage.

Now consider this: whether or not a society expects its individual citizens to deal with their trash, or it expects garbage collectors to, there’s still garbage to be dealt with. And that is, no matter how you spin it, a cost.

What the heck does this have to do with that LINQ query? I’m glad you asked. Let’s look at that first line again:

For Each prod As Product In Products.Where(Function(p) p.IsQuoting).OrderBy(Function(p) p.Expiration).ThenBy(Function(p) p.Strike)

How much garbage does this code create? I will count the throwaway objects:

  1. An enumerator to iterate the results of the Where call.
  2. A delegate object to execute p.IsQuoting on every iteration.
  3. Another enumerator to provide access to the result of OrderBy.
  4. Another delegate object to call p.Expiration for each iteration.
  5. Another enumerator to iterate over the results of ThenBy. This enumerator will need to store its own intermediate collection of Product objects, since it has to sort them and iterate over them in sorted order based on p.Expiration followed by p.Strike.
  6. Another delegate object to call p.Strike for every iteration.

See that? That’s 6 objects, not including the intermediate collection of Product objects requiring sorting prior to iteration in the For Each loop.

To put this in perspective: the above code could be written without requiring any new object allocations, with the possible exception of the intermediate collection (depending on whether or not the sort from OrderBy/ThenBy could be in-place).

To me this is sort of like this: say you have a bowl of pasta for lunch every day. You could use the fork and bowl you have in your kitchen drawer for every bowl, or you could use a different throwaway paper bowl and plastic fork every time you eat this meal (every day).

Which would you choose?


For the record, here’s how you could do the same work with no garbage:

' Initialization code somewhere--
' Let's say this ProductComparer class encompasses the OrderBy/ThenBy logic.

' I am not counting this as garbage, because the cost would be incurred ONCE,
' as opposed to every time the below code executes.
Dim comparer As New ProductComparer
Products.Sort(comparer)

' Now here's the clean, garbage-free code.
' (OK OK, there is still possibly an enumerator object...
' then again, there might not be; e.g., if Products is a List(Of Product),
' then its enumerator will actually be a value type which does not create
' garbage.)
For Each prod As Product in Products
    If prod.IsQuoting Then
        prod.SendQuotes()
    End If
Next