Eliza 1 - Stepwise refinement - aka "How to eat an elephant"

by Ravi Bhavnani


In this version of Eliza, we introduce an important concept of problem solving known as "stepwise refinement".

Stepwise refinement is the process of breaking down a large problem into many smaller problems. This process is repeated until we end up with problems whose solutions are arrived at easily, and can be expressed in a few lines of code. In other words, we refine the solution to our original problem over a number of steps.

Stepwise refinement was first suggested by Dr Niklaus Wirth (co-inventor of the Pascal programming language), in 1971. Dr Wirth is a legendary figure in the field of Computer Science and is the author of the seminal book Data Structures + Algorithms = Programs . I had the privilege of meeting Dr Wirth at his office at the ETH University in Zurich, Switzerland in the summer of 1986.

With Dr Niklaus Wirth, @ ETH

The large problem that we'll attempt to refine (or decompose) into smaller problems is this C# method that is at the heart of Eliza:

    /// <summary> 
    /// Gets the response to a user's input. 
    /// </summary> 
    /// <param name="input">The user's input.</param>
    /// <param name="conversationHasEnded">Flag: the conversation has ended.</param>
    /// <returns>Eliza's response.</returns>
    public string GetResponse
        (string input,
         out bool conversationHasEnded)
    {
        ...
    }

First attempt at breaking down the large problem

At its core, Eliza works by:

  • Checking if the user wants to end the conversation.
    If so, it responds with an appropriate "goodbye" response.

  • Otherwise, Eliza checks if what you say matches one or more patterns of text that it knows.
    If it can't find a matching pattern, it responds with a generic response like "Please go on." or "Interesting...".

  • Otherwise, Eliza generates the appropriate response for the pattern.

So we can rewrite the logic of GetResponse() (in "pseudo-code") as:

    public string GetResponse
        (string input,
         out bool conversationHasEnded)
    {
        // Return "good bye" if the user ended the conversation
        conversationHasEnded = does_input_imply_end_of_conversation(input);
        if (conversationHasEnded) {
            return get_good_bye_response();
        }

        // Find matching pattern - if none, return a generic response
        pattern = find_matching_pattern_for_input (input)
        if (pattern == null) {
            return get_generic_response();
        }

        // Return an appropriate response for the pattern
        return generate_response_for_pattern(pattern, input);
    }

Notice how we decomposed GetResponse() into 5 smaller problems:

  1. does_input_imply_end_of_conversation(input)
  2. get_good_bye_response()
  3. find_matching_pattern_for_input (input)
  4. get_generic_response()
  5. generate_response_for_pattern(pattern, input)

Tackling the "low hanging fruit"

In software development, the phrase "low hanging fruit" refers to problems that are easily solved, and have no (or very few) dependencies on other pieces of work. In other words, it's quite easy to pick low hanging fruit off a tree, but getting at the coconuts at the top of a tree requires considerable more effort.

Of the 5 problems we've identified, note that #1, #2 and #4 are completely independent of #3 and #5, and are pretty easy to solve. They are our "low hanging fruit". So let's quickly write these methods:

    bool DoesInputImplyEndOfConversation
        (string input);

    string GetGoodByeResponse();

    string GetGenericResponse();

STOP!  DON'T READ ANY FURTHER.
GO WRITE THESE 3 METHODS, THEN COME BACK TO THIS POSITION IN THIS PAGE.

Access modifers and static members

Now that you've written these methods, let's ask ourselves 2 questions about each method:

  1. What should its access modifier be?
  2. Should the method be static or not?

If you don't know what (1) means, go back to Step 0 of this journey and read the section on "Information hiding".

And here's the difference between static and non-static members (i.e. methods, properties and variables):

  • A static member of a class is defined by the compiler only once for the entire class. ALL instances of the class SHARE the same (one and only one) static member.

  • A non-static member is defined for EACH INSTANCE of the class and isn't shared across all instances of the class. Each instance member is completely independent of another instance's member.

Why do we care about static vs. not static? Because defining a single shared (static) member saves memory, especially if we end up constructing thousands of instances of a class.

A simple rule to help decide whether or not to make something static is:

"If the item CAN be shared across all instances of the class, it should be static."

NOW THAT YOU KNOW HOW TO SELECT AN ACCESS MODIFIER AND DECIDE IF A MEMBER SHOULD OR SHOULDN'T BE STATIC, MODIFY THE 3 METHODS YOU WROTE (IF NECESSARY).

Tackling the harder problems by stubbing the solutions

Now that we've tackled the "low hanging fruit", it's time to turn our attention to the remaining harder problems, i.e.:

  • (3) find_matching_pattern_for_input (input)
  • (5) generate_response_for_pattern(pattern, input)

Both these problems are difficult, and their solutions are complex. So instead of solving them, let's "stub" (fake) the solutions for now. A "stub" is a "fake" or "mock" solution that's a temporary stand-in for the real solution.

If you look at problems #3 and #5, you'll notice they have something in common. #3 returns a "pattern" (whatever that is) to the caller, and #5 consumes this pattern. But what is a pattern? We don't know the answer to that as yet.

For now, let's just say a pattern is some kind of object. A "stub" for this object will simply be an empty object with no properties and no behavior (i.e. no methods).

It looks like this:

    // A pattern.
    class Pattern
    {
        // Stubbed for now.  We'll figure out what it contains and how it
        // behaves later.
    }

Now that we've defined a Pattern (OK, we cheated by just stubbing it), let's stub the solution to problem #3 and #5 by writing the (stubbed) methods.

    Pattern FindMatchingPatternForInput
        (string input)
    {
        // Stubbed for now.  We'll figure out what it does later.
        // For now just return a Pattern.
        return new Pattern();
    }

    string GenerateResponseForPattern
        (Pattern pattern,
         string input)
    {
        // Stubbed for now.
        return string.Format("You said \"{0}\".", input);
    }

We're done!

Amazing! We just finished writing Eliza!

OK, so we cheated by stubbing the hardest problems (hee hee hee). But we now have a MUCH better idea of how Eliza works. All we have to do is break up the remaining unsolved hard problems into many smaller ones, and attack each of them (by possibly temporarily stubbing some of them as we work our way through them).

As we proceed with each successive step of stepwise refinement, we'll come closer to the final, fully working version of Eliza. Along the way, we'll discover:

A complex piece of software is nothing but a very large number of very simple moving parts.
 

Most of the content at this site is copyright © Ravi Bhavnani.
Questions or comments?  Send mail to ravib@ravib.com