Pages

A blog about teaching Programming to non-CompSci students by Tim Love (Cambridge University Engineering Department). I do not speak on behalf of the university, the department, or even the IT group I belong to.

Wednesday 13 July 2011

C++ function problems

This page shows some of the students' imaginative attempts at solving one part of the 1st year Michaelmas computing course. The task is one that's been on the course for years, but this year we spoonfed them less. The students would have the same difficulties in many other languages, though C++ provides rather more challenges than Matlab (for example) would.

How we introduce functions

We explain functions gradually, using a mixture of explanation and practical work. The explanations include animations, diagrams showing functions as boxes with inputs and outputs, and sections entitled

We tell then that they have to do 3 things when writing functions: write a prototype, write the function, and call the function. The practical work begins with a gentle learning curve as follows

  • We show them is_even (a function that determines whether an integer is even) and give them a program that uses the function to see which integers in the range 1 to 10 are even. Then we ask them to produce a similar program to identify multiples of 3. This requires them to modify the code trivially, but we insist that they change the name of the routine (so they have to change the call and prototype too).
  • We get them to write a times-table program using a timesby7 function that we provide (the program will be similar to the above program)
  • We get them to write a program with just a main function, then get them to restructure it without changing its output so that it has main and another function.
  • Then we get them to write bigger programs with functions written from scratch
  • Then we get them to use library functions.
Common problems include
  • Thinking that the prototype
         int fun(int number);
    
    means that they have to call fun with a parameter called "number". We could get them to write the prototype as
         int fun(int);
    
    but that's not considered good style
  • Thinking that the prototype
         
         int fun(int number);
    
    calls the routine. We could ask them to prove to us that it does - by adding a cout call to the function.
  • Thinking that if a function prints out the answer, that's the same as returning it.

Playing with dice

About half way through their Michaelmas term work when they've already used functions we give the 1st years the following code

int RollDie()
{
   int randomNumber, die;

   randomNumber = random();
   die = 1 + (randomNumber % 6);
   return die;
}
We say

Each time the function random() is called, it will return a random positive integer. Work out what the ... function does and how it works.


One student didn't know how to search for "%" on a web-page, hence couldn't find where we'd described what the "%" operator did. I worry about students' webskills sometimes.

Then later in the handout we say

You've already seen the RollDie function that simulates the rolling of a single die. Copy it into your new file. Now write a function called Roll2Dice to simulate the rolling of 2 dice (call RollDie twice and return the sum of the answers). Before going any further, test it. If it doesn't work, neither will your full program! Here's a main function you could use

  int main() {
     srandom(time(0));
     cout << "Roll2Dice returns "  << Roll2Dice()  << endl;
  }
You'll need to add prototypes for RollDie and Roll2Dice too.


This task contrasts with last year's work where after we gave them the code for RollDie() we gave them the code for a routine with the prototype int RollManyDice(int M) (though we didn't provide the prototype or the final return ... line of the function). We made the change because we'd rather students programmed something simple themselves than merely type in more complex code than they don't understand.

Here's a list of solutions that students have tried

  1. Several start by writing this prototype
       bool Roll2Dice()
    
    because the first function introduced to them returns a bool.
  2. A few start by writing this prototype
       int Roll2Dice(int RollDie(),int RollDie() )
    
    because RollDie is "needed" by Roll2Dice, I presume.
  3. Some do this
    int Roll2Dice()
    {
       int randomNumber, die;
    
       randomNumber = random();
       die = 1 + (randomNumber % 12);
       return die;
    }
    
    (returning an integer from 1 to 12) or this
    int Roll2Dice()
    {
       int randomNumber, die;
    
       randomNumber = random();
       die = 2 + (randomNumber % 11);
       return die;
    }
    
    (returning an integer in the range 2 to 12, all the outcomes equally likely) or this
    int Roll2Dice()
    {
       int randomNumber, die;
    
       randomNumber = random();
       die = 1 + (randomNumber % 6);
       return 2*die;
    }
    
    (i.e. rolling a die and doubling the outcome). I think these examples illustrate that common sense suffers when students are struggling with C++.
  4. Quite a few people start by writing a new function to simulate the 2nd die.
    int RollDie2()
    {
       int randomNumber2, die2;
    
       randomNumber2 = random();
       die2 = 1 + (randomNumber2 % 6);
       return die2;
    }
    

    Some then try doing

      int total=die+die2;
    
    later in their program rather than calling the functions, not realising that die (in RollDie) and die2 (in RollDie2) are unavailable. At this point some students create global variables die and die2 while still creating the local instances of die and die2 - which silences the compiler but isn't the correct solution.

    Others write a Roll2Dice() function that calls RollDie() and RollDie2() to get the correct answer. Perhaps the existence of 2 dice makes them think they need 2 functions - I suspect they wouldn't write 2 functions to calculate the square roots of 2 numbers, or write 10 functions to roll 10 dice.

  5. The next is one of the most common solution, not calling the provided RollDie function at all.
    int Roll2Dice()
    {
       int randomNumber, die, randomNumber2, die2;
    
       randomNumber = random();
       die = 1 + (randomNumber % 6);
       randomNumber2 = random();
       die2 = 1 + (randomNumber2 % 6);
    
       return die+die2;
    }
    

Conclusions

I was hoping for
   int Roll2Dice() {
   int die1=RollDie();
   int die2=RollDie();
   int sum=die1+die2;
   return sum;
}
or even just
  int Roll2Dice() {
  return RollDie() + RollDie();
}

It's easy enough in a handout to explain how to write correct code, but this year we didn't want to tell them exactly what to type. Just about everything that we didn't dictate to them produced errors that revealed a lack of understanding. I think it would be counter-productive to anticipate and correct these misunderstandings by putting a list of what not to do in the handout - it would confuse them. Besides, it's useful to have these conceptual errors exposed as early as possible as long as demonstrator help is available.

Some of the solutions above are correct and the students often understand what they've written, so there's a case for letting them get on with it, but they're going to have bigger problems later if these conceptual hurdles aren't tackled now. (I once looked at a IIB project student's final program. It barely used functions. By factorising repeated code I reduced the line-count to 30% of the original. Worrying).

Some students are clearly just guessing as they go along, looking for any lines of code that look as if it should be copied. It would help if they revised earlier work, or trace their finger along the locus of control, explaining it line-by-line. Others start with a reasonable idea of what to do but make small mistakes that lead to bigger ones as they try to silence the compiler at all costs. It would help if they could identify run-time errors as soon as possible, but iterative development is something they only slowly learn, and besides, not all of them know what results to expect.

Understanding functions remains a problem. We introduce functions by analogy with mathematical functions, but in C++ they can see inside the black-box that is the function, and once they do, they find it hard to treat the function like a black-box ever again (it becomes a physical thing occupying space, rather than a concept). As an educational aid it helps to have an editor that collapses functions.

Frequencies

We get the students to run routines like Roll2Dice() and record the outcomes. They find

   int outcome=Roll2Dice();
   frequency[outcome]=frequency[outcome]+1;

hard to understand, which isn't so surprising given that after 6 hours of practicals

  • a few students still don't know how to add 1 to a simple variable.
  • more than a few students have "no idea" how to write a line that "creates an array called frequency big enough to store 6 integers". I left one such student to read the documentation for a few minutes, but when I returned to him he was none the wiser. The Arrays section of the doc might be sub-optimal, but it can't be that bad - it's much the same as last year's.
Even those who do understand arrays have trouble with the code quoted above, though they're happy with
   int outcome=Roll2Dice();
   if(outcome==1)
      frequency[1]=frequency[1]+1;
   if(outcome==2)
      frequency[2]=frequency[2]+1;
   ...
I've tried to spell out a 2-page explanation of the shorter version as a Frequently Asked Question but some people don't understand that either. When the penny drops they sometimes remark "that's clever". Then they try to be too clever and do
    frequency[Roll2Dice()]=frequency[Roll2Dice()]+1;

wondering why it fails (they're calling Roll2Dice twice, so the RHS and LHS may refer to different array elements). Having fixed it they put the line in a loop. A few students do this

int tries=0;
while(tries<100) {
   int outcome=Roll2Dice();
   frequency[outcome]=frequency[outcome]+1;
   frequency[outcome]=0;
   tries=tries+1;
}

Why is frequency[outcome]=0; there? Well, one student said that it was in an earlier loop so they thought they'd better put it in this loop too.

In short, there are still many indications that a non-trivial minority of students are just fumbling blindly through. If anything, the changes to the course this year make it easier for demonstrators to identify the students with severe problems - it's harder for students to bluff their way through.

According to "Validating an instructor rating scale for the difficulty of CS1 test items in C++" (Lulis and Freedman, JCSC 27, 2 (December 2011)) "faculty members disagree amongst themselves as to the difficulty level of questions involving functions", much more so than for questions involving other topics.