Teaching Programming at University: 2011

Wednesday 13 July 2011

C++ function problems

This page shows some of the students' imaginative attempts at solving one part of the 1st year Michaelmas computing course. The task is one that's been on the course for years, but this year we spoonfed them less. The students would have the same difficulties in many other languages, though C++ provides rather more challenges than Matlab (for example) would.

How we introduce functions

We explain functions gradually, using a mixture of explanation and practical work. The explanations include animations, diagrams showing functions as boxes with inputs and outputs, and sections entitled

We tell then that they have to do 3 things when writing functions: write a prototype, write the function, and call the function. The practical work begins with a gentle learning curve as follows

We show them is_even (a function that determines whether an integer is even) and give them a program that uses the function to see which integers in the range 1 to 10 are even. Then we ask them to produce a similar program to identify multiples of 3. This requires them to modify the code trivially, but we insist that they change the name of the routine (so they have to change the call and prototype too).
We get them to write a times-table program using a timesby7 function that we provide (the program will be similar to the above program)
We get them to write a program with just a main function, then get them to restructure it without changing its output so that it has main and another function.
Then we get them to write bigger programs with functions written from scratch
Then we get them to use library functions.

Common problems include

Thinking that the prototype
```
     int fun(int number);
```
means that they have to call fun with a parameter called "number". We could get them to write the prototype as
```
     int fun(int);
```
but that's not considered good style
Thinking that the prototype
```
     
     int fun(int number);
```
calls the routine. We could ask them to prove to us that it does - by adding a cout call to the function.
Thinking that if a function prints out the answer, that's the same as returning it.

Playing with dice

About half way through their Michaelmas term work when they've already used functions we give the 1st years the following code

int RollDie()
{
   int randomNumber, die;

   randomNumber = random();
   die = 1 + (randomNumber % 6);
   return die;
}

We say

Each time the function random() is called, it will return a random positive integer. Work out what the ... function does and how it works.

One student didn't know how to search for "%" on a web-page, hence couldn't find where we'd described what the "%" operator did. I worry about students' webskills sometimes.

Then later in the handout we say

You've already seen the RollDie function that simulates the rolling of a single die. Copy it into your new file. Now write a function called Roll2Dice to simulate the rolling of 2 dice (call RollDie twice and return the sum of the answers). Before going any further, test it. If it doesn't work, neither will your full program! Here's a main function you could use

  int main() {
     srandom(time(0));
     cout << "Roll2Dice returns "  << Roll2Dice()  << endl;
  }

You'll need to add prototypes for RollDie and Roll2Dice too.

This task contrasts with last year's work where after we gave them the code for RollDie() we gave them the code for a routine with the prototype int RollManyDice(int M) (though we didn't provide the prototype or the final return ... line of the function). We made the change because we'd rather students programmed something simple themselves than merely type in more complex code than they don't understand.

Here's a list of solutions that students have tried

Several start by writing this prototype
```
   bool Roll2Dice()
```
because the first function introduced to them returns a bool.
A few start by writing this prototype
```
   int Roll2Dice(int RollDie(),int RollDie() )
```
because RollDie is "needed" by Roll2Dice, I presume.

Some do this

int Roll2Dice()
{
   int randomNumber, die;

   randomNumber = random();
   die = 1 + (randomNumber % 12);
   return die;
}

(returning an integer from 1 to 12) or this

int Roll2Dice()
{
   int randomNumber, die;

   randomNumber = random();
   die = 2 + (randomNumber % 11);
   return die;
}

(returning an integer in the range 2 to 12, all the outcomes equally likely) or this

int Roll2Dice()
{
   int randomNumber, die;

   randomNumber = random();
   die = 1 + (randomNumber % 6);
   return 2*die;
}

(i.e. rolling a die and doubling the outcome). I think these examples illustrate that common sense suffers when students are struggling with C++.

Quite a few people start by writing a new function to simulate the 2nd die.
```
int RollDie2()
{
   int randomNumber2, die2;

   randomNumber2 = random();
   die2 = 1 + (randomNumber2 % 6);
   return die2;
}
```
Some then try doing
```
  int total=die+die2;
```
later in their program rather than calling the functions, not realising that die (in RollDie) and die2 (in RollDie2) are unavailable. At this point some students create global variables die and die2 while still creating the local instances of die and die2 - which silences the compiler but isn't the correct solution.

Others write a Roll2Dice() function that calls RollDie() and RollDie2() to get the correct answer. Perhaps the existence of 2 dice makes them think they need 2 functions - I suspect they wouldn't write 2 functions to calculate the square roots of 2 numbers, or write 10 functions to roll 10 dice.

The next is one of the most common solution, not calling the provided RollDie function at all.

int Roll2Dice()
{
   int randomNumber, die, randomNumber2, die2;

   randomNumber = random();
   die = 1 + (randomNumber % 6);
   randomNumber2 = random();
   die2 = 1 + (randomNumber2 % 6);

   return die+die2;
}

Conclusions

I was hoping for

   int Roll2Dice() {
   int die1=RollDie();
   int die2=RollDie();
   int sum=die1+die2;
   return sum;
}

or even just

  int Roll2Dice() {
  return RollDie() + RollDie();
}

It's easy enough in a handout to explain how to write correct code, but this year we didn't want to tell them exactly what to type. Just about everything that we didn't dictate to them produced errors that revealed a lack of understanding. I think it would be counter-productive to anticipate and correct these misunderstandings by putting a list of what not to do in the handout - it would confuse them. Besides, it's useful to have these conceptual errors exposed as early as possible as long as demonstrator help is available.

Some of the solutions above are correct and the students often understand what they've written, so there's a case for letting them get on with it, but they're going to have bigger problems later if these conceptual hurdles aren't tackled now. (I once looked at a IIB project student's final program. It barely used functions. By factorising repeated code I reduced the line-count to 30% of the original. Worrying).

Some students are clearly just guessing as they go along, looking for any lines of code that look as if it should be copied. It would help if they revised earlier work, or trace their finger along the locus of control, explaining it line-by-line. Others start with a reasonable idea of what to do but make small mistakes that lead to bigger ones as they try to silence the compiler at all costs. It would help if they could identify run-time errors as soon as possible, but iterative development is something they only slowly learn, and besides, not all of them know what results to expect.

Understanding functions remains a problem. We introduce functions by analogy with mathematical functions, but in C++ they can see inside the black-box that is the function, and once they do, they find it hard to treat the function like a black-box ever again (it becomes a physical thing occupying space, rather than a concept). As an educational aid it helps to have an editor that collapses functions.

Frequencies

We get the students to run routines like Roll2Dice() and record the outcomes. They find

   int outcome=Roll2Dice();
   frequency[outcome]=frequency[outcome]+1;

hard to understand, which isn't so surprising given that after 6 hours of practicals

a few students still don't know how to add 1 to a simple variable.
more than a few students have "no idea" how to write a line that "creates an array called frequency big enough to store 6 integers". I left one such student to read the documentation for a few minutes, but when I returned to him he was none the wiser. The Arrays section of the doc might be sub-optimal, but it can't be that bad - it's much the same as last year's.

Even those who do understand arrays have trouble with the code quoted above, though they're happy with

   int outcome=Roll2Dice();
   if(outcome==1)
      frequency[1]=frequency[1]+1;
   if(outcome==2)
      frequency[2]=frequency[2]+1;
   ...

I've tried to spell out a 2-page explanation of the shorter version as a Frequently Asked Question but some people don't understand that either. When the penny drops they sometimes remark "that's clever". Then they try to be too clever and do

    frequency[Roll2Dice()]=frequency[Roll2Dice()]+1;

wondering why it fails (they're calling Roll2Dice twice, so the RHS and LHS may refer to different array elements). Having fixed it they put the line in a loop. A few students do this

int tries=0;
while(tries<100) {
   int outcome=Roll2Dice();
   frequency[outcome]=frequency[outcome]+1;
   frequency[outcome]=0;
   tries=tries+1;
}

Why is frequency[outcome]=0; there? Well, one student said that it was in an earlier loop so they thought they'd better put it in this loop too.

In short, there are still many indications that a non-trivial minority of students are just fumbling blindly through. If anything, the changes to the course this year make it easier for demonstrators to identify the students with severe problems - it's harder for students to bluff their way through.

According to "Validating an instructor rating scale for the difficulty of CS1 test items in C++" (Lulis and Freedman, JCSC 27, 2 (December 2011)) "faculty members disagree amongst themselves as to the difficulty level of questions involving functions", much more so than for questions involving other topics.

Thursday 2 June 2011

Online help systems

On 29 June 2011 I'm attending a UCISA symposium on "Advisory and IT Support" about "Producing a service desk good practice guide - Measuring the service desk". A parallel session that I'd also have liked to attend was from University of Lincoln - "A project, led by ICT, focused on changing the ICT and Estates service by embedding a new culture across both departments that delivers excellent, consistent service, underpinned by a robust framework of technology, processes, learning, development and support". As preparation I thought I'd summarise our experiences.

Changes in user skills and expectations regarding the WWW offer challenges and opportunities for information providers, but it's far from being merely a technological issue that can be solved by a new piece of software. As Lincoln has discovered, culture (amongst staff as well as amongst users) and processes matter, and need to be understood. MIT have produced a report on their attempt to update their system - see the Nercomp Hermes presentation.pdf. They say that a culture of "Knowledge Centered Support" involves developing "knowledge as part of problem-solving process - When you solve a user’s problem, document it". Inhibitions to this included

Wiki markup (or browser support for WYSIWYG editors)
Time pressure (for frontline support, primarily)
Different self-imposed “standards” for publication
Unease with public information, even if there’s nothing inherently confidential
"Ownership": if I write about it, I might have to support it

Their list of "Lessons Learned" included

"It's harder to get contributors than we thought"
"Did not set up a tracking mechanism up front. Can't tell who's looking at what."
"Get buy-in from decision makers to make "executive" decisions setting expectations for internal IS&T groups to contribute information"
"Need to be clear about what information goes where, e.g. website versus knowledge base"
"Ongoing maintenance is required to keep content fresh."
"There's a need for an advocacy role"

Their project had definite end date and finite resources and a special one-time allocation of funds, which helped to force things forward.

Many of these finding chime with my observations - see the Searching, Culture, Distributed Authorship, and Solutions sections below. My department has had an online help system since 1994, several authors having produced pages. Our oldest page was last modified at "1995-01-06 12:41:20 GMT". A 1997 version of our front page is still online

Oct 1997 (traffic 33k pages/day; click to zoom)

Aug 2010 (traffic 201k pages/day; click to zoom)

Apart from the house styling, you'll see that little has changed on the surface - even in 2010 it was still described as the "hypermedia help system"! In 1996 I wrote a little review of the help system where I mentioned that

"We are encouraging (not very successfully) admin and teaching staff to maintain their own material."
"We have regular users of our system who still don't use e-mail let alone the help system, so personalised user education is still necessary."
"[people are] Over-using brute-force searches"

We had a help-search facility but I don't know what it was - Google hadn't really take off by then. Maybe we already used swish, an earlier version of swish-e a public-domain indexing facility.

Keyword searching wasn't the only option for users - we offered "task-based" and "subject-based" trees of links, and pages had at their foot a list of related pages so that users could browse around. In those days there were several sites (e.g. Yahoo) that maintained a hierarchy of links to pages so that people could browse as an alternative to word-searching when they wanted to find something out. Even in 1996 however, people preferred brute-force searches though their search terms were often more hopeful than precise.

Between 1996 and 2003 the amount of material grew, as did the variety of types (PHP, movies and databases appeared). Our dependance in the help system as a front-line service grew too - in our introductory letter to new undergraduates we wrote "The department has an extensive help system ... which has answers to many questions people ask about the Engineering Department computer system. Please look at this first and if you cannot find an answer there consult the Department's Computer Operators". The success of Google meant that more than ever, users word-searched for information rather than browsed through hierarchies. Google ranks pages in a way that satisfied most users. Customised site-specific searches can be set up using google, but there are difficulties using Google to look for local information because some of it was domain and/or password protected.

The growth in local material hadn't matched growth globally. Many of the documents we wrote in the early years had been superceded by documents elsewhere. Because of the increased performance of the internet there was not even a speed advantage to having local documentation.

Searching

By 2003 it was clear that university establishments might have special requirements when it comes to searching. The Search facilties for UK HE web sites paper (written by a Cambridge webmaster) dates back to 2003 and lists some useful alternatives.

Troubles that people searching our site have are that

It can be hard to think up useful search terms for questions involving a lack of understanding rather than a simple lack of information
Many queries involve generic computing terms (e.g. "open", "windows", "word") which makes ranking more difficult.
Coverage will be patchy with some common questions not covered while other obscure topics might be covered in a depth that swamps the results of searches. If a user fails to find information in their first help-search (which is likely) they'll be wary of using it again. At least Google comes up with something even if it's not locally relevant.

A 2007 consultant's report about the University site noted that local searches still posed problems - "Most of the [users'] complaints about the site fell into three areas:" the first-mentioned being "inadequate search facility: It was generally felt that an external google search yielded more appropriate and better presented results than the search function on the existing University website". The report went on to say that "As a minimum Google search should be implemented across site content .... Adoption of a University-wide meta-tagging is a prerequisite and a major editorial undertaking that should be done as part of the initial content rework". However, this recommendation seems not to have been adopted by the University

Culture

The department's help system has a rather elite target audience who are science-literate, but can be naive computer-wise. Amongst the opinions about the Help System are these -

That it in some sense belongs to the Computer staff (it does)
That it in some sense belongs to a small subset of Computer Officers (it doesn't. Any CO or operator can contribute)
That pages have to be written in HTML using the current house style (they should. Example pages are provided)
That authors will get mailed if a page is wrong (they will if links go bad, or if mistakes are noticed in a popular page)
That it's old-fashioned static HTML (most of it is static HTML)
That material is hard to find

Some of these beliefs inhibit page production

If a person thinks that they're not allowed to write pages, they might add them to their research group's web site or to their personal pages)
An author might rather not write a page at all than have to maintain pages long-term.
If the help system aims to replace work done by people, those people will be out of a job (or at least will have to do less pleasant work)

Some of these beliefs inhibit users trying the help system

For the 50% of students who use Facebook at least once a day, the help system will look old
The un-Googly search looks unfriendly

Though the skills that web users employ to further their hobbies aren't always used in their academic work, the gap between the help system and other information systems has widened recently.

Distributed Authorship

From the start, pages in the help system were owned and looked after by many people, though a small number of people write most of the pages. Initially a few central pages were owned by webadmin and the rest were in folders owned by individuals, making for easy management and identification of ownership. Each page mentioned its author, so bug reports could easily be directed to the right person, and (except for the top level) folders didn't contain files with a variety of owners.

There are disadvantages to this (e.g. when people leave, their pages need to be moved) but when we tried having more central pages authored by a role rather than an individual, mail to that role-name was left unanswered. It can take over a year for an incorrect sentence to be removed from a page, even with reminders.

Multiple authorship introduces other problems too - the standard of the HTML varies widely, and also when an author produces a new page they need to tell other authors to link to it.

Solutions

In 2009 a student created a pilot system based on Wordpress blog software, hoping to leave behind some of the above-mentioned beliefs. It shares many design ideas with MIT, though we were unaware of MIT's plans at the time.

Comments can be added by anyone to pages
A WYSIWYG editor and form-based input means fewer errors and easier authoring
Pages can be drafted so that someone else can authorise them.
There's more automated page- and link-checking
It's a blog, and blogs aren't old fashioned - they're fun (User 2.0).
When a new page is created it appears immediately in other pages' "related links" lists
Authors can add a comment to other pages, mentioning their new page. Better still, each page has an auto-generated "Related Pages" section at the end that lists pages with related tags.
a Wiki-style option is possible, letting authors reversably change other authors' pages
A WYSIWYG editor will guarantee more consistent (but not necessarily better) HTML
The system automatically records authorship
line managers can list all the pages written by particular authors, along with modification dates.

"The Corporate Blogging Book", Debbie Weil (Piatkus, 2006) looks at issues relating to the introduction of blogs into an e-mail-literate workplace. It mentions inhibitions

If bosses don't blog, why should the employees?
Some users and management think that time will be wasted (it will, if the resulting pages aren't used and advertised by staff)
People who are confident enough writers to post e-mail have doubts about producing web pages (because of larger audience, and uncertainty about etiquette)

The book also mentions advantages, some of which haven't yet been mentioned

RSS feeds help reduce bulking mailing
Less distance between "us and them" - students and staff

For more details, see the student's Final Report

Meanwhile, in October 2010 we gave the old material a new front page

Oct 2010 (click to zoom)

Wednesday 27 April 2011

Teaching O-O

We teach using Matlab and C++ so we could teach programming in an Object-Oriented way. Post-2008, Matlab's O-O support has become neater, and has more features - listeners; something similar to Java's final, etc, but we don't talk about that aspect of Matlab. Undergraduates define their own C++ classes in the 1st year, but only in an optional 3rd year course do they really get introduced to O-O. There's not really enough time to push multi-paradigm programming at them. They would end up forgetting how to write loops, or whether arrays begin at 0, and they'd more often do things like getting programs to print the final results by putting the output code into a destructor, or using objects merely to modularise code.

In a way you might think engineering suits O-O - re-usable components and interfaces are common to both fields - but engineers also know about K.I.S.S. and over-engineering.

In "Python for Teaching Introductory Programming: A Quantitative Evaluation" (Jayal et al; Italics V.10.1 Feb 2011) they say they "found four experimental studies that compare object oriented approach with the traditional procedural approach". One "by Reges (2006) has found significant gains in student satisfaction and enrolment after replacing the object oriented programming first curriculum with a procedural approach". The other 3 studies found no significant differences. Jayal et al. found that students who started with Python then did Java performed better than students who did Java all the way through the course.

In Back to Basics in CS1 and CS2, Stuart Reges says "Our new version of CS1 looks a lot like a 1980's course taught in Pascal. We have gone back to procedural style programming. I was motivated to do this after attempting and failing to teach a broad range of introductory students at the University of Arizona using an 'objects early' approach. I found that my best students did just fine in the new approach, but the broad range of midlevel students struggled with the object concept". The course begins by using Java with a lot of public static methods. He writes "Our switch to static methods has allowed us to bring back the problem solving aspects of the course that we thought were so important in the 1980's" adding that "even though Java is not an ideal choice for our CS1, we continue to use it because of its payoff in our CS2 course."

If that reasoning applies in a Computer Science course it applies even moreso to Engineering, I'd have thought. So should O-O stand for "Objects-Overrated"?

Thursday 14 April 2011

"Coders at Work" (edited by Peter Seibel, Springer-Verlag, 2009)

"Coders at Work" (edited by Peter Seibel, Springer-Verlag, 2009) has interviews with several famous programmers. Most use Emacs and debug using print statements. Most try to understand code in several ways - bottom-up, top-down, following the effects of a user action (e.g. deleting a character in an editor), looking at data structures, etc. Here are some quotes that might be of use

C++

"C++ is just an abomination", Zawinski, p.10
"[C++'s] syntax is terrible and totally inconsistent and the error messages, at least from gcc, are ridiculous", Fitzpatrick, p.63
"I don't like C++; it doesn't feel right", Armstrong, p.224
"given the kinds of goals that I have in programming, I think the decision [for C++] to be backwards-compatible with C is a fatal flaw ... C fundamentally has a corrupt type system", Steele, p.355
"Google is C++, strictly C++. It's no big deal programming in C++, but I don't like it ... by and large I think it's a bad language", Thompson, p.475

C

"C has destroyed our ability to advance the state of the art in automatic optimization, automatic parallelization, automatic mapping of a high-level language to the machine", Allen, p.502
"the biggest security problem to befall modern computers is C", Cosell, p.557
"one of the most important revolutions in programming languages was the use of pointers in the C language", Knuth, p.585

Debugging

"I love strace. Strace, I don't think I could live without", Fitzpatrick, p.79
"I think an hour of [team] code reading is worth two weeks of QA", Crockford, p.103
"we have found fuzz testing to be more productive than almost any other kind of testing", Eich, p.138
"the most important [debug] tools for me are still my eyes and my brain. I print out the code involved and read it very carefully ... So long as I can put print statements in the code, and I can read it thoroughly, I can usually find the bugs", Bloch, p.189-190
"most of my subjects have found that the hardest bugs to track down are in concurrent code", Seibel, p.xiii
"the first thing I will try is dropping in print statements to see if it will help me, even though that is probably the least effective for dealing with a complicated bug. But it does such a good job of grabbing the simple bugs that it's worth a try", Steele, p.365
"I don't know of anybody who [uses print statements] if they have the choice of using a good debugger", Ingalls, p.405

Gender

"Recently I realized what was probably the root cause of [the glass ceiling]: computer science had emerged between 1960 and 1970. And it mostly came out of the engineering schools ... And the engineering schools were mostly all men in that period", Allen, p.510
"A lot of people think it's the games and the nerdiness of sitting in front of a computer all day [that making computer science unappealing to women]. It's going to be interesting how these new social networks online will have an effect", Allen, p.513
"the conventional wisdom at the time ... said that women made good programmers because they pay attention to details ... today ... they're great on teams because they like to collaborate", Allen, p.507, 509

Miscellaneous

"We installed some buttons on the computer, because you could do that, at the time and one was a panic button. When the program appeared to loop one could just press the panic button", Allen, p.488
"I remember reading books about languages that I had no way to run and writing programs on paper for languages that I'd only read about", Zawinski, p.2
"I despise [perl]. It's a horrible language", Zawinski, p.11
"One of the jokes we made at Netscape a lot was 'We're absolutely 100 percent committed to quality. We're going to ship the highest-quality product we can on March 31st'", Zawinski, p.34
"On [Google's] top six or seven languages, there's a really strict style guide", Fitzpatrick, p.72
"I had a friend who had some iptables rule that on connection to certain IP addresses between certain hours of the day would redirect to a 'You should be working' page", Fitzpatrick, p.73
"most of the security problems that we've seen in operating systems over the last few years are a consequence of ++. In my programming style now I don't use ++ anymore, ever", Crockford, p.106
"I think threads are an atrocious programming model", Crockford, p.121
"I'm not an object-oriented, design-patterns guy", Eich, p.138
"I can't even remember which [UML] components are supposed to be round or square", Bloch, p.181
"the best existing multithreaded building blocks are in Java", Bloch, p.198
"I think the lack of reusability comes in object-oriented languages, not in functional languages. Because the problem with object-oriented languages is they've got all this implicit environment that they carry around with them", Armstrong, p.213
"with very difficult problems I quite often start right by writing the documentation", Armstrong, p.231
"From what I've seen of programmers, they're either good at all languages or good at none", Armstrong, p.235
"I don't think software is fractal ... I think the things that happen when systems get large are qualitatively different from the things that happen as systems go from being small to medium size", Deutsch, p.421
"garbage collection fights cache coherency massively", Thompson, p.472

About the programmers

Jamie Zawinski (Xemacs, Netscape, Mozilla)
Brad Fitzpatrick (LiveJournal, memcached, Google)
Douglas Crockford (Yahoo! Invented JSON)
Brendan Eich (Netscape, Mozilla, Invented JavaScript)
Joshua Bloch (Chief Java Architect at Google)
Joe Armstrong (invented Erlang)
Guy Steele (Lisp)
Dan Ingalls (Smalltalk)
L Peter Deutsch (Ghostscript)
Ken Thompson (Unix)
Fran Allen (IBM, compilers)
Bernie Cosell (PDP-1)
Donald Knuth (TeX, algorithms)

Monday 28 March 2011

Teaching Aids

The evaluation of new courses isn't always easy, especially when technology's involved. There's the

Hawthorn Effect - students work harder when they know evaluation is happening
Novelty Effect - some students like playing with new tech (the effect soon wears off)
??? Effect - students are affected by the increased motivation and interest of the staff

It's likely that the course needed an overhaul anyway, so even a less innovative re-write might have been popular

I went to a talk recently about the effect new tech might have on education. One person pointed out that years ago lap-tops were going to revolutionize education, but it hasn't really happened. Before that there was computer-aided-education (too often rote-learning). I think the web has changed things, more by evolution than revolution - information and help's more readily available. For computing there are Forums where sensible questions get quick and informative answers. There are Web pages covering issues that students might get stuck on - pointers, O-O, etc. Students no longer need to struggle on without documentation. Meanwhile however, the method of teaching programming doesn't seem to have changed much. Textbooks have become more varied (more jokes, more games) but the mainstream books haven't changed much. For C++ we suggest that students get Deitel and Deitel. The text (1600 page of it) is multicolored, there are asides in the margin and an associated Web site but the pattern remains of telling students about a concept then getting them to write a program that uses the concept. Some online books like Introduction to Programming Using Java (David J. Eck) have embedded Applets, which can be useful.

I've tried a few things

and a lecturer's written MetaCard animations. I've rewritten the 1st year C++ course and put it online. Though it's traditional in structure, it has links to supplementary material that students wouldn't follow were the links on paper. And we offer short and long versions of the document.

scratch For other subjects I think the availability of Apps (simulations, periodic tables, movies, animations) has assisted greatly. Not so in HE Computer Science. Facilities like Scratch are great, but have their limits. So what's next?

E-books - Most student work is still submitted on paper, and many handouts are on paper. A local department thinks it would be cost-effective to supply each student with an e-book loaded with course PDFs. The killer app is a program that makes freehand PDF annotaion easy
Tablets/Slates - In group activities with computers, it's typical for one person to hog the mouse. Tablets make the work more collaborative
GPS - Field trips, etc benefit from cameras with GPS
Smartphones - Students seem more likely to do something (e.g. provide feedback) if there's an App for it

but perhaps collaborative tools will make the most difference in the end. Google Docs can be used as a document collaborative tool: it's online, with version control and change tracking - and it's free.

Tuesday 8 February 2011

Legacy Code

"Since most computer science majors will face the horror of being assigned to a legacy system at least once in their career, computer science education should prepare them in a more scientific fashion by perhaps incorporating code discovery techniques in a software engineering course ... [they] may come away with a greater appreciation of the need for traditional, methodical software engineering techniques, as well as styles for surveying in the messy real world of legacy code" (JCSC 25,6 (June 2010)).Mark Meyer and Kevin Mastropaolo

Many experienced programmers try to understand code in several ways - bottom-up, top-down, following the effects of a user action (e.g. deleting a character in an editor), looking at data structures, etc. Though there's no substitute for experience, some books and web pages exist.

"Working Effectively with legacy code", M.C.Feathers, Pearson Hall

Working Effectively With Legacy Code - a 12 page PDF article where he says

The general legacy management strategy is: 
 
1. Identify change points 
2. Find an inflection point 
3. Cover the inflection point 
    a. Break external dependencies 
    b. Break internal dependencies 
    c. Write tests 
4. Make changes 
5. Refactor the covered code.

working with legacy code - slides of a talk

Another option is to use something like doxygen which as well as generating HTML or LaTeX files from a set of documented source files can also extract the code structure from undocumented source files by means of "include dependency" graphs, inheritance diagrams, and collaboration diagrams.

We ran a project for students many years ago where students were given code that they had to build on. The project's remained popular. We've upgraded the sample GUI code, but the rest of the code has barely changed in a decade, and we don't intended to change it. In some senses, the older it gets the better. Some student feel the need to overhaul the code (replacing error codes by exceptions, etc) and usually break it.

If you're setting up such a project you might find it hard to acquire suitable legacy. You may need to write retro-style code yourself.

Monday 17 January 2011

Working in Teams

Having written about student projects it's natural to talk about working in teams because that's how much of our project work is done - students like it that way, and the lessons learnt are useful beyond software-writing - beyond academia too. Again, I'll start with a checklist

Team size - We use 2, 3, or 6 (the 6 being split into 3 pairs)
Who chooses teams? - We sometimes impose and sometimes let the students choose. Both options have pros and cons. With a yearly intake of 300+, letting students choose their own teams can be time-consuming, and self-selected teams might be rather unbalanced. There may be non-academic reasons for controlling team-selection (a mixed of genders and background, for example)
Within a team, who does what? - Sometimes there are clear roles within a project. Sometimes the tasks are much less clearly delineated. For marking purposes staff need to know who did what.
Is there a team leader? - Specifying that there should be one enforces a structure onto the team
Planning overhead - What team-related milestones and deliverables are requested? An initial presentation? Gantt charts? Milestones? Some individuals who wouldn't plan a private project can see the point of planning when part of a team.

I like the idea of psychologically-profiling team-members at the start with a quick quiz, to see if the results are predictors of team dynamics. We don't do that, but one project does ask teams to complete a form at the end showing what percentage of the total workload each member did.

The same project also attempts to assess the time taken on the project and how much staff time was used (staff time is estimated to cost up to 250 pounds/hour)

Though students in general like working in teams, it can be a wounding experience. If team-members clash, how interventionist should staff be? Some staff involve themselves closely with teams and pro-actively influence dynamics. People at other sites sometimes offer mediation services. If staff leave groups to sink or swim, some will sink, and some innocent passengers will go down with the ship. We've had some generally popular team-based courses scoring un-exceptionally on end-of-year surveys because a few disgruntled individuals giving courses a zero rating. Reports include comments like "It quickly became apparent XXX did not have the self-discipline or dedication to be an effective team manager ... His tendency to speak to others in an authoritative manner despite the lack of useful input/output from himself created dissatisfaction amongst other team members."

By the end of a project, team dynamics can get out of hand. After 4 weeks, a team member was late for a final session. One of his team members growled "[He] does less damage when he's not here. I don't want him touching the code ever again."

Teams can also have problems if one person is repeatedly absent. For this reason we're tough on attendance.

We use online forums to encourage teamwork (and to give us a chance to identify teamwork problems)

According to "Pair Programming Illuminated" by Williams, L. and Kessler, R. (Addison-Wesley) students who work in pairs enjoy the work more, are more confident, and get better grades. Students typically ask half as many questions. Pair-programming helps female computer science students suggests that pairing is particularly beneficial for women. On the other hand there are courses (e.g. at the University of Washington) that have given up pair programming and have found that students get better grades (though the improvements may not be attributable to the teamwork decision).

In "Journal of Computing Sciences in Colleges" (Volume 28 Issue 2, December 2012) Tom Rishel describes a course where 5 teams simultaneously did some projects that had 5 development stages, each team working on a different project for each stage.

Friday 14 January 2011

Gender

Though the percentage of female students in the university is about 50%, the Engineering department figure is more like 25% (which is better than the UK engineering average of 14%). You'd have thought that once people have gone into Engineering, they might as well do Computing (in for a penny in for a pound) but in our 3rd year software project the figure's below 10% (0% some years). It's not just us of course; at Harvard computer science is the most gender-skewed subject, with women comprising only 13 percent of undergraduate CS majors. The proportion of female CS majors is similar at some of Harvard’s peer institutions - Princeton is 19 percent, and Stanford is 14 percent.

The problem goes back a long way. According to Fran Allen the ratio in computing might be engineering's fault - "Recently I realized what was probably the root cause of [the glass ceiling]: computer science had emerged between 1960 and 1970. And it mostly came out of the engineering schools ... And the engineering schools were mostly all men in that period"

4 issues are commonly mentioned in the documentation. As the authors point out, dealing with these points will improve the popularity of the subject in general as well as increase the female intake.

Lifestyle -
- "A lot of people think it's the games and the nerdiness of sitting in front of a computer all day [that makes computer science unappealing to women]. It's going to be interesting how these new social networks online will have an effect" (Fran Allen)
- "[women] attach their interest in computing to other arenas, to a social context that's more people-oriented. We refer to this as computing with a purpose as opposed to programming for programming's sake or a totally technology-centric focus. But the curriculum and culture does not acknowledge this interdisciplinary, contextual orientation toward computer science." (Jane Margolis and Allan Fisher, "Unlocking the Clubhouse: Women in Computing")
At engineering, we might not be too badly off in this respect: computing is only a means to an end, and we have students working on Design, Medical software, Green Technology, 3rd World technology, Teaching Aids, etc.
Career Options - It is sometimes thought that computing jobs reward those prepared to obsessionally work long hours. There are many 9-5 computing jobs nowadays, and many uses of computing in the humanities. I've been involved with computing projects about garden design, poetry, etc.
Role Models - our head of department is female. The IT group I'm in has about about 20% females. It's difficult getting female staff/p-grads involved with introductory computing - there aren't many of them in the first place, and they don't want to spend their teaching time on introductory courses as role models.
Pre-university qualifications/experience -
- "We also found because of early socialization in schools and at home, and a sort of early claiming of the computer as a boy's toy, that girls who wanted to major in computer science and got into one of the top computer science departments in the country actually came in with less hands-on experience. Although there was absolutely no difference in ability, there was a difference in experience, which then led to a difference in confidence during the program." ("Unlocking the Clubhouse: Women in Computing", Jane Margolis and Allan Fisher)
The choices made at school can be restrictive. At our department there are people in touch with developments in pre-univ education, and we run OutReach courses, bringing schoolchildren into the department. Elsewhere, girl-only summer schools are run.

Some other places have attempted remedies.

Several CS professors [at Harvard] indicated that encouraging more women to study the subject was among their top priorities for the future. "It’s something that we talk about a lot," said Associate Dean for Computer Science and Engineering J. Gregory Morrisett. "We are coordinating with a bunch of departments around the world and are trying a lot of different things in the hopes that we will uncover some of the issues and correct for them." (from The Crimson Harvard)
They improved the situation at Carnegie-Mellon -
- "There's been an attempt to teach computing in a more interdisciplinary way. Also, the university accounted for the different levels of experience - one of our findings being that women came in with different levels of experience, but there was no difference in ability."
- "A new set of courses was introduced in the first year, allowing everyone to self-select where they wanted to be according to their experience, and then everyone would be at a similar level by the second year. That means you wouldn't have students with little experience sitting next to someone that's been hacking their whole life and then get really discouraged."

Monday 10 January 2011

Which language?

A long time ago many students arrived at our department with some programming experience thanks to home micros running BASIC, and schools equipped with BBC micros. Then skills fell away. More recently, with free Linux being available, there's been a recovery in programming skills, but only amongst the computer literate, meaning that the distribution of our intake's computing skills is more bi-polar than ever.

Meanwhile, the use of computing in engineering has increased enormously, and new, computing-based areas of engineering have emerged. The invention of the WWW and cheap, small processors has led to the use of GPS and intelligent sensors in civil engineering projects. Google Maps and Google Apps enrich projects as well as aid communication between workers. Computer simulations and CAD continue to replace their predecessors.

The curriculum has slowly changed. In 1997 we moved from teaching Pascal as a first language to teaching C++ (well, C really). Later, there were compulsory computing questions in the 1st year maths examination. In 2003 we offered the MDP disc so that students could run a C++ compiler on their home machines. In 2008 we introduced a C++ summer project. In 2010 first years used Lego MindStorms in week 1, programming using Matlab.

Some of these initiatives aimed to inspire students or offer opportunities for those interested in computing. Other initiatives sought to improve the skills of the less capable students, or encourage a more self-taught approach but we still introduce programming concepts formally using a compiled language and a mixture of practicals and lectures. The increased freedom of choice means that students can more easily avoid programming, thus magnifying the bi-polar distribution of our intake's skills. Attempts have also been made to introduce programming into more engineering courses. In recognition of the need for proficiency in more than one language, Matlab/Octave is now taught (a vehicle for teaching about algorithms) to 2nd years.

Though the C++ course has always received above average results in student feedback, exam performance suggests that many people don't revise the subject. Also feedback from staff and older students hasn't been so encouraging - it's said that the course doesn't prepare people for the kind of programming they'll later need. Programming is used increasingly across all engineering disciplines, and C++ isn't the "swiss army knife" that students want.

C++ was never intended as a "teaching language" but it has withstood the test of time. It's used to write anything from operating systems to embedded programs in hardware. It can be used to introduce all the programming paradigms that have emerged over the years - procedural, object-oriented, generic, etc. It's used here in 2nd and 3rd year practicals (interfacing with low-level hardware and electronics), and a 3rd year software project. Though it's not used in very many 4th year projects, knowledge of C++ is useful for projects that involve Java, C# or Objective-C. Its use has become more focussed over the years as alternatives have emerged. In 2009 we adjusted the C++ course to exploit the teaching potential of the web technologies now available.

The new version separates the reference and lab-instruction aspects, so that during the practicals students are only told about what they need for the practicals. They are encouraged to experiment as they read, and to look details up online - web-access is assumed.
Because it's a web document we can add some interactive tests and offer students novice/expert choices. This also saves 10,000 sheets of paper. We've taken care that the web pages print out well, in case student want paper versions.
Some other departments in Cambridge and in the States leave students to their own devices. Without going that far, we'd like to make students more self-reliant and more able to continue work at home. To that end we've dropped the graphics in this course.
The theory in particular (but also the exercises in the practicals) used to emphasise Numerical Analysis. This emphasis has been reduced in favour of problem solving, but the need to retain examinable material is a constraint.
The impression that demonstrators gained in the past was that some students never recovered from the first impression that computing was hard. Though the material covered is nearly the same as before (casts have been dropped, C++ strings are used more, and classes are used in an exercise rather than just appearing in an appendix), the initial learning curve is shallower and the emphasis is on using C++ to solve problems. No C++ history is mentioned. There are no comprehensive lists of available features.
Initially a reduced set of C++ facilities is taught. C++ is an old language. Though new, safer features and notation have been developed to supercede older ones, the old ones are still legal. We introduce the newer or easier approaches first (especially if they're more like the maths or Matlab/Octave equivalents), then other notations later.
- We introduce i=i+1 long before i++
- We introduce and, or and not rather than &&, || and ! for logical operations
- We use braces around the body of while, for and if code, even if the body is only one line of code. This eliminates some common bugs.
- We use while loops before for loops. Experience has shown that many students never understand for loops, forgetting the notation entirely, thinking that the condition is an "until" rather than "while" condition, or thinking that
```
   for (int i=0; i<4;i++)
       cout <<i; 
```
  is somehow equivalent to
```
   int i=0; 
   i<4; 
   i++; 
   cout <<i;
```
Once the concepts are clear, the more common (albeit more obscure) alternatives are mentioned.

The result's an online C++ course. While I was at it I converted it to Python, trying to change as little as possible - see the Python course

Python is popular within the University, even for "Scientific Computing". I can see its role growing within the Engineering Department too. We're now in a more polyglot, mash-up age. Matlab for example has about 2.5M lines of C, 1M lines of Java, 0.5M lines of Fortran and about 2.5M lines of its own scripting code. But C++ (the Latin of programming languages) remains: though the first official reference guide for the C++ was published over 25 years ago, it's used in modern applications by Halo, Amazon, Google, Mathworks, Microsoft, Apple, etc. According to the October 2010 figures by TIOBE Ranking Index, JAVA and the C family remain the 2 most popular languages.

Before C++ becomes too complacent though, here are some quotes from "Coders at Work" (edited by Peter Seibel, Springer-Verlag, 2009), which has interviews with several famous programmers.

"C++ is just an abomination", Zawinski, p.10
"[C++'s] syntax is terrible and totally inconsistent and the error messages, at least from gcc, are ridiculous", Fitzpatrick, p.63
"I don't like C++; it doesn't feel right", Armstrong, p.224
"given the kinds of goals that I have in programming, I think the decision [for C++] to be backwards-compatible with C is a fatal flaw ... C fundamentally has a corrupt type system", Steele, p.355
"Google is C++, strictly C++. It's no big deal programming in C++, but I don't like it ... by and large I think it's a bad language", Thompson, p.475

And here's another quote - "We sold our programming soul when we began teaching the pedagogically unsound and intellectually ugly languages C and C++ to beginners" - David Gries (IEEE Computer, October 2006, p.81).

Tuesday 4 January 2011

Open-ended computing projects

We run a 3rd year project for 20+ students (teams of 3) which lasts 4 weeks and lets them choose how to fulfill the task. To make it more fun we modify the requirements a week before the deadline, the hope being that if their work is well structured, the modifications won't be too onerous.

We also offer 4th year projects for individuals which last half their final year where there's considerable scope for flexibility - within reason they can come up with their own project titles. Here are 3 examples -

Other places (computing departments in particular) offer more project coursework - e.g.

iPod Touch development (podcasted lectures are online)
Producing add-ons for free-ware (GIMP, Mozilla, etc)

Such projects often appeal to students (and in retrospect are thought to be very useful). Students mention projects in CVs and they're a popular interview topic. They bring together many issues at the heart of Software Engineering but present difficulties to staff

Project acceptance - how carefully should the student's suggestions be assessed?
Evaluation - how can dissimilar projects be compared fairly?
Copying - how much copying is there in an adaption? Should suspicions be raised if the source of a very similar product appears during (or just after) the student's project?
Staff Workload - if the students have freedom of language, platform, etc, help is going to be hard to provide.
Student non-productive work - if the student is trying to contribute to a big project there might be required procedures, packaging, version control, documentation, etc that take up too much time

Powered by Google App Engine In our situation we would like to bring some aspect of engineering into the project (for years 1 and 2 anyway) but WebApps are so prevalent nowadays that one could almost consider them as examples of basic programming so one option might be to use GoogleApps Engine as a platform. The online documentation tells people how to download the software and has a staged example which is easy to work through (Java and Python are supported). The resulting web-page + database back-end can be run on the user's machine. Students could then work on their own ideas and upload the result to the GoogleCloud. A set of standard project titles could be offered for the less ambitious.

Teaching Programming at University

Pages

Wednesday 13 July 2011

C++ function problems

How we introduce functions

Playing with dice

Conclusions

Frequencies

Thursday 2 June 2011

Online help systems

Searching

Culture

Distributed Authorship

Solutions

Wednesday 27 April 2011

Teaching O-O

Thursday 14 April 2011

"Coders at Work" (edited by Peter Seibel, Springer-Verlag, 2009)

C++

C

Debugging

Gender

Miscellaneous

About the programmers

Monday 28 March 2011

Teaching Aids

Tuesday 8 February 2011

Legacy Code

Monday 17 January 2011

Working in Teams

Friday 14 January 2011

Gender

See Also

Monday 10 January 2011

Which language?

See Also

Tuesday 4 January 2011

Open-ended computing projects