Thursday, April 12, 2007

Talk about software maintenance

IEEE Transaction on Software Engineering Dec 2006 issue published a very interesting study conducted by researchers at Carnegie Mellon University intended to explore and seek the answer to how software developers approach typical software maintenance tasks and if there is any behaviour pattern that can be observed as cues to help improving existing theory and toolset to make developers more efficient at performing such tasks.

The whole experiment in a nutshell consists of the following building blocks:

  • A group of above-average Java developers as test targets
  • A simple paint application written in Java with around hundreds of line of source code
  • A set of consistent pre-prepared typical maintenance tasks (including defect fixes and minor feature enhancement) for each developer
  • A video recording program running at the background to record how the developers perform their task
  • A artificial periodic interruption mechanism simulates real-life interruptions that happens in any software organization

Based on this study the researcher concluded that the following behaviour pattern was observed during the experiment:

  • Search – Developer explores cues in environment to choose a sufficiently relevant node to start comprehending
  • Relate – Developer explores cues in environment to decide whether to navigate a dependency, return to a previously visited node, or stop relating. If the node is relevant the developer collects it.
  • Collect – Developer uses some form of memory, external or otherwise, to remember what was found

For each task every developer went through a search-relate-collect cycle till they collect enough information to attempt a solution implementation, and if they encounter further uncertainty during the implementation they will repeat the cycle to collect more information that is necessary to help them moving forward. Along with the behaviour pattern, the research study also suggested some improvements for existing toolset and IDE on how to help developers to perform the search, relate, and collect actions more efficiently, but what really caught my eyes is the fact that since this research was completely conducted based on the traditional way of software development, without the agile elements, it becomes a perfect study material for finding out how agile approach can eliminate or at least minimize some of the difficulties that were observed in this study as well as what we should look-out in an agile project to ensure high software maintainability.

Any successful commercial or open-source system goes through a relatively rapid green field development stage, and a much longer sometimes more challenging brown field maintenance phase. Customer satisfaction (for commercial software) and community endorsement (for open-source software) are both highly dependent on how well and easily the system is maintained. According to this research, to improve software maintainability it pretty much boils down to helping developers search, relate, and collect information in hundreds of thousand or million lines of source code. Let’s talk about each aspect in more details and how agile approach helps in many different ways:

I. Helping Developers Search More Effectively

A. It was identified during the study that method naming played a significant role in helping or preventing developers from searching and understanding code effectively, thus Continuous Refactory, Short Method, Self-descriptive Method Name can really help you on this.

B. In the report, researchers also noticed the fact of having an effective way of communicating the intent and purpose of the code really help developers to pinpoint the optimal starting point of the their search effort as well as reduce the scope of the search they have to perform, and the researcher suggested that well-written and up-to-date document is a sound solution for this purpose, but in agile realm I think a well-written test suite will be a even better choice since they don’t just describe the intent and purpose but also coded and compiled against them so they can be verified automatically and kept up-to-date.

II. Helping Developers Relate Information More Effectively

A. Once developers find the clues they need to comprehend them and identify whether they are relevant to the tasks on hand or not, and it becomes obvious that the more readable the source code is the more effective the developer could be relating information. Agile principle Keep It Simple and Stupid, Just Enough Design, and Continuous Refactory can greatly enhance the readability of your code thus make this task a lot easier than it has to be.

III. Helping Developers Collect Information More Effectively

A. Collecting information seems all about how to store, categorize, share and maybe annotate the information that was allocated in the previous step, and it sounds like a job for the IDE to handle which is probably true just like what the researchers suggested in the report, our tool vendors do have a lot of room to improve and innovate; however if you look at this issue from a different perspective there might be just enough things that agile methodology can do to make this task easy enough so the existing IDE or your limited brain memory can easily handle it without major technological or biological advance. Now imagine a typical scenario that after hours of searching and relating finally you managed to comprehend a extremely complicated algorithm, a lot of information need to be collected and maybe stored somewhere in a document since it is so complex that you just don’t trust your own memory, and also you maybe would like to share it with your colleagues. Now instead of doing that how about directly apply your what you have learnt to the source code level, and use Refactory to reduce the complexity and increase the readability, therefore basically combining the documentation with your source code in one unified form along with the help from what I described in the previous two steps you might just be able to completely remove the need to collect information on paper, since the code is so simple and self-descriptive all you need to do is just search and relate, the rest is just plain in sight – no need to collect (at least not in a document).