Thursday, August 31, 2006

What can we learn from the Open Source community?

The most common complains I heard from the developers during my consulting career are as such that “The job is boring”, “The job is too stressful”, or “The manager does not know what he/she is doing”, and if I hear that a lot from the team then it means the team is probably demoralized which in most cases inevitably leads to a project failure. At the same time on the other side, the common complains that coming from the managers usually consist “The developers are not self-motivated/managed”, “The developers are too picky”, or “The developers are too greedy” as you can simply see both sides are not happy with each other, and in most cases that I have observed eventually the trust between managers and developers will be completely broken resulting in further demoralization which ultimately contributes to the failure of the project or can even bring down a company.

Now lets take a look at some of the characteristics of a successful open source team. In my opinion almost every successful and popular open source project has the following characteristics: firstly the team delivers high quality software that largely satisfy end-users' requirement, secondly it is consist of a group of highly skilled and self motivated developers, and finally (this is the sweetest part for the corporation) they don't get to paid to do all of that, but rather doing it for FREE. Now it is obvious this kind of developers are the perfect kind of employees, a dream come true if I may say for many of the corporations out there, so my next question will be how can a company acquire this kind of employees? If I am running a recruiting company, I will probably tell you that I can find someone like that for you, but since I don't I am going to tell you the truth. It all starts from building the company culture. In my opinion, two major contributing factor formed the solid foundation of any successful open source team and their formidable power of devotion.

The Learning Edge

Recent research in Psychology shows that when a person is in a constant and comfortable learning zone without too much pressure, then he/she has a much higher chance to get into a state called “Flow”. Flow, as described in The Psychology of Optimal Experience by Mihaly Csikszentmihalyi is a state of altered consciousness in which our ability to concentrate and perform is enormously enhanced. The research also suggests that within this state a great level of self-satisfaction and appreciation are being generated for the individual, in other words in this state a person can be more easily satisfied without too much stimulation from the external environment. The open source project perfectly reflects this theory, most of the team members are not being paid to work but still they devote a large chunk of their time and energy to do something, why? Because they can achieve satisfaction this way without any reward in any materialized form. We all know from Maslow's Hierarchy of Human Needs that once the basic needs are met the effectiveness of putting more investment such as increasing salary or benefit suffers from the law of Marginal Efficiency of Investment, hence it can not be used effectively as a motivation tool forever, sooner or later the company has to consult to other means to motivate people and of course the Learning Edge is the cheapest and proven the most effective way of doing that. For more information on the Learning Edge management philosophy check out the “The Learning Edge” article that is published on June 2006 ACM Communications journal.

Direct Feedback from the End Users

This is something else that's usually missing in many corporations but playing an very important role in open source community that also contributes to the high level of self-satisfaction and motivation demonstrated by some of the most successful open source team in the field. A good manager knows to openly express his/her appreciation to the team to increase the morale and the team spirit, but just like salary this kind of appreciation from a single manager or even a couple of them also suffers from the law of Marginal Efficiency of Investment and eventually become ineffective or too expensive to be effective, however if you open the communication channel between the developers and the end users directly then you actually open the door to a infinite appreciation program (if the software quality is good and also serves the end user's requirement) as well as an equally infinite monitoring and inspection program (if the software suffers from quality issues or fails to meet the end user's requirement) although special care need to be taken in consideration to remove the distraction and any other negative effects as side effects this direct channel can create, but overall I think the benefit greatly outweighs the shortcomings.

To summarize, to be competitive as a technology company at this post-dot-com era you can not only rely on the traditional management toolkits and philosophies but also need to learn from the open source community to first of all maintain a reasonable salary level as well as talents and most importantly to survive and strive.

Monday, August 21, 2006

Deadlock problem in MySQL JConnector 3.1

Recently I found a very interesting but also annoying database deadlock problem in the my current project. For the performance reason we uses different storage engine in MySQL for different type of tables, for example InnoDB engine for OLTP type tables, MyISAM engine when high data throughput as well as certain level of data inconsistency can coexist, MEMORY engine for transient data or tables that have extreme performance requirement.

Just a couple of weeks ago, during system stability testing we discovered that occasionally deadlocks were found in MySQL JConnector 3.1 driver when we have large number of InnoDB and MyISAM tables coexist in the same database. According to MySQL bug base, this problem is scheduled to be fixed in JConnector 5, unfortunately due to many different reasons both technical and political we do not have the luxury to upgrade to JConnector 5 at least for now in this project, but we also disparately need this mixed engine deployment feature from MySQL to meet our performance requirement. It became obvious a couple weeks ago, I had to fixed this problem for this project to be successfully released. After downloading JConnector source code, by using the combination of simple JVM dump and JProfiler finally I pinpointed the problem in two classes com.mysql.jdbc.Connection and com.mysql.jdbc.ServerPreparedStatement. In the original Connection class prepareStatement method implementation, where PreparedStatement gets created, it only uses a simple synchronized keyword at method level as the semaphore. Within this method Connection class performs many call to ServerPreparedStatement to open, close, and check state of the statement. Based on my observation and investigation, realClose method on ServerPreparedStatement seems is the one that causes all of the problems. Looked at the realClose method, quickly I realized the problem is that in this method ServerPreparedStatement makes callback to Connection class which causes a classic cross-reference scenario (the breeding bed of the notorious deadlock problem). After that I also examined how ServerPreparedStatement uses its semaphore, it appears to me it is obvious that the orginal implementation of the locking mechanism in this part of the code is rather naïve and not well thought-through. The original implementation of ServerPreparedStatement uses three different semaphores in realClose method in the order of statement.this, connection.this, connection.mutex, and now if you look at the Connection.prepareStatement implementation as we mentioned before at first it seems only uses connection.this as the semaphore, but with a little bit digging I realized that before calling prepareStatement method the current thread has already acquired connection.mutex as a semaphore which means Connection.prepareStatement locks in connection.mutex then connection.this sequence, but ServerPreparedStatement.realClose locks in connection.this then connection.mutex sequence. Now everything is clear, it is a classic locking sequence problem, and the solution is easy that is to make the sequence of semaphore acquisition the same in both Connection.prepareStatement and ServerPreparedStatement.realClose methods by adding explicit locking order:

synchronized (getMutex()) {
synchronized (this) {

to Connection.prepareStatement implementation and switching the locking sequence:

synchronized (this.connection.getMutex()) {
synchronized (this.connection) {

in ServerPreparedStatement.realClose implementation.

Although this fix did fixed all the problem we had in our project, but according to MySQL official bug base that this deadlock problem might have a much broader base of causes and impact, therefore it can be only address in version 5, so before you apply this fix make sure you fully understand the implication and the causes of your particular problem, also I strongly recommend you running a thorough stability test and profiling session after applying this fix, nevertheless I am hoping this post will shed some light to this problem as well as its solution and hopefully can also provide some help.

Wednesday, August 16, 2006

How to select a right Agile methodology for your project - Part 3

In my last two posts we have talked about XP as well as a little bit about RUP, and today I will explore the SCRUM methodology. Relatively speaking SCRUM is still the new kid on the block comparing to the other two, and that is also why you can easily observe the similarities among them since SCRUM does borrow ideas from both XP and RUP. Some even went one step further and announce SCRUM “as a management wrapper for Extreme Programming” (see Control Chaos website for more details on this theory) and similar claim for RUP as well (follow this link for the full article). Although I don't totally agree with this management wrapper theory, I do agree that SCRUM did learn from both XP and RUP through their success and failures, and try to come up with a better process.

Since most of people agree that SCRUM is more comparable to XP than RUP, plus at its core RUP is just a set of guidelines as I mentioned in part one, hence here I am going to focus on comparing SCRUM with XP and show what are the implications and hidden differences between them that you need to look out when implementing SCRUM in your project. As a methodology, SCRUM is more about management than development, comparing to XP SCRUM does not give out or enforce any coding guidelines or practices, such as the test first, pair programming, and refractory that XP requires, however in process management aspect SCRUM is way more rigid than XP, it has hard fixed requirement on team size, meeting frequency, meeting length, iteration length, and so on. But before we jump into details, I want to point out the first and foremost difference between SCRUM and XP which is also the reason why I don't agree with ADM's “using SCRUM as a management wrapper for XP” theory. If you remember what we talked about in part two, that the one and the only one reason why XP is considered so extreme is it allows customer changing their requirement at any time they want even in a middle of iteration. In order to make the whole process more manageable, since manageability is the most scaring part of XP at least for the traditional managers, SCRUM took this cornerstone of XP out of its process. Now people can argue that they can use SCRUM as a management wrapper for XP, since SCRUM does not have its own development principles so XP development style fits it pretty well, but can you still call it XP. I don't think so. It is like for people who find white-water rafting as a sport is too extreme and unmanageable, so to make it more manageable they take out a 'white-water' part and then claim calm-water rafting is a management wrapper for the white-water rafting although you can still apply all the precautions and best practices that 'white-water' rafting requires, but can you still call it an extreme sport?

Alright thats enough about the management wrapper theory, now lets take a look at some core SCRUM rules as well as the reasons and implications behind them:

  • Fixed Backlog Items (Requirement) for each Sprint (Iteration)

    • Reasons

Although SCRUM does acknowledge the changing nature of the requirement during the process, as we mentioned before in order to increase the manageability as well as lack of development principles and guidelines SCRUM is neither designed nor capable to handle frequent change of requirement in the middle of a sprint.

    • Implications

As soon as the sprint kicks off, all sprint backlog items become frozen. New backlog items only get added to the product backlog item pool, and picked only as early as when the next sprint starts. Because the relative inflexibility of the requirement, when implementing SCRUM the development team should always start picking the high priority items first since they are most likely to be the most stable requirement. Also due to this stableness, when using SCRUM on a simple project excessive testing and high level unit-test coverage might not be necessary (high test coverage, above 80%, is still recommended if the project is complex or it has a long predicted field life in other words refractory is required for the evolving architecture)

  • Fixed Sprint (Iteration) Length – 30 days

    • Reasons

First of all it is a widely accepted optimal iteration length for a 7 person team in RUP community (See The Rational Unified Process Made Easy for more information on recommended iteration length). Secondly in SCRUM for each Sprint there are relatively formal planning and review process (usually it takes a day or two), a very short Sprint will just simply create too much overhead because of this form of ceremony. Last but not least, because SCRUM does not enforce high test coverage, hence change of implementation and architecture is relatively expensive therefore the process was designed to shield any changes from the outside world for fixed period of time allowing the team to focus on what they are doing and get into their flow to maximize the productivity when minimizing the overhead.

    • Implications

If you are working on a project that needs very short iteration probably because a large degree of uncertainty and unpredictability within the requirement or just simply because of a super aggressive deadline, and before you try to customize your iteration length you should really consider XP or AUP which are tailored for short iteration or even downsize your team to fit a shorter sprint (usually I don't recommend shorten your sprint to less than 2 weeks with SCRUM), but if you insist to change the length of iteration remember the following consequence as well as some recommended solutions:

Consequence

Recommended Solution

Overhead from formal planning/review ceremony

Batch the planning and review activities for multiple sprints in one session

Overhead from more dynamic requirement and implementation

Increase test coverage as well as the SCRUM Master needs to do a better job to shield the team from the outside 'noise'

  • Max Team Size - 7 people

    • Reasons

Firstly because SCRUM does not require pair programming (here is an excellent article I read a couple of years ago thats explains why paring is a good idea), knowledge does not get spread as naturally as it is in XP and to overcome this problem SCRUM uses close-up open working space concept as well as mandatory daily meeting to help speedup the knowledge transfer. When you increase the team size over 7, this approach become insufficient to serve its purpose. Secondly like XP due to the decentralize the team structure that SCRUM adopted, once you have a large size team too many communication paths get created, and research shows that once team size exceeds 12 the communication overhead starts growing exponentially.

    • Implications

SCRUM does not have any problem handling team thats smaller than 7, but you need to be very careful when considering to increase your team size over that. Here are some possible consequences when you do that and recommended resolution from me:

Consequence

Recommended Solution

Inefficiency caused by large number of communication paths

Assign a team lead who is not the SCRUM master since the SM has to deal with the blocking issues plus shield the team from the rest of world. The team lead needs to be equally involved as well as focused on development to reduce the communication paths effectively.

Knowledge become isolated to its implementer

Promote pair design, pair programming, and conduct code review if necessary, in addition to that high test coverage as well as simple design will help preventing accidental breakage of the existing implementation because the modifier lacks of the appropriate knowledge.

In summary SCRUM is much more manageable at least in traditional sense comparing to XP plus the fact that in SCRUM all the traditional managerial type of position and pay check can be easily kept as the chickens (Observer/Contributor) to avoid the pain of converting people from architect to developer, thats why it remains so far the most favorite choice for any project that wishing to transit from more traditional waterfall model to a relatively more agile approach.


Huh... this turned out to be another lengthy post, and I am hoping you will find the reading is not too bumpy and the information is useful.

Sunday, August 06, 2006

Thoughts after reading “Componentization: The Visitor Example” by Betrand Meyer and Karine Arnout

This article was published on page 23 in IEEE Computer magazine July 2006 issue. My thoughts after reading this article are that the authors went on and on talked about how componentization is so much better then design patterns, so if we could we should some how componentize all design pattern at all possible levels. They argued that patterns “only provides the description of a solution but not the solution itself” and “Patterns such as Visitor or Bridge is a good idea carried halfway through. If it is that good, we argue, why should we ever use it again just as a design guideline? Someone should have turned it into an off-the-shelf component, a process we call componentization”

Despite the enthusiasm the authors had about their theory and product, the conclusion and reasoning behind it are rather naive. First of all, they used a very generalized and high-level definition of design pattern to start their argument. They explained that “A design pattern is an architectural solution to a frequently encountered software design situation”. Take a first look at this definition, it seems that design pattern is kind of similar to component, which is defined as “A system element offering a predefined service and able to communicate with other components.” in Wikipedia, but if you take a closer look, you will realize the key difference between design pattern and component is basically although both of them are aimed at reusability, design pattern is created to address design and communication issues rather than being a off-the-self solution that address issues in a problem domain as what component is designed to do. Now you will realize that the first argument they put up that design pattern only describes a solution but not the solution itself is not the weakness of design pattern, and in contrary the nature of design pattern. Based on this weakly established basis the authors start using examples to try to convince reader how much better that components could be, which inevitably led them to make another mistake in which they picked some classic design patterns such as Visitor and Bridge to serve as their examples. If before they are just comparing orange with apple (at least they have similar size and both of them are fruits) now they are comparing building with bricks, since now they are not comparing componentization with some conceptualized definition of design pattern but some concrete design patterns in object oriented realm. If you are familiar with the evolution of software design, you will realize that there are different level of encapsulation/abstraction existing in this domain:

  • Level 0 – No encapsulation
  • Level 1 – Macro or function level encapsulation
  • Level 2 – Class level encapsulation (usually referred to Object Oriented design philosophy)
  • Level 3 – Component level encapsulation
  • Level 4 – Service level encapsulation

As you can see that design patterns and componentary live on different planes of encapsulation/abstraction, comparing them does not make any sense, it is like comparing service with component. The same argument can be made the component is just the description of a part of solution (from service point-of-view) not the solution itself therefore it is not good enough, hence why bother creating component we should just focus on making ready to plug-and-play services, a process we call servicise.


To summarize here, I am not trying to say that design pattern is perfect and we should use them at all cost. In contrast I agree with some of the harshest criticism (you can have a peek of them here) out there against design pattern, but compoentization is definitely not the solution to these problems.