Probability, Velocity and Re-Estimating User Stories

When working with user stories, story points and SCRUM, it is not uncommon to come to the end of a sprint and realize that a smaller story took longer to complete than a larger story. While this is an excellent reminder that story points reflect complexity, not effort, it is often misinterpreted as an indication that remaining stories need to be re-estimated. Basic probability distribution shows us that over a large enough sample, all three point stories in the product backlog will take about the same time to complete. A team’s desire to re-estimate stories in the product backlog based on a few stories deviating from the norm is a mistake.

Not All Stories are Created Equal

If you were to roll a pair of dice a million times, you would see the majority rolls sum up to seven. If you graphed the data, you would see a bell curve shape. User stories are no different.

Over a large enough sample, you will start to see the same bell curve shape in the time it took to complete those stories. Some three point stories might take two hours to complete while others take twelve; but most take seven. At the same time, some of your five point stories will take ten hours to complete and some twenty four; but most take eighteen. Consider the overlap in the following diagram:

The overlap between the time to complete some of the three point stories and some of the five point stories should not be mistaken for a need to re-estimate any stories. Sometimes a three point story will take as long to complete as a five point story.

Commitment Driven Planning

One technique from which all SCRUM teams could benefit is that of Commitment Driven Planning. From Cohn’s Agile Estimating and Planning

A commitment-driven approach is an alternative way to plan an iteration. Commitment-driven iteration planning involves many of the same steps as velocity-driven iteration planning. However, rather than creating an iteration plan that uses the yesterday’s weather idea to determine how many story points or ideal days should be planned into the current iteration, the team is asked to add stories to the iteration one by one until they can commit to completing no more.

So what happens when a three point story is tasked out and comes to ten hours, and a five point story is tasked out and comes to nine? Often times the first inclination of the team is to inflate the three point story. Especially interesting is that almost no one suggests to shrink the estimate of the five point story. Again, we should expect that some three point stories will take the same amount of time to complete as some five point stories, but in the end this will even out.

Velocity is The Great Equalizer

Again from Mike Cohn’s Agile Estimating and Planning

What’s happened here is that velocity is the great equalizer. Because the estimate for each feature is made relative to the estimates for other features, it does not matter if our estimates are correct, a little incorrect, or a lot incorrect. What matters is that they are consistent. We cannot simply roll a die and assign that number as the estimate to a feature. However, as long as we are consistent with our estimates, measuring velocity over the first few iterations will allow us to hone in on a reliable schedule.

As the team complete more and more sprints, developers will come across three point stories that are a little larger than average and five point stories that are a little smaller than average. Between our bell curve distributions and velocity acting as the great equalizer, over time, the bumps in the estimations will smooth out and the desire to re-estimate stories will fade.

Do You Make This Common Mistake When Estimating?

There’s a common mistake that many software developers make when estimating projects. Here’s how you can avoid falling into this trap.

When estimating a project using the Planning Poker method, many developers like to use a baseline estimate for a given task. For example, many developers use CRUD, the creating, displaying, editing and deleting of a Model as their baseline estimate. Once they’ve got their baseline in mind, it makes it easier to estimate other stories that are more domain specific, or so it seems.

Baseline Estimates are Broken

When you’re using a task like CRUD as a baseline for your estimations, you can easily skew the estimations of the other stories in the project. Let’s say we’re using a 3 point baseline for CRUD stories.

  1. As a user I should be able to upload a profile photo – 2
  2. As a user I should be able to CRUD movies I’ve seen – 3
  3. As a user I should be able to send a private message to another user – 5
  4. As a user I should be able to create a trivia quiz for a movie that I’ve seen – 8

In this first example, the stories are probably estimated fairly well and compared to each other, the complexity is quite relative. What happens if we add a few more easy stories or a few more difficult stories?

  1. As a user I should be able to see a contact e-mail on the home page – 1
  2. As a user I should be able download the menu as a PDF - 1
  3. As a user I should be able to read a privacy policy – 2
  4. As a user I should be able to CRUD restaurant reviews – 3

With easier stories added, the CRUD story is definitely the most complex, but compared to the others it is significantly more complex than seeing a link on a page or viewing some text.

  1. As a user I should be able to CRUD portfolio photos – 3
  2. As a user I should be able to signup for a paid recurring account – 8
  3. As a user I should be able to make connections with other users via Facebook – 13
  4. As a user I should be able to send a rocket to the moon – !

When the other stories become much more complex, the CRUD task is again shown to be significantly different in complexity, this time in the other direction. In this group of stories its hard to imagine that managing portfolio photos is only two orders of magnitude away from a recurring payment e-commerce system.

Relative Complexity Works

Its important to estimate a group of stories so that the complexity of each story is relative to the next. In Mike Cohn’s Agile Estimating and Planning, he describes a method of estimating stories called “Analogy”.

When estimating by analogy, the estimator compares the story being estimated with one or more other stories. If the story is twice the size, it is given an estimate twice as large.

When you apply this technique to your estimation process, you will have a more coherent set of estimates. From this you will be more likely to determine an accurate estimated velocity and you will have a better overall sense for the scope of the project.

Hurdles to Estimating by Analogy

  • Estimating all of your stories one by one makes it difficult to estimate a relative complexity as you only have the previously estimated stories with which to compare your current story. A group of especially simple or complex stories could be waiting towards the end of the sessions.
  • Estimators who are new to the agile estimation process might have difficultly estimating a story without a point of reference.
  • Estimating with a baseline can be a difficult habit to break since it provides a convenience and familiarity to the estimator.
  • Two seemingly similar projects may have a greater than expected difference in total number of story points, even though their velocities are relatively the same.