“We need to get better at estimating,” an experienced member of a Scrum development team once told me. “Management is getting concerned that we keep coming up short on our commitment.”
“Really?” I responded. “What have you been committing to?”
“Thirty story points” she said. “We get there about 50% of the time. In a couple of recent sprints, we’ve even exceeded thirty points, but the last sprint marked the third time this quarter that we fell short of our target.”
“Why was your forecast off target, do you think?” I asked.
“Well, things come out of left field occasionally. You know, stuff that couldn’t be anticipated in sprint planning,” she answered.
“So, why are you estimating effort if you can’t predict what will happen in the sprint?” I said.
Now, at this point I want to make clear that I am not one of those who say that development teams should not estimate effort. For me, the ability to estimate independently is an important part of the autonomy of teams. What I was trying to do in this particular conversation was to get the team member to consider why teams estimate in the first place, and what “commitment” means in that context.
It is not just a matter of those doing the work being the only ones who are able to estimate effort, which I believe to be true. I recently participated in a large multi-team project of effort being estimated by a centralized systems analysis unit. In that project, the development teams were involved in estimating, but it was the systems analysts’ forecast that was being communicated to the customer. The result was massively inflated customer expectations, and retrospection revealed that the teams’ estimates were about four to five times larger—and thus far more accurate–than those of the analysts.
Why Estimate Effort?
So, development teams make the best and most accurate estimations. That’s all well and good, but it still doesn’t answer the question of why estimates are necessary in the first place.
Scrum is a “pull” system in that it balances demand with the team’s capacity to do work by giving the team control over how much work is brought into the sprint. No one can tell a development team how to do its work, or how much work it must do. Effort is estimated – in story points, in the number of stories or in person-hours – so that a judgement can be made about how much work can be pulled in from the product backlog in order to meet the sprint goal.
When either team members or their managers start to fret about the team’s commitment in terms of its target velocity, it is a sure sign of old muscle memory kicking in. After all, the velocity is a forecast, not a commitment. The team’s commitment is not to its velocity but to the goal that was agreed upon in the sprint planning. And, in any agile environment, commitment means that, given the information we have right now, the resources we have at our disposal and our judgement about our capacity to do work, we think we can get to a certain point by the end of the sprint, and we are going to do everything in our power to get there.
Therefore, a commitment can never be equated to a guarantee. It is akin to the moment a football team walks onto the pitch: the team is committed to winning the game, but there are too many variables in play (not the least of which is the opposing team!) for victory to be guaranteed from the start.
When managers ask me questions about the target velocity of the teams, my first response is to ask them why it should be any concern of theirs. They typically mumble a reply about making sure the team is working to plan, or being efficient. A brief chat normally follows in which I point out that management’s concern should be with business outcomes and with making sure the teams have the environments and tools they need to deliver them, not with the team’s tasks.
Target Velocity and the Sprint Goal
So, at this point we’ve established that target velocity is a concern of the development team and no one else (OK, the product owner needs to know about velocities so he or she can make trade-offs in the ordering of the product backlog, but that’s another story entirely). Estimates simply need to be accurate enough for the team to confidently identify the items at the top of the product backlog that can be worked on to meet the sprint goal.
A good product owner will give the developers extra wriggle room by articulating the sprint goal to customers and stakeholders independently of the product backlog items (PBIs) or the stories that might constitute it. By giving the upcoming increment a named state, or by describing the expected value in a few sentences, the product owner can work with the team if a mid-sprint descoping of the forecast is needed, without compromising the sprint goal.
All of this means that while the team’s estimates do not demand the same precision as task estimation in waterfall-based planning, they do still need to be accurate. Actual velocities that go up and down like a roller coaster help no one, and some level of predictability is needed. I’ve seen teams agonize over what a 13-point story should look like, and have even been shown handbooks in which reference PBIs are described for each number in the Fibonacci series. In my experience, this type of overly complicated approach never works.
“Smaller” is More Predictable
There is only one way to improve the estimate of effort for PBIs: breaking them down into smaller PBIs. Think for a moment about the values most often used in planning poker, for example: 1, 2, 3, 5, 8, 13, 20, 40 and 100. They loosely follow the sequence in which each number is the sum of the previous two.
This raises the question, why “20” instead of “21”? Simply, because “21” would be too precise. In effect, we are saying that anything more than 13 can be considered large, and will probably need to be decomposed.
The larger numbers in the sequence and, more specifically, the larger gaps between them, reflect a corresponding level of uncertainty. Suppose the team thinks a PBI is bigger than an 8, but is probably not big enough to be a 13. In that case, it might “precisely” be a 9, 10, 11 or 12. However, they can only categorize the PBI as either an 8 or a 13. Essentially, the larger a PBI is, the greater its possible variation from its predicted effort. On the other hand, if all the stories are, say, a 5 or smaller, then it is only when a PBI sits between a 3 and a 5 that you begin to see significant variance.
It doesn’t really matter whether the team uses stories and story points or not, as long as the underlying goal is still the same: to break the PBIs down to the smallest size they can be and fit a number of them into the sprint. In my view, this is the only way that a team can reliably improve the accuracy of their estimations.