Thursday, August 29, 2013

Is your burn-up chart holding back your thinking?

This week I'm going to take some shots at something that's actually one of my favorite Agile project metrics - the burn-up chart.

There's a lot to like about burn-up charts.  They're compact, but very information rich.  They convey several key concepts visually and in a way that's easy to understand.  They make it highly intuitive to understand the likely end date, and the uncertainty around that likely date.

So why am I picking on them?  Because (like many things, especially metrics) they can be used badly or in the wrong mindset, and in my experience often are.  Wrongly approached, burn-up charts can be reflective of some problematic thinking that can keep your team from being successful. 

I don't think these problems are the result of bad intention.  Rather, I think they stem from not thinking about how "what the burn-up chart shows" relates to "how we expect a typical Agile project to run."  My goal is to point out some of these "impedance mismatches," and present some ideas on how to make sure you and your burn-up chart are on the same page.

What's a burn-up chart, anyways?  

Simply put, a burn-up chart is a plot over time of work completed, and a plot over time of the "target" amount of work we need to  complete to achieve a goal.  By looking at the level and slope of the "completed" line, we can understand when we expect it to intercept the "goal." 

The concept of a burn-up chart can be applied to a wide variety of situations where "work" is achieved towards a "goal."  Burn-up charts can be be done with many kinds of "work achieved" (story points completed?  stories completed?  defects closed?  tasks complete?), can be plotted on multiple timescales (daily?  hourly?  monthly?), and can be used for a variety of goals (stories burning up for a single iteration?  tasks burning up to complete a given story?  stories burning up for a release?)

For purposes of this article, however, I'm going to restrict myself to the most frequently used type of burn-up chart, which is a "release burn-up."  The "unit of time" for this chart is is generally sprints/iterations, and the "unit of work" is generally measured in story points (or whatever the team's estimation unit of choice may be).  The "goal" line represents the total estimated work that's "in scope" for the current release.  The "burn up" line represents the number of story points that are completed each iteration.  It looks something like this:



To make it more useful, we can add a "trend line" for the work completed, which allows us to visualize where the intercept will happen:



Because our velocity isn't completely smooth, we can also extrapolate "worst case" and "best case" velocity estimates, to show us the range of uncertainty around our possible dates:



And there we have it.  It's fairly simple to draw.  And it answers a whole host of potential questions around "how is the team doing" at once.  You can see at a glance the "when do we expect to finish?"  The "range of uncertainty" is reasonably easy to understand. You can clearly see whether the team's velocity is relatively steady or jumping around.

You can also answer a number of what-if questions easily - how much would we need to cut if we need to be ready for production by date X?  Draw a vertical line on the date, and you can read from the projected trend lines the best/middle/worst case estimates on how much we can get done, so we know how big the cuts are.  Will be ready by date Y?  Is it in the range of uncertainty?  If not, probably not without making changes.  If it is, we can see whether it's an aggressive or conservative target.

So, yeah, it's a pretty good chart.  It's so commonly used that most Agile tracking tools will build it a burn-up chart for you.  Here's a few examples:

ThoughtWorks Mingle
Altassian Jira/Greenhopper
Rally

So, enough about what's good about burn-ups.  This post is supposed to be about what can go wrong with them, so let's get into that.

Burn-ups can block discovery

Let's say we're in the early stages of a project.  We did a quick discovery exercise, and found 200 points worth of user stories.  We did a velocity projection and estimated the team could do about 20 points of work per iteration.  Our first two iterations roughly bore that prediction out.  Our burn-up chart looks something like this:



Great. It looks like we'll be done around iteration 10.

Fast forward to iteration 9.  We should be in the home stretch.  But we're not - if we look at the actual backlog, we see that there are still 3 more iterations of work!  What happened?  Let's look at the burn-up chart again:



Ah.  The scope moved over time, as new stories were added to the project.  This means we have more work, and so we need more time to finish.

The two words most project managers (especially "traditional" project managers) are probably thinking about right now are "scope creep."  Clearly, the team should have "held the line" on new scope - if we wanted to add something new, we should have taken something out.  This project was "mismanaged."

And, in my opinion, if that's what they think, they're probably wrong.  What I described in the last paragraph is completely non-Agile thinking.

The notion that all the work "ought to be known" at the start of a project comes from "traditional" waterfall thinking, where we have an extensive "planning" phase up front that's supposed to uncover every requirement and task in the project before we get started.  Deviations from that plan are expected to be infrequent, and must be managed via a strong "change control" process to keep from materially impacting the schedule unnecessarily.

In Agile projects, we do away with the long, drawn-out planning phase that's expected to find everything we're going to need to do.  And in doing so, we need to do away with the assumption that all the work we need to do to accomplish the goal is known before we start.

While we might do a discovery phase up front, we should EXPECT that phase is imperfect and will miss a few things that need to be done.  Because we're collaborating with customers, we should EXPECT some amount of re-work or re-imagining of features will happen over the course of the project.  Those changes aren't "scope creep" - they're the norm of an Agile project.

Expecting that the team should "hold the line" on the scope we knew about at the start of the project, and that any "discovered" stories (either from customer collaboration or new discovery) means we need to remove something else to make room for it, is following a plan over responding to change.

So, what's the solution?

The simplest thing you can do to address this issue is start with the assumption that your scope line will be UPWARD SLOPING, not flat.  We don't know exactly which stories we'll uncover, but we KNOW we'll find some.  Assuming our scope line will be flat unless "something unexpected happens" sets the wrong expectations, both for the team and the customer.

The exact slope of the line will vary depending on the project, and factors like how unknown the problem space is, how expert the team is in the domain, how many different stakeholders we're trying to satisfy, etc.  I find a good rule of thumb to be that a "typical" team will spend about 20% of it's time working on "newly discovered stuff," leaving 80% for the "known at the start" work.  So my sample team with a velocity of 20 should expect to discover about 4 points of stories per iteration.

Some people may be howling that allowing this discovery factor is an open invitation to scope creep.  I don't think it is.  Just assuming the line to be upward sloping isn't the same as saying we'll stop thinking about newly discovered stories and just accept everything into the project.  Regardless of how we build the metrics, we need to have regular conversations with our product owner and only accept changes we genuinely want to deliver.

Also, assuming a "discovery rate" can make this a somewhat scientific process.  Assigning a "budget" for discovery helps us have productive conversations early in the project.  We tell our product owner their "budget" for story discovery is 4 points per iteration.  We then track it - in iteration 1, we added 6 points.  In iteration 2, we added 3 points, in iteration 4, we added 2 points.  Great - we're on track.  Or, if we're going into iteration 4 and we've been adding 8 points an iteration, not 4, we have to have a conversation.  Was our "discovery" expectation too conservative, and we need to change it (and so change our expected end date)?  Or is it that we had a major correction early in the project, and we don't expect it to recur?  The point here is that by tracking our discovery against a planned "budget," we can validate our assumptions.  We can even do it right in the burn-up chart:



In this example, the green line is how we expect scope to grow over time with discovery, so we should expect to be finished in Iteration 12 (and not Iteration 10, where we'd have projected to finish if we "held the line").  We can see from the trend in the Current Scope line that we're upward sloping, but roughly sticking to our discovery "budget."

One problem with this approach is that (to my knowledge) there aren't any automated tools that make it easy to account for an "expected discovery rate" - the burn-up targets tend to be horizontal lines (if anyone knows of a tool that DOES make this easy, would love to hear about it in the comments).  So if you want to track this, you may need to "roll your own" chart to do so. 

Burn-ups can cause complacence

Having talked about burn-up charts leading to thinking that potentially caused us to exclude things we ought to do, I want to talk about the opposite problem - burn-ups causing us to do work we don't need to do.

Again, by design, the initial "discovery" for Agile projects trades complete accuracy for speed - we're willing to tolerate some false positives/negatives in exchange for being able to get started writing useful code faster.   This means we'll miss some things, but it also means we'll potentially pick up some stuff we think is high value at the time that we'll later learn isn't actually necessary.

Most teams are pretty good at reviewing stories we recently discovered and vetting them for "is this really part of the project?"  But teams can be less careful about reviewing the things that are already "on the plan" and "in scope" for the project.

The burn-up chart can contribute to this lack of introspection.  If we had 100 points of work at the start of the project, and we're projecting to finish all 100 points within the expected timeframe people are asking for, then we'll just keep churning through those 100 points.  They're part of the "baseline" (which on a burn-up chart is literally a line).  The fact that we have a scope line with a fixed value as the "target" mentally causes us to treat that amount of work differently from "new" work, because we've already baked it into our metrics.


A good team should (and will) regularly review and "prune" the backlog - just as we expect to discover new work over time, we also expect that we will discover existing work isn't necessary.  We should remove those items, even if our burn-up chart "looks good" to getting the work done. 

Some of you may be noticing that in the previous section, I argued the scope should be expected to grow over time as we learn things, and here I'm arguing some of the initial scope should be expected to be removed over time.  Can't we just assume these effects "cancel out" and go back to the "flat" projected scope line?

While the two effects do partially cancel each other out, the magnitudes aren't necessarily the same.  In my experience, the "20% growth" assumption is what I've seen as the "net" effect - we discover new work faster than we remove work.

Again, the goal is to be deliberate about scope changes.  The team should be diligent on removing unneeded items, just as they should be diligent about only accepting necessary change.  There's nothing "special" about the work that was "in the baseline" - unnecessary work is unnecessary, whether it's "already in the metrics" or not. 

Burn-ups can encourage infrequent delivery

There are times when the right way to do a project is to spend several months building a cohesive product, and then release it together.  Sometimes you're building a brand new application that doesn't deliver value until it can support certain likely usage scenarios end-to-end.  Sometimes we're doing a major restructuring of a module that needs to change all at once.  Sometimes external circumstances around how we deliver necessitate a single large release (e.g. something that needs to deploy in parallel with a third party's application that we integrate with).

But these situations do NOT apply to all projects.  On many projects (probably MOST projects) we don't need to "build up" 6-12 months of work to have something of value to release to the marketplace.  And if we don't need to wait to do a big release, we probably shouldn't - one of the key goals of Agile is to deliver value to production rapidly, and begin to capture that value quickly.

Having valuable code that could be making us money "waiting" on a release is throwing money away.  We could be getting value from it, but we're not, because it's "part of the release" that happens six months from now. 

The continuous delivery movement has a lot to say on techniques/processes around the "how" of getting code to production quickly.  But I'm talking about metrics today, so I won't go into those.

The reason I'm seeing a potential problem with burn-up charts is that, once you're familiar with burn-up charts and grow to like them, you may start structuring your projects in ways that product a good burn-up chart, rather than alternate structures that don't work well with burn-ups but may deliver value more quickly.

The key insights you can get from a burn-up chart is a projection of when a given scope of work is likely to be complete in the (relatively distant) future.  If your burn-up chart tracks points once per iteration, they're close to worthless for giving us meaningful insight on how long it will take to complete a scope of work that's only an iteration-or-two worth of effort.

You can address some of this by plotting points more frequently (you might plot your "work complete" in days rather than iterations).  But in some cases, that's fitting the data to make the chart work, as opposed to thinking the chart is a metric that gives us useful insight that we can make actionable decisions on.  The real question is whether thinking of our project as relatively slowly "burning up" to a large goal is the right way to think about our project at all.

I see this thinking trap frequently on "new to Agile" teams.  They are coming from a world where a nine month project is considered "short" and the executive team is used to looking at a Gantt chart.  When they switch to Agile, they have slightly shorter projects (say, four months), but they don't have a Gantt chart anymore.  But we can show the execs a burn-up chart, and explain it to them, and we can make them relatively satisfied that they still have good insight into how the project is going.  But then the burn-up becomes a self-fulfilling obligation - we need to have burn-up charts because management expects burn-up charts.  And, subtly, we become mired in our thinking that the "right way" to think about projects is by burning up over time to a goal.

So, what do we do about this?  The first thing is to consider whether, like any other metric, a burn-up chart is actually appropriate to your project.  What is it going to tell you?  Does it fit the way we we're structuring the project to deliver value?  Let's be OK having the conversation that a burn-up chart isn't appropriate to our project.

One of the most productive teams I was ever a part of didn't bother with burn-up charts.  We'd simply agree "once these 4 stories are done, we can go to production with them," and make that happen - we'd deploy every week or two.

Burn-ups can cause missed expectations

Burn-up charts are great at predicting the "development complete" date for a given project.  Ideally, that's also the date we could "go live" with the new project - just click the button at the end of the last iteration and go.

In the real world, this is rarely the case.  First (as discussed in the last section), if you're a project that has a significant burn-up chart, you probably aren't sufficiently comfortable with your deployment process that you could just "push the button and go." 

There may be a genuine need for other processes to happen that need to take place between the "we're done writing code" and "we're able to go to production."  Maybe your application requires regulatory review and approval.  Maybe your process is to have a beta testing phase for final feedback before you flip the switch in production.  Maybe you need to coordinate the launch of the product with a marketing campaign.  Maybe you need to bring a swarm of tech writers in to put together the final "user manual" after the app is done.  Maybe you have a major outage that has to happen for the move to production to do a data migration into the new system. 

Some of these activities might be able to be done "as you go" during your development iterations, but probably not all of them.  Which means there are likely activities that take place AFTER your development is complete, but BEFORE you're in production (and delivering actual business value).

By design, a burn-up chart doesn't show these things.  What it shows is how we're progressing towards the "goal," where the goal is generally "we have completed all the stories."

The way we get an expectation miss is if we're reporting our burn-up charts out to a wider audience, and they're taking back the message of "the development and testing for features will be complete at the end of iteration 9 on Aug 1st," and conflating that with "we will be in production Aug 1st."  If that's not the case, you need to be careful about the messaging around your burn-up.  Which probably means "don't communicate ONLY the burn-up chart."

Some teams "account for" this by putting in placeholder stories and assigning them to "iterations" near the end of the project, so this non-development work will show in the burn-up.  I think this is often problematic, and is twisting the process to fit the chart.  Assigning "points" to "do the marketing for the release" is implicitly claiming the marketing activities are measurable on the same scale as the development work, and will have the same velocity.  But if the development team moves faster than we expect, the marketing doesn't magically take less time, and we'd be in trouble if our chart assumed we would. 

While I'm not a fan of them for many purposes around tracking software delivery, the much maligned Gantt chart can actually be your friend here - tracking dependencies and "cascading" dates appropriately is exactly what a Gantt chart is good at.  If you're putting together a "showcase" for your metrics, a slide with a name like "From Development Complete to In Production" showing the dependencies and timeline starting with your current best "development complete" date and ending with the date you're live in production can help everyone understand what those activities are, how long they'll take, and when the project is actually going to be done.

Wrapping up

As I said at the beginning, I generally like burn-up charts.  They're incredibly useful tools, and for certain kinds of projects are an ideal way to communicate progress succintly.

That said, I hope I've at least raised your awareness that (like all tools) burn-ups aren't perfect.  If we think about them the wrong way we can drive undesirable behavior, set incorrect expectations, or miss opportunities.  

Thursday, August 22, 2013

As a User, I want to be represented in User Stories

I'm a big fan of user stories for tracking upcoming work.  They keep us focused on the problem, not the solution.  They're good at reminding us why each chunk of functionality is valuable.  Done well, they're easy to prioritize and plan with.

But one of the big issues I have with them is that it's really easy to lose the "user" in the "user story."  I've seen way too many projects with several dozen stories that all being "As a user, I want...."  The placeholder "user," used over and over again, is usually indicative that we've stopped thinking about the actual users at all, and we're just putting it in for the sake of form.

Which is a shame, because thinking about who we're delivering value for is really important for us to write good stories.  So I'm going to talk about a few ways to break out of the rut of constant "as a user..." stories.

Users are not actors

One mistake I see frequently, especially with people transitioning from a use case heavy background to user stories, is to mistake the "As a..." clause of a user story as a place to identify an actor - the individual who is PERFORMING an interaction with a system.  But user stories are much more about stakeholders - the individual who gets VALUE from a feature. 

In many cases, these are the same, but they need not be.  Even though a "user" of the system might be the one interacting with it, something is happening that delivers value for someone else.  That's the person who "wants" the feature we're providing.  (By the way, I think the term "user story" is a slight misnomer for this reason). 

For example, let's say we're building an e-commerce portal, and we have a prospective feature validate a credit card number is a valid number (e.g. by running a checksum) on the client before we submit the credit card transaction for approval to the credit card processor.  This is a useful thing to do because many credit card processors charge by the request.  If we can figure out on our own the credit number is bad BEFORE we submit the request, we save money - we don't need to incur the expense of a transaction to tell us the credit card didn't go through. 

I could write a story for this like "As a user, I want to know that I entered a valid credit card number before I'm allowed to submit my order, so that I can fix the card number if it's wrong."

The problem here is that, while the "user" (a customer in this case) is indeed the one who entered the number, they probably couldn't care less about having an intermediate check between "I clicked submit" and "the transaction was sent to the processor."  Heck, they might not even notice it.  From the "As a user" perspective, this story is close to valueless.

However, the feature is NOT valueless.  Given miskeyed credit card numbers are a major cause of credit card transaction failures, doing an internal check here can cut down a lot of our "waste" of paying for failed credit card transactions.  Let's re-write the story from the perspective of the person who actually gets value from this feature.

"As the accounts payable manager, I want credit card numbers pre-checked for validity before they're sent to the credit card processor, so that I don't pay transaction fees for transactions I know will fail."

Suddenly, we've transformed the same feature from a "why would they care?" into a much more compelling story that clearly relates to business value.  And it's a lot easier to prioritize this against other stories.  If we only have 10 failed credit card transactions per day, this might still not be high value.  It we have 10,000 failed transactions a day, this might pay for itself in a week.

 "Accounts payable manager" might not be a "role" in the system.  Heck, the accounts payable team might never use our system directly to do their jobs.  And they're certainly not the ones who are actually entering the credit card numbers. But they're the ones who care about this feature - they're the ones paying the cost of not having it, and the ones who benefit from it being in place.  Their perspective is the one that's important when evaluating this story.

A good rule of thumb when putting together the "As a..." clause of a user story is to think "If we took this feature out of scope, who would be first in line to pay me to put it back?"

Users are not system roles

Another way we can lose track of who we're delivering value for is to simply think of individuals in terms of what their "system role" is.  We could think about every feature targeted by someone in a "user" level role as "As a user...", and have the features that are only for administrators written as "as an administrator."

This makes some sense from a technical perspective - we're delivering a feature that's intended for everyone who's role in the system is "user."  And since we're not intending to restrict the feature to only a subset of users, everyone who's a "user" gets the benefit.

The problem with this approach is that it treats everyone in a user role as being identical.  That's rarely the case.  For example, think of an online auction site.  Everyone who has an account (other than the admins) is a "user" - none are different from a perspective of what they have the ability to do.  Anyone can search items, bid on an item, sell items.

But think of the richness that's obfuscated if we just think of everyone as "users."  There are lots of subcategories of users that have different perspectives, and different needs.  You have power sellers - factory outlets selling many copies of the same item in bulk.  You have high-dollar-item sellers, who need lush descriptions and photos to make the item look good.  You have online yard sellers, trying to get rid of a few things as cheaply as possible to clean out their basement.  You have targeted buyers who want one and only one thing, and want to find it fast.  You have browsers who want to compare multiple items of the same general category (such as laptops from 10 different sellers).  You even have your day traders, who buy and sell the same items frequently to try and arbitrage short-term market fluctuations.

Each of these "users" might have access to the same tools, but thinking about which one wants a feature can help bring clarity.

"As user, I want to create a template of an item that I want to auction, so that I can later list that item quickly" vs. "As a power seller, I want to create a template listing of an item I have in stock, so that I can create multiple individual auctions quickly."

With the first story, we might wonder why the templating is useful - couldn't we just create the listing directly?  The rationale is much more clear when we think about the context of a power seller - the template helps because they can create the "same" auction multiple times.  And thinking through that lens clarifies how this needs to work - I clearly need to be able to re-use the template.  I may need to be able to change specifics with each new listing (like a serial number).  Also, we can probably assess the priority better- how many of our users actually fit the "power seller" profile?  Is it worth building this feature primarily for their use?

Users are people

Your users are not faceless anonymous masses.  They are (for the most part) human beings.  They are people trying to live real lives.  In many cases, their interaction with whatever you're building is a small part of their busy life.  Often, the thing they're trying to use your system to accomplish is a means to an end (I'm buying this book online so I'll have something to read when I go on my beach vacation).

The problem here, of course, is that every user is different.  Trying to write user stories that really capture each user's individual goals, nuance, and perspective is almost impossibly hard for any system that has more than a handful of users.

One technique to try and capture some of this "real user" feel without being able to model real users is to use personas.  For those unfamiliar, a "persona" is a somewhat fictionalized biographical sketch of a "typical" user, with more realistic goals.  While it's possible to go overboard with them, using personas can bring valuable insight into our implementation of a user story.

Many projects create user personas as part of an inception/exploration phase at the start of the project.  It can be very powerful to map stories back to personas.  If there's no persona that clearly would use a feature, that's telling us something.  Are we missing a segment of our expected userbase in our personas?  Or are we devising a feature that no one we think will use the product actually wants? 

Let's enhance our user story from the last section by tying it to a persona.  Back when we started the project, we anticipated we'd have power users, so we built a persona for them.

Meet Linda.  Linda runs her own distressed asset disposal business out of her garage.  She has an IT inventory management background, and has contacts with a number of local and regional technology distributors.  Linda spends all morning calling around looking for large cancelled orders, unopened returns, and other unwanted tech in bulk.  She buys pallet quantities for cheap prices, and sells them off online.  The margins aren't great, and she has limited space, so her business is all about speed and volume.  She opens dozens of auctions a day, and the less time she needs to spend setting up and maintaining her auctions, the more time she can spend finding great deals.

Now let's try our user story again, this time with the persona as the user.  "As Linda, I want to create a template listing of an item I have in stock, so that I can create multiple individual auctions quickly."

We can almost picture this in our mind.  We can see Linda sitting in her dusty garage with a laptop, next to an shrink-wrapped pile of LCD monitors trying to set up her auction template.  Hey - I bet she wants to be able to pull down the manufacturer's stock description into her auction template, rather than type out a description herself.  Hey - I bet she wants some way to track how many of these she has.  Hey - I bet when she creates auctions, she's going to want to auto-relist items that don't sell.  Hey - you know what would be cool?  If she could just scan the barcode on the manufaturer's box and pull all the spec's up automatically....

Some of these thoughts are probably different user stories - we don't want to suddenly drag the mechanism to create and auto-relist auctions into a story to create a template.  Even the barcode scanning idea is probably it's own feature that we can add after we build the basic template concept.  But keeping Linda and her problems in mind as we build each feature will probably guide dozens of small decisions we make over the course of each story.

It's possible to go overboard with personas.  Sometimes we build so many that we lose track - who was "John" again?  It's possible to build personas that aren't accurately reflective of the real user community, and so could steer us in the wrong direction (what if 95% of our power sellers aren't like Linda - they're like Charles who works for the distributor and is more interested in recovering the most value rather than turning items at speed?)  That said, writing stories to a reasonable set of identifiable personas is a powerful way to keep your team focused on solving real people's real problems.

Wrapping up

Software is for built to solve problems for people.  Keeping the team focused on who they're building software for, what they want, and why they want it is the key insight behind user stories as a technique, and why they're so powerful.

I don't imagine changing a single word in a user story will take your stories from good to great in a single stroke.  The "As an X..." isn't so important as itself.  Masses of stories reading "As a user..." are a symptom of a problem, not the problem itself. 

What's important is the thinking behind our stories.  Are we really thinking about who we're delivering value for?  Are we clearly thinking about our users as something other than a uniform monolith?  Can we see their faces and breathe their problems?  Because that's what makes us great builders of software.