Thursday, September 5, 2013

Should you re-estimate work?

Hi, folks.  This week, I'm potentially making a "land war in Asia" level mistake by weighing in on one the "holy wars" around Agile projects.  Specifically, should you re-estimate stories once you've estimated them the first time?

This might strike some as a crazy thing to have a discussion about.  Why wouldn't you update your estimates as you learn more?   Our estimates should be as "good" as they can possibly be.  If we've learned something new that influences "how big" a given chunk of work will be, we should update our estimates.  Shouldn't we?  

Actually, maybe we shouldn't.  There are some reasonable arguments made that we SHOULDN'T strive for keeping our estimates as "good" as possible.  Estimation effort is potentially costly.  Estimates will never be accuracy.  And there are some reasonable questions around whether updating estimates actually makes your project predictions more accurate or not. 

This week, I want to take a look at why we have estimates, what the schools of thought on re-estimation are, and how I'd recommend approaching re-estimation.

Why have estimates in the first place?

That's actually not a rhetorical question.  While "you have to estimate everything!" is deeply ingrained in most software professionals by now, it's not completely obvious that we HAVE to have estimates to  build software.

As folks in the "lean development" camp will point out, work can flow just fine independent of a "schedule."  (I recommend reading some of Mary and Tom Poppendick's work on lean software if you want to explore this further).  Just let the team work out the most valuable stuff, have them work on it, and release to production.  If you're able to deliver work continuously, you don't necessarily need a great master schedule of "when will you be done?" to steer your team.  (I mentioned this from a different angle last week).

"Should you estimate at all?" is a topic for another time.  What I want to point out is that estimates are not per se valuable.  They are not an end in themselves.  They are a tool, used to answer a question.  "Having good estimates" is not the goal.  "Being able to understand the project" is.

Also, estimates are costly.  Every minute your team spends estimating software is time spent by your team on activities that do NOT result in any valuable working tested software being built.  Estimation time takes the team away from delivering value to do something else.  If the estimates provide valuable insight, they might be worth the investment, but they're not free.  All else equal, we should try to minimize the time spent doing tasks that don't result in building valuable software.

On most Agile teams, a common practice is to do fairly lightweight "relative size" estimation of work.  We then use velocity (the team's speed at delivering over time) to project how fast the team can deliver work.  Mike Cohn's book Agile Estimating and Planning is a great reference on this.  He also has a video of a presentation on the topic that's a good intro if you're unfamiliar.

For the rest of this blog post, I'm going to assume you're familiar with relative size estimation and velocity planning.

Getting back to my not-quite-rhetorical question at the top of this section, I'm going to assert that we estimate on Agile projects to allow us to answer two related-but-not-identical questions:
  • How much time do we think it will take to accomplish a given large scope of work? (the MACRO question)
  • Which specific pieces of work is it reasonable for us to take on in the near future? (the MICRO question).
The macro question is the "big picture" question - when is the new website likely to be ready?  It's also effectively the "how much does it cost" question - cost can be projected as "run rate" cost for the team ($X per week)  multiplied by time).  This is the "looking forward six months, where will we be?" question (As I talked about last week, there's a question on whether we should be thinking about our work in such long chunks, but that's a different discussion).

The micro question is the "what can we do right now" question.  It often boils down to "which stories off our backlog do we think can fit in the next n-week iteration?"  If we know our velocity is 20 points, we can in theory count the points on the work already "in" the iteration, and decide whether "one more story" will fit.

Why would we potentially want to re-estimate?

Again, for purposes of this blog post, I will posit a project that has a relatively large scope of work that needs to be released together.  The project team did a discovery/inception process, and found a number of user stories.  They did a relative-size estimation on those stories, projected a velocity, and produced a burn-up chart to project a likely delivery date.  At the start of the project, this was the best information the team had.

As the project wears on, however, the team learn new things.  Assumptions we might have made when we put our original updates together might no longer be true.  They might have learned that integrating with an outside system that they thought would be easy is actually a nightmare, and we have 20 stories that need to touch that system.  They might have changed their architectural approach in a way that makes some stories easier, and some stories harder.

When the team learns these things, they are faced with a question - do we update our estimates based on our new knowledge?  Or do we leave our estimates "as-is?"

The case for re-estimation

Some teams think the answer is obvious - our estimates should be the best they can be, to give us the most "accurate" picture of the project possible.  If we learned something new that impacts how long a story will take, we should re-estimate the story.

Doing frequent re-estimation will tend to give this team a smoother velocity over time (because stories are always as "right sized" as the team can make them before playing them, so we don't get "bumps" due to a story being bigger or smaller than its estimate).  This team will be better able to answer the "Micro" question - this team can use their velocity much more accurately as a check on "can these stories fit into a 2 week iteration?" 

A team that re-estimates frequently believes their long-term macro estimates are more believable because they've "baked in" their best knowledge.  However, their long term estimate is more likely to fluctuate iteration-to-iteration, even if velocity is steady, because the number of points in the release will fluctuate as estimates change (hopefully around the a "steady" middle point, but there will be some variation).

The net belief is that re-estimating makes our micro planning better, and makes our macro estimates no worse and in some ways better.  So while re-estimating involves investing more effort in our estimation process, it's worthwhile.  

Philosophically, re-estimators will argue that the "don't re-estimate" crowd is tolerating bad information.  We know that software estimation and planning is an inherently imprecise exercise.  When we are presented with an opportunity to improve the information we can give others about the project, we should do so. 

The case against re-estimation

The opposite school of thought is that you should not change your estimates from what you thought initially, even if you've learned more.  Teams that follow this approach would likely bring up the following points:

First, estimates are ESTIMATES.  They're not intended to be perfect.  As long as they are ON AVERAGE correct (roughly same number above or below), from a macro perspective, those inaccuracies will even out over the course of the release.  Re-estimating (they argue) creates an illusion of accuracy on an inherently inaccurate exercise.  We know there will be unexpected bumps no matter what we do, so let's not worry too much about attempting to smooth them out.

Second, they will point out that, while IN THEORY teams that re-estimate will improve estimates in both directions, in practice people tend to re-estimate UP more than they re-estimate down. 

This risk of "net estimation up" usually (in my experience) comes from an asymmetric application of applying risk in estimates.  Let's say we have a story that was estimated as a 2-point story.  From past experience, some stories similar to this one were "real 2's, but some were more like 4's.  It might be a 2, it might be a 4.  Let's make it a 4 to be safe.  Now consider a story estimated as an 8-point story.  We know some similar stories were "real" 8's, some were really only 4's.  Let's leave it as an 8 to be safe.  Even without ill will, the natural inclination in both cases is for uncertainty to ratchet more stories up than it ratchets down.

In practice, this means a team that re-estimates frequently will have its total estimate of the same backlog ratchet up over time.  They will also have their velocity ratchet up (since over time they'll be doing more and more stories that have had "slightly larger" estimates applied.)  The end result may be the same in terms of time taken, but the metrics will be harder to read.  The "don't re-estimate" school will argue that re-estimating actually hinders our understanding of the "macro" question.  Our "scale" for "what does a 2-point story mean?" will change over time, so simple linear assumptions around velocity and scope won't work properly.  

A generalization of this argument is a belief that re-estimation inherently causes "drift" in our estimation scale.  We began the project with a number of stories all estimated together with a consistently low level of information.  As we go, we have some stories retaining the "relatively little information" estimates and other stories with "more information" estimates.  Does a 2-point story estimated with more information really contain the same amount of work as a 2-point story estimated with less information?  The suspicion is a "mixed" backlog containing some stories that are re-estimated and some that aren't has an "apples and oranges" problem that make it hard to apply a single "velocity" number to consistently. 

The "don't re-estimate" folks will agree that if we don't change our estimates, deciding what stories we take into an iteration will be a less mathematically consistent exercise - if there are three stories available that are all estimated as 2 points, the team might feel comfortable picking up one of them (which is currently believed to be a "real" 2), but not feel comfortable with a different story (which is a 2, but has a lot of "known issues" that likely make it bigger).  The "don't re-estimate" school sometimes argues this is a benefit and not a problem - it means choosing stories for an iteration has to be a conversation with the team, and not a math problem for the project manager.  If nothing else, they'd argue that the time it takes to talk about "this 2-point story, not that 2-point story" is probably less than the time we'd spend re-estimating stories (which we do on stories that might turn out not make a difference in the story selection exercise).

The net belief is that re-estimating isn't a high-value investment for the amount of micro predictability it potentially brings, and potentially actually makes our macro predictability worse. 

Philosophically, the "don't re-estimate" school believes estimates are inherently imperfect, and that trying to tinker with them might be well intentioned but actually introduces more uncertainty than it removes.

To re-estimate or not to re-estimate?

I don't think either extreme position here is completely right.  However, my sympathies are closer to the "don't re-estimate" crowd.  I think changes in scale and "ratcheting up" are a real (not a hypothetical) risk.  I also believe the "macro" question of "when is the release done?" is of considerably higher value to the project team than the "micro" question of what fits in the next iteration.

Also, in my experience, quite a lot of the "always re-estimate!" teams I encounter are teams that are either new to Agile, or teams whose management don't completely trust the teams.  In both cases, re-estimating is done not for predictability, but for "accounting" reasons.  The team re-estimates stories because they are afraid that if the pull in a story that's "bigger than it looks" from its estimate, it will cause their metrics to show a drop in "productivity" and someone will overreact to a percieved problem.  "If we don't get 20 points done this iteration, the development manager going to yell at us for 'missing our velocity target.'"  This is solving the wrong problem - the issue is really an EXPECTATION problem about what estimates should mean.  Investing significant "no value add" time to try and make your estimates look "accurate" isn't going to solve that underlying expectation issue.

That said, I think the "never re-estimate ever" position is too extreme.  There are times when we've genuinely discovered that the way we're going to solve a problem is nothing like what we'd assumed initially and will require a radically different amount of work.  Our estimate is for what's effectively a different story than the one we're actually going to do.  Never accounting for that work to our project plan hurts us both in the micro and macro scales - if the project will genuinely take longer (or shorter!), let's say so.

I have two rules I'd recommend for "when do we re-estimate?"

First, I recommend when we do our initial estimates of a story, we record any key estimating assumptions we are making that the team agrees are key to driving us to choose which "bucket" to put the story in.  e.g. "This report can be built entirely on data that's already in RDB.  No additional data sources need to be built for this story."  My rule of thumb is we should only really consider a story for re-estimation if at least one key estimation assumption is violated - if the story differs in a SIGNIFICANT way from what we thought at the time of the initial estimation.

Second, I recommend a "two-bucket" rule.  If we're using "Powers of 2" for story points, don't spend time re-estimating a 2-point story that we think MIGHT be a 4 point story.  Only talk about it if we think it's at least POTENTIALLY an 8.  Don't spend time on the 4 that might be a 2 - only talk about the 4's where we could make a case for them being a 1.  This doesn't mean we can't decide to only move the story one bucket at the end.  Rather, our filter on "does this story really need to be re-estimated?" should be "it's so far out of whack that it MIGHT be MORE THAN ONE bucket off."

The purpose of the two-bucket rule is to keep us from arguing around the edges - "Is this a large 2 or a small 4?" isn't a high-value thing to get right (and is the situation most likely to lead to "ratcheting up" for risk).  We only want to talk about the ones that we think are BADLY mis-categorized.  Those are both the ones that are likely to have a major issue, and the ones that are potentially the biggest issues for our "macro" predictability.

Here's how I see this working.  We do our initial estimates.  As we're pulling together details for the "up soon" user stories, the BA is regularly reviewing progress with the devs, QA, and product owner.  As part of the conversation about the story, we should at least look at the estimate and assumptions.

"...So that's the story.  It's a estimated at a 2.  Anyone want to holler about the estimate?"
"Hmm...I know it's a 2, but I'm wondering if it's maybe a 4?  There's a few tasks that might be big, and that assumption about data sources is totally wrong."
"OK.  Do you think it's 'maybe a 2, maybe a 4,' or 'definitely a 4, maybe an 8'?"
"There's no way it's an 8."
"OK, then let's leave it at a 2 and move on." 
"Fair enough."

By focusing your re-estimation effort on the clear outliers, you can hopefully avoid getting mired in a lot of debate about things that don't significantly improve your predictability. 

Thanks to many colleagues and now-ex-colleagues at ThoughtWorks who've provided feedback on earlier versions of this rant...

Thursday, August 29, 2013

Is your burn-up chart holding back your thinking?

This week I'm going to take some shots at something that's actually one of my favorite Agile project metrics - the burn-up chart.

There's a lot to like about burn-up charts.  They're compact, but very information rich.  They convey several key concepts visually and in a way that's easy to understand.  They make it highly intuitive to understand the likely end date, and the uncertainty around that likely date.

So why am I picking on them?  Because (like many things, especially metrics) they can be used badly or in the wrong mindset, and in my experience often are.  Wrongly approached, burn-up charts can be reflective of some problematic thinking that can keep your team from being successful. 

I don't think these problems are the result of bad intention.  Rather, I think they stem from not thinking about how "what the burn-up chart shows" relates to "how we expect a typical Agile project to run."  My goal is to point out some of these "impedance mismatches," and present some ideas on how to make sure you and your burn-up chart are on the same page.

What's a burn-up chart, anyways?  

Simply put, a burn-up chart is a plot over time of work completed, and a plot over time of the "target" amount of work we need to  complete to achieve a goal.  By looking at the level and slope of the "completed" line, we can understand when we expect it to intercept the "goal." 

The concept of a burn-up chart can be applied to a wide variety of situations where "work" is achieved towards a "goal."  Burn-up charts can be be done with many kinds of "work achieved" (story points completed?  stories completed?  defects closed?  tasks complete?), can be plotted on multiple timescales (daily?  hourly?  monthly?), and can be used for a variety of goals (stories burning up for a single iteration?  tasks burning up to complete a given story?  stories burning up for a release?)

For purposes of this article, however, I'm going to restrict myself to the most frequently used type of burn-up chart, which is a "release burn-up."  The "unit of time" for this chart is is generally sprints/iterations, and the "unit of work" is generally measured in story points (or whatever the team's estimation unit of choice may be).  The "goal" line represents the total estimated work that's "in scope" for the current release.  The "burn up" line represents the number of story points that are completed each iteration.  It looks something like this:

To make it more useful, we can add a "trend line" for the work completed, which allows us to visualize where the intercept will happen:

Because our velocity isn't completely smooth, we can also extrapolate "worst case" and "best case" velocity estimates, to show us the range of uncertainty around our possible dates:

And there we have it.  It's fairly simple to draw.  And it answers a whole host of potential questions around "how is the team doing" at once.  You can see at a glance the "when do we expect to finish?"  The "range of uncertainty" is reasonably easy to understand. You can clearly see whether the team's velocity is relatively steady or jumping around.

You can also answer a number of what-if questions easily - how much would we need to cut if we need to be ready for production by date X?  Draw a vertical line on the date, and you can read from the projected trend lines the best/middle/worst case estimates on how much we can get done, so we know how big the cuts are.  Will be ready by date Y?  Is it in the range of uncertainty?  If not, probably not without making changes.  If it is, we can see whether it's an aggressive or conservative target.

So, yeah, it's a pretty good chart.  It's so commonly used that most Agile tracking tools will build it a burn-up chart for you.  Here's a few examples:

ThoughtWorks Mingle
Altassian Jira/Greenhopper

So, enough about what's good about burn-ups.  This post is supposed to be about what can go wrong with them, so let's get into that.

Burn-ups can block discovery

Let's say we're in the early stages of a project.  We did a quick discovery exercise, and found 200 points worth of user stories.  We did a velocity projection and estimated the team could do about 20 points of work per iteration.  Our first two iterations roughly bore that prediction out.  Our burn-up chart looks something like this:

Great. It looks like we'll be done around iteration 10.

Fast forward to iteration 9.  We should be in the home stretch.  But we're not - if we look at the actual backlog, we see that there are still 3 more iterations of work!  What happened?  Let's look at the burn-up chart again:

Ah.  The scope moved over time, as new stories were added to the project.  This means we have more work, and so we need more time to finish.

The two words most project managers (especially "traditional" project managers) are probably thinking about right now are "scope creep."  Clearly, the team should have "held the line" on new scope - if we wanted to add something new, we should have taken something out.  This project was "mismanaged."

And, in my opinion, if that's what they think, they're probably wrong.  What I described in the last paragraph is completely non-Agile thinking.

The notion that all the work "ought to be known" at the start of a project comes from "traditional" waterfall thinking, where we have an extensive "planning" phase up front that's supposed to uncover every requirement and task in the project before we get started.  Deviations from that plan are expected to be infrequent, and must be managed via a strong "change control" process to keep from materially impacting the schedule unnecessarily.

In Agile projects, we do away with the long, drawn-out planning phase that's expected to find everything we're going to need to do.  And in doing so, we need to do away with the assumption that all the work we need to do to accomplish the goal is known before we start.

While we might do a discovery phase up front, we should EXPECT that phase is imperfect and will miss a few things that need to be done.  Because we're collaborating with customers, we should EXPECT some amount of re-work or re-imagining of features will happen over the course of the project.  Those changes aren't "scope creep" - they're the norm of an Agile project.

Expecting that the team should "hold the line" on the scope we knew about at the start of the project, and that any "discovered" stories (either from customer collaboration or new discovery) means we need to remove something else to make room for it, is following a plan over responding to change.

So, what's the solution?

The simplest thing you can do to address this issue is start with the assumption that your scope line will be UPWARD SLOPING, not flat.  We don't know exactly which stories we'll uncover, but we KNOW we'll find some.  Assuming our scope line will be flat unless "something unexpected happens" sets the wrong expectations, both for the team and the customer.

The exact slope of the line will vary depending on the project, and factors like how unknown the problem space is, how expert the team is in the domain, how many different stakeholders we're trying to satisfy, etc.  I find a good rule of thumb to be that a "typical" team will spend about 20% of it's time working on "newly discovered stuff," leaving 80% for the "known at the start" work.  So my sample team with a velocity of 20 should expect to discover about 4 points of stories per iteration.

Some people may be howling that allowing this discovery factor is an open invitation to scope creep.  I don't think it is.  Just assuming the line to be upward sloping isn't the same as saying we'll stop thinking about newly discovered stories and just accept everything into the project.  Regardless of how we build the metrics, we need to have regular conversations with our product owner and only accept changes we genuinely want to deliver.

Also, assuming a "discovery rate" can make this a somewhat scientific process.  Assigning a "budget" for discovery helps us have productive conversations early in the project.  We tell our product owner their "budget" for story discovery is 4 points per iteration.  We then track it - in iteration 1, we added 6 points.  In iteration 2, we added 3 points, in iteration 4, we added 2 points.  Great - we're on track.  Or, if we're going into iteration 4 and we've been adding 8 points an iteration, not 4, we have to have a conversation.  Was our "discovery" expectation too conservative, and we need to change it (and so change our expected end date)?  Or is it that we had a major correction early in the project, and we don't expect it to recur?  The point here is that by tracking our discovery against a planned "budget," we can validate our assumptions.  We can even do it right in the burn-up chart:

In this example, the green line is how we expect scope to grow over time with discovery, so we should expect to be finished in Iteration 12 (and not Iteration 10, where we'd have projected to finish if we "held the line").  We can see from the trend in the Current Scope line that we're upward sloping, but roughly sticking to our discovery "budget."

One problem with this approach is that (to my knowledge) there aren't any automated tools that make it easy to account for an "expected discovery rate" - the burn-up targets tend to be horizontal lines (if anyone knows of a tool that DOES make this easy, would love to hear about it in the comments).  So if you want to track this, you may need to "roll your own" chart to do so. 

Burn-ups can cause complacence

Having talked about burn-up charts leading to thinking that potentially caused us to exclude things we ought to do, I want to talk about the opposite problem - burn-ups causing us to do work we don't need to do.

Again, by design, the initial "discovery" for Agile projects trades complete accuracy for speed - we're willing to tolerate some false positives/negatives in exchange for being able to get started writing useful code faster.   This means we'll miss some things, but it also means we'll potentially pick up some stuff we think is high value at the time that we'll later learn isn't actually necessary.

Most teams are pretty good at reviewing stories we recently discovered and vetting them for "is this really part of the project?"  But teams can be less careful about reviewing the things that are already "on the plan" and "in scope" for the project.

The burn-up chart can contribute to this lack of introspection.  If we had 100 points of work at the start of the project, and we're projecting to finish all 100 points within the expected timeframe people are asking for, then we'll just keep churning through those 100 points.  They're part of the "baseline" (which on a burn-up chart is literally a line).  The fact that we have a scope line with a fixed value as the "target" mentally causes us to treat that amount of work differently from "new" work, because we've already baked it into our metrics.

A good team should (and will) regularly review and "prune" the backlog - just as we expect to discover new work over time, we also expect that we will discover existing work isn't necessary.  We should remove those items, even if our burn-up chart "looks good" to getting the work done. 

Some of you may be noticing that in the previous section, I argued the scope should be expected to grow over time as we learn things, and here I'm arguing some of the initial scope should be expected to be removed over time.  Can't we just assume these effects "cancel out" and go back to the "flat" projected scope line?

While the two effects do partially cancel each other out, the magnitudes aren't necessarily the same.  In my experience, the "20% growth" assumption is what I've seen as the "net" effect - we discover new work faster than we remove work.

Again, the goal is to be deliberate about scope changes.  The team should be diligent on removing unneeded items, just as they should be diligent about only accepting necessary change.  There's nothing "special" about the work that was "in the baseline" - unnecessary work is unnecessary, whether it's "already in the metrics" or not. 

Burn-ups can encourage infrequent delivery

There are times when the right way to do a project is to spend several months building a cohesive product, and then release it together.  Sometimes you're building a brand new application that doesn't deliver value until it can support certain likely usage scenarios end-to-end.  Sometimes we're doing a major restructuring of a module that needs to change all at once.  Sometimes external circumstances around how we deliver necessitate a single large release (e.g. something that needs to deploy in parallel with a third party's application that we integrate with).

But these situations do NOT apply to all projects.  On many projects (probably MOST projects) we don't need to "build up" 6-12 months of work to have something of value to release to the marketplace.  And if we don't need to wait to do a big release, we probably shouldn't - one of the key goals of Agile is to deliver value to production rapidly, and begin to capture that value quickly.

Having valuable code that could be making us money "waiting" on a release is throwing money away.  We could be getting value from it, but we're not, because it's "part of the release" that happens six months from now. 

The continuous delivery movement has a lot to say on techniques/processes around the "how" of getting code to production quickly.  But I'm talking about metrics today, so I won't go into those.

The reason I'm seeing a potential problem with burn-up charts is that, once you're familiar with burn-up charts and grow to like them, you may start structuring your projects in ways that product a good burn-up chart, rather than alternate structures that don't work well with burn-ups but may deliver value more quickly.

The key insights you can get from a burn-up chart is a projection of when a given scope of work is likely to be complete in the (relatively distant) future.  If your burn-up chart tracks points once per iteration, they're close to worthless for giving us meaningful insight on how long it will take to complete a scope of work that's only an iteration-or-two worth of effort.

You can address some of this by plotting points more frequently (you might plot your "work complete" in days rather than iterations).  But in some cases, that's fitting the data to make the chart work, as opposed to thinking the chart is a metric that gives us useful insight that we can make actionable decisions on.  The real question is whether thinking of our project as relatively slowly "burning up" to a large goal is the right way to think about our project at all.

I see this thinking trap frequently on "new to Agile" teams.  They are coming from a world where a nine month project is considered "short" and the executive team is used to looking at a Gantt chart.  When they switch to Agile, they have slightly shorter projects (say, four months), but they don't have a Gantt chart anymore.  But we can show the execs a burn-up chart, and explain it to them, and we can make them relatively satisfied that they still have good insight into how the project is going.  But then the burn-up becomes a self-fulfilling obligation - we need to have burn-up charts because management expects burn-up charts.  And, subtly, we become mired in our thinking that the "right way" to think about projects is by burning up over time to a goal.

So, what do we do about this?  The first thing is to consider whether, like any other metric, a burn-up chart is actually appropriate to your project.  What is it going to tell you?  Does it fit the way we we're structuring the project to deliver value?  Let's be OK having the conversation that a burn-up chart isn't appropriate to our project.

One of the most productive teams I was ever a part of didn't bother with burn-up charts.  We'd simply agree "once these 4 stories are done, we can go to production with them," and make that happen - we'd deploy every week or two.

Burn-ups can cause missed expectations

Burn-up charts are great at predicting the "development complete" date for a given project.  Ideally, that's also the date we could "go live" with the new project - just click the button at the end of the last iteration and go.

In the real world, this is rarely the case.  First (as discussed in the last section), if you're a project that has a significant burn-up chart, you probably aren't sufficiently comfortable with your deployment process that you could just "push the button and go." 

There may be a genuine need for other processes to happen that need to take place between the "we're done writing code" and "we're able to go to production."  Maybe your application requires regulatory review and approval.  Maybe your process is to have a beta testing phase for final feedback before you flip the switch in production.  Maybe you need to coordinate the launch of the product with a marketing campaign.  Maybe you need to bring a swarm of tech writers in to put together the final "user manual" after the app is done.  Maybe you have a major outage that has to happen for the move to production to do a data migration into the new system. 

Some of these activities might be able to be done "as you go" during your development iterations, but probably not all of them.  Which means there are likely activities that take place AFTER your development is complete, but BEFORE you're in production (and delivering actual business value).

By design, a burn-up chart doesn't show these things.  What it shows is how we're progressing towards the "goal," where the goal is generally "we have completed all the stories."

The way we get an expectation miss is if we're reporting our burn-up charts out to a wider audience, and they're taking back the message of "the development and testing for features will be complete at the end of iteration 9 on Aug 1st," and conflating that with "we will be in production Aug 1st."  If that's not the case, you need to be careful about the messaging around your burn-up.  Which probably means "don't communicate ONLY the burn-up chart."

Some teams "account for" this by putting in placeholder stories and assigning them to "iterations" near the end of the project, so this non-development work will show in the burn-up.  I think this is often problematic, and is twisting the process to fit the chart.  Assigning "points" to "do the marketing for the release" is implicitly claiming the marketing activities are measurable on the same scale as the development work, and will have the same velocity.  But if the development team moves faster than we expect, the marketing doesn't magically take less time, and we'd be in trouble if our chart assumed we would. 

While I'm not a fan of them for many purposes around tracking software delivery, the much maligned Gantt chart can actually be your friend here - tracking dependencies and "cascading" dates appropriately is exactly what a Gantt chart is good at.  If you're putting together a "showcase" for your metrics, a slide with a name like "From Development Complete to In Production" showing the dependencies and timeline starting with your current best "development complete" date and ending with the date you're live in production can help everyone understand what those activities are, how long they'll take, and when the project is actually going to be done.

Wrapping up

As I said at the beginning, I generally like burn-up charts.  They're incredibly useful tools, and for certain kinds of projects are an ideal way to communicate progress succintly.

That said, I hope I've at least raised your awareness that (like all tools) burn-ups aren't perfect.  If we think about them the wrong way we can drive undesirable behavior, set incorrect expectations, or miss opportunities.  

Thursday, August 22, 2013

As a User, I want to be represented in User Stories

I'm a big fan of user stories for tracking upcoming work.  They keep us focused on the problem, not the solution.  They're good at reminding us why each chunk of functionality is valuable.  Done well, they're easy to prioritize and plan with.

But one of the big issues I have with them is that it's really easy to lose the "user" in the "user story."  I've seen way too many projects with several dozen stories that all being "As a user, I want...."  The placeholder "user," used over and over again, is usually indicative that we've stopped thinking about the actual users at all, and we're just putting it in for the sake of form.

Which is a shame, because thinking about who we're delivering value for is really important for us to write good stories.  So I'm going to talk about a few ways to break out of the rut of constant "as a user..." stories.

Users are not actors

One mistake I see frequently, especially with people transitioning from a use case heavy background to user stories, is to mistake the "As a..." clause of a user story as a place to identify an actor - the individual who is PERFORMING an interaction with a system.  But user stories are much more about stakeholders - the individual who gets VALUE from a feature. 

In many cases, these are the same, but they need not be.  Even though a "user" of the system might be the one interacting with it, something is happening that delivers value for someone else.  That's the person who "wants" the feature we're providing.  (By the way, I think the term "user story" is a slight misnomer for this reason). 

For example, let's say we're building an e-commerce portal, and we have a prospective feature validate a credit card number is a valid number (e.g. by running a checksum) on the client before we submit the credit card transaction for approval to the credit card processor.  This is a useful thing to do because many credit card processors charge by the request.  If we can figure out on our own the credit number is bad BEFORE we submit the request, we save money - we don't need to incur the expense of a transaction to tell us the credit card didn't go through. 

I could write a story for this like "As a user, I want to know that I entered a valid credit card number before I'm allowed to submit my order, so that I can fix the card number if it's wrong."

The problem here is that, while the "user" (a customer in this case) is indeed the one who entered the number, they probably couldn't care less about having an intermediate check between "I clicked submit" and "the transaction was sent to the processor."  Heck, they might not even notice it.  From the "As a user" perspective, this story is close to valueless.

However, the feature is NOT valueless.  Given miskeyed credit card numbers are a major cause of credit card transaction failures, doing an internal check here can cut down a lot of our "waste" of paying for failed credit card transactions.  Let's re-write the story from the perspective of the person who actually gets value from this feature.

"As the accounts payable manager, I want credit card numbers pre-checked for validity before they're sent to the credit card processor, so that I don't pay transaction fees for transactions I know will fail."

Suddenly, we've transformed the same feature from a "why would they care?" into a much more compelling story that clearly relates to business value.  And it's a lot easier to prioritize this against other stories.  If we only have 10 failed credit card transactions per day, this might still not be high value.  It we have 10,000 failed transactions a day, this might pay for itself in a week.

 "Accounts payable manager" might not be a "role" in the system.  Heck, the accounts payable team might never use our system directly to do their jobs.  And they're certainly not the ones who are actually entering the credit card numbers. But they're the ones who care about this feature - they're the ones paying the cost of not having it, and the ones who benefit from it being in place.  Their perspective is the one that's important when evaluating this story.

A good rule of thumb when putting together the "As a..." clause of a user story is to think "If we took this feature out of scope, who would be first in line to pay me to put it back?"

Users are not system roles

Another way we can lose track of who we're delivering value for is to simply think of individuals in terms of what their "system role" is.  We could think about every feature targeted by someone in a "user" level role as "As a user...", and have the features that are only for administrators written as "as an administrator."

This makes some sense from a technical perspective - we're delivering a feature that's intended for everyone who's role in the system is "user."  And since we're not intending to restrict the feature to only a subset of users, everyone who's a "user" gets the benefit.

The problem with this approach is that it treats everyone in a user role as being identical.  That's rarely the case.  For example, think of an online auction site.  Everyone who has an account (other than the admins) is a "user" - none are different from a perspective of what they have the ability to do.  Anyone can search items, bid on an item, sell items.

But think of the richness that's obfuscated if we just think of everyone as "users."  There are lots of subcategories of users that have different perspectives, and different needs.  You have power sellers - factory outlets selling many copies of the same item in bulk.  You have high-dollar-item sellers, who need lush descriptions and photos to make the item look good.  You have online yard sellers, trying to get rid of a few things as cheaply as possible to clean out their basement.  You have targeted buyers who want one and only one thing, and want to find it fast.  You have browsers who want to compare multiple items of the same general category (such as laptops from 10 different sellers).  You even have your day traders, who buy and sell the same items frequently to try and arbitrage short-term market fluctuations.

Each of these "users" might have access to the same tools, but thinking about which one wants a feature can help bring clarity.

"As user, I want to create a template of an item that I want to auction, so that I can later list that item quickly" vs. "As a power seller, I want to create a template listing of an item I have in stock, so that I can create multiple individual auctions quickly."

With the first story, we might wonder why the templating is useful - couldn't we just create the listing directly?  The rationale is much more clear when we think about the context of a power seller - the template helps because they can create the "same" auction multiple times.  And thinking through that lens clarifies how this needs to work - I clearly need to be able to re-use the template.  I may need to be able to change specifics with each new listing (like a serial number).  Also, we can probably assess the priority better- how many of our users actually fit the "power seller" profile?  Is it worth building this feature primarily for their use?

Users are people

Your users are not faceless anonymous masses.  They are (for the most part) human beings.  They are people trying to live real lives.  In many cases, their interaction with whatever you're building is a small part of their busy life.  Often, the thing they're trying to use your system to accomplish is a means to an end (I'm buying this book online so I'll have something to read when I go on my beach vacation).

The problem here, of course, is that every user is different.  Trying to write user stories that really capture each user's individual goals, nuance, and perspective is almost impossibly hard for any system that has more than a handful of users.

One technique to try and capture some of this "real user" feel without being able to model real users is to use personas.  For those unfamiliar, a "persona" is a somewhat fictionalized biographical sketch of a "typical" user, with more realistic goals.  While it's possible to go overboard with them, using personas can bring valuable insight into our implementation of a user story.

Many projects create user personas as part of an inception/exploration phase at the start of the project.  It can be very powerful to map stories back to personas.  If there's no persona that clearly would use a feature, that's telling us something.  Are we missing a segment of our expected userbase in our personas?  Or are we devising a feature that no one we think will use the product actually wants? 

Let's enhance our user story from the last section by tying it to a persona.  Back when we started the project, we anticipated we'd have power users, so we built a persona for them.

Meet Linda.  Linda runs her own distressed asset disposal business out of her garage.  She has an IT inventory management background, and has contacts with a number of local and regional technology distributors.  Linda spends all morning calling around looking for large cancelled orders, unopened returns, and other unwanted tech in bulk.  She buys pallet quantities for cheap prices, and sells them off online.  The margins aren't great, and she has limited space, so her business is all about speed and volume.  She opens dozens of auctions a day, and the less time she needs to spend setting up and maintaining her auctions, the more time she can spend finding great deals.

Now let's try our user story again, this time with the persona as the user.  "As Linda, I want to create a template listing of an item I have in stock, so that I can create multiple individual auctions quickly."

We can almost picture this in our mind.  We can see Linda sitting in her dusty garage with a laptop, next to an shrink-wrapped pile of LCD monitors trying to set up her auction template.  Hey - I bet she wants to be able to pull down the manufacturer's stock description into her auction template, rather than type out a description herself.  Hey - I bet she wants some way to track how many of these she has.  Hey - I bet when she creates auctions, she's going to want to auto-relist items that don't sell.  Hey - you know what would be cool?  If she could just scan the barcode on the manufaturer's box and pull all the spec's up automatically....

Some of these thoughts are probably different user stories - we don't want to suddenly drag the mechanism to create and auto-relist auctions into a story to create a template.  Even the barcode scanning idea is probably it's own feature that we can add after we build the basic template concept.  But keeping Linda and her problems in mind as we build each feature will probably guide dozens of small decisions we make over the course of each story.

It's possible to go overboard with personas.  Sometimes we build so many that we lose track - who was "John" again?  It's possible to build personas that aren't accurately reflective of the real user community, and so could steer us in the wrong direction (what if 95% of our power sellers aren't like Linda - they're like Charles who works for the distributor and is more interested in recovering the most value rather than turning items at speed?)  That said, writing stories to a reasonable set of identifiable personas is a powerful way to keep your team focused on solving real people's real problems.

Wrapping up

Software is for built to solve problems for people.  Keeping the team focused on who they're building software for, what they want, and why they want it is the key insight behind user stories as a technique, and why they're so powerful.

I don't imagine changing a single word in a user story will take your stories from good to great in a single stroke.  The "As an X..." isn't so important as itself.  Masses of stories reading "As a user..." are a symptom of a problem, not the problem itself. 

What's important is the thinking behind our stories.  Are we really thinking about who we're delivering value for?  Are we clearly thinking about our users as something other than a uniform monolith?  Can we see their faces and breathe their problems?  Because that's what makes us great builders of software. 

Thursday, June 13, 2013

Measure your Non-Functional Requirements

Like most people who work in the software industry, I often hate working with non-functional requirements.   They're the ugly stepchild of software requirements - forgotten as soon as they're created, difficult to manage, and generally never rearing their heads until the end of the project, when they become a stick to beat the development team with and a cause of late-breaking deployment delays.

I'd like to change that.  I believe it's possible for us to work effectively with non-functional requirements, make them visible, and do more to make sure they're met than simply crossing our fingers. We just need to plan for them. 

For purposes of this post, I'm going to define Non-Functional Requirements (now sometimes called Cross-Functional Requirements, often abbreviated NFR's) as follows:  NFR's are the set of "things your team is required to deliver" AND the set of "conditions your team must make true" over and above the delivery of working, tested code that implements user-visible features.

Examples of non-functional requirements are documentation requirements ("We need to produce a user guide"), performance requirements ("All page load times need to be 2 seconds or less under production load"), and a variety of "ilities" like supportability, accessibility, deployability, etc. ("All major events need to be logged to a common file," "Every page and message needs to be I18n compliant," etc.)

On "traditional" waterfall projects, the typical handling of NFR's is that in the planning phase, we make a list of all of the NFR's.  Then we record them on a spreadsheet.  Then we hand it to the architect to "keep in mind" when planning out the project.  Then we stop thinking about them until the late-game pre-deployment testing, when cross our fingers and see if they're actually met.  Then, when they aren't (and trust me, at least some of them aren't), we argue about whether we live with it or slip the date.

On "agile" projects, the process is largely the same up until we record them on the spreadsheet.  Then we....well...we're not sure.  In theory, some of these become "coding standards" that we make people aware of (but often don't enforce).  Maybe we'll remember to cross our fingers and test before we deploy.  Maybe we'll wait until someone complains after we go live to say "oh, yeah...we should fix that."  Regrettably, since they fit poorly into our "focused on features" development process, they're easily ignored. 

The real problem with non-functional requirements is that they tend to play by different rules than "everything else" we're deciding to build.  I see two key differences. 

First, non-functional requirements have a "cost visibility" problem.  It's hard to estimate "how much" a requirement like quick page load times will cost the project, because it (generally) increases with the number of pages we build.  It's hard for us to give customers visibility into tradeoffs like "how many other features will I need to cut in order to include ADA compliance?"

Second, most non-functional requirements have a "focus" problem.  On an Agile team, we're constantly looking at some kind of board with all the "in flight" functional features for the current iteration/sprint/cycle.  But because (most) non-functional requirements span all features, they're never visible on the board - they're "universal acceptance criteria" for all stories/features.  And like everything else that's "boilerplate" to each item, we stop thinking about them. 

So, what do we do about non-functional requirements?  I've had some success using a two-part approach to handling non-functional requirements.

First, we need to separate non-functionals into "playable requirements" and "standards."  Some requirements are genuinely "do once and then you're done" items that look and act just like User Stories/Features.  An example would be "we need to build a pre-production staging environment with data copied regularly from production."  That's a thing we could choose to build in, say, Iteration 3, and once it's done we don't have to build it again. I tend to treat these like "standard" requirements - estimate them, assign them an iteration, and run through the normal process.

Then, we have the requirements left that are NOT "one and done."  They're the ones that span all stories.  For these stories, I focus on having a METRIC.  I have a conversation with the team (INCLUDING the product owner) where I ask them "how do we want to ensure, over the course of the project, that this requirement is met?" 

For example, let's say our requirement is that "all page load times will be less than 2 seconds under production load."  Great - how do we ensure we meet that requirement?  There are a number of ways we could in theory ensure that.  At one end of the spectrum, we could say "OK, we'll build a dedicated, prod-like load test environment, on a clone of prod hardware, along with a dedicated pool of load generation machines.  We'll also build a set of load test scripts that test every major function of the system.  Every check-in that passes CI gets auto-deployed to this environment and load tested." 

That's probably the most robust possible testing we could have to meet the load requirement.  Unfortunately, it's also expensive and time-consuming to build.  Is this requirement worth it?  If not, what might we do that's less than this?  Maybe we'll just build some JMeter scripts that run in CI to collect directional performance metrics we'll watch over time - it won't tell us DEFINITIVELY we'll perform under load, but it will tell us if we're getting worse/slipping.  Riskier, but cheaper.  Maybe we'll periodically have the testing team step through some key manual scenarios with a stopwatch and measure times.  Or maybe we'll choose NOT to invest in load testing at all -we'll accept the risk that we might not meet this requirement, because the load numbers are expected to be small and the technology we've chosen for this project is the same technology we're using elsewhere at far greater volumes, so we think the risk is very small. 

The point is we have these discussions as a team, and INCLUDE the product owner.  This gets us over the "cost visibility" problem - if you want to team to have extremely high confidence they meet this requirement, what will that cost us?  Are we willing to invest that much?  Or are we willing to take on more risk, for a cheaper cost? 

Once we've decided on how we'll measure our compliance, we need to make it "part of the plan."  Let's say our plan for performance testing was for a tester to manually go through the application once an iteration with a stopwatch and measure response times in the QA environment.  Great!  We put a card on our wall for "Iteration 4 performance test," and (whenever in I4 we feel it's appropriate) have a person do that test and publish the results.  If they look good, we've got continued reassurance that we're in compliance.  By publishing them, we remind the team it's a "focus", so we remember we need to be thinking about that for every story.  If we find an issue, we add a card to the next iteration to investigate the performance issue to get us back on track.  

You can have similar conversations around things like a user guide.  How will the team produce this document?  One option is to say we won't do anything during development.  Instead, we'll engage a tech writer at the end of the project to look at the app and write the guide.  That would work, but it means we'll have a gap at the end of the project between "code is done" and "we're in production." 

Another approach would be to build this up over time - with every story, we agree that someone (maybe the analyst, maybe the developer, maybe the tester) will update the user guide to cover whatever new thing we built.  Thus, we build up a guide over time. 

This is great in theory, but again, how will we ensure the team's doing this?  Are we going to periodically review the user guide?  Is the user guide going to be presented/reviewed in our regular demo meetings?  Are we going to add this to the testers' checklist for "what do I need to validate before I sign off on a feature?"  The goal is to make our decision explicit, make sure everyone understands the level of investment we expect, and how we're going to demonstrate our compliance.

Having explicit conversations around our investment in non-functional requirements, and setting explicit metrics has the ability to turn "non-functional requirements" from vague semi-forgotten dictums into explicit common-sense cost/benefit tradeoffs made between the product owner and the development team.  We can take them from items we don't think about until the end of the project to something that's constantly visible.  And we can take them from a cause of massive heartburn and late schedule slips into a source of pride and confidence for the team.   

Thursday, June 6, 2013

Stop expecting your customers to know how to solve their problems

One of the most important things a business analyst needs to understand is this:
Your users are not (in most cases) skilled application designers.

Your users are people trying to check a bank balance, or order Season Three of The Wire on DVD, or add a new employee to the payroll system.  Most of them are not technologists.  Seems pretty obvious, right?

So why am I bringing it up?  Because very often, business analysts don't recognize the implications of this fact.  Your users are good at finding problems with your system.  They're good at evaluating potential solutions.  They're good at telling you an implemented solution solved their issue.  But what they shouldn't be expected to be good at is determine exactly what that solution should look like.

A business analyst needs to be an analyst, not a short-order cook taking tickets.

Let's consider an example.  A user of our grocery delivery website has a fixed food budget each week, and it's important for him not to exceed that budget when ordering.  However, we don't show the user in the shopping path information on how big their order currently is, so our user doesn't know if he can afford the T-bone this week, or needs to settle for burgers.  We do keep track of the current value of the order, but it's on the "View shopping cart" page, so our user has to keep flipping back and forth between the shopping page and the cart page to keep tabs on his order.

Sounds good so far, right?  There's a real problem here, and it's probably one that can be solved with technology.

The problem is that this isn't usually how the problem is presented to us.  Very often, our users (in the process of being frustrated by something) will envision an idea that could solve the problem for them.  And so what we get from the user will be a "feature request" that looks something like this: "When I go to add a new item to my shopping cart, I want a popup that says 'This will make your total order $X.  Are you sure you want to add this item?'"

The wrong approach to this feature request is to ask "OK, what color do you want that popup?"  Then add the request to the backlog and build it.  The user asked for it!  It's a user requirement!

A better approach is to start with the request, and work with the customer to understand the reason for that request.  "OK, so help me understand how this popup makes life easier for you." "Well, I have a fixed budget, and I need to know if the item I'm adding is going to put me over that budget."  "OK, and you need something new because you don't have a way to see that today?" "Right - the only place I can get the current order total is on the Shopping Cart page, and it's a pain to keep flipping back and forth to a different page.  I need to keep track of this while I'm shopping." 

Aha.  Now we have the most important piece of a user story - the goal the user is trying to accomplish.

The feature injection approach to requirements is really useful here.  You start by "hunting the value," then building the features you actually want to implement off of that.  To borrow one of their techniques, I might write a user story for my customer's request "value first" in this case - In order to keep my order within my set budget, as a shopper I need a way to keep track of my order value from within the shopping path." 

Now that we have the goal, we can leverage our skilled development team to come up with a range of ideas on how to meet that goal.  Instead of showing everyone using the site a popup every time they try to add an item, what about just showing a running total price in the upper right corner under the shopping cart icon?  What about flashing a notification after each added item in the lower right like "Added 12 oranges for $5.68.  Order total is $96.78."?  What about allowing the user to expand the shopping cart contents from within the shopping path to see what's currently in there?  Now that I have some possible solutions, I can circle back with the user, and we can evaluate the best way to solve their problem. 

So, why do so many projects seem to have an issue with this?  My suspicion is that it's related to the deference that "user requirements" (more on why I hate this term in later weeks) are given in the industry today.  The notion is that there are certain "requirements" the system needs to have, and if we want to uncover it, we just ask the users, and eventually they'll tell us what the system "needs" to do.  In this case, we have a "feature request" that came directly from a user.  It must be a user requirement!  We'll add it to the list, and build it.  What could go wrong?

We need to avoid mistaking "what the users need to accomplish their goals" and "what the users' best design for the system looks like."  Users are not great system designers.  That's OK. 

Users are really good at feeling pain, and feeling it's absence.  Users need to be valued and listened to.  User feedback on your application needs to be welcomed and acted on.  But that doesn't mean we should blindly expect them to design a software application.  Translating from user pain to user goals to effective solutions that allow users to meet their goals is your job and your team's job.  Expecting to users to do that translation into solutions for you isn't valuing your users.  It's abdicating your responsibility, and hiding behind "Hey, I'm just doing what the users asked me to." 

Thursday, May 16, 2013

INVESTing In User Stories Is Hard

If you're reading this blog, I'm hoping you're somewhat familiar with the concept of user stories (if not, here's a good place to start). 

A key concept around user stories is that they should ideally embody 7 properties, which are usually represented by the acronym INVEST.  What's NOT often talked about is that creating stories that follow all 7 of the guidelines in the INVEST acronym simultaneously is actually pretty hard.  The various "generally good" properties can often be in tension, and getting stories to follow one can often mean trading off another. 

I don't think enough time is spent thinking about why following all the tenets of INVEST is difficult.  I don't think teams always recognize clearly that they ARE making tradeoffs, and that they could make different ones.  And I don't think all teams communicate well about those tradeoffs are, why they've made the choices they have, and understand what might cause them to reconsider. 

This week, I want to talk about the most important tensions I see in INVEST.  I want to raise awareness that some of these tradeoffs are hard, and present some ways of thinking about how to make the right tradeoffs for your project.

What's INVEST anyways?  

For those of you who aren't familiar or need a refresher, INVEST represents 7 properties that "ideal" user stories should have.  They are:
  • INDEPENDENT.  Good stories should not have implicit dependencies on other stories.  This is important because it gives us one of the key advantages of Agile - that our product owner should be free to choose the order in which to play the stories based on business value.  When stories have dependencies, there's an implicit sequencing - you can't play story D until after A, B, and C.  
  • NEGOTIABLE.  Stories should describe the business problem rather than a specific implementation.  This allows negotiation/back-and-forth between the skilled development team and the product owner on what the solution might look like.  Non-negotiable stories lead to "because I said so" development, where the team does what they're told without having input.  It's also a leading cause of "it's what I asked for but not what I need."
  • VALUABLE.  Every story should have a clear tie to why it delivers business value.  If a story doesn't deliver meaningful business value, why should we spend time on it?  This is often expressed in the "so that..." or "in order to..." clause of the one-sentence story formulation.  Stories that don't express value clearly are likely to be deprioritized by the product owner (or, if we actually do them, frustrate the product owner by doing something they don't care about).  Related - "valuable" stories are our method for having "thin threads through the system" - end-to-end slices of useful end-user functionality.  If we don't focus on keeping stories valuable, we can wind up splitting "build the database for X" from "build the UI for X" into different stories, neither of which is actually useful to an end-user. 
  • ESTIMABLE.  Stories should be sufficiently specific that the development team can estimate the relative complexity of implementing that story.  Stories that can't be estimated are difficult to fit into iterations,  because they take away our ability to determine what's "reasonable" in a given timeframe.
  • SMALL.  Stories should be as small as reasonably possible.  A single story that takes 4 weeks to implement is not only difficult to fit into a single iteration, but it also reduces our ability to track our own progress.  It's much easier to track done/not done than it is to track "I'm 45% complete with this story."  Keeping stories small makes planning easier, prioritization easier, execution easier.
  • TESTABLE.  It should be obvious to everyone on the team (including the product owner) when a story is complete, and the team can move on to another story.  This means we need agreed to, testable conditions of completeness.  This is often expressed in terms of acceptance criteria (e.g. an "I will know I am done when...." list, a set of "Given/When/Then" criteria, or any other agreed-on formulation).  
I believe there are a wealth of ways that these items can come into conflict with each other, but here are some of the more common tensions I've seen.  

Small vs. Valuable

The single most common tension I see on teams is trading off Small vs. Valuable.   

"Valuable" stories have a clear tie to something that delivers clear business value.  i.e. we know why the story makes the user's life easier.  "Valuable" is often the Product Owner's domain on the team - the Product Owner is often the person articulating the business value, and needs to understand it to help the team prioritize effectively.

"Small" stories are in a working size that's effective for the team.  As an upper bound, stories should be no larger than can fit in a single iteration.  Most effective Agile teams use stories significantly smaller than that, which helps maintain a healthy workflow (having several smaller stories finish over the course of the iteration is easier to track than one story that's "worked on" all iteration and finishes at the end.)

The tension arises because, with a little creative thinking, any story can almost always be split into smaller pieces (i.e. there's no "smallest POSSIBLE size").  Teams can generally choose the granularity they want.  However, the smaller we split the story, the harder it is to maintain a clear grasp on how each individual piece delivers business value.  If we split far enough, we'll pass the point where our Product Owner can understand why a given piece of work is important to deliver value, and we won't be able to prioritize effectively.

For example, let's say we have a story like "As a customer, I want to view my shopping cart online prior to checkout, so that I know what I'm buying."  But that story is estimated at six weeks.  Maybe we break it up into "view order basics," "view items," "view item quantities," "view total price for each item," "view order subtotal," "view order taxes," and "view estimated shipping charges."  OK, still reasonably clear how all those items are valuable.  Then "view estimated shipping charges" has to break down into "view shipping charges for ground items only," "view shipping charges for air-eligible items," "view shipping charges when there are hazardous items," "view shipping charges for international items..."

At some point, we pass the threshold where the set of stories we break down into are so far removed from the thing the product owner actually wants ("I want to see my shopping cart contents before checkout") that it becomes difficult-to-impossible for us to clearly see the value proposition for each one.  Asking the product owner to prioritze the 12 stories related to seeing the shopping card against the 15 stories for checkout and the 29 stories for selecting items is a very difficult task.  On the other hand, asking the development team to work on a project that has only 3 huge stories (because they're the things the product owner cares about) is likely to have serious planning and workflow issues.

How do we reduce this kind of tension?  First, if nothing else, be aware that the finer we carve stories into pieces, the harder each is to prioritize (and vice versa).  Second, most teams I've seen be successful target a certain story size.  If you're one of those teams, talk about how that's working in your retrospectives.  Don't be afraid to suggest your stories are too large to execute (or so small that we can't get a clear prioritization).  Third, ensure that WHEN you split stories, the team is focused on ensuring that each sub-story is valuable (e.g. do NOT split "build the UI" from "build the business logic.)  Fourth, consider how you parallelize stories - is a story that's "too big" something that could have two developers (or two pairs if you're pairing) work on different pieces at the same time?  And if so, does that make more sense than splitting into two smaller stories we need to torture to express as "valuable"?

I think a bigger issue here is whether User Stories are in fact really the right unit for BOTH prioritization and workflow, but that's my article for next week, so stay tuned.

Small vs. Negotiable

A central principal for most Agile teams is to defer decision making until the "last responsible moment" - the moment when we have to make the decision in order to avoid future pain.  Having negotiable stories is the embodiment of this principal - we deliberately elect to defer deciding on implementation details until we've come to the point where we can't reasonably proceed without those details.

As noted above, we can in principal split stories ad infinitum if we wish, and often will do so to ensure stories are small enough to fit our "standard" granularity.  And (as we just discussed) one thing we want to ensure we do when splitting stories is to keep each of the substories valuable.  It's very easy, however, for this to mean that we lose negotiability - as we split a story into smaller pieces, we start making decisions on how we're going to implement the story, and each piece becomes a story.

For example, we might have a story like "As a returning customer, I want to be recognized during a future puchase, so that I won't have to re-enter all my information."  If that story is too large, we might break into stories like "As a returning customer, I want to provide a username and password when I check out, so that my previously used information can be retrieved," and "as a returning customer who logged in successfully, I want to select a previously used shipping address during checkout, so I don't have to re-enter the data," and "as a new customer, I want to be prompted to optionally register an account and provide a password during checkout, so that I can be recognized in the future.

All these sub-stories are potentially small, and all seem valuable.  But notice that we have a lot more decisions made about how the feature will be implemented than we had in the original story.  We're using a username and password (as opposed to, say, storing the information in a cookie on the user's machine for later retrieval).  We added a step the checkout process to choose a password (as opposed to, say, sending an e-mail with the option to register post-purchase).  We have the ability to select from multiple previously used shipping addresses (as opposed to the most recent one).

All these decisions might be correct, and this might be the best implementation.  But by breaking down the story this way, we've clearly implied a number of details of the implementation, and so reduced the negotiability of the story.  Creative ideas the developers might have (for example, using a third-party ID tool instead of rolling our own) may be squeezed out.

How can we reduce some of this tension?  Again, a big piece is awareness - understand that when we have a story that needs to be broken down, we might need to make certain decisions.  Second, make sure the development team is involved in those decisions - if we're not going to restrict their available choices by our choice of stories, let's make sure that the way we're breaking this down is what the development team thinks is correct (as opposed to being "the only way the analyst who broke the story down could think might work.")

Independent vs. Small

Sorry, Small.  I know I'm picking on you, but you're hard to do well. 

As mentioned above with "Small vs. Negotiable," one common artifact of the process of breaking a story into smaller, valuable pieces is to make certain technical decisions up front, thus reducing the overall negotiability of a feature.  A related common practice is break a given story up into smaller pieces by creating smaller stories that need to be done in a required order to be meaningful, which means they're not longer independently prioritizable.

For example, if we're building what we've decided is a three-step registration wizard, we might break it up into the logical steps the user goes through, like "Provide personal contact details in the wizard," "Provide address details in the wizard," "Provide payment details in the wizard," "Check the data collected by the wizard for errors," and "Create the profile based on wizard data."  As noted above, this breakdown is less negotiable, but lets say we talked it over with our developers, and we all agree this is the right way to do registration. 

Now we might have a different problem, which is that the stories are assuming they need to be done in order.  If the only way to get to the "Payment Details" is from a button on the "Provide personal contact details," then logically we need to do contact details first.  And we probably can't do "check for errors" or "submit" until we've collected all the data. 

The problem with creating "flows" of stories like this is two fold.  First, we've reduced our ability to reconsider pieces of the flow - if we decide later "You know what? Storing payment data is a security risk we don't want to take, and it's not really hard to re-enter it, so let's cut that story," then we have to re-work (at least) the "check for errors" and "submit" stories, and maybe others.  Second, by having flows, we can "trap" ourselves for development - if the stories MUST be done in an order, then we can't do story 2 until story 1 is done, and can't do story 3 until after story 2.  This means we "single thread" on these stories, which means we can't have more than one in play at a time (since they depend on each other).  This can lead to long lead times. 

The first thing I'd consider to reduce this tension is thinking about whether we've actually split the stories the right way, or if there's a different split that allows more independence.  The piece that's suspicious to me here is the separate "check for errors" and "submit."  A possibly better split would be "collect personal information from a wizard and save it to a profile," "collect address information from a wizard and save it to a profile," and "collect payment information from a wizard and save it to a profile."  Rather than needing to build up a whole profile before saving, add the mechanism with each piece to error check and save that piece.  If we want to do the "payment" story first, so be it.  We might have (at least temporarily) profiles that are just anonymous payment information - is that actually a problem, or just a different way of thinking about profiles?  That's not to say EVERY dependency issue can be resolved by a different splitting of stories, but it's sometimes the case that we block ourselves by thinking too narrowly about our options (in my example, thinking of the "profile" as an atomic thing that we need to build completely before submitting).  

Negotiable vs. Estimable

When stories are negotiable, a range of potential solutions are possible, so long as we solve the underlying business problem.  However, a "range of possible solutions" can be difficult-to-impossible to estimate.

For example, a story to "provide feedback to a user when they are missing required data elements on the registration form" could be as simple as re-displaying the form with the missing fields highlighted in red, or could be as complicated as providing a wizard to guide the user through the form, with step-by-step instructions translated into the user's language.   

 It would be nearly impossible for a reasonable development team to provide a single estimate that covers both ends of this spectrum that has any degree of accuracy.  On the other hand, if we decide that, in order to be estimable, we will choose to estimate the first option (redisplay with a color highlight), then we've reduced negotiability and made a choice that we might regret later (when we have a great framework we need in 6 other places to provide text guidance, but we have to do something different here because "the story says to."

To resolve some of this tension, a good practice is to separate the necessary (the business problem and the acceptance criteria) from the assumed (the specific implementation we're assuming in order to estimate).  I like to keep a separate "estimating assumptions" section in stories, where we clearly record what assumptions our estimate was based on.  Related, we need to revisit these estimating assumptions - if it becomes clear that the way we're going to implement the story is different from what we assumed earlier, we should revisit whether the estimate still makes sense.  Finally, we need to set the expectation with the team that solving the business problem is more important that justifying our estimate - if there's a better way to solve the problem than the one we assumed when we estimated, we will pursue the better solution rather than justify our estimate.

 Negotiable vs. Testable

I actually see two separate sources of tension here I want to tease apart.

First, similar to the tension with Negotiable vs. Small, there's a temptation to assume a specific implementation in our acceptance criteria for a story, either to make the story more concrete, or because there's only one possible solution in the mind of the person writing the acceptance criteria, and they don't realize other solutions are possible.

Second, similar to the tension with Negotiable vs. Estimable, a story that's highly open to a variety of options is very difficult to plan specific test cases for.  We can't plan what we're going to test if we don't know exactly what the code is going to do yet.

For the first tension, I think it's important to think about our acceptance criteria as different entities from the tests we're going to execute to ensure the code works properly.  Acceptance criteria ought to be "properties that EVERY acceptable solution to the problem must have."  It takes thought to make our acceptance tests implementation agnostic.  It also introduces a translation step - when we test the actual implementation, we need a mapping from the abstract "thing we need" to the specific "what we're going to do."

Consider the following two formulations:
  • GIVEN I have not provided all the required registration information, WHEN I attempt to register, THEN I should get clear feedback that my information is incomplete AND I should be told what additional information is required.
  • GIVEN that I have not filled our some required fields on my registration form, WHEN I submit the form, THEN I should see my form re-displayed with an error message at the top and the missing fields highlighted in red.
The first form is relatively implementation agnostic.  Multiple implementations could satisfy it.  However, it's not easily executable - to actually test this, we need to know what "required information" is, what "attempting to register" means, what "feedback" will look like, etc.  The second formulation is considerably closer to a runnable test case.  However, it assumes a number of implementation elements - a "form," some kind of "submit" action, "fields" to highlight, and a separate "error message" at the top.  

The second tension is more a question of when we need to do the translation from an abstract "need this to be true" into "we will do the following to know that it's true." It's virtually impossible to practically test a user story if you don't know whether you're submitting a form full of data or retrieving information from a third-party repository.  At some point, we need tests specific to our chosen implementation. 

There are a few things I suggest to lessen this tension.  First, just as I suggested separating the "necessary" from the "assumed" when talking about Negotiable vs. Estimable, I suggest separating Acceptance Criteria (describing the necessary) from Acceptance Tests (describing how we'll verify we meet the acceptance criteria).  Ideally, we present the developers with the Acceptance Criteria, and work out the details of the Acceptance Tests with them when we're ready to implement the story and so have to pick a specific design.  Related - just as it's often an anti-pattern to have too large a backlog of fully fleshed out user stories, we should resist the temptation to build significant test plans for stories we haven't yet begun to work on.  One technique I've found successful with multiple teams is to have a "story kick-off" with the analyst, testers, and developers (and ideally product owner) when we start development on a story.  This ensures we have a common understanding of what the goals of the story are, and that the developers can articulate their expected vision for the implementation of that story.  This allows the testers to develop more concrete test cases to the actual design "just in time" when the actual design is known.

Wrapping up

As I mentioned earlier, I could go on with this - there are a lot of other places where there can be perceived tension between the various tenets of INVEST (Independent vs. Testable, you got lucky this time...)

What I hope I've accomplished in an overly-long blog post is at least illustrate WHY there are potential tensions, and that no matter what you're doing on your team, you ARE making some of these tradeoffs.  If things are going well, you're probably making the tradeoffs that work well for your team, so well done.  If things are frustrating, hopefully I've suggested some places to look and some balances you might want to revisit. 

Tuesday, May 7, 2013

Trust the people, not the process

In my view, one of the most misunderstood pieces of the Agile Manifesto is "People and Interactions over Processes and Tools."

Too many people believe this point is limited to one or both of the following statements:
  • The process steps in your SDLC in Agile is different from the process steps in waterfall.
  • Changing to Agile means you need to use different tools like Mingle or Rally instead of Clearcase, Traq, or MS Project.
In fact, "People and Interactions over Processes and Tools" is much more fundamental mindset change that's one of the hardest things to accomplish in an Agile transformation.

Quick quiz.  In a "traditional" waterfall environment, who is responsible for ensuring the team produces high-quality, useful software?

Is it the project manager, who's responsible for the overall project (even though they don't build anything)?  The business analyst, who puts together the requirements (but doesn't execute them)?  The developers, who build the software (but don't have a lot of say in what we're building or why)?  The testers, who test the application meets its requirements (but don't have any say in what those requirements are, and in many cases don't really understand them)?

I believe the "right" answer for waterfall projects is that it's not any person's responsibility.  Instead, it's the PROCESS that is responsible.

Here's how this usually works.  Before the project begins, a subset of the team spends a lot of time putting together a highly detailed set of requirements.  They write detailed use cases and produce high-fidelity comps.  They make all the decisions on what the moving pieces need to be, and create class diagrams and database models for everything.  They break all the development work down to highly detailed tasks.  And then those artifacts are are handed off to "the team" to execute the tasks.

How does a developer on the team know what they're doing is right?  Easy!  They read the plan.  If the plan says to build these three classes with these 12 methods, they build those classes.  Why those classes and not others?  Because the plan says to, and I trust that the process put together the "right" plan.  Is the stuff I built useful?  It must be - the plan said so!

How does a tester know we built a good user interface?  Easy!  They compare "what it does" to the list of "what it's supposed to do" in the requirements.  Can a customer actually figure out how to use the screen to accomplish a goal?  They must be - the plan said so!

The "build a plan, follow the plan" mentality actively reduces agency by individual team members.  Team members are supposed to do "their job," and if everyone does, well, I guess we'll get high-quality useful code as a byproduct.  The team needs to have faith that the people who built "the plan" knew what they were doing.   They understood all the customer needs, all the architectural foibles, all the possible edge cases, and put together a plan that covered all the contingencies (other than a few tweaks that will come in through the oh-so-friendly change control process).  Because that was their job.  Doing what they told me is mine. 

If I did my job, I'm no longer accountable.  Hey, don't blame me the system cratered - I built it just the way the architect designed it.  Don't blame me - I tested everything against the documented requirements.  Don't blame me - we built exactly what the customer told us when they signed off on the spec's.  If it doesn't work, it's not MY fault.  We all followed the process!  

In an Agile world, yes, we have different steps in our processes.  Yes, we document our requirements differently.  Yes, we are more iterative in our approach.  Yes, we use different tools.  But more importantly than ANY of those things, we stop believing that "the process" is the thing that produces positive results.

In Agile, if a developer doesn't believe that a given user story is the right way to achieve that story's stated goal, we EXPECT that developer to question it.  They need to have a conversation with the analyst who put the story together, the product owner, the customer - whoever is the right person.  We don't trust the "story generation process" produced the right story.  Instead, we trust are smart and thoughtful developer to ask reasonable questions and expect either good answers or appropriate changes.  If a tester doesn't think that the acceptance criteria documented for a story really capture what's necessary for a user to accomplish the stated goal, we don't expect them to ignore that belief and blindly trust that "the process" captured what's needed.  We expect them (we demand of them) that they express their concerns and hold others accountable for getting it right.  If the user has a problem and suggests a possible solution, we expect the business analyst to work with them to explore other solutions, and validate that their suggestion really is the best approach, rather than blindly trusting "the customer must know exactly how to solve the problem" and write down exactly what the user said.  

To move to Agile effectively, we can't just swap out a team's process with a different process, or their tools with other tools.  We have to attack the mindset that the processes and tools are the things that make us successful.  We have to attack the mindset that "doing your job" means "doing what you're told."  We have to attack the mindset that blindly "doing your job" inherently leads to success.  We have to attack the mindset that understanding what someone meant means reading a document.  We have to attack the mindset that responsibility for the project being successful resides with "someone else."  We have to attack the mindset that the thing that makes us successful in Agile is doing standup meetings and estimating in story points. 

This is a hard problem.  It's one a number of putatively "agile" teams I've worked with have not in fact solved. 

Customers and Product Owners need to expect and get used to being questioned on whether what they asked for is right.  Analysts need to get used to pushback on whether what they wrote up is the right thing.  Developers need to get used to being questioned on why they built it that way.  Testers need to get used to being expected to be expert on the business problem, and pushing back on things that (strictly speaking) aren't in the requirements.

People who "grew up" in traditional software environments are often scared of this.  How can I "do my job" when I'm no longer completely sure what "my job" means?  Won't my manager yell at me if I push back on "what the customer asked for" and "slow down" the process?  Will I be called on the carpet if I do something that's not on the script and it turns out to be wrong?

The key to all of this is establishing trust.  The team needs to feel trusted to be good at their jobs.  Trusted to solve problems creatively.  Trusted that they will make the right decisions.  Some of this trust can come from within the team.  But even more important, the team needs to feel trusted by management - that they won't be constantly second guessed.  That they have freedom to occasionally make mistakes.  That the time they spend talking through issues won't be considered "waste" time to be minimized.  That their estimates will be respected.  But most importantly, that they know what they're doing, and they're trusted with ownership of their own quality. 

A team that's trusted, and that trusts each other, will naturally build the communications links necessary to validate their assumptions.  They'll talk to each other constantly.  They'll develop "just enough" processes to ensure they all know what's going on, and that everyone is aligned.  A group of trusted people with a clear goal is the most powerful force in software.