Thursday, August 28, 2014

What A Piece of Plywood Can Teach Us About Software Design

When I was in high school and college, I was heavily into technical theatre (scenery, lighting, etc).  Today, I want to talk a little an experience I had studying stage design, and somehow try to relate that back to software design, business analysis and problem solving. 

A little background on on me.  I love to build things.  I love the idea of shaping things by hand, of and building things from scratch (it's also why I enjoy cooking).  I got into theatre through carpentry.  I started off building stuff, and eventually worked my way up to designing the stuff to build (hey, software parallels already!).  I loved the idea of designing and building these artificial worlds for characters to inhabit.  So, I did a bunch of shows, and also in college I took a few electives in scenic design.

In my scenic design course, I ran across a quote that was attributed to famous scenic designer Ming Cho Lee, one of the leading voices in contemporary stage design.  I'm unfortunately unable to find the original reference or citation here, so I can't swear to the source of the quote.  But this is more a story about me, so I'm not sweating it.  What Ming Cho Lee (allegedly) said was this: "Once you know a piece of plywood measures 4 feet by 8 feet, you will never be as good a designer again."

I want to pause here and explain why the size of a piece of plywood would be important to a stage designer.  While scenery can come in all kinds of shapes and sizes, there are two major building blocks that go into most designs.  First, there are upright framed pieces called flats, which generally make up the "walls" of most sets.  Second, there are heavy duty flat pieces called platforms or risers, which are used to make raised floors or levels.  Most sets are mostly made of combinations of these two basic components.  

As you've probably guessed by now, both flats and platforms are usually constructed from plywood - in both cases, a wooden frame of some kind covered by a piece plywood (thin paintable plywood for flats, heavy duty flooring grade plywood for platforms).  And since plywood comes in 4' x 8' sheets, the most convenient size of wall flats and floor platforms to build is 4' x 8'.  In fact, many theaters keep a stock of 4x8 pieces, which can simply be reassembled and repainted into any configuration the show requires.

So, anyways, I was pretty angry about what Ming Cho Lee (allegedly) said.  As I said earlier, I came up as a carpenter before being a designer, so I knew exactly how big a piece of plywood was.  I had always considered my knowledge of how to build scenery an asset to my work as a designer - when I designed something, I knew how it would be built.  I knew I wasn't designing anything with impossible geometry.  I could draft up plans for my carpenters, anticipate their problems, and have good conversations.  I'd seen a number of student designers get into trouble from not realizing that there was no way to get out of that exit door, or that the walls couldn't come apart like they were anticipating for the final scene.  I never had those problems.

Knowing about construction (I felt strongly) made me a better designer, not a worse one.  So I interpreted Ming Cho Lee's alleged quote as something an artist who was too avant garde for their own good would say.  Something like "the true artist doesn't concern themselves with the mundane details of how the art is executed" or some such.  I saw it as impractical nonsense - hey, dude, however "pure" your art might be, we're going to build it out of plywood eventually.  Screw you, person who might be Ming Cho Lee!

I didn't really revisit my opinion of (possibly) Ming Cho Lee for a long time.  But a decade later, and after moving into a job where I help build things out of 1's and 0's instead of plywood and 2 by 4's, I think  I finally get what (possibly) Ming Cho Lee was trying to (allegedly) say.

What I think he wanted to do was to draw a distinction between the problem of design, and the problem of execution.

An empty theatre is a magical place.  It can be anything and anywhere, from a Viking long boat to the palace of a king to a shabby apartment to a merry-go-round.  Whatever fits the needs of the show we've chosen to put on and what we're trying to say by producing that show.  The designer's job is to envision a world for these characters to inhabit that looks right - right size, right feel, right flow, right colors.  This is a highly artistic endeavor.  What's right is what looks right in the designer's head (generally communicated in drawings to the rest of the team).

When it comes time to actually construct the design, we might make certain tradeoffs for practicality - if the walls in the drawings are 8'3" tall, we might ask if we're OK shrinking them down to 8'.  If the stair landing measures 3'10" x 4'2", we'll ask if we really want to construct something custom, or if the 4'x4' platform we have in stock will do.  Often, if it's not a big deal, we'll tweak the original vision slightly to make the execution easier.  Sometimes we won't - "look, I realize 8' walls are easier to build, but we really need 10' walls here to convey the grandeur of a Victorian mansion."  Sometimes this is a construction convenience, sometimes this is budget - "I'd really like 6 over 6 divided light windows, but the windows we have in stock are single pane, and we don't have the budget to buy new ones, so we'll use them."

But regardless of how much freedom (and budget and time...) we have, what we want to do is design the thing that's "right," and once we know what that is, consider making tradeoffs for convenience in execution.

But now consider what happens if our designer isn't thinking about what looks "right," and instead tries to think about the execution first - the designer who's thinking about the plywood and not the design.  Rather than designing a properly proportioned space with the right "feel," they'll start lining up 4x8 flats across the space until it's "close enough" to the size we need.  Rather than thinking about what the right proportions are for that balcony hanging over the garden, they'll start by assuming we'll take a 4x8 platform we have in stock and putting it up on legs.  They'll never consider a wall that's taller (or shorter) than 8 feet, no matter what the show is.  They'll never consider a transition into the sunken floor of the living room that's not a straight line across (and a multiple of 4' long).

And this is where they've stopped designing.

It's no longer the case that we're adapting a vision to fit practical reality.  It's that we've never worried about a vision at all.  We've never considered any elements that weren't easy, or weren't "standard" sized.  We'll wind up with a set that's probably functional, and certainly easy (and cheap) to build.  But we've never really considered whether it was "right" for show.  We've abdicated being an artist to instead be a draftsman.  And the art (by necessity) suffers.

OK, so why am I talking about this in a blog that (allegedly) about software in general, and business analysis in particular?  Because I think the same logical trap of the set designer stacking up 4x8 blocks applies to someone designing a piece of software.


The heart and soul of business analysis (to me, anyways) is to understand our users goals, objectives, and desired outcomes, and understanding how we'd know when those goals were achieved.  It's about the "why" and the "what," not the "how."  Eventually, of course, we'll need to work with the rest of our team to define the "how," and the "how" is what we'll actually implement.  But thinking about the "how" too soon can turn your brain off - you stop listening for the goal, and simply start "stacking up" the tasks we need to do to make whatever they asked for "fit" in our current solution.

For example, let's say we're working for an e-commerce company.  They've done all their business to date on a "sales" model, but now they also want to add "rentals."  As we're working with the customer to understand how rentals work, we think to ourselves "OK, we already have an "Order Type" table - we can just add another Order Type called "rental" and it will flag all the rentals.  OK, customer is talking about returns now - we'll just add another "Return_Type" for rental return.  Easy peasy.

Except that we've really stopped listening halfway through the conversation.  We envisioned an easy solution, and started filtering everything the customer said through that lens.  We stopped thinking about what was different for rentals, and started thinking about how we'd make them fit our existing paradigm.  Which might be disastrous.  Because there might be a whole host of things that might be critical to launching a rental business other than just having another order type.

Is rental a separate kind of inventory?  Do we need to asset tag them separately from "purchase" inventory?  Don't we need a way to track which item is where?  What happens if we get something back "damaged."  When do we allow someone else to rent something that's currently rented to someone else but is expected back soon?  Just having two more order types ignores a whole host of areas we might want (and need) to explore.

We need to understand the problem first, and only when we've got that down should we worry about what our options are for implementation, and which options are easier.  The approach that's simplest in our architecture is OFTEN the best approach.  But if we assume an approach too early, we miss the opportunity to discover other options that may be superior.

Listen to your customers, understand them well, and then be the best designer you can be. 


Thursday, June 26, 2014

I Love Pecha Kucha

Hi, folks,

I want to talk this week about my favorite technical/non-technical presentation format - PechaKucha™ **

This week, I want to talk about what Pecha Kucha is, why I like it so much, how to use the format effectively, and close with a Pecha Kucha I gave on how to do a good Pecha Kucha.

What's a Pecha Kucha, anyways?

Simply put, a Pecha Kucha is a type of lightning talk with a very specific format.  I'm using "lightning talk" in the general sense to refer to "any short presentation" here.  All Pecha Kucha's are lightning talks, but not all lightning talks are Pecha Kucha's.

A Pecha Kucha is sometimes referred to as a 20x20, because the specific restriction of the format is "Exactly 20 slides, with each slide timed to auto-advance to the next slide after exactly 20 second."  This means a Pecha Kucha is exactly 6 minutes and 40 seconds long.  By design, the presenter has no active control during the presentation - the slides neither wait for the presenter to be finished nor advance whenever the presenter is ready.  It's up to the presenter to find the right rhythm.

Where did Pecha Kucha come from?

Pecha Kucha was invented in 2003 by Astrid Klein and Mark Dytham, two architects who lived in Tokyo and also owned a bar/experimental space called SuperDeluxe, as a way to get people excited to come to their venue.  They were inspired by a simple idea - architects (like most people) tend to talk too much.  60 minutes of PowerPoint is a lot to listen to.  So, they devise a format to force presenters to boil their presentations down to their essence.

And the first PechaKucha™night was born. 

There are now regular open Pecha Kucha nights in (as of this writing) 700 cities world wide, with people speaking on all kinds of topics.  You can learn more about them at pechakucha.org.  I highly recommend checking out their "Presentation of the Day" archive for some truly mind-blowing stuff. 

What's so cool about Pecha Kucha's?

The big thing I love about the format is that technologists suffer from the same problem as architects - they talk too much, and if you ask the average technologist to write a presentation, it's likely to be long and dry. 

But we're also a learning culture - the state of the art in technology changes rapidly, and it's hard to keep up.  Learning what our peers are up to, what they're experimenting with, what's worked for them -- all of this is vital to keeping on the cutting edge.

Pecha Kucha, with it's short, focused format that forces presenters to get to the point, is ideal for technologists to give each other a peak at what's new in the world, without taking too long or getting into too much detail.  It's just enough content for me to know if I'm interested in learning more about a topic, and get me excited about new ideas.  

It's also an accessible presentation format for people who haven't necessarily had experience giving public presentations before.  The format does have some challenges, but 6 minutes and 40 seconds of engaging content is considerably easier to generate then 60 minutes of content.  (At ThoughtWorks internal training program for new college grads, we require every participant to write and present a Pecha Kucha).  

I host a semi-regular Pecha Kucha night at ThoughtWorks' New York office, where our consultants (and some special guests) get the opportunity to let the rest of ThoughtWorks know what they're working on, looking into, or doing in their spare time to make the world a better place.

Sounds interesting - how do I get started?

Glad you asked!  I actually put together a Pecha Kucha on why you should do a Pecha Kucha, that also contains how to put together a good Pecha Kucha, that was presented at one of our ThoughtWorks Pecha Kucha nights! 




** PechaKucha™ is trademarked to protect the term and the network of open PechaKucha nights around the world. You can read more about the trademark at www.pechakucha.org/faq I very much support their right to trademark their work. However, to avoid crazy overuse of the ™ symbol all over this post, I will be omitting the trademark symbol when I'm talking about the concept (as opposed to the organization).

Thursday, June 12, 2014

Software Doesn't Have Requirements

I'm back!  Today I'm going to talk about one of the most pernicious myths in software development today.  It's this: Software has Requirements.

This may seem like a surprising thing to take issue with.  People have been talking about "software requirements" since, well, since there's been software.  It's the industry standard term to describe "what we're going to build."  Even leaders in the Agile space talk about software having requirements.  

So why am I picking on the word "requirements"?  Because it's the wrong word.  But more than that, it's a dangerous word.  Thinking about software as having "requirements" affects how we think about our development process in many suboptimal ways.

Software (by and large) doesn't have requirements!

In this article, I want to convince you why "requirements" is the wrong word, how thinking about "requirements" holds back our thinking and collaboration, and propose a better way to think and talk about what we're trying to build. 

Why software doesn't have requirements

I'll admit this isn't a completely new idea.  Among others, Kent Beck wrote about this back in 2004, and Jeff Patton wrote a very influential article on the subject back in 2007.

What do I mean when I say software doesn't have requirements?  I mean that very little of what goes under the heading of "requirements" is really "required" for us to build successful software.

If software has requirements, the first question to ask is "required by what (or whom)?"  And "required for what?"  Something that absolutely must be included for our product to have any hope to succeed in the market?  Something that's the only way to solve the business problem we're trying to tackle?  Something that we're certain we can't omit without risking failure?  How would we even know these things (if they exist)?

What we're really talking about when we talk about requirements are DECISIONS made by potentially fallable humans.  A trusted human, perhaps.  One who has a good understanding of the problem, hopefully.  But it's someone's best understanding.  Not some great truth about the market for software that's been hiding out there waiting for us to discover it.  They're not something that's "required" for success.  They're our BEST UNDERSTANDING of what successful software MIGHT look like.

Why am I belaboring the point here?  Because what we call requirements...aren't.   There might be other things we could do that could succeed.  There might be things we think are required that it turns out we could have done without.  They're not objective fact.  They're subjective opinions.

The closest most projects come to having true "must have" requirements are actually the much maligned, little understood non-functional requirements.  If we're rolling out an app to 10,000 people, it's pretty much required that it can support 10,000 users.  If we're building an app to integrate our HR and accounting data, it's got to be interoperable with our HR and accounting systems.  If we're building software for medical records, it needs to comply (in the US) with HIPAA rules.  This isn't hard-and-fast--many non-functionals requirements are still more of decisions.  But to the extent true "requirements" exist, they're more likely to be on the non-functional list. 

Why believing in requirements is dangerous

Regardless of what we call them, in any software project there's going to be a set of stuff that we build.  So why do I care so much what we call them?

Because labeling the "stuff we're building" as requirements is actively dangerous to our thinking. 

First, belief that software HAS objective requirements implies we can (and should) discover those requirements.  The requirements exist independent of the team.  Belief in "requirements" is belief in a right answer.  There's a correct "set of stuff" out there.  We just need to figure out what it is.  This implies investing (potentially considerable) time trying to determine and build a list of requirements.

And once we do, we should think the list is largely static, since everything on the list is "required."  After all, if we can decide not to do it, it must not have been "required" in the first place!  Yes, in Agile projects we manage our backlog, and re-prioritize frequently.  We can discover new "requirements" over time.  But (in my experience) we rarely REMOVE items from the backlog.

Thinking about "requirements" drives a wedge between people who ought to be collaborating.  It sets up a distinction between "the smart people who understand what's required" and "the people who implement those requirements."  It's not the team's job to think about what we should be building.  We're just building what's "required."  Requirements, by definition, are non-negotiable. 

Belief in requirements inverts our thinking.  The most important piece of building software is determining what we want to build and why.  But if the "what" is a non-negotiable list, and the "why" is "because it's required," we're telling the team to only focus on the "how."  But the "how" is the least important piece - we can do the wrong thing as well as humanly possible, but it's still the wrong thing. 

In practice, we know this is the wrong way to think about software.  There are usually multiple ways to solve a problem.  There are many different approaches that could potentially succeed in the market.  There's not a "right" answer - there are MANY right answers.

As Fred Brooks wrote in The Mythical Man Month, "The hardest single part of building a software system is deciding precisely what to build."  (quote stolen wholesale from Patton's article)  We need to keep our best efforts focused on solving that problem, not assuming the answer and focusing on less important matters. 

Rather than try to figure out in advance what will work, it's usually a better idea to TRY things and see what works.  This principle underlies the Lean Startup movement, A/B testing, and other experiment-driven approaches.  Don't try to figure out what OUGHT to work.  Don't rely on an expert to DECIDE what should work in advance.  Try something, see what works, and adjust.

Hey, isn't experimentation a way to "discover" requirements?  Sort of.  But only retrospectively.  Believing that we can have "requirements" for software is the belief that there's something knowable IN ADVANCE that tells us where to go.  Experimentation turns this on its head - explore many possibilities, and the ones that work were the right ones.  Even here, it's not clear that the working approaches were "required" - there could be other things that would have worked we didn't try.

Problems, Ideas, Decisions, Designs

This would be a pretty boring article if all I was doing was complaining about a word without offering any constructive ideas on how to think about "the stuff we build" differently.  Here's one analyst's thoughts on a better way to think about it.

In my view, there are four major components of the "what do we build?" problem.

PROBLEMS

Problems are our current perception of issues that people have that need to be addressed.  They might be based on things someone told us, things we observed, or just things we think could be better.  No matter where they come from, there's some set of "problems" we think exist for some set of people in the world that we could potentially try to address with our software.


There's no guarantee our list of problems is CORRECT.  We might think something's a problem that isn't actually something that's important.  We might not understand a problem that's really important.  Our list of "problems" is our BEST UNDERSTANDING of what is out there to potentially be solved.

If we're thinking about the online banking space, here are some examples of problems we might perceive:
  • When I want to buy something, I don't always know how much money I have.
  • Dealing with paper checks is a hassle.
  • I can't always find an in-network ATM when I want cash. 
  • It's inconvenient to pay monthly bills one at a time on different sites. 
We don't have to solve all of these problems together.  We might decide not to solve some of them at all.  But the list of problems is a good start for the universe of "what issues we MIGHT choose to address."

Problems, by definition, exist independent of any particular solution.  Some problems might have a single solution.  Some might have many.  Some might be insoluble.  When we're identifying problems, we shouldn't care. 

A key thing to avoid when identifying problems is to avoid the self-justifying solution.  The absence of a particular solution is never ipso facto a problem.  "I don't have a mobile application that guides me to the closest ATM for my bank." isn't a good problem statement, for two reasons.  First, it pre-supposes one (and only one) solution (a mobile app).  Second, it doesn't tell us anything about WHY we want that solution - what's the issue that causes me to want an ATM finder app? 

IDEAS

Smart, creative people can formulate a number of ways of ways that we might completely or partially address some of the problems we've identified.  That set of possible solutions are our ideas of things we can potentially do.

There's no expectation that we have a single good idea on how to solve every problem we've identified.  Sometimes we might have no ideas.  For some problems, we might have multiple ideas.  We might have ideas that completely solve a problem, or only partially address it.  Our ideas might be contradictory.  This is OK.

Here are some ideas we might have on how to deal with "I can't always find an in-network ATM when I want cash":
  • Build a mobile app that uses geolocation and a map of known ATM locations to guide me to the closest ATM.
  • Partner with Google Maps to have an option to show our ATM locations as a native overlay.
  • Waive our fees from using out-of-network ATM's and reimburse fees so our customers can use any ATM without penalty.
  • Build a NFC-based mobile wallet application linked to a bank account so I don't need cash so often.
  • Deploy iBeacons on all our ATM's so they're easier to find.  
  • Partner with a well-known company with a large footprint (e.g. McDonald's) to have an ATM in every one of their stores, to increase our footprint and increase our visibility.
  • Build mini-ATM's and install one in the home of every customer who requests one.  
Not all our ideas necessarily need to involve software.  Some of our ideas might not be great in combination (would we partner with Google Maps and ALSO build our own native app?)  Some of our ideas might be ridiculously infeasible (mini-ATM's in everyone's home) or cost-ineffective (waiving all our ATM fees).

Our ideas represent the universe of things we think might be worth doing.

DECISIONS

Obviously, we can't implement every idea.  If we're thinking creatively, we will almost certainly have more ideas that we're capable of implementing.  Some problems are more important than others.  Some ideas are better than others.  Sometimes we need to make a choice between several plausible ideas to address a problem.  These choices are our decisions.

Decisions represent our choices of which of our existing ideas we think are worth implementing and want to implement first.

We don't have to exhaustively decide on every idea we want to implement before we start.  We don't necessarily have to rank every idea we want to do.  It could be enough to start with a single idea to implement (lean-ists would probably recommend this approach).  We don't have to make all our decisions up front (and we probably shouldn't).

Deciding to implement an idea isn't an irrevocable decision.  If we're following an experimental approach, we might determine that an idea we thought was good didn't work out.  That's OK - our decisions can be reconsidered.  

A prioritized backlog of "As X, I want Y, so that Z" user stories can be thought of as the output of our decisions (the ordering of the stories) about our chosen ideas (the "I want" clause of the stories) to address important problems (the "as a" and "so that" clauses).  That's not to say that a good backlog "just implements" my model - while you can produce a backlog this way, the backlog (being the decision output) hides the universe of problems and ideas we did NOT choose to pursure. 

Our decisions represent our evaluation of  which ideas we think are the best way to address our known problems. 

DESIGNS

Once we've decided a given idea is worth implementing, we have to figure out ho we're going about it.  These decisions include the fine details of exactly what our idea means, how we'll determine we're successful, the technical design we plan to use, what our solution will look like visually, and how it will fit in with everything else we've built or are planning to build.  Those decisions encompass our design.

Our design can (and should) be informed by what we know or have learned about the problems we're solving, what other ideas have and haven't worked, and potentially some quick lo-fi experimentation we might chose to do as part of designing our solution.

If we're building software, our design encompasses all the things we need to do to translate our idea into high-quality, well-tested code that addresses our chosen problem.   The output of our designs is working software.

By point of comparison, the "designs" piece is the ONLY piece the team really owns in "requirements-based" thinking.  All the team does is implement other people's choices.  The determination of the problems, generation of ideas, and decisions around which ideas have merit are all part of the "requirements" that come to the team from the outside.

That's not to say designs aren't important - they're the only piece that directly results in working software.  But a good design is only valuable if it represents a good idea on how to solve an important problem.  Keeping the team isolated from the problem determination, idea generation, and decision making makes it very difficult to feel invested in the business problem.  And it keeps your skilled, experienced team at arms length from contributing to those important pieces of the process.

Why change?  

Why go through all the trouble to replace a common, reasonably well understood term?  And why replace it with four concepts that only replace it when taken as a whole? 

The reason I'm advocating changing the terminology is primarily to change our thinking about the most important process in software.  It doesn't matter how well we build the wrong thing.

The thing I like about my terminology is that every one refers to something we shouldn't expect to be static.  We can always identify a new problem.  We can always come up with new ideas.  We can always revisit decisions.  We can always redesign something.

All of these words imply things we should expect to change.  They're all inherently negotiable concepts.  None of them imply a correct answer, or an expectation that a single correct answer exists.  In short, they describe the software development world as we find it - dynamic, indeterminate, complex, and full of valuable, important problems that deserve to be solved.

Thanks to many of my colleagues at ThoughtWorks who gave feedback on a version of these thoughts presented at our North American AwayDay 2014.  




Thursday, September 5, 2013

Should you re-estimate work?

Hi, folks.  This week, I'm potentially making a "land war in Asia" level mistake by weighing in on one the "holy wars" around Agile projects.  Specifically, should you re-estimate stories once you've estimated them the first time?

This might strike some as a crazy thing to have a discussion about.  Why wouldn't you update your estimates as you learn more?   Our estimates should be as "good" as they can possibly be.  If we've learned something new that influences "how big" a given chunk of work will be, we should update our estimates.  Shouldn't we?  

Actually, maybe we shouldn't.  There are some reasonable arguments made that we SHOULDN'T strive for keeping our estimates as "good" as possible.  Estimation effort is potentially costly.  Estimates will never be accuracy.  And there are some reasonable questions around whether updating estimates actually makes your project predictions more accurate or not. 

This week, I want to take a look at why we have estimates, what the schools of thought on re-estimation are, and how I'd recommend approaching re-estimation.

Why have estimates in the first place?

That's actually not a rhetorical question.  While "you have to estimate everything!" is deeply ingrained in most software professionals by now, it's not completely obvious that we HAVE to have estimates to  build software.

As folks in the "lean development" camp will point out, work can flow just fine independent of a "schedule."  (I recommend reading some of Mary and Tom Poppendick's work on lean software if you want to explore this further).  Just let the team work out the most valuable stuff, have them work on it, and release to production.  If you're able to deliver work continuously, you don't necessarily need a great master schedule of "when will you be done?" to steer your team.  (I mentioned this from a different angle last week).

"Should you estimate at all?" is a topic for another time.  What I want to point out is that estimates are not per se valuable.  They are not an end in themselves.  They are a tool, used to answer a question.  "Having good estimates" is not the goal.  "Being able to understand the project" is.

Also, estimates are costly.  Every minute your team spends estimating software is time spent by your team on activities that do NOT result in any valuable working tested software being built.  Estimation time takes the team away from delivering value to do something else.  If the estimates provide valuable insight, they might be worth the investment, but they're not free.  All else equal, we should try to minimize the time spent doing tasks that don't result in building valuable software.

On most Agile teams, a common practice is to do fairly lightweight "relative size" estimation of work.  We then use velocity (the team's speed at delivering over time) to project how fast the team can deliver work.  Mike Cohn's book Agile Estimating and Planning is a great reference on this.  He also has a video of a presentation on the topic that's a good intro if you're unfamiliar.

For the rest of this blog post, I'm going to assume you're familiar with relative size estimation and velocity planning.

Getting back to my not-quite-rhetorical question at the top of this section, I'm going to assert that we estimate on Agile projects to allow us to answer two related-but-not-identical questions:
  • How much time do we think it will take to accomplish a given large scope of work? (the MACRO question)
  • Which specific pieces of work is it reasonable for us to take on in the near future? (the MICRO question).
The macro question is the "big picture" question - when is the new website likely to be ready?  It's also effectively the "how much does it cost" question - cost can be projected as "run rate" cost for the team ($X per week)  multiplied by time).  This is the "looking forward six months, where will we be?" question (As I talked about last week, there's a question on whether we should be thinking about our work in such long chunks, but that's a different discussion).

The micro question is the "what can we do right now" question.  It often boils down to "which stories off our backlog do we think can fit in the next n-week iteration?"  If we know our velocity is 20 points, we can in theory count the points on the work already "in" the iteration, and decide whether "one more story" will fit.

Why would we potentially want to re-estimate?

Again, for purposes of this blog post, I will posit a project that has a relatively large scope of work that needs to be released together.  The project team did a discovery/inception process, and found a number of user stories.  They did a relative-size estimation on those stories, projected a velocity, and produced a burn-up chart to project a likely delivery date.  At the start of the project, this was the best information the team had.

As the project wears on, however, the team learn new things.  Assumptions we might have made when we put our original updates together might no longer be true.  They might have learned that integrating with an outside system that they thought would be easy is actually a nightmare, and we have 20 stories that need to touch that system.  They might have changed their architectural approach in a way that makes some stories easier, and some stories harder.

When the team learns these things, they are faced with a question - do we update our estimates based on our new knowledge?  Or do we leave our estimates "as-is?"

The case for re-estimation

Some teams think the answer is obvious - our estimates should be the best they can be, to give us the most "accurate" picture of the project possible.  If we learned something new that impacts how long a story will take, we should re-estimate the story.

Doing frequent re-estimation will tend to give this team a smoother velocity over time (because stories are always as "right sized" as the team can make them before playing them, so we don't get "bumps" due to a story being bigger or smaller than its estimate).  This team will be better able to answer the "Micro" question - this team can use their velocity much more accurately as a check on "can these stories fit into a 2 week iteration?" 

A team that re-estimates frequently believes their long-term macro estimates are more believable because they've "baked in" their best knowledge.  However, their long term estimate is more likely to fluctuate iteration-to-iteration, even if velocity is steady, because the number of points in the release will fluctuate as estimates change (hopefully around the a "steady" middle point, but there will be some variation).

The net belief is that re-estimating makes our micro planning better, and makes our macro estimates no worse and in some ways better.  So while re-estimating involves investing more effort in our estimation process, it's worthwhile.  

Philosophically, re-estimators will argue that the "don't re-estimate" crowd is tolerating bad information.  We know that software estimation and planning is an inherently imprecise exercise.  When we are presented with an opportunity to improve the information we can give others about the project, we should do so. 

The case against re-estimation

The opposite school of thought is that you should not change your estimates from what you thought initially, even if you've learned more.  Teams that follow this approach would likely bring up the following points:

First, estimates are ESTIMATES.  They're not intended to be perfect.  As long as they are ON AVERAGE correct (roughly same number above or below), from a macro perspective, those inaccuracies will even out over the course of the release.  Re-estimating (they argue) creates an illusion of accuracy on an inherently inaccurate exercise.  We know there will be unexpected bumps no matter what we do, so let's not worry too much about attempting to smooth them out.

Second, they will point out that, while IN THEORY teams that re-estimate will improve estimates in both directions, in practice people tend to re-estimate UP more than they re-estimate down. 

This risk of "net estimation up" usually (in my experience) comes from an asymmetric application of applying risk in estimates.  Let's say we have a story that was estimated as a 2-point story.  From past experience, some stories similar to this one were "real 2's, but some were more like 4's.  It might be a 2, it might be a 4.  Let's make it a 4 to be safe.  Now consider a story estimated as an 8-point story.  We know some similar stories were "real" 8's, some were really only 4's.  Let's leave it as an 8 to be safe.  Even without ill will, the natural inclination in both cases is for uncertainty to ratchet more stories up than it ratchets down.

In practice, this means a team that re-estimates frequently will have its total estimate of the same backlog ratchet up over time.  They will also have their velocity ratchet up (since over time they'll be doing more and more stories that have had "slightly larger" estimates applied.)  The end result may be the same in terms of time taken, but the metrics will be harder to read.  The "don't re-estimate" school will argue that re-estimating actually hinders our understanding of the "macro" question.  Our "scale" for "what does a 2-point story mean?" will change over time, so simple linear assumptions around velocity and scope won't work properly.  

A generalization of this argument is a belief that re-estimation inherently causes "drift" in our estimation scale.  We began the project with a number of stories all estimated together with a consistently low level of information.  As we go, we have some stories retaining the "relatively little information" estimates and other stories with "more information" estimates.  Does a 2-point story estimated with more information really contain the same amount of work as a 2-point story estimated with less information?  The suspicion is a "mixed" backlog containing some stories that are re-estimated and some that aren't has an "apples and oranges" problem that make it hard to apply a single "velocity" number to consistently. 

The "don't re-estimate" folks will agree that if we don't change our estimates, deciding what stories we take into an iteration will be a less mathematically consistent exercise - if there are three stories available that are all estimated as 2 points, the team might feel comfortable picking up one of them (which is currently believed to be a "real" 2), but not feel comfortable with a different story (which is a 2, but has a lot of "known issues" that likely make it bigger).  The "don't re-estimate" school sometimes argues this is a benefit and not a problem - it means choosing stories for an iteration has to be a conversation with the team, and not a math problem for the project manager.  If nothing else, they'd argue that the time it takes to talk about "this 2-point story, not that 2-point story" is probably less than the time we'd spend re-estimating stories (which we do on stories that might turn out not make a difference in the story selection exercise).

The net belief is that re-estimating isn't a high-value investment for the amount of micro predictability it potentially brings, and potentially actually makes our macro predictability worse. 

Philosophically, the "don't re-estimate" school believes estimates are inherently imperfect, and that trying to tinker with them might be well intentioned but actually introduces more uncertainty than it removes.

To re-estimate or not to re-estimate?

I don't think either extreme position here is completely right.  However, my sympathies are closer to the "don't re-estimate" crowd.  I think changes in scale and "ratcheting up" are a real (not a hypothetical) risk.  I also believe the "macro" question of "when is the release done?" is of considerably higher value to the project team than the "micro" question of what fits in the next iteration.

Also, in my experience, quite a lot of the "always re-estimate!" teams I encounter are teams that are either new to Agile, or teams whose management don't completely trust the teams.  In both cases, re-estimating is done not for predictability, but for "accounting" reasons.  The team re-estimates stories because they are afraid that if the pull in a story that's "bigger than it looks" from its estimate, it will cause their metrics to show a drop in "productivity" and someone will overreact to a percieved problem.  "If we don't get 20 points done this iteration, the development manager going to yell at us for 'missing our velocity target.'"  This is solving the wrong problem - the issue is really an EXPECTATION problem about what estimates should mean.  Investing significant "no value add" time to try and make your estimates look "accurate" isn't going to solve that underlying expectation issue.

That said, I think the "never re-estimate ever" position is too extreme.  There are times when we've genuinely discovered that the way we're going to solve a problem is nothing like what we'd assumed initially and will require a radically different amount of work.  Our estimate is for what's effectively a different story than the one we're actually going to do.  Never accounting for that work to our project plan hurts us both in the micro and macro scales - if the project will genuinely take longer (or shorter!), let's say so.

I have two rules I'd recommend for "when do we re-estimate?"

First, I recommend when we do our initial estimates of a story, we record any key estimating assumptions we are making that the team agrees are key to driving us to choose which "bucket" to put the story in.  e.g. "This report can be built entirely on data that's already in RDB.  No additional data sources need to be built for this story."  My rule of thumb is we should only really consider a story for re-estimation if at least one key estimation assumption is violated - if the story differs in a SIGNIFICANT way from what we thought at the time of the initial estimation.

Second, I recommend a "two-bucket" rule.  If we're using "Powers of 2" for story points, don't spend time re-estimating a 2-point story that we think MIGHT be a 4 point story.  Only talk about it if we think it's at least POTENTIALLY an 8.  Don't spend time on the 4 that might be a 2 - only talk about the 4's where we could make a case for them being a 1.  This doesn't mean we can't decide to only move the story one bucket at the end.  Rather, our filter on "does this story really need to be re-estimated?" should be "it's so far out of whack that it MIGHT be MORE THAN ONE bucket off."

The purpose of the two-bucket rule is to keep us from arguing around the edges - "Is this a large 2 or a small 4?" isn't a high-value thing to get right (and is the situation most likely to lead to "ratcheting up" for risk).  We only want to talk about the ones that we think are BADLY mis-categorized.  Those are both the ones that are likely to have a major issue, and the ones that are potentially the biggest issues for our "macro" predictability.

Here's how I see this working.  We do our initial estimates.  As we're pulling together details for the "up soon" user stories, the BA is regularly reviewing progress with the devs, QA, and product owner.  As part of the conversation about the story, we should at least look at the estimate and assumptions.

"...So that's the story.  It's a estimated at a 2.  Anyone want to holler about the estimate?"
"Hmm...I know it's a 2, but I'm wondering if it's maybe a 4?  There's a few tasks that might be big, and that assumption about data sources is totally wrong."
"OK.  Do you think it's 'maybe a 2, maybe a 4,' or 'definitely a 4, maybe an 8'?"
"There's no way it's an 8."
"OK, then let's leave it at a 2 and move on." 
"Fair enough."

By focusing your re-estimation effort on the clear outliers, you can hopefully avoid getting mired in a lot of debate about things that don't significantly improve your predictability. 

Thanks to many colleagues and now-ex-colleagues at ThoughtWorks who've provided feedback on earlier versions of this rant...

Thursday, August 29, 2013

Is your burn-up chart holding back your thinking?

This week I'm going to take some shots at something that's actually one of my favorite Agile project metrics - the burn-up chart.

There's a lot to like about burn-up charts.  They're compact, but very information rich.  They convey several key concepts visually and in a way that's easy to understand.  They make it highly intuitive to understand the likely end date, and the uncertainty around that likely date.

So why am I picking on them?  Because (like many things, especially metrics) they can be used badly or in the wrong mindset, and in my experience often are.  Wrongly approached, burn-up charts can be reflective of some problematic thinking that can keep your team from being successful. 

I don't think these problems are the result of bad intention.  Rather, I think they stem from not thinking about how "what the burn-up chart shows" relates to "how we expect a typical Agile project to run."  My goal is to point out some of these "impedance mismatches," and present some ideas on how to make sure you and your burn-up chart are on the same page.

What's a burn-up chart, anyways?  

Simply put, a burn-up chart is a plot over time of work completed, and a plot over time of the "target" amount of work we need to  complete to achieve a goal.  By looking at the level and slope of the "completed" line, we can understand when we expect it to intercept the "goal." 

The concept of a burn-up chart can be applied to a wide variety of situations where "work" is achieved towards a "goal."  Burn-up charts can be be done with many kinds of "work achieved" (story points completed?  stories completed?  defects closed?  tasks complete?), can be plotted on multiple timescales (daily?  hourly?  monthly?), and can be used for a variety of goals (stories burning up for a single iteration?  tasks burning up to complete a given story?  stories burning up for a release?)

For purposes of this article, however, I'm going to restrict myself to the most frequently used type of burn-up chart, which is a "release burn-up."  The "unit of time" for this chart is is generally sprints/iterations, and the "unit of work" is generally measured in story points (or whatever the team's estimation unit of choice may be).  The "goal" line represents the total estimated work that's "in scope" for the current release.  The "burn up" line represents the number of story points that are completed each iteration.  It looks something like this:



To make it more useful, we can add a "trend line" for the work completed, which allows us to visualize where the intercept will happen:



Because our velocity isn't completely smooth, we can also extrapolate "worst case" and "best case" velocity estimates, to show us the range of uncertainty around our possible dates:



And there we have it.  It's fairly simple to draw.  And it answers a whole host of potential questions around "how is the team doing" at once.  You can see at a glance the "when do we expect to finish?"  The "range of uncertainty" is reasonably easy to understand. You can clearly see whether the team's velocity is relatively steady or jumping around.

You can also answer a number of what-if questions easily - how much would we need to cut if we need to be ready for production by date X?  Draw a vertical line on the date, and you can read from the projected trend lines the best/middle/worst case estimates on how much we can get done, so we know how big the cuts are.  Will be ready by date Y?  Is it in the range of uncertainty?  If not, probably not without making changes.  If it is, we can see whether it's an aggressive or conservative target.

So, yeah, it's a pretty good chart.  It's so commonly used that most Agile tracking tools will build it a burn-up chart for you.  Here's a few examples:

ThoughtWorks Mingle
Altassian Jira/Greenhopper
Rally

So, enough about what's good about burn-ups.  This post is supposed to be about what can go wrong with them, so let's get into that.

Burn-ups can block discovery

Let's say we're in the early stages of a project.  We did a quick discovery exercise, and found 200 points worth of user stories.  We did a velocity projection and estimated the team could do about 20 points of work per iteration.  Our first two iterations roughly bore that prediction out.  Our burn-up chart looks something like this:



Great. It looks like we'll be done around iteration 10.

Fast forward to iteration 9.  We should be in the home stretch.  But we're not - if we look at the actual backlog, we see that there are still 3 more iterations of work!  What happened?  Let's look at the burn-up chart again:



Ah.  The scope moved over time, as new stories were added to the project.  This means we have more work, and so we need more time to finish.

The two words most project managers (especially "traditional" project managers) are probably thinking about right now are "scope creep."  Clearly, the team should have "held the line" on new scope - if we wanted to add something new, we should have taken something out.  This project was "mismanaged."

And, in my opinion, if that's what they think, they're probably wrong.  What I described in the last paragraph is completely non-Agile thinking.

The notion that all the work "ought to be known" at the start of a project comes from "traditional" waterfall thinking, where we have an extensive "planning" phase up front that's supposed to uncover every requirement and task in the project before we get started.  Deviations from that plan are expected to be infrequent, and must be managed via a strong "change control" process to keep from materially impacting the schedule unnecessarily.

In Agile projects, we do away with the long, drawn-out planning phase that's expected to find everything we're going to need to do.  And in doing so, we need to do away with the assumption that all the work we need to do to accomplish the goal is known before we start.

While we might do a discovery phase up front, we should EXPECT that phase is imperfect and will miss a few things that need to be done.  Because we're collaborating with customers, we should EXPECT some amount of re-work or re-imagining of features will happen over the course of the project.  Those changes aren't "scope creep" - they're the norm of an Agile project.

Expecting that the team should "hold the line" on the scope we knew about at the start of the project, and that any "discovered" stories (either from customer collaboration or new discovery) means we need to remove something else to make room for it, is following a plan over responding to change.

So, what's the solution?

The simplest thing you can do to address this issue is start with the assumption that your scope line will be UPWARD SLOPING, not flat.  We don't know exactly which stories we'll uncover, but we KNOW we'll find some.  Assuming our scope line will be flat unless "something unexpected happens" sets the wrong expectations, both for the team and the customer.

The exact slope of the line will vary depending on the project, and factors like how unknown the problem space is, how expert the team is in the domain, how many different stakeholders we're trying to satisfy, etc.  I find a good rule of thumb to be that a "typical" team will spend about 20% of it's time working on "newly discovered stuff," leaving 80% for the "known at the start" work.  So my sample team with a velocity of 20 should expect to discover about 4 points of stories per iteration.

Some people may be howling that allowing this discovery factor is an open invitation to scope creep.  I don't think it is.  Just assuming the line to be upward sloping isn't the same as saying we'll stop thinking about newly discovered stories and just accept everything into the project.  Regardless of how we build the metrics, we need to have regular conversations with our product owner and only accept changes we genuinely want to deliver.

Also, assuming a "discovery rate" can make this a somewhat scientific process.  Assigning a "budget" for discovery helps us have productive conversations early in the project.  We tell our product owner their "budget" for story discovery is 4 points per iteration.  We then track it - in iteration 1, we added 6 points.  In iteration 2, we added 3 points, in iteration 4, we added 2 points.  Great - we're on track.  Or, if we're going into iteration 4 and we've been adding 8 points an iteration, not 4, we have to have a conversation.  Was our "discovery" expectation too conservative, and we need to change it (and so change our expected end date)?  Or is it that we had a major correction early in the project, and we don't expect it to recur?  The point here is that by tracking our discovery against a planned "budget," we can validate our assumptions.  We can even do it right in the burn-up chart:



In this example, the green line is how we expect scope to grow over time with discovery, so we should expect to be finished in Iteration 12 (and not Iteration 10, where we'd have projected to finish if we "held the line").  We can see from the trend in the Current Scope line that we're upward sloping, but roughly sticking to our discovery "budget."

One problem with this approach is that (to my knowledge) there aren't any automated tools that make it easy to account for an "expected discovery rate" - the burn-up targets tend to be horizontal lines (if anyone knows of a tool that DOES make this easy, would love to hear about it in the comments).  So if you want to track this, you may need to "roll your own" chart to do so. 

Burn-ups can cause complacence

Having talked about burn-up charts leading to thinking that potentially caused us to exclude things we ought to do, I want to talk about the opposite problem - burn-ups causing us to do work we don't need to do.

Again, by design, the initial "discovery" for Agile projects trades complete accuracy for speed - we're willing to tolerate some false positives/negatives in exchange for being able to get started writing useful code faster.   This means we'll miss some things, but it also means we'll potentially pick up some stuff we think is high value at the time that we'll later learn isn't actually necessary.

Most teams are pretty good at reviewing stories we recently discovered and vetting them for "is this really part of the project?"  But teams can be less careful about reviewing the things that are already "on the plan" and "in scope" for the project.

The burn-up chart can contribute to this lack of introspection.  If we had 100 points of work at the start of the project, and we're projecting to finish all 100 points within the expected timeframe people are asking for, then we'll just keep churning through those 100 points.  They're part of the "baseline" (which on a burn-up chart is literally a line).  The fact that we have a scope line with a fixed value as the "target" mentally causes us to treat that amount of work differently from "new" work, because we've already baked it into our metrics.


A good team should (and will) regularly review and "prune" the backlog - just as we expect to discover new work over time, we also expect that we will discover existing work isn't necessary.  We should remove those items, even if our burn-up chart "looks good" to getting the work done. 

Some of you may be noticing that in the previous section, I argued the scope should be expected to grow over time as we learn things, and here I'm arguing some of the initial scope should be expected to be removed over time.  Can't we just assume these effects "cancel out" and go back to the "flat" projected scope line?

While the two effects do partially cancel each other out, the magnitudes aren't necessarily the same.  In my experience, the "20% growth" assumption is what I've seen as the "net" effect - we discover new work faster than we remove work.

Again, the goal is to be deliberate about scope changes.  The team should be diligent on removing unneeded items, just as they should be diligent about only accepting necessary change.  There's nothing "special" about the work that was "in the baseline" - unnecessary work is unnecessary, whether it's "already in the metrics" or not. 

Burn-ups can encourage infrequent delivery

There are times when the right way to do a project is to spend several months building a cohesive product, and then release it together.  Sometimes you're building a brand new application that doesn't deliver value until it can support certain likely usage scenarios end-to-end.  Sometimes we're doing a major restructuring of a module that needs to change all at once.  Sometimes external circumstances around how we deliver necessitate a single large release (e.g. something that needs to deploy in parallel with a third party's application that we integrate with).

But these situations do NOT apply to all projects.  On many projects (probably MOST projects) we don't need to "build up" 6-12 months of work to have something of value to release to the marketplace.  And if we don't need to wait to do a big release, we probably shouldn't - one of the key goals of Agile is to deliver value to production rapidly, and begin to capture that value quickly.

Having valuable code that could be making us money "waiting" on a release is throwing money away.  We could be getting value from it, but we're not, because it's "part of the release" that happens six months from now. 

The continuous delivery movement has a lot to say on techniques/processes around the "how" of getting code to production quickly.  But I'm talking about metrics today, so I won't go into those.

The reason I'm seeing a potential problem with burn-up charts is that, once you're familiar with burn-up charts and grow to like them, you may start structuring your projects in ways that product a good burn-up chart, rather than alternate structures that don't work well with burn-ups but may deliver value more quickly.

The key insights you can get from a burn-up chart is a projection of when a given scope of work is likely to be complete in the (relatively distant) future.  If your burn-up chart tracks points once per iteration, they're close to worthless for giving us meaningful insight on how long it will take to complete a scope of work that's only an iteration-or-two worth of effort.

You can address some of this by plotting points more frequently (you might plot your "work complete" in days rather than iterations).  But in some cases, that's fitting the data to make the chart work, as opposed to thinking the chart is a metric that gives us useful insight that we can make actionable decisions on.  The real question is whether thinking of our project as relatively slowly "burning up" to a large goal is the right way to think about our project at all.

I see this thinking trap frequently on "new to Agile" teams.  They are coming from a world where a nine month project is considered "short" and the executive team is used to looking at a Gantt chart.  When they switch to Agile, they have slightly shorter projects (say, four months), but they don't have a Gantt chart anymore.  But we can show the execs a burn-up chart, and explain it to them, and we can make them relatively satisfied that they still have good insight into how the project is going.  But then the burn-up becomes a self-fulfilling obligation - we need to have burn-up charts because management expects burn-up charts.  And, subtly, we become mired in our thinking that the "right way" to think about projects is by burning up over time to a goal.

So, what do we do about this?  The first thing is to consider whether, like any other metric, a burn-up chart is actually appropriate to your project.  What is it going to tell you?  Does it fit the way we we're structuring the project to deliver value?  Let's be OK having the conversation that a burn-up chart isn't appropriate to our project.

One of the most productive teams I was ever a part of didn't bother with burn-up charts.  We'd simply agree "once these 4 stories are done, we can go to production with them," and make that happen - we'd deploy every week or two.

Burn-ups can cause missed expectations

Burn-up charts are great at predicting the "development complete" date for a given project.  Ideally, that's also the date we could "go live" with the new project - just click the button at the end of the last iteration and go.

In the real world, this is rarely the case.  First (as discussed in the last section), if you're a project that has a significant burn-up chart, you probably aren't sufficiently comfortable with your deployment process that you could just "push the button and go." 

There may be a genuine need for other processes to happen that need to take place between the "we're done writing code" and "we're able to go to production."  Maybe your application requires regulatory review and approval.  Maybe your process is to have a beta testing phase for final feedback before you flip the switch in production.  Maybe you need to coordinate the launch of the product with a marketing campaign.  Maybe you need to bring a swarm of tech writers in to put together the final "user manual" after the app is done.  Maybe you have a major outage that has to happen for the move to production to do a data migration into the new system. 

Some of these activities might be able to be done "as you go" during your development iterations, but probably not all of them.  Which means there are likely activities that take place AFTER your development is complete, but BEFORE you're in production (and delivering actual business value).

By design, a burn-up chart doesn't show these things.  What it shows is how we're progressing towards the "goal," where the goal is generally "we have completed all the stories."

The way we get an expectation miss is if we're reporting our burn-up charts out to a wider audience, and they're taking back the message of "the development and testing for features will be complete at the end of iteration 9 on Aug 1st," and conflating that with "we will be in production Aug 1st."  If that's not the case, you need to be careful about the messaging around your burn-up.  Which probably means "don't communicate ONLY the burn-up chart."

Some teams "account for" this by putting in placeholder stories and assigning them to "iterations" near the end of the project, so this non-development work will show in the burn-up.  I think this is often problematic, and is twisting the process to fit the chart.  Assigning "points" to "do the marketing for the release" is implicitly claiming the marketing activities are measurable on the same scale as the development work, and will have the same velocity.  But if the development team moves faster than we expect, the marketing doesn't magically take less time, and we'd be in trouble if our chart assumed we would. 

While I'm not a fan of them for many purposes around tracking software delivery, the much maligned Gantt chart can actually be your friend here - tracking dependencies and "cascading" dates appropriately is exactly what a Gantt chart is good at.  If you're putting together a "showcase" for your metrics, a slide with a name like "From Development Complete to In Production" showing the dependencies and timeline starting with your current best "development complete" date and ending with the date you're live in production can help everyone understand what those activities are, how long they'll take, and when the project is actually going to be done.

Wrapping up

As I said at the beginning, I generally like burn-up charts.  They're incredibly useful tools, and for certain kinds of projects are an ideal way to communicate progress succintly.

That said, I hope I've at least raised your awareness that (like all tools) burn-ups aren't perfect.  If we think about them the wrong way we can drive undesirable behavior, set incorrect expectations, or miss opportunities.  

Thursday, August 22, 2013

As a User, I want to be represented in User Stories

I'm a big fan of user stories for tracking upcoming work.  They keep us focused on the problem, not the solution.  They're good at reminding us why each chunk of functionality is valuable.  Done well, they're easy to prioritize and plan with.

But one of the big issues I have with them is that it's really easy to lose the "user" in the "user story."  I've seen way too many projects with several dozen stories that all being "As a user, I want...."  The placeholder "user," used over and over again, is usually indicative that we've stopped thinking about the actual users at all, and we're just putting it in for the sake of form.

Which is a shame, because thinking about who we're delivering value for is really important for us to write good stories.  So I'm going to talk about a few ways to break out of the rut of constant "as a user..." stories.

Users are not actors

One mistake I see frequently, especially with people transitioning from a use case heavy background to user stories, is to mistake the "As a..." clause of a user story as a place to identify an actor - the individual who is PERFORMING an interaction with a system.  But user stories are much more about stakeholders - the individual who gets VALUE from a feature. 

In many cases, these are the same, but they need not be.  Even though a "user" of the system might be the one interacting with it, something is happening that delivers value for someone else.  That's the person who "wants" the feature we're providing.  (By the way, I think the term "user story" is a slight misnomer for this reason). 

For example, let's say we're building an e-commerce portal, and we have a prospective feature validate a credit card number is a valid number (e.g. by running a checksum) on the client before we submit the credit card transaction for approval to the credit card processor.  This is a useful thing to do because many credit card processors charge by the request.  If we can figure out on our own the credit number is bad BEFORE we submit the request, we save money - we don't need to incur the expense of a transaction to tell us the credit card didn't go through. 

I could write a story for this like "As a user, I want to know that I entered a valid credit card number before I'm allowed to submit my order, so that I can fix the card number if it's wrong."

The problem here is that, while the "user" (a customer in this case) is indeed the one who entered the number, they probably couldn't care less about having an intermediate check between "I clicked submit" and "the transaction was sent to the processor."  Heck, they might not even notice it.  From the "As a user" perspective, this story is close to valueless.

However, the feature is NOT valueless.  Given miskeyed credit card numbers are a major cause of credit card transaction failures, doing an internal check here can cut down a lot of our "waste" of paying for failed credit card transactions.  Let's re-write the story from the perspective of the person who actually gets value from this feature.

"As the accounts payable manager, I want credit card numbers pre-checked for validity before they're sent to the credit card processor, so that I don't pay transaction fees for transactions I know will fail."

Suddenly, we've transformed the same feature from a "why would they care?" into a much more compelling story that clearly relates to business value.  And it's a lot easier to prioritize this against other stories.  If we only have 10 failed credit card transactions per day, this might still not be high value.  It we have 10,000 failed transactions a day, this might pay for itself in a week.

 "Accounts payable manager" might not be a "role" in the system.  Heck, the accounts payable team might never use our system directly to do their jobs.  And they're certainly not the ones who are actually entering the credit card numbers. But they're the ones who care about this feature - they're the ones paying the cost of not having it, and the ones who benefit from it being in place.  Their perspective is the one that's important when evaluating this story.

A good rule of thumb when putting together the "As a..." clause of a user story is to think "If we took this feature out of scope, who would be first in line to pay me to put it back?"

Users are not system roles

Another way we can lose track of who we're delivering value for is to simply think of individuals in terms of what their "system role" is.  We could think about every feature targeted by someone in a "user" level role as "As a user...", and have the features that are only for administrators written as "as an administrator."

This makes some sense from a technical perspective - we're delivering a feature that's intended for everyone who's role in the system is "user."  And since we're not intending to restrict the feature to only a subset of users, everyone who's a "user" gets the benefit.

The problem with this approach is that it treats everyone in a user role as being identical.  That's rarely the case.  For example, think of an online auction site.  Everyone who has an account (other than the admins) is a "user" - none are different from a perspective of what they have the ability to do.  Anyone can search items, bid on an item, sell items.

But think of the richness that's obfuscated if we just think of everyone as "users."  There are lots of subcategories of users that have different perspectives, and different needs.  You have power sellers - factory outlets selling many copies of the same item in bulk.  You have high-dollar-item sellers, who need lush descriptions and photos to make the item look good.  You have online yard sellers, trying to get rid of a few things as cheaply as possible to clean out their basement.  You have targeted buyers who want one and only one thing, and want to find it fast.  You have browsers who want to compare multiple items of the same general category (such as laptops from 10 different sellers).  You even have your day traders, who buy and sell the same items frequently to try and arbitrage short-term market fluctuations.

Each of these "users" might have access to the same tools, but thinking about which one wants a feature can help bring clarity.

"As user, I want to create a template of an item that I want to auction, so that I can later list that item quickly" vs. "As a power seller, I want to create a template listing of an item I have in stock, so that I can create multiple individual auctions quickly."

With the first story, we might wonder why the templating is useful - couldn't we just create the listing directly?  The rationale is much more clear when we think about the context of a power seller - the template helps because they can create the "same" auction multiple times.  And thinking through that lens clarifies how this needs to work - I clearly need to be able to re-use the template.  I may need to be able to change specifics with each new listing (like a serial number).  Also, we can probably assess the priority better- how many of our users actually fit the "power seller" profile?  Is it worth building this feature primarily for their use?

Users are people

Your users are not faceless anonymous masses.  They are (for the most part) human beings.  They are people trying to live real lives.  In many cases, their interaction with whatever you're building is a small part of their busy life.  Often, the thing they're trying to use your system to accomplish is a means to an end (I'm buying this book online so I'll have something to read when I go on my beach vacation).

The problem here, of course, is that every user is different.  Trying to write user stories that really capture each user's individual goals, nuance, and perspective is almost impossibly hard for any system that has more than a handful of users.

One technique to try and capture some of this "real user" feel without being able to model real users is to use personas.  For those unfamiliar, a "persona" is a somewhat fictionalized biographical sketch of a "typical" user, with more realistic goals.  While it's possible to go overboard with them, using personas can bring valuable insight into our implementation of a user story.

Many projects create user personas as part of an inception/exploration phase at the start of the project.  It can be very powerful to map stories back to personas.  If there's no persona that clearly would use a feature, that's telling us something.  Are we missing a segment of our expected userbase in our personas?  Or are we devising a feature that no one we think will use the product actually wants? 

Let's enhance our user story from the last section by tying it to a persona.  Back when we started the project, we anticipated we'd have power users, so we built a persona for them.

Meet Linda.  Linda runs her own distressed asset disposal business out of her garage.  She has an IT inventory management background, and has contacts with a number of local and regional technology distributors.  Linda spends all morning calling around looking for large cancelled orders, unopened returns, and other unwanted tech in bulk.  She buys pallet quantities for cheap prices, and sells them off online.  The margins aren't great, and she has limited space, so her business is all about speed and volume.  She opens dozens of auctions a day, and the less time she needs to spend setting up and maintaining her auctions, the more time she can spend finding great deals.

Now let's try our user story again, this time with the persona as the user.  "As Linda, I want to create a template listing of an item I have in stock, so that I can create multiple individual auctions quickly."

We can almost picture this in our mind.  We can see Linda sitting in her dusty garage with a laptop, next to an shrink-wrapped pile of LCD monitors trying to set up her auction template.  Hey - I bet she wants to be able to pull down the manufacturer's stock description into her auction template, rather than type out a description herself.  Hey - I bet she wants some way to track how many of these she has.  Hey - I bet when she creates auctions, she's going to want to auto-relist items that don't sell.  Hey - you know what would be cool?  If she could just scan the barcode on the manufaturer's box and pull all the spec's up automatically....

Some of these thoughts are probably different user stories - we don't want to suddenly drag the mechanism to create and auto-relist auctions into a story to create a template.  Even the barcode scanning idea is probably it's own feature that we can add after we build the basic template concept.  But keeping Linda and her problems in mind as we build each feature will probably guide dozens of small decisions we make over the course of each story.

It's possible to go overboard with personas.  Sometimes we build so many that we lose track - who was "John" again?  It's possible to build personas that aren't accurately reflective of the real user community, and so could steer us in the wrong direction (what if 95% of our power sellers aren't like Linda - they're like Charles who works for the distributor and is more interested in recovering the most value rather than turning items at speed?)  That said, writing stories to a reasonable set of identifiable personas is a powerful way to keep your team focused on solving real people's real problems.

Wrapping up

Software is for built to solve problems for people.  Keeping the team focused on who they're building software for, what they want, and why they want it is the key insight behind user stories as a technique, and why they're so powerful.

I don't imagine changing a single word in a user story will take your stories from good to great in a single stroke.  The "As an X..." isn't so important as itself.  Masses of stories reading "As a user..." are a symptom of a problem, not the problem itself. 

What's important is the thinking behind our stories.  Are we really thinking about who we're delivering value for?  Are we clearly thinking about our users as something other than a uniform monolith?  Can we see their faces and breathe their problems?  Because that's what makes us great builders of software. 

Thursday, June 13, 2013

Measure your Non-Functional Requirements

Like most people who work in the software industry, I often hate working with non-functional requirements.   They're the ugly stepchild of software requirements - forgotten as soon as they're created, difficult to manage, and generally never rearing their heads until the end of the project, when they become a stick to beat the development team with and a cause of late-breaking deployment delays.

I'd like to change that.  I believe it's possible for us to work effectively with non-functional requirements, make them visible, and do more to make sure they're met than simply crossing our fingers. We just need to plan for them. 

For purposes of this post, I'm going to define Non-Functional Requirements (now sometimes called Cross-Functional Requirements, often abbreviated NFR's) as follows:  NFR's are the set of "things your team is required to deliver" AND the set of "conditions your team must make true" over and above the delivery of working, tested code that implements user-visible features.

Examples of non-functional requirements are documentation requirements ("We need to produce a user guide"), performance requirements ("All page load times need to be 2 seconds or less under production load"), and a variety of "ilities" like supportability, accessibility, deployability, etc. ("All major events need to be logged to a common file," "Every page and message needs to be I18n compliant," etc.)

On "traditional" waterfall projects, the typical handling of NFR's is that in the planning phase, we make a list of all of the NFR's.  Then we record them on a spreadsheet.  Then we hand it to the architect to "keep in mind" when planning out the project.  Then we stop thinking about them until the late-game pre-deployment testing, when cross our fingers and see if they're actually met.  Then, when they aren't (and trust me, at least some of them aren't), we argue about whether we live with it or slip the date.

On "agile" projects, the process is largely the same up until we record them on the spreadsheet.  Then we....well...we're not sure.  In theory, some of these become "coding standards" that we make people aware of (but often don't enforce).  Maybe we'll remember to cross our fingers and test before we deploy.  Maybe we'll wait until someone complains after we go live to say "oh, yeah...we should fix that."  Regrettably, since they fit poorly into our "focused on features" development process, they're easily ignored. 

The real problem with non-functional requirements is that they tend to play by different rules than "everything else" we're deciding to build.  I see two key differences. 

First, non-functional requirements have a "cost visibility" problem.  It's hard to estimate "how much" a requirement like quick page load times will cost the project, because it (generally) increases with the number of pages we build.  It's hard for us to give customers visibility into tradeoffs like "how many other features will I need to cut in order to include ADA compliance?"

Second, most non-functional requirements have a "focus" problem.  On an Agile team, we're constantly looking at some kind of board with all the "in flight" functional features for the current iteration/sprint/cycle.  But because (most) non-functional requirements span all features, they're never visible on the board - they're "universal acceptance criteria" for all stories/features.  And like everything else that's "boilerplate" to each item, we stop thinking about them. 

So, what do we do about non-functional requirements?  I've had some success using a two-part approach to handling non-functional requirements.

First, we need to separate non-functionals into "playable requirements" and "standards."  Some requirements are genuinely "do once and then you're done" items that look and act just like User Stories/Features.  An example would be "we need to build a pre-production staging environment with data copied regularly from production."  That's a thing we could choose to build in, say, Iteration 3, and once it's done we don't have to build it again. I tend to treat these like "standard" requirements - estimate them, assign them an iteration, and run through the normal process.

Then, we have the requirements left that are NOT "one and done."  They're the ones that span all stories.  For these stories, I focus on having a METRIC.  I have a conversation with the team (INCLUDING the product owner) where I ask them "how do we want to ensure, over the course of the project, that this requirement is met?" 

For example, let's say our requirement is that "all page load times will be less than 2 seconds under production load."  Great - how do we ensure we meet that requirement?  There are a number of ways we could in theory ensure that.  At one end of the spectrum, we could say "OK, we'll build a dedicated, prod-like load test environment, on a clone of prod hardware, along with a dedicated pool of load generation machines.  We'll also build a set of load test scripts that test every major function of the system.  Every check-in that passes CI gets auto-deployed to this environment and load tested." 

That's probably the most robust possible testing we could have to meet the load requirement.  Unfortunately, it's also expensive and time-consuming to build.  Is this requirement worth it?  If not, what might we do that's less than this?  Maybe we'll just build some JMeter scripts that run in CI to collect directional performance metrics we'll watch over time - it won't tell us DEFINITIVELY we'll perform under load, but it will tell us if we're getting worse/slipping.  Riskier, but cheaper.  Maybe we'll periodically have the testing team step through some key manual scenarios with a stopwatch and measure times.  Or maybe we'll choose NOT to invest in load testing at all -we'll accept the risk that we might not meet this requirement, because the load numbers are expected to be small and the technology we've chosen for this project is the same technology we're using elsewhere at far greater volumes, so we think the risk is very small. 

The point is we have these discussions as a team, and INCLUDE the product owner.  This gets us over the "cost visibility" problem - if you want to team to have extremely high confidence they meet this requirement, what will that cost us?  Are we willing to invest that much?  Or are we willing to take on more risk, for a cheaper cost? 

Once we've decided on how we'll measure our compliance, we need to make it "part of the plan."  Let's say our plan for performance testing was for a tester to manually go through the application once an iteration with a stopwatch and measure response times in the QA environment.  Great!  We put a card on our wall for "Iteration 4 performance test," and (whenever in I4 we feel it's appropriate) have a person do that test and publish the results.  If they look good, we've got continued reassurance that we're in compliance.  By publishing them, we remind the team it's a "focus", so we remember we need to be thinking about that for every story.  If we find an issue, we add a card to the next iteration to investigate the performance issue to get us back on track.  

You can have similar conversations around things like a user guide.  How will the team produce this document?  One option is to say we won't do anything during development.  Instead, we'll engage a tech writer at the end of the project to look at the app and write the guide.  That would work, but it means we'll have a gap at the end of the project between "code is done" and "we're in production." 

Another approach would be to build this up over time - with every story, we agree that someone (maybe the analyst, maybe the developer, maybe the tester) will update the user guide to cover whatever new thing we built.  Thus, we build up a guide over time. 

This is great in theory, but again, how will we ensure the team's doing this?  Are we going to periodically review the user guide?  Is the user guide going to be presented/reviewed in our regular demo meetings?  Are we going to add this to the testers' checklist for "what do I need to validate before I sign off on a feature?"  The goal is to make our decision explicit, make sure everyone understands the level of investment we expect, and how we're going to demonstrate our compliance.

Having explicit conversations around our investment in non-functional requirements, and setting explicit metrics has the ability to turn "non-functional requirements" from vague semi-forgotten dictums into explicit common-sense cost/benefit tradeoffs made between the product owner and the development team.  We can take them from items we don't think about until the end of the project to something that's constantly visible.  And we can take them from a cause of massive heartburn and late schedule slips into a source of pride and confidence for the team.