When I’ve Skipped the Estimates…

spiral_clockWhile the debate carries on whether one must have estimates or not, I thought I’d provide a viewpoint of when I found them no longer needed.

However, before go there, let’s start off with a bit of a story about when estimates were not useful, but required, so I took the *EASIEST* path out.

Let’s go back to 2008; I was just hired on as a software development Branch Chief in USDA and asked to prepare the budget for the next fiscal year.  Of course, the first thing I did was poll around on what upcoming work there was. No one knew except that the same amount of maintenance as was last year. That was easy, apply an inflation factor on what we had this year, add a management reserve, and we’re done.

Now onto the harder problem: what about the unknown new projects looming.  dollar_tunnelSo I investigated how these normally got funded; any estimate done is simply reported up the chain (as requested Development monies), but the funds are actually provided by the programs that need the work done for them. These are used as a projection for  the branch and nothing more. Any work really done goes through its own process of requesting and then actual money is provided.

So I asked, how many projects did we do the year prior and how much did they cost? And the year prior? And the prior to that? 4 projects, 4 projects, and 6 projects were the answers. (I won’t go into the money numbers, but I’ll note this branch did not develop super huge applications, but small to medium sized applications with some complexity – a GIS app, an analytical app, several tracking type apps, a loan package development application, that may give you the picture.)  I didn’t need to know the number of apps for  the reporting, but I used that number to calculate the average cost per app we developed, projected into 2009 dollars; adding a standard deviation game me some more certainty, then a 15% management reserve.  Once I had those numbers, the process was literally a half hour to run through the math a couple of times to ensure I was on target.

My project managers could not believe I was going to use that number; they always went around to each potential customer and asked them to conjecture on applications or upgrades they wanted. Most never got funded and something else came up and got funded, so why spend time estimating what never happened.

This was a very low precision estimate, but got me in a reasonable and justifiable target number. (If the system allowed for ranges, I would have provided those, but alas it didn’t.)

I’m guessing you are wondering how ‘correct’ I was with that… We had 5 projects and it was fairly close to the average.  The next year we did the same thing, but it was off – much higher as the Recovery Act kicked into high gear, but as I pointed out before, it didn’t matter.

OK, that was budgets built using the least painful method of estimates possible.  (Sometime in the future, ping me on how I executed on real work within the branch… The spoiler hint is I limited the WIP of projects going on at any one time, so that I could keep my team close to constant size, the increase meant I experienced a contractor headcount increase by about 2 people.

So now onto some maintenance estimation I did away with…

When I took over running the maintenance team at Office of Pesticide Programs, every Software Change Request (SCR) came in went into a queue where it was examined in a meeting and the contractor told to go estimate it.  When the contractor came back with their estimate, usually a week later, the work was approved.  They estimated in time and they could then quote the money as they figured out who was going to do the work and then they could apply their labor rate. This singular meeting was at least an hour long every week and consisted of telling the contractor go estimate the amount of work to do and report out on estimates made.  This never went anywhere; no one did anything with these estimates. We never said no to the SCRs for the legacy systems we maintained, mostly because no one worked with the business well enough to know whether it should happen or not. On top of that, there were 20 some legacy apps with at least that many stakeholders to try and satisfy. Perhaps at some point, this estimation process was used to say no, but with the mostly low complexity work coming in, there was no drive to say no.

We set budgets based on annual contractor headcount. Perhaps at some point this estimation exercise was used for this, but it wasn’t any longer.

So I did a couple of things, I killed the meeting. I put the onus on the government application maintenance staff to work with the business to prioritize the work in their viewpoint. I set-up a rule set for taking these priorities, along with a quick technical assessment (that set severity) and the date in, to establish a prioritization across all apps.  I got these stakeholders to agree to this scheme so I didn’t have to fight with each app. We still never said no, we just prioritized the work not started constantly.

And I eliminated the estimates.  I decided on contractor staff based on how much work I could get through; I concentrated on further process improvements before I thought of increasing headcount. (You can read about the Kanban system that was set-up on GovLoop if you so desire.)

To go full circle to where once again I found an estimate helpful in this environment was a potential regulatory change was going to require a rather large piece of work to our legacy PowerBuilder app. I was asked how long it would take; the upper management was interested in ensuring that we had enough lead time to get it done. Not having it done, had a financial impact on the Agency.

Since I had a Kanban system implemented in Trac, I filtered out that legacy’s enhancements to similar ones and calculated the average and two standard deviations.  I gave them that range with stating the high number had 95% confidence we’d fit within it. They deeply appreciated the accuracy and precision in this case. This is a form of estimation of course, but the real point is day-to day, we never estimated; there was zero value in it.  We did capture actual data though using our system, which made predictability possible just as I mentioned above.

Hopefully this will help others at least understand one context where estimations weren’t needed and also where low fidelity estimates were good enough to establish a reasonable estimate. I consider myself a no estimates guy, only because I look at the assumptions of why I need to estimate and if I don’t and can derive a more suitable answer in some manner, I’ll probably use that.  It’s all a matter of context.

An Alternative for Identifying Classes of Service

appleorange-1

In most Kanban systems established, classes of service refer to an assessment of impact to the business.  While I personally like this approach, often this assessment technique doesn’t fit well for some teams or organizational issues.  It may also not be very informative for some work items being managed.  I have always believed in using Kanban, and particularly its associated metrics, for identifying areas to improve.  Sometimes we need abilities to slice by similar items as far as impact, but that may have other degrees in which they vary. So I’d like to present a few other styles for identifying these.  I’ll start at a team level and move upwards towards something more organizational wide.

Maintenance Activities

I often seem to find teams performing maintenance activities (upgrades, defect/bug fixes, small to large enhancements, etc.) struggling to find ways of understanding the metrics that will be useful to them.  While an Expedite class of service, with its own identifiable swimlane and corresponding WIP limit, is invaluable, a standard class of service is not when the timeframe or scope tends to skew the metrics results.  I want to be able to predict when an activity may be done with some confidence.  If I lump all of the activities into one standard class of service, the larger items will skew the average lead time to a higher number than my smaller activities and my variability will be very high.

A concrete example is an ERP upgrade versus an important (but perhaps not critical enough to go into the Expedite column) bug fix.  The ERP upgrade may fix numerous (just as) important bugs as well.  Upgrades in ERPs often can’t be broken into apples to apples comparisons as the tasks are entirely different though the lifecycle that may be managed through the Kanban process may be identical.  Additionally, the items that must be completed for the definition of done (which become cumulative entry/exit criteria along my columns) may also be different.

BTW, these types of items may be tracked within a higher level Kanban and not necessarily a team based one…

Portfolio Items

Definitely moving a level or two upward, if I have portfolio items that need to follow an identical process, but may have varying entry/exit criteria or varying typical timelines may also be worth tracking as separate classes of service though each may be more or less equally important to the business (i.e. close to the same prioritization in the backlog).  Here’s some examples: reogranizing a particular function, redesigning a business process, implementing a new application (at the highest level).  Each may follow a similar process of: Backlog -> Analyze -> Implement -> Measure Performance -> Done.  The definition of done and timelines may be quite different on each of these items.  Wouldn’t it be nice the next reorganization being proposed I understood how long my last set took in terms of average lead time and its variability so I can predictably give an answer to the board?  I don’t want to skew my data with that for my last network upgrade.

One could argue that we could (or should) use separate Kanban boards for these, but I think this is less than useful. I can think of two reasons to have these on the same board.

  1. I want to understand how my organizational WIP of change affects cycle-time overall.  This would be very difficult to do if these were spread across multiple boards. (This is not to say that each effort may not have its own more detailed board.)
  2. Also if I want to also think through alternatives approaches and compare cycle-times as a magnitude of cost (since often time is money) and benefit (time to market), having these on the same board makes this much easier.  By tracking this information, I can use this information as input to my decision-making on which approach I may use.  For example, if I can use it as input to analyze whether I redesign my current business process or automate the existing process.  Knowing the cycle-time can become part of the analysis both in terms of cost and benefit

A Quick Analysis View

So how do we determine these different classes of service? Well I have already hinted at the dimensions that we will use.  We’re going to basically categorize the work item types by time it takes to get them done (just a gut feel of time) and differences in the scope of definition of done, looking for vastly large differences.  You can place these on a grid such as the one below.

Classes of Service Matrix

So even items with a similar definition of done may have vastly different timelines, knowing this keeps us from skewing the data when we want like items.  Additionally, not lumping things that have vastly different definitions of done (column(s) exit criteria) yet follow an identical process at the level we are looking at it can also be very helpful.  The bottlenecks that may occur can be different; this also makes a useful distinction.  Lastly, I can now view all of these dissimilar items on the same board and yet have a means of distinguishing them and their corresponding metrics.

When one is stuck on identifying classes of service, or the classes of service between the items appears meaningless, give this a shot and see if it helps.  I’d be interested in other viewpoints.