Using Economics to Encourage Testing Incrementally (or As You Go Along)

At TriAgile, I had an interesting conversation with a Product Owner. She described to me a problem where the testers could not keep up and their behaviors were actually holding them back. Let me describe her situation…

In her content development team, they had a couple of testers. They manually tested hyperlinks and other HTML/JavaScript/CSS elements towards the end of the iteration. While she would love to move to automated testing, there were some hurdles to get software approved for use, plus she had this whole behavioral mindset she needed to overcome. The testers on her team felt building and running tests incrementally as a developer completed work on acceptance criteria was wasteful. They preferred to do it after a story was completed by teh content developer; this then always put them being crunched. No matter how hard this Product Owner tried to convince them to do their testing as they go, they resisted. Her Scrum Master was also not providing any influence one way or the other.

As we discussed this at TriAgile, I finally settled on economics to help her understand the situation.

Suppose a content developer produced a defect that prevented a CSS library from working by using a faulty assumption (let’s say it was as simple as a misspelled directory in the URL). And let’s suppose this faulty assumption caused the error to be reproduced 10 times. And further, let’s say each time the person did this, it took them 10 minutes to implement each instance. Lastly, it took the tester 5 minutes to test EACH instance.  So let’s do some math. (All of the times are hypothetical; they could be longer or shorter.)

So first up, testing at the end: 10 x 10 minutes implementation + 10 x 5 minutes testing = 150 minutes. But wait, we now have to fix those errors. So presuming that great information got passed back to the developer and it only takes them 5 min to correct each instance, we need to add: 10 x 5 minutes fix time + 10 x 5 minutes retesting = 100 minutes.  So our total time to get to done is 150 + 100 = 250 minutes to implement, test, correct, and retest the work. Our Product Owner had actually said that this kind of error replication had happened multiple times.

OK, what would have happened if it happened incrementally? Well our implementation time is the same, but after the first implementation occurs it would go to get tested. If an error is found, it goes back to the content developer and having seen the error she or he was making, they can now avoid reproducing it. So the time would be something like this: 1 x 10 minutes implementation + 1 x 5 minutes testing = 15 min, then 1 reworked item x 5 min + 1 item retested x 5 min = 10 min, and finally 9 remaining items x 10 minutes implementation + 9 x 5 minutes testing = 130. Total time now is 160 min.

If the cost was $2/minute (assuming a $120/hour rate employee), you easily wasted

$250 – $160 or $90.

Now multiply this by however many teams are not testing as they go and how many times they have this happen.

Of course there could be items caught that are not recurring, but the fact of the matter is, every recurrence of an error that has to be backed out introduces a lot of waste into the system (defect waste for you Lean/Six Sigma types). Testing as you go and stopping ‘the line’ to prevent future defects from occurring saves money in the long run since labor time is what we are talking about.

In addition to this direct savings as calculated above, one ALSO has the queue time for each test that is awaiting to being tested before it can be OK-ed to produce value. In the first instance, this may be building up considerably, delaying production readiness. And suppose out of the 10 occurrences above, only 8 could be completed because we’re near the end of the iteration? Then we’re probably not going to get all of them tested and any fixes done in time. If we had been testing along the way, then if something didn’t get tested, we could talk with the Product Owner about releasing what was completed and successfully tested. Something of value is completed as opposed to deploying nothing. There is a real opportunity cost for this delay.

So there is something to be learned by each area with this. For the tester, testing completed work, even manually, incrementally keeps you from becoming a bottleneck to producing value. For the developer, giving developed items to the tester incrementally and getting feedback after each item allows you to correct along the way, possibly avoiding future errors. And for the business, having this occur incrementally actually reduces both the real and potential opportunity costs of the work.

The UX-CX Dance

How seriously do you consider user experience for your internal applications? There  seems to be much discussion around creating good user experiences for outwardly facing applications; however, equally important is those that are internal, particularly if they support people that directly interact with your organization’s customers.

Here’s an example of what I mean; let’s start with some context.

My family and I were flying back from Australia earlier this week. In order to make our flight, we got up very early (4am) to do final prep before our taxi picks us up.We made plans to arrive a little over 2 hours20140523_VH-XFB_Keith_Anderson prior to our flight knowing that would be sufficient to check-in, make it through security, and have a coffee. We’re going to be flying domestic to Sydney before boarding our international flight to LAX and then transfer to their affiliate airline domestically back to DC. We have a premium economy seating internationally and domestic internally; that isn’t offered domestic in Australia (only economy and business).

We arrive at the airport without a hitch. Because we are starting off domestically and not going straight international from Melbourne, we go up to the domestic agent to do our check in. She is very pleasant and nice! (Particularly for this early in the morning, which is about a 5:40ish and a bit earlier than even we anticipated arriving.)

First up, we provide our information and US passports and at her request place our first bag up for weighing. She states that we’re overweight on our bag. The policy of the airline as we had checked it, was 32kg international premium economy (it was 23kg for just economy). I personally checked each bag with a sale we bought as we packed it and ensured we were well under for each bag (greater than 4kg). My wife pleasantly points this out; our agent wasn’t argumentative, but stated she would have to check since we started off only on economy. About 5-6 minutes later when she returned, she had her answer that yes it was allowed. By the time she had gotten back, my wife already had the policy pulled up on their own website. UX point #1: This should have been available on the screen to her without her having to go check with someone (presumably a manager).  Perhaps an agent at a check-in for international would have had this available, but I doubt it; most likely they would just know the policy due to necessity. Given the airlines current route structure, MOST international flights to places in other parts of the world would fly from airports other than Melbourne, thus this ‘help’ feature would have made sense to be made available to every agent.

So she returns to entering our passport information. She apologizes that she has to key in the address we are going to (in this case our home address) for each person separately and thus consumes more of our (and her) time. We casually discuss that this multiple entry seems inconvenient. UX point #2: There should have been a way for identifying people flying together as being members of the same household so that the address field would only be need to be entered once. There is good reason for having this as an option; I could tell she was a bit frustrated about it and it was preventing her from helping others in the line as the airport got busier. She remained very pleasant to us and it never impacted us being able to depart on time.

My wife is both a US and Australian citizen, where as I and my son are just a US citizens. The next issue came up when she entered my wife’s passport information; it didn’t want to let her complete the transaction since she hadn’t entered in on a visa.  She had flown in on her Australian passport since she is an Aussie citizen and didn’t need a visa, where as my son and I entered on visas. So she swapped to using my wife’s Aussie passport; now it wanted a visa for entering the US. After a bit of hassle and finally asking someone, she found out that it wasn’t possible to enter two passport numbers on the screen without having someone link the records on the back-end (presumably some configuration/database entry) to enable that feature. UX point #3: the developers had not considered the persona of a dual citizen and now it had become a clunky customer support  feature. There are lots of dual citizens in Australia, particularly with Britain.

So at this point, let’s stop for a moment and consider perhaps a deeper cause to these three UX points. (BTW, I never saw her screens, but the last two had her frustrated enough that she was pleasantly talking through what she had to do.) I would venture to guess that the development team, and particularly the product owner/business representative, of this application never fleshed out many personas of either the agents nor the customers they would be helping. They probably ONLY considered ‘agent’ role as the one possibility and never the people they help.

Want to improve the product owner’s ability to support her or his user base? Help them understand their customer and that customer’s customer using customer journey maps. (I particularly like using the Lego Customer Experience Wheel or the Innovation game Start Your Day.) Flesh out the personas with Empathy maps and further refine your backlog based on these.  If you want to understand better how backlogs change based on personas (whether it be customer persona or role), check out the game “Backlog is in the eye of the beholder” game.

Organizational or business agility means attending to customer needs; gaining the right UX/CX experience in your product, release, and iteration planning is key to doing that right.

(Incidentally, we had overall CX impacts with how the airline had negotiated arrangements in how people are physically moved by a bus between terminals in Sydney as well, so using customer journey maps can really help give you a holistic view in how to improve your relationships with them, something that is all important these days.)

Demonstrating the INVEST Criteria

potters_gold-2

I’ve been doing some rather “loftier” types of post, let’s return to something a bit more fundamental to (software) product development, user stories and in particular the INVEST acronym as developed by Bill Wake (see INVEST in Good Stories, and SMART Tasks). I was helping a coworker with some good examples of stories to showcase the INVEST criteria and felt this may be a useful post for people.

Let’s start with two formats User Stories may be expressed, we’ll stick with latter:

Who-What-Why

Or more commonly as

As a (role or persona)

I want to (perform some business function)

So that I can (get some business value/rationale)

Usually breakdowns in good user stories fail to articulate one or more of the INVEST criteria. Let’s look at each separately along with some examples.

I = Independent

We want stories to be independent; an independent story should be small vertical slice through most, if not all, of the software stack (UI, business logic, data persistence, etc.). Let’s start with a counter example to help demonstrate this.

As a decision-maker,

I want the data selection table menu to show the latest option results

So that I can determine which one to analyze.

Sounds OK right? Not really, the menu is a UI item. Where is this data going to come from, presumably a database, file, or API. It may get processed in a middle tier to do some filtering or sorting. The UI layer where the menu resides is only one layer; this story would be dependent on other stories in other layers to be able to be implementable. Usually any story that goes into the ‘how’, becomes less independent. Let’s rewrite it to –

As a decision-maker,

I want to view the latest option results

So that I can determine which one to analyze.

Besides appearing simpler, this doesn’t specify the menu, leaving the development team needing to do all the tasks to implement the results. Tasks could be querying the table, apply filter algorithm for outliers, sort from highest to lowest, display as a menu. It also doesn’t lock the team into the how – if the result could also come from an API or web service they can present those as an options to the product owner for selection; same with the menu, perhaps a table would be better.

N = Negotiable

Negotiable means the product owner and development team can make trade-offs on the priority of the story and/or acceptance criteria. Again let’s start with a counter example.

As a survey reviewer

I want to compare multiple respondent data sets

So that I can see if a correlation may exist.

What data sets? What data of the data sets? How is the product owner supposed to negotiate on this? Let’s add some detail –

As a survey reviewer

I want to compare age bracket data to geographic region

So that I can see if particular geographic regions contain particular high levels of a particular age group.

This is more negotiable; why? Suppose there was a second story –

As a survey reviewer

I want to compare income bracket data to geographic region

So that I can see if particular geographic regions contain particular high levels of a particular income.

Now the product owner can negotiate on which one is more important? They could also dig into acceptance criteria and talk about the ages or incomes that make up those brackets or what level of granularity they need to do for the regions. Often non-negotiable stories, ones that seem that MUST be done and can’t be ranked against others that MUST be done also are an indicator they are too big; they encompass too much.

V = Valuable

Another counter example will illustrate a story that doesn’t articulate value…

As a decision-maker,

I want to view the latest results

So that I can see them in order.

Why do I want to see them in order? (It’s presumed the order desired would be acceptance criteria. Better to specify the why, this also usually indicates why not only is the function needed, but why the particular acceptance criteria was chosen. Here is our refined story again –

As a decision-maker,

I want to view the latest results

So that I can determine which one to analyze.

Now we know why we need to do it.

E= Estimable

We don’t care so much about the estimate, which is one reason we use relative estimation based on complexity over trying to nail down an estimate in effort/length of time (hours for either). We care that some amount of certainty in the complexity can be articulated; this gives us a gauge that it is understood well enough to start. The higher the estimate, the less certainty, meaning it is more complex. At some point, this may require splitting into 2 or more stories to reduce complexity.

As a investor,

I want the latest analysis

So that I can decide what to do.

What do we mean by latest analysis? How do we estimate that? And that value statement doesn’t help; what decision are we trying to make – the business function – and why do I want to make it – the why. Here’s a story that may be estimable (providing acceptance criteria can be drawn from this)

As a investor,

I want the latest ROI graph with my minimum threshold shown

So that I can decide whether to continue making this investment.

OK, we want a graph, which we know must draw on data; if the raw data needs to go through calculations, we will need to do that. This threshold, is it entered or stored somewhere? Looks like well need tests to ensure the calculations are done properly. If we need to ensure web accessibility for people with sight disabilities, we may need a textual equivalent. Regardless, even with this uncertainty, being able to see most of the tasks and thinking on their complexity will give me the ability to estimate. Many have found that the estimate becomes pointless once the team actually has confidence they can complete it along with other stories in an iteration; remember this is mostly to describe common understanding. This may take months or even years to get to that point though.

S = Sized properly

Hand-in-hand with estimable, is sizing. If the story is large, really complex, then we need to think about splitting it into smaller independent stories. A good example of a story that is probably too large is the first story that dealt with a survey reviewer. The stories that follow it describing the data sets to compare are smaller and clearer and probably could be successfully implemented within an iteration. Who knows if the first one could? Also, if I couldn’t I get no partial credit for getting some of it done. If I get any small story done, then I can take credit for it.

And lastly, T = Testable

Testable stories are determined by their acceptance criteria. Let’s go to our first good story and fill in some acceptance criteria to see this clearly.

As a decision-maker,

I want to view the latest option results

So that I can determine which one to analyze.

When we turn the card over, we find the…

Acceptance Criteria:

  • Display options as menu choices
  • Display options in descending order from highest to lowest
  • Display results below my threshold in red and bold these
  • Don’t display negative results
  • Option results are calculated by the uncertainty index to the simulation result
  • Return the results in 0.3 of a second

These are easily testable, manually or in an automated fashion. (NOTE: there is a more sophisticated method called Given-When-Then from Specifications by Example by Gojko Adzic that allow these tests to be more easily automated in tools such as Cucumber.)

Using Dollars as a Constraint on a Project

I’ve been planning to write this for awhile, and this seemed to me more important to post after seeing an update from a Kickstarter campaign I am backing.

So I backed a board game, I was particularly interested in that it i intended to be small so I can take it with me almost anywhere I go.  What was amazing to me was how they calculated what they needed for funding.

Before I dive into that, I’ve backed quite a few boardgames on Kickstarter (along with music albums, music gear, and camping gear…) Most Kickstarter projects go in with varying degrees of estimations; one nice thing Kickstarter does is if you don’t reach your funding goal you don’t owe to make anything and the backers keep their money.  If you get funded, your estimates hopefully allow you to produce the game and have at least a small measure of profit. Most projects offer stretch goals that when they are reached, component upgrades and such kick in – these usually have a change in your estimate.

The gentleman that developed Carrier Commander, decided on a price point he wanted to be able to sell the game ($3 as it is a nanogame; I love small games to take with me when I travel). From there he reversed everything into size and weight by calculation based on what would be possible should he hit his stretch goals.

On the campaign page, he reveals the cost breakdown including the “Uh-Oh” zone which is the profit area…

To read up on how he calculated his way into the $3 price point without estimating, see this update:

https://www.kickstarter.com/projects/1078944858/star-patrol-carrier-commander-3-sci-fi-strategy-na/posts/1348731?ref=dash

Should all Kickstarters work this way?  Probably not…  The larger the game, the more the calculations would become overly cumbersome, particular as stretch goals needed to be calculated, so using estimation and factoring in reserve to cover the uncertainty would probably suffice.  In his instance, his upgrades were in cardboard only, so this made it much easier.

So how would this relate to software development? As I wrote in my post “When I Have Skipped Estimates”, one could use a team size as a constraint and then measure throughout.  Once the constraining bottleneck is understood and all worthwhile options for increasing throughput there have been exhausted, you could increase capacity.  This really works well for software maintenance.

One could also use something akin to what this gentleman did for his Kickstarter game; establish a fair market value for the cost of what you are building (i.e. how much is someone willing to pay to have something by a particular point in time).  Once you have this you have both time and budget constraint and now you can see how much that would pay for in terms of people and other infrastructural resource one may need; i.e. what is the capacity it can purchase?  Let’s say we got enough money that it would pay for 7 people for 6 months (+ servers, desktops, software licenses, etc). We can then execute and develop based on that.

One may ask at this point, how do you know if you will make what is needed? You actually don’t. What you do know is that this is what the person or people that set the constraint said would be what they are willing to pay. Like a venture capitalist, they have in their mind, I am willing to risk this amount of money to see if I can get what I want.  Yep, no guarantee. But then, an estimate doesn’t produce one either.

Should you do this under all cases? Absolutely not. In fact, I would say estimation is needed more often than not when deciding to fund a project (or program). And for those cases, we as an industry need to improve in estimation. However, there are cases,where estimation doesn’t necessarily help us. The more novel the project (and thus its approach), the greater the uncertainty and at some point it may be best to establish a cost (and perhaps schedule) constraint and see what you get at the end of that.  Got something valuable? Perhaps continue forward (and perhaps now introduce estimation); what you got isn’t valuable? Then you can use the knowledge you gained to decide to continue or not (and perhaps add in estimation or not).  You can use the knowledge you have to make a decision.

At least those are the cases I have for when I would go a #noestimates route… What are yours?

I’m interested in exploring each side; if this interests you, I hope you will consider joining me at the first Agile Dialogs unconference I am putting together.

Using a Business Canvas in a Government Environment

At least some of you know I worked at the Environmental Protection Agency in the Office of Pesticide Programs (OPP).  At one point I and a colleague created a Business Canvas for our office; this concept comes from Alex Osterwalder’s book, Business Model Generation.  Below is what I can remember of our canvas (we did this about 5 years ago and I did not take it with me, so this was reproduced from memory; it’s mostly correct).

OPP_Biz_Canvas

These high level items allowed us to identify quite a few useful things. I’m not going to go through every box at the moment, but what we found we could do with this was identify weak spots (our IT contractor at the time was a weakness for us) and the primary activities to leverage to create our value propositions.  We did some postulating on new possible customer segments and thought specifically targeting farmers (one of the largest users of pesticides) may be a good thing to call out.

We then did an analysis on various trends. One trend stuck out; while we were a monopoly, we still were subject to market forces. The economy at the time had been in recession for a couple of years, a pretty severe one at that.  PRIA registrant fees funded much of our work. If the economy is tanking, less pesticides will be purchased (farmers in particular will try and get with less to lower costs). This in turn normally lowers the amount companies will invest in R&D. Without R&D, less new pesticides will be rolling out for registration, meaning less funds and work for OPP. There isn’t anything magic here, but the canvas had us postulating on it.  We went to talk with our IT Director as we wanted to find a way of testing this hypothesis as it would have a severe impact on the work we do; he showed little interest.

Later that year, the Office Director for OPP announced we were going to have the least number of registrations on record since the Office was founded. I can only envision had we tested our hypothesis we would have had a leading indicator as opposed to the lagging indicator of watching the number of registrations trend significantly lower than expected.

Most Government organizations have only appropriation.  Even so, thinking in terms of the value propositions being delivered to customer segments and the activities and partners needed to do this can be really advantageous.

T-Shaped/H-Shaped Contracting Officers

Recently the US Digital Services and Office of Federal Procurement Policy issued an OMB Challenge; in it they discuss how contracting officers need to be more knowledgeable in digital services procurements. (Digital Services seems to be the new 18F-ish buzzword for user-centric software development, though they also reference cloud-based services…)

In this challenge, they mention creating depth of knowledge in digital services procurement, however they also suggest a desire to increase their business savviness, though they don’t express exactly what is meant.

T-shaped people have both depth and breadth of expertiseThis prompts me to simply point out that contracting officers and specialists (as well as any acquisition-related professional) are needed to aspire to become generalizing specialists or T-shaped people.  What do I mean by this?  For a contracting officer, this means becoming not only steeped in contracting services, but knowing enough about information technology to understand what may or may not apply to procurements. I’d also suggest getting more knowledgeable in their department’s or agency’s mission and understanding their needs earlier on is what will also aid them in becoming better at digital services procurements.

The challenge wants a CORE-Plus curriculum; IMHO this indicates that the government is interested in beginning to create contracting officers that have more breadth.  This helps attune their contributions to become more valuable as their knowledge increases to better align with the services being procured.  In some ways the desire to have contracting officers undergo a CORE-Plus certification, means they will be more like H-shaped people with some deper knowledge of digital services technologies as well.

Contracting, particularly in the government, is a complex undertaking.  As someone who maintained several DAWIA (Defense Acquisition Workforce Improvement Act) certifications myself, I can attest to how valuable it is for personnel to have a broader understanding for what they are acquiring and how it fits into the needs of the organization that will utilize it.

For an excellent general write-up on what T-shaped people are, drop by Darren Negraeff’s post The Importance of T-Shaped Individuals.  It contains links to further reading and is also where the T-shaped image above comes from…

A Short Essay on Using Models – Why Should You Use Them & Why You Should Create Some

EA-7L_Corsair_Line_Drawing I use many models in my thinking, whether they are mine or someone else’s, yet I don’t think of myself as a theorist. I thought it may be helpful to some on why models are so valuable to a pragmatist. Another word for model is framework…

“essentially, all models are wrong, but some are useful”

George E.P. Box

This quote is the first thing to remember when you begin using any model; you need to remember that at some point a model will break down and no longer support what you were using it for…  Like a lean start-up idea, create and use models passionately, but stop using them the moment evidence points that they are no longer helpful.  (he nice thing about a model though is that generally this means you have crossed an edge-case where the model doesn’t work any longer, but may still be useful in the long run.  If the model consistently doesn’t work, then perhaps the model has some invalid assumptions.  Exploring these assumptions then may help you refine the model into something that once again works or to find or develop a model that does work under the broader circumstances.

This brings me to the next point – ALWAYS realize models have a set of assumptions.  Explore how the model works under these assumptions.  This helps you understand when the model may be useful and when it may not. With that, why do you need them if you are simply someone (particularly a coach or manager) who needs to help people get things done?

Models help you understand systems; they may not provide a means to achieve an answer, but may simply may provide a means for organizing your thoughts.  The Cynefin model by David Snowden is one of these latter ones – it can help you understand the problem space you are exploring for decision-making. Finding models that can represent systems or at least significant and important portions of a system is mostly useful for helping you organize your thoughts.  The act of thinking through when and how these apply including valid and invalid assumptions about variables, algorithms, or organization (for more pictorial models) really helps you determine on which things to pay attention.  Even if you find the model doesn’t work, the amount of thinking you went through will serve you well.

And I invite you, particularly when you don’t find a model that seems to represent what you need, to try and think through creating one.  Don’t worry about it being perfect, you can always adapt the model after inspecting how it works.  Again, you are using this to organize your thoughts.  Creating a model could be as simple as combining models; Jurgen Appelo’s CHAMPFROGS model about motivation does this.  It appears Jurgen saw gaps, overlaps, and some inconsistencies in representation and blended a new model to make it more clear to him.

It’s also extremely useful to find where different models connect in explaining the same observations (data) differently.  This helps you understand where options may be found and where the thinking on these has many dimensions, which again exposes assumptions about the models.

Going back to the usefulness, one huge benefit for applying or creating a model is stepping back from tactical thinking to a more strategic layer.  This helps in prioritizing based on importance over simple urgency.

People serving as coaches and managers are there to help the people improve the system, you can do this best when you have your own thoughts organized. Models can be an essential tool in selecting and organizing the particular tools and techniques needed to apply.