Sunday, May 18, 2014

Testing in Production?? Of Course! No Never! … ?!?

Recently I’ve come across a number of discussion on testing in production and whether this is good or bad.

Misunderstandings all the way down

Of course it all depends on your perception of what “testing in production” means. If it means delivering products that ripen at the client (what is called “Banana Software” in Germany) that’s quite different from when it means “being able to probe the running system without (too much) disturbance of vital funtions”

How do other professions handle it?

A little while ago I elaborated a bit more on the subject of testing and I also think most of the ideas from this earlier article are still valid. Testing should contribute to better, and more reliable solutions. Whether this requires testing at creation time, build time, roll-out time or during production, testing at the right level with the right approaches is a great thing – of course!!

What do you think?

’till next time

Sunday, May 04, 2014

The blackout version of “stop-the-line”

As the story goes:

“Whenever a worker in that Toyota plant saw anything suspicious or a fault in the product he worked on, he pulled a cord hanging from the ceiling and the whole production line stopped.”

This may seem counterintuitive at first, but actually makes a lot of sense if the circumstances are right. Consider for example a misalignment between rear view mirror and type label on the bonnet of the car, that is discovered close the end of the production line. If it is just caused by a misplaced label, stopping the line might be ‘a bit’ over the top, but if it is caused by misaligned mounting holes for the bonnet (drilled at the very beginning) which in turn leads to errors everywhere downstream from that station (bent hinges, torn padding, sheared bolts etc.) it might be a good idea to stop the line as early as possible and fix the root cause first.

But that’s not related to software, or is it?

This might seem to be less of a problem in software, but from my experience it isn‘t – quite the contrary. Let‘s just assume that a new function is introduced in the newest version of a library or framework and this operation is redundant (to an existing version) and also faulty. Not “stopping the line” and eradicating the problem at it‘s roots will probably lead to a widespread usage of exactly that new function. Sometimes in fact so widespread that the whole system becomes unstable – and a maintenance nightmare as well!

But how to do it in (software-related) development?

Most development teams with a “stop-the-line” policy tend to use another concept from the TPS, the andon, a system to spread important information by visualizing it excessively. A common example for this is a traffic light or a set of lava lamps.
But there is a problem with these approaches – they still require everyone to follow the agreement, that a faulty build means “stop-the-line”. Also they only work for faulty builds – not for conceptual problems.

A really cool (but slightly scary) version – The Blackout! …

… was recently brought up by a client of mine: connect the “stop-the-line”-Buzzers (or cords) with a dedicated power circuit for the displays… Thus effectively once someone hits the “stop-the-line”-Button all screens go dark!
Even though this idea came up as a somewhat humorous remark, I could imagine that this might actually work – at least for teams that have reached the high quality levels typical for ‘hyper-productive’ teams.

So – what’s your policy for defects? And what’s it going to be?

’till next time
  Michael Mahlberg