Chaos Monkey

The last week something amazing happened. For maybe the 100th time ever, an employee gamed a very well-designed HR policy in place at my friend’s company. As I witnessed her navigate the tightrope between wanting to penalize the employee and thanking him for bringing the loophole to light, I recalled yet another of my favorite tech terms.

Chaos Monkey is a software built by those at Netflix designed to inflict pain to pressure test against unexpected, unplanned failures. The chaos monkey deliberately switches off servers in live environments at random. It takes the pain of disappearing servers and brings that pain forward. By deliberately sabotaging their own systems, think friendly hackers, it creates strong alignment for the team to design-in redundancy and automation for the necessary resiliency and reliability in the face of random failures. Training for this randomness helps make stronger, more resilient and fault-tolerant systems and software.

In short, chaos monkey is a metaphor for actively working on what life could throw at your system before it happens. People attempt to break systems all the time. Employees do it, hackers do it, you & I are always contemplating ways to do it: it’s everywhere. The beauty of it lies in knowing how to use it to your advantage like the engineers at Netflix do.

Think of it as ‘Chaos Monkey sessions’ where you invite a diverse set of ‘hackers’ to come together and attempt breaking an existing or new product, policy, tool or process. Maybe even award them to do so. This way, you are aware of most of the ways the system can be played and all perceivable problems before they come into play allowing you to build more resilient interventions. When I say diverse, I do not refer to just end users. Begin by pulling a diverse set of people within the HR function–recruiters, compensation consultants, learning partners. Then add in a set of non-HR roles. Once you’ve done that, identify for a cross-section of gender, differently abled, color vision deficient, neurologically different, etc. It will surprise you how much you learn when you do this. Not very long ago, I learnt that my new dashboard that took days of design and included lots of color coding was absolutely useless to one of my color vision deficient leaders.

Sometimes, best laid plans fall apart sometimes, and that is ok. There are only two things one can do; prepare as well as you can and face failure with grace. Embracing Chaos Monkey helps prepare us. As for the employee, she’s still around and now leads these sessions.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s