Building Resilient Teams Through Strategic Redundancy

I’ve been thinking a lot about redundancies lately – not the kind splashed across headlines about corporate downsizing, but the reassuring kind, like knowing there are three backup engines when one fails on your transatlantic flight. This fascination hit me hard during the recent wave of tech layoffs, where I watched companies strip away their safety nets in the name of efficiency. It’s ironic, really. We carefully build redundancies into nearly everything that matters – from backing up our family photos to keeping spare house keys with trusted neighbors. These aren’t just backup plans; they’re the invisible thread of trust woven into our daily lives. In engineering, we’ve long known that a small investment in redundancy can dramatically reduce the chance of failure. It is why it’s common practice to build a system that runs on multiple hosts. The math is simple, but the implications are profound: spend a little more, get exponentially more reliability. Yet somehow, when it comes to our organizations and teams, we seem to be ignoring this fundamental principle.

The mathematics of redundancy is fascinating. While I’ll draw primarily from engineering teams which is an area I am most familiar with, these principles apply across many organizational contexts. In engineering systems adding redundant components to exponentially improve reliability is called building redundant architecture. A system with two independent components, each with 90% reliability, has a 99% chance of at least one working. Add a third, and you’re at 99.9%. This compelling math has shaped everything from aircraft design to data centers. Of course, redundancy isn’t always beneficial. In complex systems, cascading failures can actually be amplified by redundant components, turning what should be a safety net into a liability. Think of how a single algorithmic trading error can replicate across backup systems, magnifying losses instead of preventing them. But in most system designs, thoughtfully implemented redundancies build customer trust and system resilience. However, when we turn the same lens to organizational design, it gets far more interesting – and complicated.

Unlike mechanical components, humans defy neat mathematical formulas. When we add redundancy to teams, we’re not simply duplicating identical parts with predictable failure rates. Each person brings unique perspectives, skills, and approaches – making the math messier but potentially more powerful. Take a software engineering team maintaining a critical service: having three engineers who can deploy fixes might seem like straightforward redundancy. But Engineer A might excel at debugging performance issues, Engineer B might have deep security expertise, and Engineer C might be best at system architecture. Their overlapping capabilities provide basic redundancy, but their diverse strengths create a multiplier effect that goes beyond simple backup and risk management. It creates space for innovation and futuristic breakthroughs that designing for efficiency alone cannot achieve. Yes, constraints breed resourcefulness and innovation, but constraints can also stifle innovation when engineers are struggling to keep their head above water and live in constant fear of being axed.

On the flip side, redundancies introduce complexity that mechanical systems don’t face. Communication and management overhead increases exponentially with each additional person. Knowledge sharing becomes far more critical. Team dynamics shift in ways that can either enhance or diminish effectiveness. And unlike mechanical components that operate independently, humans influence each other’s performance, for better or worse. A redundant team member might step up brilliantly during a crisis, or they might create confusion about ownership that slows down decision-making.

However, we must be clear-eyed about when redundancy can become dangerous. When roles become truly redundant – with multiple people doing exactly the same thing – it can lead to confusion, reduced accountability, and inefficiency. The goal isn’t to duplicate everything but to strategically build in capacity where it matters most.

The key, then, isn’t just about having backup like we do with non-human systems. It’s about thoughtfully designing teams where overlapping capabilities enhance rather than interfere with each other. This is where the art of organizational design meets the science of redundancy planning.

In practice, strategic redundancy works best when built around three core principles: critical path coverage, clear ownership with collaborative overlap, and planned slack.

First, identify your critical paths. These are the essential processes and knowledge areas where failure isn’t an option. In software teams, this might mean ensuring multiple engineers can handle production incidents or having overlapping expertise in core technologies. But it’s not just about technical skills. Maybe your product manager is the only one who deeply understands both your customer needs and technical constraints. That’s a single point of failure waiting to happen.

Second, maintain clear ownership while fostering collaborative overlap. Each system, project, or area should have a primary owner or a single throat to catch, who drives decisions and holds ultimate accountability. But they should be working alongside others who understand enough to step in when needed. This isn’t about creating backup copies of people; it’s about building a web of complementary capabilities. In practice, this might look like regular rotation of responsibilities like we do when we rotate on-calls, paired programming sessions, or cross-functional project teams where people can learn from each other while maintaining their primary focus areas.

Third, and perhaps most controversially, plan for slack. This means intentionally operating below maximum capacity maintaining a talent buffer of 15-20% in critical areas. This isn’t new. Most sprint planning accounts for slack typically reserving 20-30% capacity for unexpected work and improvements. It might seem inefficient to have five people doing work that four could handle, but this slack serves multiple crucial purposes. It creates space for learning and innovation, allows teams to handle unexpected challenges without burning out, and provides capacity for knowledge transfer and collaboration that strengthens the entire system.

I recently witnessed the power of this approach in action when a lead database engineer unexpectedly took extended leave. Because we’d intentionally built overlap in database expertise across the team and maintained enough slack for others to step up, what could have been a crisis became a growth opportunity. The team not only maintained our systems but used the challenge to improve our documentation and cross-training practices.

Of course, this approach requires defending against the constant pressure to optimize for short-term efficiency. It means being able to articulate why having ‘extra’ capacity isn’t waste – it’s insurance, innovation capacity, and organizational resilience all rolled into one. It means acknowledging that sometimes this may translate into ‘extra’ managers to manage careers, overlaps and strategic redundancies and, standing firm when finance questions why your team isn’t operating at 100% utilization. And sometimes, it means accepting higher immediate costs for longer-term stability and growth potential.

As we navigate an increasingly unpredictable business landscape, the true cost of running too lean is becoming painfully clear. I’ve watched organizations sacrifice long-term resilience for short-term efficiency, only to find themselves scrambling when key people leave or when market conditions demand rapid adaptation. The ‘lean and mean’ approach might look good on quarterly reports, but it often masks growing organizational debt that comes due at the worst possible moments.

Building thoughtful redundancy into our organizations isn’t about inefficiency or waste, it’s about creating systems that can not only survive but thrive through change and challenge. Just as we wouldn’t want to fly in a plane without backup engines, we shouldn’t build teams without backup capabilities. The key is being strategic about where and how we build these buffers.

Remember that spare house key you keep with your neighbour? It might seem unnecessary until the day you really need it. Similarly, organizational redundancy might seem like a luxury until it becomes the difference between resilience and collapse. The math might be messier when we’re dealing with human systems, but the principle remains the same: small investments in strategic redundancy can yield disproportionate returns in reliability, innovation, and long-term success.

In the end, the question isn’t whether we can afford to build redundancy into our organizations. The question is whether we can afford not to. So, the next time you are designing a team, add a slide or a paragraph on strategic redundancies and get ready to defend it. Here’s to hoping we succeed in making this acceptable in a world where it may be frowned upon.

Further recommended reading: https://blog.joemag.dev/2024/01/the-mathematics-of-redundancy.html

Building Resilient Teams Through Strategic Redundancy

Published by Ankita Poddar

Leave a comment Cancel reply

Share this:

Published by Ankita Poddar

Leave a comment Cancel reply