Site Reliability Engineering (SRE) may be a fairly new practice, but it’s one that is catching great momentum, particularly among leading-edge companies that are fueled to build high-performance SDLC orgs.

According to results from the Global SRE Pulse 2022 survey, 62% of respondents reported that their organizations employ SRE processes today with:

  • 19% having their entire org using SRE
  • 55% leveraging it within specific teams, products, or services
  • 23% stated that SRE has just begun to be piloted

One of the most powerful ways to leverage SRE is to institute it in your SDLC at the points at which developers and help desk teams integrate. Doing so will immediately improve how quickly and effectively your team addresses voice of the customer (VoC) issues in existing products and gets better at anticipating potential issues in future generations of products. Let’s dive deeper…

 

The Impact of SRE

SRE at its core is about a software engineering approach to IT operations that focuses on building a collaborative IT organization without rigid development and production silos. Instead, individuals can move between both disciplines for periods of time, thereby facilitating the creation of a “learning organization” that can accelerate adoption of operational lessons learned into future development. It is growing in popularity particularly as it relates to its ability to close SDLC gaps, optimize processes and systems, and drive superior customer experiences.

SRE needs to be injected somewhere in the SDLC process … and that somewhere should be right between the worlds of the help desk and developers. While DevOps has made great progress in creating stronger cross-team collaboration, there still remains a core chasm between the help desk and developers. And here’s how it can play out…

Historically, the definition of “Done” for developers has followed something like this:

  • Scrum teams sprint for weeks at a time
  • Developers focus on producing features
  • Features get out the door and pass the demo level
  • The demo satisfies the use case and can be seen to be working
  • The feature/functionality moves from dev to production

But what this flow doesn’t recognize is how completely that feature/functionality will end up satisfying the needs of the end user in terms of usability. Here’s what often happens next.

The help desk gets flooded with angry customer calls and queries about the product. In many cases, those who work the help desk lack the technical expertise to solve these problems… so they do their best to capture and describe the feedback to elevate and escalate back to the dev team.

Dysfunctional feedback loops start to occur. An endless game of “telephone” ensues – and now the people responsible for creating the code must decipher help desk narratives and tackle unscheduled rework that detracts from velocity.

True chaos and something we can remedy!

It starts by expanding the definition of “Done” for the dev team, with “Done” also being about meeting VoC needs. How do we do that? By introducing SRE into the help desk.

 

SRE as the Bridge

There are several layers of organizational structure that exist between dev and help desk teams. The challenge in that is we lose continuity between those who create the code and those who interact with the customers firsthand.

High-performing SDLC functions should have the dev team live with the help desk for bits of time — and vice versa — so that each group can have an appreciation for the problems which the other solves. This is about tearing down barriers that keep teams from collaborating effectively and driving continuous integration. At its core, this is what SRE accomplishes.

To begin, consider creating a rotational roster among your devs to ensure that each periodically goes and serves a tour of duty in the help desk (for example 6 weeks, usually at the 2nd level of support) once code has been released. This will help your devs deepen their technical expertise and app knowledge, while enabling them to hear straight from the horse’s mouth what is or isn’t working. This helps calibrate their definition of “done” from code that is working to code that is also maintainable and reliable. What’s more, this will help ensure that your devs design and build products for the future in the way end users actually use them.

Be warned: devs might resist it at first! After all, devs tend to love sitting behind the desk coding, but this is a great evolution for them. It will strengthen their skill sets and create a mindset of continuous focus and care for VoC.

A reciprocal “tour of duty” placing help desk personnel in the dev teams as Business Analysts or developers (depending on their technical background) is equally beneficial. The help desk professionals work directly with the Product Owner — possibly for the first time — and bring the VoC directly to the ear of the Product Owner. This can bring an enormous improvement in the “fit for purpose” of the release before it is ever seen by end users.

SRE makes VoC the responsibility of both devs and help desk. It becomes the responsibility of the help desk to help the dev team write more reliable code that incorporates VoC, and it forces the devs to change their definition of done to include those factors: it closes the loop.

The impact of closing the loop will be felt immediately in the metrics, particularly if you track things like:

  • Mean Time to Repair: You’ll also notice decreases in terms of how long it takes to reduce time to improve
  • Recidivism Rate: Occurrences of defects going open, closed and opened again should reduce
  • Defect Density: The monitoring of how many defects you have as you introduce code; with an expanded definition of what done means, done should start to be truly done
  • Defect Distribution: Error-prone sections of code will be identified more readily and re-factored to improve its maintainability and reliability
  • Help Desk Customer Satisfaction Scores: Go from angry to happy customers with a team that can expeditiously resolve issues and create customers who feel truly heard

The era of SRE is here. And it can have the biggest impact when applied within the help desk and dev teams. So, who do you believe will be your first devs to serve their tour of duty?

Interested in speaking with our Solutions Architect team on specific ways to drive quality of process and product into your SDLC? Click here to book your time directly with our team.