Analytics 101: Why People Leave

Organizations spend a considerable amount of time, money and effort in hiring the right people. Once the employees are in the system it makes good sense to have the tenure balance the investment cost. However, as any working professional would tell you, the calculation isn’t always that simple. In addition to hiring and training costs, there is also the loss of critical knowledge, impact on customer relationship and many other factors that can’t always be assigned a financial cost. It is thus easy to understand why employee turnover has and will continue to give HR professionals many sleepless nights.

The marriage of attrition and analytics isn’t new. Most organizations have a number of attrition metrics that are reviewed on a monthly basis if not more frequently. How useful are these metrics in providing insight to help tackle the problem? In truth these metrics are agonizingly limited. They might be able to point you in the direction of trouble but rarely do they help provide a solution.

Let us take the top reasons for leaving as an example. In a study carried out by Allen, Bryant and Vardaman, they found that even the strongest correlation is that of 0.25 which isn’t great. What does this tell us? In absolute terms, very little. To dig deeper, it is necessary to understand why people leave.

Inverse Correlation with Turnover

Allen, Bryant & Vardaman (2010)

I find the below image useful in helping understand the process that leads to the decision to quit and to understand which metrics may help. Of course, this isn’t all inclusive but gives a pretty good idea of what goes on in the mind of an employee.

Process Perspectives on Turnover

Adapted from Lee & Mitchell, 1994

As you can see, there are a number of reasons for why people quit. Any one explanation will only explain what’s happening for a fairly small population. You will find people who love their job but quit anyway. On the other hand, there will be people who hate their job and supervisors but upon comparing with alternatives find that there is nothing better available and decide to stay. Under these circumstances, it becomes extremely difficult to come up with any predictive analysis model that will pass with flying colors when tested for validity.

The good news is that because we deal with the unpredictable nature of human beings, we aren’t extremely hung up on high correlations or high validity. What we are looking for is a model that can help identify the levers that can increase retention. One effective way to come up with a model is to use multivariate regression. Thinking about all the different variables that affect attrition and throwing them into a multivariate regression helps in untangling the effect of these variables. It also helps to understand which variable has the strongest effect on who stays and who leaves.

A better way and my favorite, is to use a survival model. Survival analysis is used heavily in epidemiology and is an excellent fit for understanding turnover. What survival analysis does is try to fit a time function. It helps you plot a curve of the range of people in a team over a period. You can then play around with variables, say experience level at the time of hiring, to see if the curve moves up or down. The good part of this analysis is that you are not looking at just 3 months, 6 months or a 1-year chunk at a time but plotting it all together.

The increasing existence of data and strong analytical tools provides a fantastic opportunity to bring more rigor into turnover analysis. It helps to move from systems based on instincts and hunches to making data backed decisions. However, I must leave you with a caveat. Data can provide excellent validation or throw you completely off course. There will be times when say network data indicates that your promotion candidate is not optimally connected with necessary stakeholders whereas another candidate is. Under those circumstances, it is possible that you begin to question your judgment and all other indicative data. Realize that data comes with noise and limited correlations.  Use data wisely and understand when to be data informed and when to be data driven.

P.S: This post was first published on HCI on August 16, 2016.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s