What is Somers’ Delta?
Somers’ Delta (Somers’ D) is a measure of agreement between pairs of ordinal variables. Ordinal variables are ordered , like best to worst or smallest to greatest (the Likert scale is one of the more popular ordinal scales.)A measure of agreement tells you something about how two pairs of variables are connected. This connectivity is defined by concordance and discordance. Put simply, concordant pairs “match” and discordant pairs don’t (for a more in-depth explanation and illustrated example, see: What are Concordant and Discordant Pairs?).
Delta can predict column categories from row categories in a contingency table. More specifically, asymmetric* Somers’ D measures how much the prediction for the dependent variable improves, based on knowing a value of the independent variable. Therefore, it’s important to define which variable is the independent variable and which is the dependent variable when running this test: you’ll get two different results for (X,Y) and (Y,X). As a simple example, let’s say you wanted to know whether customer satisfaction (on a scale of 1 to 5) was dependent on how friendly your sales staff were (on a scale of 1 to 3). If you switch the independent and dependent variables around, you’ll be measuring how friendliness of your sales staff was affected by customer satisfaction. That may be interesting information, but it isn’t the relationship you’re interested in.
Delta is an ordinal alternative to Pearson’s Correlation Coefficient. Like Pearson’s R, the range for Somers’ D is -1 to 1:
- -1 = all pairs disagree,
- 1 = all pairs agree.
Large values for Somers’ D (tending towards -1 or 1) suggest the model has good predictive ability. Smaller values (tending towards zero in either direction) indicate the model is a poor predictor. Let’s say you had a Delta of .549 in the friendly sales staff/customer satisfaction scenario. Customer satisfaction is the dependent variable, so you can say that friendly sales staff improves customer satisfaction by 54.9%.
Somers’ D increases as a contingency table’s dimensions increase, but does tend to underestimate the actual degree of association in tables (Göktaş & İşçi, 2011).
Somers’ D has been defined in several ways. One way is as “the difference between the number of concordant pairs and the number of discordant pairs divided by the total number of pairs not tied on the independent variable” (Oxford Index). This definition gives you an idea of how complex it is to calculate; Finding concordant/discordant pairs is no quick task. In addition, the specific formula for Delta depends on the position of the independent variable (Göktaş & İşçi, 2011). This is one reason software is usually used to find Delta.
Somers’ D is also sometimes defined in terms of Kendall’s Tau:
- (X,Y) is a pair of bivariate random variables.
- Τ is Kendall’s Tau.
Alternatively, if one X is larger than the other, it can be defined as the difference between the two corresponding conditional probabilities. The difference between Delta and Tau-b is only that Delta corrects for tied pairs on the independent variable:
Somers’ D vs. Gamma
Both Somers’ D and Goodman and Kruskal’s gamma find associations between two ordinal variables. Unlike Goodman and Kruskal’s gamma, Somers’ D differentiates between the independent variable and the dependent variable. The difference between the two can be fuzzy, although if you know your data and the goal of your analysis (i.e. if having one variable labeled as dependent is important), it should be clear which of the two procedures to use.
Two versions of Delta exist: asymmetric and symmetric. The asymmetric version is by far the most popular and is the one you’re likely to come across when using software (e.g. SPSS). When you read about “Somers’ D” you’re probably reading about the asymmetric version (although a lot of authors don’t clarify that). A symmetric version — where both variables are ignored to be independent or dependent –does exist, so it’s wise to clarify which one you’re using before interpreting the results.
Göktaş, A. & İşçi. O. A Comparison of the Most Commonly Used Measures of Association for Doubly Ordered Square Contingency Tables via Simulation. Metodološki zvezki, Vol. 8, No. 1, 2011, 17-37.
If you prefer an online interactive environment to learn R and statistics, this free R Tutorial by Datacamp is a great way to get started. If you're are somewhat comfortable with R and are interested in going deeper into Statistics, try this Statistics with R track.Comments? Need to post a correction? Please post on our Facebook page.