A benchmarking example in R

DEA
The aim of this example is to learn how to apply DEA and to demonstrate that DEA is capable to reveal inefficiency. Therefore, we will first simulate our dataset of 100 courts, based on a certain assumption on inefficiency of each court (we will therefore know the’true efficiency’). And then we will use DEA to determine efficiency by means of DEA. By comparing DEA results to the assumed efficiency, we will establish whether DEA is capable to reveal true efficiency.
Authors
Affiliations

Misja Mikkers

University of Twente

Victoria Shestalova

Dutch Healthcare Authority (NZa)

Published

August 21, 2021

Introduction

In this post we will give you an example of a DEA benchmark performed in R. We will do this with simulated data.

In this case study we consider 100 courts. Courts handle ‘disputes’, which result in cases or settlements. Hence, settlements and cases are (perfect) substitutes. On the input side, courts incur costs. We consider several DEA models for benchmarking courts. The aim of this example is to learn how to apply DEA and to demonstrate that DEA is capable to reveal inefficiency. Therefore, we will first simulate our dataset of 100 courts, based on a certain assumption on inefficiency of each court (we will therefore know the’true efficiency’). And then we will use DEA to determine efficiency by means of DEA. By comparing DEA results to the assumed efficiency, we will establish whether DEA is capable to reveal true efficiency.

Packages

Code
library(tidyverse) # for manipulating data and plots
library(Benchmarking) # for DEA
library(PerformanceAnalytics) # for correlation plots
library(knitr) # for tables

Simulation of data

We will first simulate some simple data of 100 Courts. The courts produce Solutions to disputes (“production”), which are “Cases” and “Settlements”

ID: just an identification number that we will not use: Total_Eff_Cost : Is the “real efficient total cost”, not observed in practice Efficiency: This is the “real efficiency score”, not observed in practice. Because we want to have some efficient firms for this examples we make sure that firms with id’s 1, 50 and 100 become efficient. Total cost: The total cost we observe and can use in our benchmark as input Cases: The number of verdicts written by each court Settlements: Cases that did not come to a verdict (e.g. plea guilty or compromise) Production: The total of cases and settlements

The data is simulated so that:

  • all inefficiency leads to higher cost
  • production is a (lineair) function of total efficient cost
  • which means we have chosen for a CRS technology (see discussion later)
  • settlements and cases are subsitutes
Code
set.seed(123)
id <- 1:100 
d1 <- as.data.frame(id) %>%
  mutate(Total_Eff_Cost = round(runif(100, 100, 500),0)) %>%
  mutate(Efficiency_1 = round(runif(100, 0.6, 1 ),2))  %>%
  mutate(Efficiency = ifelse(id == 1 | id ==50 | id ==100, 1, Efficiency_1)) %>%
  mutate(TC = round(Total_Eff_Cost * (1/Efficiency), 0)) %>%
  mutate(Production = round(15 * Total_Eff_Cost,0)) %>%
  mutate(Share_cases = round(runif(100, 0.4, 0.7),2)) %>%
  mutate(Cases = round(Share_cases * Production, 0)) %>%
  mutate(Settlements = round((1-Share_cases) * Production, 0))

We can inspect the observable data:

Code
d_CC <- d1 %>%
  select(TC, Cases, Settlements)

suppressWarnings(chart.Correlation(d_CC))

Model 1 with 1 input (TC) and 1 output (Cases)

For this model we chose to run a CRS model with 1 input: TC (Total Cost) and 1 output (Cases).

Since Settlements and Cases are subsitutes, firms with relatively more Settlements should score worse.

Select our inputs and outputs

The Benchmarking package needs inputs and outputs to be matrices. As long as we want 1 input and 1 ouput, we just need to select the relevant variables.

Code
mx <- d1 %>%
  select(TC)
my <- d1 %>%
  select(Cases)

Running the model

Now we can run the DEA model:

We need to chose the inputs and the outputs (as we have done above), the orientation (since we are interested in input minimization we chose “in”) and returns to scale (CRS)

Code
model_1 <- dea(mx, my, ORIENTATION = "in", RTS= "crs")

Looking at the output

The package Benchmarking produces a list with some information.

Most interesting information are the scores and the peers.

Code
efficiency_scores <- model_1$eff

d_eff <- as.data.frame(efficiency_scores) %>%
  mutate(Peers = peers(model_1))

kable(summary(d_eff))
efficiency_scores Peers.peer1
Min. :0.4190 Min. :100
1st Qu.:0.5835 1st Qu.:100
Median :0.6812 Median :100
Mean :0.6896 Mean :100
3rd Qu.:0.7901 3rd Qu.:100
Max. :1.0000 Max. :100

As we can see, DMU with id 100 is peer for all firms.

Evaluation of the model

In reality we cannot evulate the model, but since we work with simulated data we can see how well our model performs by comparing the “real efficiency scores” with the estimated scores from the model.

First, we will add the efficiency scores to our initial dataset

Code
d1 <- d1 %>%
  mutate(Model_1 = model_1$eff)

Now we can plot the modelled inefficiency versus the real inefficiency

Code
ggplot(data = d1, aes(x= Efficiency, y=Model_1)) + 
  geom_point(color = "blue") + 
  geom_abline(intercept = 0, slope = 1, color = "red") + 
  xlim(0.4,1) +
  ylim(0.4,1) +
  xlab("True efficiency") +
  ylab("Efficiency scores model 1") +
  theme_minimal()

We can conclude that our model does not perform very good:

  1. most efficiency scores are underestimated
  2. a few are overstated

The reason is that we only took 1 output, while in our simulated dataset Courts produced 2 outputs.

Full model (1 input, 2 outputs) CRS

Select our inputs and outputs

We will run a model with 1 input (TC) and 2 outputs (Cases and Settlements). First we have to select the right variables as input and outputs. Please note: since we have 2 chosen 2 outputs, we should explicitly transform the dataframe into a matrix.

Code
mx_1 <- d1 %>%
  select(TC)

my_1a <- d1 %>%
  select(Cases, Settlements)

my_1 <- as.matrix(my_1a)

Running the model

The only change compared to the first model is the difference in outputs.

Code
model_CRS <- dea(mx_1, my_1, ORIENTATION = "in", RTS= "crs")

Looking at the output

Again we can extract the efficiency scores and peers

Code
efficiency_scores_CRS <- model_CRS$eff

Peers <-as.data.frame(peers(model_CRS))

id <- 1: 100
d_CRS <- as.data.frame(id) %>%
  mutate(efficiency_scores_CRS = efficiency_scores_CRS)

d_CRS_1 <- cbind(d_CRS, Peers)

kable(summary(d_CRS_1))
id efficiency_scores_CRS peer1 peer2
Min. : 1.00 Min. :0.6016 Min. : 1.00 Min. : 93.00
1st Qu.: 25.75 1st Qu.:0.7206 1st Qu.: 1.00 1st Qu.:100.00
Median : 50.50 Median :0.8093 Median : 1.00 Median :100.00
Mean : 50.50 Mean :0.8189 Mean : 30.07 Mean : 98.28
3rd Qu.: 75.25 3rd Qu.:0.9068 3rd Qu.: 93.00 3rd Qu.:100.00
Max. :100.00 Max. :1.0000 Max. :100.00 Max. :100.00
NA NA NA NA’s :31

We can that the efficiency scores -in general- have improved. The average score is up from around 69% to nearly 82%.

Code
id <- 1:100
Diff <- as.data.frame(id) %>%
  mutate(Difference = round(d_CRS_1$efficiency_scores_CRS - d_eff$efficiency_scores , 2))

ggplot(data = Diff, aes(x = reorder(id, Difference), y = Difference)) +
  geom_bar(stat = "identity", color = "blue", fill = "lightblue") + 
  theme_minimal() +
  xlab("ID") +
   theme(axis.text.x = element_text(angle = 90, hjust = 1, size = 5))

We can also have a closer look at the peers:

Code
d_peers_CRS <- d_CRS_1%>%
  filter(efficiency_scores_CRS == 1)

kable(d_peers_CRS)
id efficiency_scores_CRS peer1 peer2
1 1 1 NA
50 1 1 100
93 1 93 NA
100 1 100 NA

We now see more peers. In the dataframe DMU’s with id’s 1, 100 and 93 are peer. DMU with ID 50 (has also score 1, but is not a peer)

Evaluation of the model

In reality we cannot evulate the model, but since we work with simulated data we can see how well our model performs by comparing the “real efficiency scores” with the estimated scores from the model.

First, we will add the efficiency scores to our initial dataset

Code
d1 <- d1 %>%
  mutate(Model_CRS = model_CRS$eff)

Now we can plot the modelled inefficiency versus the real inefficiency

Code
ggplot(data = d1, aes(x= Efficiency, y=Model_CRS)) + 
  geom_point(color = "blue") + 
  geom_abline(intercept = 0, slope = 1, color = "red") + 
  xlim(0.5,1) +
  ylim(0.5,1) +
  xlab("True efficiency") +
  ylab("Efficiency scores model 2") +
  theme_minimal()

We can now conclude that the model performs reasonbly well.

Model 2 VRS

We will now estimate the same model with VRS.

Running the model

We only have to change the RTS (Returns to scale option) into VRS.

Code
model_VRS <- dea(mx_1, my_1, ORIENTATION = "in", RTS= "vrs")

Again we can extract the efficiency scores and peers

Code
efficiency_scores_VRS <- model_VRS$eff

Peers <-as.data.frame(peers(model_VRS))

id <- 1: 100
d_VRS <- as.data.frame(id) %>%
  mutate(efficiency_scores_VRS = efficiency_scores_VRS)

d_VRS_1 <- cbind(d_VRS, Peers)

kable(summary(d_VRS_1))
id efficiency_scores_VRS peer1 peer2 peer3
Min. : 1.00 Min. :0.6047 Min. : 1.00 Min. : 18.00 Min. : 24.00
1st Qu.: 25.75 1st Qu.:0.7433 1st Qu.: 6.00 1st Qu.: 20.00 1st Qu.: 93.00
Median : 50.50 Median :0.8491 Median : 11.00 Median : 50.00 Median : 93.00
Mean : 50.50 Mean :0.8413 Mean : 22.33 Mean : 57.78 Mean : 92.91
3rd Qu.: 75.25 3rd Qu.:0.9523 3rd Qu.: 26.00 3rd Qu.: 89.00 3rd Qu.:100.00
Max. :100.00 Max. :1.0000 Max. :100.00 Max. :100.00 Max. :100.00
NA NA NA NA’s :13 NA’s :34

The score is slightly higher than under CRS (from approx 82% to 84%)

We can also have a closer look at the peers:

Code
d_peers_VRS <- d_VRS_1%>%
  filter(efficiency_scores_VRS == 1)

kable(d_peers_VRS)
id efficiency_scores_VRS peer1 peer2 peer3
1 1 1 NA NA
6 1 6 NA NA
11 1 11 NA NA
18 1 18 NA NA
20 1 20 NA NA
24 1 24 NA NA
26 1 26 NA NA
50 1 50 NA NA
74 1 74 NA NA
87 1 87 NA NA
89 1 89 NA NA
93 1 93 NA NA
100 1 100 NA NA

We see now much more firms with score 1, being their own peers.

Evaluation of the model

We can add the efficiency scores to our initial model

Code
d1 <- d1 %>%
  mutate(Model_VRS = model_VRS$eff)

Now we can plot the modelled inefficiency versus the real inefficiency

Code
ggplot(data = d1, aes(x= Efficiency, y=Model_VRS)) + 
  geom_point(color = "blue") + 
  geom_abline(intercept = 0, slope = 1, color = "red") + 
  xlim(0.4,1) +
  ylim(0.4,1) +
  xlab("True efficiency") +
  ylab("Efficiency scores model 3") +
  theme_minimal()

As we see, in general firms improved (a bit). Especially large or small firms will be efficient be default.

Please note:

  • VRS is always beneficial to the DMU’s
  • The returns to scale choice may depend on:
    • underlying technology
    • possibility for firms to adjust their scale (by merging or splitting up)