2012 Index Computation
There are several steps in the process of constructing a composite Index. Some of those involve deciding which statistical method to use in the normalization and aggregation processes. In arriving at that decision, we took into account several factors, including the purpose of the Index, the number of dimensions we were aggregating, and the ease of disseminating and communicating it, in an understandable, replicable, and transparent way.
The following 10 steps summarize the computation process of the Index:
- Take the data for each indicator from the data source for the 61 countries covered by the Index for the 2007-2011 time period.
- Impute missing data for every (secondary) indicator for the sample of 61 countries over the period 2007-2011. Some indicators were not imputed as it did not make sense (logically) to do so. Those are noted here.Broadly, the imputation of missing data was done using two methods: country-mean substitution if the missing number is in the middle year (e.g. have 2008 and 2010 but not 2009), and taking geometric average growth rates on a year-by-year basis (so: calculate the growth rate year-on-year, and then take the geometric average).Most missing data for 2011 are imputed by applying the (geometric) average growth rate for the period, to the 2010 number (some data sources have not yet provided 2011 data for the selected indicators). For the indicators that did not cover a particular country in any of the years, no imputation was done for that country/indicator.None of the primary data indicators were imputed. Hence the 2011 Index is very different from the Indexes computed using secondary data only.
- Normalize the full (imputed) dataset using z-scores, making sure that for all indicators, a high value is “good” and a low value is “bad”. For example, for the Freedom House indicators (raw data), a low score is good and a high score is bad. This was inversed after normalization so that it is consistent with all the other values in the Index where a high score is always good and a low score is always bad.
- Cluster some of the variables (as per the scheme in the tree diagram), taking the average of the clustered indicators post normalization. For the clustered indicators, this clustered value is the one to be used in the computation of the Index components.
- Compute the 7 component scores using arithmetic means, using the clustered values where relevant.
- Compute the min-max values for each z-score value of the components, as this is what will be shown in the visualization tool and other publications containing the component values (generally, it is easier to understand a min-max number in the range of 0 – 100 rather than a standard deviation number). The formula for this is : [(x – min)/(max – min)]*100.
- Compute sub-Index scores by averaging the z-scores of the relevant components for each sub-Index, but applying the relevant weights as found in the “Reference Weighting Scheme” page of the Index file (and below). This is done by multiplying the assigned weight by the z-score value of the component.
- Compute the min-max values for each z-score value of the sub-Indexes, as this is what will be shown in the visualization tool and other publications containing the Sub-index values.
- Compute overall composite scores using the weighted average of the sub-Indexes. The weights are found in the “Reference Weighting Scheme” page (and below). This is done by multiplying the assigned weight by the z-score value of the sub-index)].
- Compute the min-max values (on a scale of 0-100) for each z-score value of the overall composite scores, as this is what will be shown in the visualization tool and other publications containing the composite scores.