The econometric setting for the EDI is a panel with a significant number of cross-sections: this consists of a large number of indicator series and relatively short time series. The objective is to design a weighting scheme such that the large number of indicators can be reduced to a smaller number of diversification indices: potentially three (production, trade, and government revenue), and/or one (diversification).
Conceptually, the problem is one of dimensionality reduction: for the set of indicators relevant to each sub-index and the overall index, the objective is to reduce the number of dimensions in the dataset from the number of indicators to just one. Two general approaches are available to solve this kind of problem: data compression; and prediction. The first set of approaches reduces the dimensionality of a dataset by uncovering the key components of variation across indicators and using a purely mathematical approach to summarize them according to a pre-defined criterion. The second set of approaches uses a given function of the indicator set to predict a variable of interest that should be strongly correlated with economic diversification.