Вы находитесь на странице: 1из 17

This article was downloaded by: [Indian Institute of Technology Guwahati] On: 26 March 2014, At: 22:54 Publisher:

Routledge Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

The Journal of Development Studies


Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/fjds20

Quantifying the Qualitative: Eliciting Expert Input to Develop the Multidimensional Poverty Assessment Tool
Alasdair Cohen & Michaela Saisana
a b a b

University of California, Berkeley, USA

European Commission, Joint Research Centre, Ispra, Italy Published online: 20 Nov 2013.

To cite this article: Alasdair Cohen & Michaela Saisana (2014) Quantifying the Qualitative: Eliciting Expert Input to Develop the Multidimensional Poverty Assessment Tool, The Journal of Development Studies, 50:1, 35-50, DOI: 10.1080/00220388.2013.849336 To link to this article: http://dx.doi.org/10.1080/00220388.2013.849336

PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the Content) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/termsand-conditions

The Journal of Development Studies, 2014 Vol. 50, No. 1, 3550, http://dx.doi.org/10.1080/00220388.2013.849336

Quantifying the Qualitative: Eliciting Expert Input to Develop the Multidimensional Poverty Assessment Tool
Downloaded by [Indian Institute of Technology Guwahati] at 22:54 26 March 2014

ALASDAIR COHEN* & MICHAELA SAISANA**


*University of California, Berkeley, USA, **European Commission, Joint Research Centre, Ispra, Italy

ABSTRACT This article discusses the participatory creation of the Multidimensional Poverty Assessment Tool (MPAT), a survey-based thematic indicator developed in China and India. The core of the article focuses on the use of expert elicitation to inform the construction of MPATs household and village surveys, the cardinalisation of survey responses, and the weighting scheme design. This is followed by a discussion of the potential pitfalls of expertise in development, the decision not to aggregate MPAT into an index, creating locally relevant poverty lines, and ideas for future research. The article closes with a summary of lessons learned.

1. Introduction: Multidimensional Poverty, Indicators and Participation Poverty is relative and multidimensional. Given its complexity, multidimensional nature and the connections with history, context, politics and power, understanding and measuring poverty is by no means straightforward (Bossert, Chakravarty, & DAmbrosio, 2009; Bourguignon & Chakravarty, 2003; Narayan, Pritchett, & Kapoor, 2009; Roe, 1998; Streeten & Burki, 1978). Many development researchers suggest two steps for poverty assessment: (1) determine who is poor; and (2) measure/quantify their poverty. Sen (1976) arguably catalysed this approach in his seminal paper Poverty: An Ordinal Approach to Measurement. Decades later there is still debate on how best to accomplish these steps (Bossert et al., 2009) and, indeed, an index of poverty (Sen, 1976, p. 219) is not always the desired end result. Income remains the most commonly used indicator of poverty. However, even if accurately measured (a difficult and costly feat in many areas), income does not provide a reliable proxy for poverty (Bourguignon & Chakravarty, 2003; Sullivan, 2006; Sen, 2000). Let us briefly consider, then, two well-known multidimensional, index-based approaches used for country-level comparisons. The United Nations Development Programs (UNDP) Human Development Index (HDI)1 is perhaps the best known welfare indicator. It combines three dimensions (health, living standards and education) which are built on four indicators (life expectancy at birth, gross national income per capita, and mean years of schooling and expected years of schooling). Some have questioned the real policy impact of HDI country rankings (Srinivasan, 1994) and others the extent to which the HDI is actually based on Sens relatively abstract formulations in terms of functionings and capabilities (Ravallion, 2010, p. 9). Debates and critiques notwithstanding (Chowdhury & Squire, 2006; Klugman, Rodrguez, & Choi, 2011), what is particularly relevant for our discussions here is the HDIs use of equal weights to aggregate the HDIs indicators, as opposed to expert weights. Chowdhury and Squire (2006) elicited expert suggestions to examine alternative weighting schemes for the HDI (as well as the Commitment to Development Index [CDI]) and found that the expert weights approximated the equal weighting scheme of the HDI (though less so for the CDI).
Correspondence Address: Alasdair Cohen, University of California, Berkeley, Department of Environmental Science, Policy & Management, 140 Mulford Hall, Berkeley, CA 94720, USA. Email: alasdaircohen@linacre.oxon.org 2014 Taylor & Francis

36 A. Cohen & M. Saisana The second index of note is the Multidimensional Poverty Index (MPI) which also uses equal weights. Recently, researchers at the University of Oxford (Alkire and Santos, 2010), in coordination with UNDP and others, expanded on the HDIs mandate and created the MPI, an application of the Alkire and Foster (2011a) methodology which attempts to reconcile Sens identification step in the context of multiple poverty dimensions (Alkire & Foster, 2011b). The MPI is based on 10 indicators which are organised under the three dimensions of health (nutrition and child mortality), education (years of schooling and children enrolled) and living standards (cooking fuel, toilet facility, water access, electricity, flooring material and assets).2 A poverty/deprivation score is measured for each indicator, and these deprivations are summed and then aggregated with equal weights across the population in question. The MPI was initially used to calculate scores for 104 countries, though the methodology may also be used for local level poverty measurement (Alkire & Foster, 2011b; Ferreira, 2011). The MPIs weighting scheme was subjected to a robustness test, which found that while the poverty estimates changed, the country rankings were relatively constant (Alkire, Santos, Seth, & Yalonetzky, 2010). Measuring multiple dimensions is important, yet as Ravallion (2011, p. 16) argues, recognising that poverty is not just about lack of household command over market goods does not imply that one needs to collapse the multiple dimensions into one (unidimensional) index. Indeed, we ought to question how much any one (over)simplified number can reveal about poverty in any given region. In contrast to the HDI and MPI, the Multidimensional Poverty Assessment Tool (MPAT) uses purpose-built surveys, an expert-based weighting scheme and is not aggregated into an index. To date, most of the expert elicitation literature deals with better understanding, measuring and forecasting uncertainty in economics and public health. However, this research can also inform the design of poverty assessment tools. But, before delving into the core content of this article, it must be noted that developing an indicator to quantify some of the core constructs surrounding rural poverty is just one option for trying to simplify and organise this complexity. There is an extensive literature on the use of participatory methods to support poverty assessment as well as project planning and implementation at the micro (field) level and at more macro (political economic) levels (Chambers, 1995, 2008; Cleaver, 1999; Cooke & Kothari, 2001; Hart, 2001; Hickey & Mohan, 2004; Mohan & Stokke, 2000; Szal, 1979). In spite of the many difficulties inherent in securing balanced participation (for example, Kapiriri, Norheim, & Heggenhougen, 2003), participatory approaches often offer a nuanced means of assessing local poverty and perceptions of poverty. There is a growing body of research on eliciting input from the poor themselves and then using these qualitative data to make statistical inferences (for example, Barahona & Levy, 2003). Others have used such participatory approaches to generate poverty lines (for example, Hargreaves et al., 2007; Narayan et al., 2009; Noble, Wright, Magasela, & Ratcliffe, 2008; Wright, Noble, & Magasela, 2007) or set health priorities (Kapiriri et al., 2003). And others have also suggested using valuations derived from poor peoples opinions to create poverty assessment tools (see Boltvinik [1998] for a discussion of related expert elicitation methods). However, due to the need to have a standardised, quantifiable, approach for use across regions and countries, MPATs design team adopted an expert-based approach. In light of the subjective nature of such an endeavour, poverty related indicators should be based on sound theoretical foundations with well-operationalised conceptions of what is being measured and how, accompanied by transparent aggregation rules and weights. In addition to explaining the elicitation methods used to develop MPAT, we hope this article will also help bridge the cited gap (Leal, Wordsworth, Legood, & Blair, 2007) between theoretical elicitation approaches and those actually used. The article is structured as follows. Section II describes MPATs purpose and architecture. The next sections detail how expert input was elicited to create purpose-built household and village surveys (Section 3), assign cardinal scores and aggregation rules to survey item responses (Section 4), and finally arrive at a weighting scheme for aggregating the subcomponents (Section 5). In Section 6 we briefly address how expertise in development frames problems and solutions in certain ways, why MPAT does not distil all the information into a single number, and offer ideas on how to use MPAT to

Downloaded by [Indian Institute of Technology Guwahati] at 22:54 26 March 2014

Eliciting Expert Input to Develop the Multidimensional Poverty Assessment Tool 37 create local level poverty lines. We close by putting forward ideas for additional research and summarising lessons learned. 2. MPATs Purpose, Architecture and Data Sources MPAT was designed to support poverty alleviation project planning, design, management, monitoring and evaluation. Specifically, MPAT is a survey-based thematic indicator built on household and village level data. Indicators are calculated for each household and then averaged for each village and the entire area/project. The tool was developed in 20082009 by an international consortium led by the International Fund for Agricultural Development (IFAD), a specialised agency of the United Nations (UN) and a working paper version was released in early 2010 (Cohen, 2009b). MPAT provides an assessment along 10 dimensions of poverty (see Figure 1) that are fundamental to rural life, livelihoods and well-being. MPAT is meant to be comprehensive and multidimensional. For these reasons, MPAT was developed as a thematic indicator , whereby the values for the 10 components are not further aggregated into a single number (Cohen, 2010). Each of MPATs 10 components is itself a composite indicator comprised of three subcomponents (with the exception of Farm Assets that can have three or four subcomponents depending on the household). The subcomponents are built on questions from the MPAT Household Survey and the MPAT Village Survey. The survey response items (mostly categorical, but also numerical) are converted to a 110 scale, with 10 being the best achievable score. Survey responses for a given household are then aggregated into subcomponent scores using weighted arithmetic averages (and scaled up to 10100 for greater precision). Finally, subcomponent scores are aggregated into component scores using weighted geometric averages (for details, see Cohen [2009b]; IFAD, in press). To better understand how this works, consider the expert-led cardinalisation (that is, converting ordinal and nominal categorical data to cardinal data)3 for the Domestic Water Supply Access subcomponent (#2.3). This subcomponent builds on two questions from the MPAT Household Survey. The first asks about the time needed to collect water for drinking and cooking. The answer, in minutes, is divided into intervals which are assigned values from 110. For example, the interval of 1120 minutes receives a value of 8.5 and the interval of 2130 minutes a value of 6.5. The second question has to do with ability to pay fees (if applicable) to access the households primary water source, and thus the responses are categorical, with a response of Always being assigned a value of 10. The expert weighting scheme for calculating the subcomponent dictates that the first question (access time) receives a weight of 60 per cent and the second question the remaining 40 per cent (using

Downloaded by [Indian Institute of Technology Guwahati] at 22:54 26 March 2014

Figure 1. MPATs indicators: components and subcomponents (source: Cohen, 2009a).

38 A. Cohen & M. Saisana a weighted arithmetic average), with the score then scaled to 10100. Finally, all three Domestic Water Supply subcomponent scores are aggregated into a component score using a weighted geometric average.

3. Using Expert Input to Design the MPAT Household and Village Surveys In 2007, a draft MPAT framework was created and sector-specific experts were invited to form a sounding board to support the tools creation. Experts were primarily sought for a combination of normative and instrumental expertise in order to provide guidance on what should be measured for each component (which aspects of rural poverty are most important) and how best to measure it (which data or proxies). Consequently, individuals were recruited due to their experience with poverty assessment work in less-developed countries and/or their sector-specific expertise (for example, in agriculture and soil science, micro credit, water management and so on). Particular attention was paid to ensuring participation of both international and regional experts, since the goal was to incorporate a wide range of perspectives, backgrounds and experiences into the tools development. Winkler and Clemen (2004, p. 174) note that the variability of expert input decreases as both the number of methods and experts increases, but that overall the gains are much greater from multiple experts than from multiple methods. Of the 63 experts invited, 39 (62%) agreed to join the sounding board, though the degree of participation varied over the course of the project. 4 Most board members came from UN agencies, research institutes and universities. In hindsight, we should have used a more representative sampling procedure for recruitment, rather than a convenience sample. One option would have been to develop a sampling frame of potential experts, with a consistent number per sector, and then randomly select and invite the required number from that larger group (multiplied by 3040% to address non-response). Even so, self-selection bias would likely still have been an issue (for example, Chowdhury and Squire, [2006, p. 764] reported a 53% expert response rate from a similar solicitation effort). However, in 2007 MPATs future utility was still unknown, and thus it was easier to recruit willing experts by way of formal and informal professional networks. The first task of the sounding board was to help define MPATs framework and suggest survey questions. Unavoidably, in requesting input from experts a number of heuristics will bias the information received; for example, availability and anchoring heuristics (Kadane & Wolfson, 1998; Schwarz & Sudman, 1996). Indeed, the method of eliciting expert opinion itself is important, since different methods will introduce varying degrees of bias. A psychometrics consultant was recruited to create a primer5 on bias and heuristics which was emailed to board members along with the instructions for how to create MPATs subcomponents and survey items (including examples of appropriate and inappropriate questions). Ensuring timely feedback from such a large group of professionals volunteering their time was also an issue, though almost everyone did provide feedback eventually. The boards suggestions for subcomponents and survey items were compiled and distributed at the projects start-up workshop (September 2008, Beijing) to provide the starting point for the survey design (contributors names were omitted to avoid potentially biasing workshop participants). This workshop yielded MPATs basic architecture with multiple and overlapping survey items for each subcomponent. Iterative field testing in rural China and India was then used to fine-tune the surveys. Besides the experts, enumerators and respondents who participated in the repeated testing of the surveys in China and India provided invaluable input for MPATs development. However, as touched on above, since MPAT was designed to provide a standardised assessment of key dimensions of rural poverty across less-developed countries (an admittedly challenging goal), the project team felt it would have been counterproductive to solicit input from the rural poor in China and India specifically, since they would, understandably, prioritise local specific issues. After five iterations of testing and revision in different areas of rural China and India, the MPAT pilot was conducted in the spring of 2009 (n = 527 households).

Downloaded by [Indian Institute of Technology Guwahati] at 22:54 26 March 2014

Eliciting Expert Input to Develop the Multidimensional Poverty Assessment Tool 39 4. Eliciting Expert Input for MPATs Survey Item Cardinalisation As Welsch (2002, p. 477) noted in a paper on expert valuation, well-being has no natural units. Since the MPAT data from the village and household surveys was mostly categorical, and the notion of the more the better was not always evident in the response scales, translating the survey responses to comparable scales across questions was accomplished through expert consultation. Furthermore, MPAT was not intended to rank households or villages as such, but rather to identify which dimensions of poverty would likely require support in different regions at different scales. Consequently, popular statistical methods of normalisation, or cardinalisation, could not be applied (as, for example, reviewed in Agresti [2007]; Boltvinik [1998]; OECD [2008]). Expert opinion was therefore used to identify both the preference-relation and the intensity of preference between the possible answers to survey questions. As the literature on expert elicitation suggests, it is important to provide would-be contributors with examples of what is desired in clear and simple language and to elicit quantities with which they are already familiar (Leal et al., 2007; OHagan, 1998). Though it was assumed that MPAT contributors were familiar with Likert scales, they were provided instructions and examples to guide them through the process of assigning cardinal scores and suggesting weights for aggregating the survey items into subcomponents (see Cohen, 2009a, Annex VII). We believe that this may represent one of the few cases in which expert input was sought at this level of indicator development. A brief side note is warranted here, since there are potential problems related to the use of correlation analysis for data falling on non-ratio Likert scales (that is, without a meaningful zero point), an issue perhaps first flagged by Schmidt (1973). The key problem, as Evans (1991, p. 13) put it, is that simple correlations are not an appropriate way to assess the relationship between a multiplicative composite and an outside variable. Trauer and Mackinnon (2001) point out that this also applies to latent variable models, such as factor analysis (a statistical approach used in indicator development). However, to be clear, in this case we are not discussing correlations between a multiplicative index and outside variables. We mention this, then, as a point of caution for others considering similar research/methods. For MPAT, the experts suggestions were first rendered anonymous and then distributed to participants at the 2009 New Delhi workshop. Participants were divided in groups of three to four people to debate the suggested cardinal scores and survey item aggregations, keeping in mind the many contexts and countries where MPAT could be used. Due to time constraints not all cardinalisations were discussed, and even for those reviewed a perfect consensus rarely emerged. Overall, the process was closer to art than science and, ultimately, it fell upon the lead author to make many of the final decisions. This was most often accomplished by taking the average value suggested by the sectorspecific experts, with judgments made in areas where there was no obvious consensus and high variability.6 At this stage of MPATs development input from multiple experts was elicited over two stages (preworkshop and during the workshop) using, essentially, a single method. While there is support for this in the literature (Winkler & Clemen, 2004), in hindsight using another method to collect the same input from contributors would have allowed us to analyse the consistency of some of their suggestions (perhaps then only using inputs with high inter-rater reliability). Another option would have been the Delphi Method (Dalkey & Helmer, 1963; Landeta, 2006). Once board members had provided their suggestions, these could have been compiled, averaged and then shared with the group again (opinion feedback), accompanied by a second request for input with the option of repeating this step a third or fourth time, or until some sort of stopping criterion was met. Those areas which still lacked consensus could have been shared with participants at the Delhi workshop so as to more efficiently use that limited time. A final issue relates to the often ordered nature of the survey responses and accompanying expertderived cardinal scores (usually from worst case to best case). It would be reasonable to suggest that a linear scaling method from 1 to 10 would be a simpler and equally suitable method. With this possible critique in mind, during the statistical validation of the pilot data from China and India, we

Downloaded by [Indian Institute of Technology Guwahati] at 22:54 26 March 2014

40 A. Cohen & M. Saisana


Table 1. Example of expert and linear based cardinalisation of survey responses MPAT (v.6) Village Survey, question #54 Does each center usually have enough medical supplies to provide adequate health care?* Answer Never Rarely Sometimes Often Always Answer code 1 2 3 4 5 Expert score 1 2 4 6.5 10 Linear score 1 3.25 5.5 7.75 10

Note: *Enumerator supervisors administer the MPAT Village Surveys and are trained to understand the nuances related to the questions; they then select the pre-coded answer which most closely matches the respondents answer (details are in the MPAT User s Guide).

Downloaded by [Indian Institute of Technology Guwahati] at 22:54 26 March 2014

also calculated intensity of preference using a linear scaling with equal distances between responses, while assuming that the preference relationships of the responses as decided by the experts were appropriate. We then compared the results of this linear scaling to the expert-led cardinal scores. To give an example, in the Health and Healthcare Quality subcomponent (#3.3), a question from the village survey (administered to village health care staff) asks: Does each centre usually have enough medical supplies to provide adequate healthcare? with the possible answers Never (1), Rarely (2), Sometimes (3), Often (4), and Always (5). In both the expert-based and linear scaling, the response codes Never (1) and Always (5) receive 1 and 10 points respectively (the minimum and maximum values). However, the intermediate responses receive different scores; for example, Sometimes (3) receives a score of 4 points under the expert-based valuation but 5.5 points under the linear scaling (see Table 1). These differences were analysed across all survey items and the use of expert scaling usually resulted in significant differences for the majority of the indicators included in MPAT as compared to linear scaling (Saisana & Saltelli, 2010). Consequently, the Standardised MPAT is based on the expertbased scores, as well as on expert-derived weightings.

5. Eliciting Expert Input for MPATs Weighting Scheme The other main task of the 2009 Delhi workshop was to try to reach a consensus on the weights for aggregating MPATs subcomponents into their respective components. There is much discussion and debate on the use of expert weights versus equal weights in indicator construction (for example, Cooke, 1991; Saisana & Tarantola, 2002; Srinivasan, 1994; Trauer & Mackinnon, 2001), and the decision to use equal weights could arguably be considered an implicit expert weighting scheme. Ahead of the second MPAT workshop, feedback from board members and other development professionals on the weights for subcomponents was solicited using a template sent by email (Cohen, 2009a, Annex VII); experts were asked to decide which subcomponents ought be given relative priority when aggregated to the component score by assigning each a proportion out of 100. Forty experts from 10 countries and 28 organisations provided weighting suggestions prior to the Delhi workshop. Unfortunately, we do not know the response rate, since in addition to inviting board members to participate, UNDP-India helped solicit input from one of their email-based networks. This also partially explains the disproportionate number of Indian nationals who submitted weighting suggestions, discussed below. In hindsight, it might have been preferable to use a budget allocation approach based on simple ratios for the weightings, rather than prompting respondents to arrive at a total of 100 per cent across the subcomponents (converting the ratios to percentages later). The Delphi method could have been used as well to help corral expert opinions more tightly through one or more iterations of opinion feedback.

Eliciting Expert Input to Develop the Multidimensional Poverty Assessment Tool 41 Another option is the analytic hierarchy process (AHP) which also provides a means of identifying inconsistencies in expert suggestions on weights. AHP is based on the use of ordinal pairwise comparisons with expert preference expressed on a scale of one to nine, with an eigenvector technique then used to calculate the relative weights based on expert inputs (Salty, 1980). As such, expert contributors have less control in establishing the weights directly, since they are calculated as opposed to assigned (as in budget allocation methods). Figure 2 summarises the mean subcomponent weighting suggestions from the experts displayed with one standard deviation. Given that all 10 components were composed of three subcomponents (before the optional fourth subcomponent was added to Farm Assets), equal weighting within a component would imply a 33 per cent weight to each subcomponent. In four components, Education (component 6), Farm Assets (component 7), Exposure and Resilience to Shocks (component 9), and Gender Equality7 (component 10), on average contributors assigned approximately equal weights to the subcomponents. For the remaining six components, expert opinion diverged, especially so with regard to Sanitation and Hygiene (component 4). The practically equal weights suggested for components 9 and 10 may well have been due to participant fatigue after having already thought through so many components, or they may reflect the perception that all three subcomponents were of equal importance. If we had the option of conducting this exercise again, instead of sending all experts the same template, we would have randomised the order components were presented in for each contributor, in order to control for this possible fatigue bias. To better assess the consistency of the experts weighting suggestions, the weights were averaged based on the experts country of origin, with a focus on India and China. Figure 3 displays the average weighting suggestions by country of origin (India: n = 21, China: n = 5, and other countries, n = 14 experts). Interestingly, the weights suggested by the experts from India are very similar to those from the rest of the world for almost all subcomponents. For example, compare the average weights for subcomponent #1.1 of approximately 40 and 44 per cent with the suggested 55 per cent from the Chinese experts. All told, in 10 of the 30 subcomponents, the average weights from the experts in China differed significantly from those of the other contributors (in particular subcomponents #1.1, #1.3, #4.1, #8.1, and #8.2). These results may indicate that too few experts from China participated, or that there might be a sample bias in the selection of experts from China, or this might reflect a divergence of opinions based on contextual factors and development priorities within China and India. Ideally, we would have solicited more input from additional China-based experts to get a clearer idea on the source of these differences. In hindsight, we should have also collected additional information from contributors, such as their age and educational background to conduct additional analyses (such as in Chowdhury & Squire [2006] and Cooke [1991]).

Downloaded by [Indian Institute of Technology Guwahati] at 22:54 26 March 2014

Figure 2. Summary of suggested subcomponent weightings discussed at the Delhi workshop.

42 A. Cohen & M. Saisana

Downloaded by [Indian Institute of Technology Guwahati] at 22:54 26 March 2014

Figure 3. Average expert weights based on contributors country of origin.

An avenue for future research would be to conduct a simple ranking exercise of MPATs 10 components by groups of development experts and rural poor in the same country/countries, and then compare the rankings to better understand the potential utility of conducting participatory weighting design nationally, with and without rural communities, to develop country specific MPAT weights. In the end, in most cases MPAT uses the average expert weights shown in Figure 2. The robustness of MPAT to different weights was tested using pilot data from villages in China and India (Saisana & Saltelli, 2010). Here, we present the results from a random sample of four villages in Kenya (total of 118 households from MPAT data collected in 2011). The sample of 35 expert weighting schemes used includes those from India (n = 21) and other countries (n = 14); we excluded the five experts from China due to the potential bias issues discussed above. Hence, for each household 35 scores were calculated for each of the 10 components, corresponding to a different expert-driven set of weights. The differences between any of these scores and the reference MPAT score were calculated for each household and component. Figure 4 summarises the results of the robustness analysis based on those differences. The black line is the median across all households and sets of weights, and the boxes include 50 per cent of the cases (from the 25th to the 75th percentile). The 90 per cent confidence intervals are displayed by the vertical lines. Thus, a median close to zero with a small box and a short vertical line indicates an MPAT component that is robust to changes in the subcomponents weights (within the space of the expert sample). The most sensitive MPAT components to the choice of the weights are Sanitation and Hygiene, Housing, Clothing and Energy, Non-farm Assets and Exposure and Resilience to Shocks. The most robust components are Education and Gender and Social Equality. Yet, even for the most sensitive components, the impact on the poverty estimates is moderate. In fact, for all MPAT components, the median is close to zero, and the boxes are only 3 points. In addition, the Pearson correlation coefficients between the MPAT scores and those obtained using single expert-based weights range from 0.95 to 1.00 for all 10 components. This demonstrates that all 10 components measured by MPAT are not driven by the weights, when these are changed within reasonable limits.

Eliciting Expert Input to Develop the Multidimensional Poverty Assessment Tool 43

Downloaded by [Indian Institute of Technology Guwahati] at 22:54 26 March 2014

Figure 4. Robustness analysis of 2011 MPAT results from 118 households in rural Kenya (data courtesy of Nuru International).

6. Discussion Throughout the article we have endeavoured to candidly expose possible shortcomings in our methods in the hopes that others may benefit from understanding the challenges and problems we faced (those which were immediately evident and those discovered only with the benefit of hindsight), as well as the successes we were fortunate enough to achieve. While we used a number of sensitivity tests to assess the cardinalisation and weights in order to inform the final design of the tool (see Saisana & Saltelli, 2010), other techniques might have been equally informative. Given that there is no absolute rubric against which to calibrate, and no standardised set of methods for eliciting expert input to develop such a tool, MPAT is, necessarily, an imperfect tool. Beyond these issues, given this articles focus on the use of experts to develop new tools for poverty assessment, it is appropriate to take a step back and reflect on the larger implications of expertise in development, and how expertise impacts the creation and character of tools such as MPAT.

6.1. The Malleability of Expertise and the Value of Transparency There are at least three, interconnected, levels that merit attention with regard to expertise in an endeavour such as this: (1) the macro level (and often unexamined) expertise which frames development problems in certain ways, making certain solutions more attractive; (2) the potential problems and shortcomings inherent in expert elicitation (the primary subject of this article); and (3) the expertise of those who design and construct poverty-related indicators. In contemporary development work, solutions often tend toward the technical and readily quantifiable and, consequently, many development interventions seek to ameliorate symptoms (hunger, disease, contaminated water) instead of their underlying causes which are, often, political in nature (Ferguson, 1994; Li, 2007; Mitchell, 2002; Scott, 1998). The MPAT indicators, and especially the expert-derived cardinal scores and weights, are themselves political statements about the means and goals of rural development. The sociopolitical predilections of MPATs creators and contributors their philosophies with regard to poverty reduction, governance and economics have shaped the entire tool. Consequently, those using MPAT ought to realise that in choosing this tool they are framing rural development and the goals of rural poverty reduction in a certain way, which in turn will likely

44 A. Cohen & M. Saisana favour certain strategies for poverty reduction over others. This may all go without saying, but it is too often not said. It follows, then, that a different project management team and different group of experts would have created a different tool. Given this, and that there is no perfect method for developing such indicators, the key to ensuring that others have an understanding of how any given indicator is both necessarily flawed and potentially useful is, in a word, transparency (Cohen, 2010, p. 893). As part of MPATs development as an open-source tool, an Excel spreadsheet which automatically calculates MPATs indicators is provided on the IFAD website (www.ifad.org/mpat). The default values and weights are those arrived at through the process of expert elicitation described herein what we refer to as the Standardised MPAT. Yet the cardinal scores and subcomponent weights can be changed by users, allowing them to create a customised MPAT which may better reflect local conditions or priorities. A caveat here is that expertise can be manipulated to deliver desired results, and it would not be difficult to adjust MPATs cardinal scores and weights so that the final MPAT scores appeared high/positive for a given area, whereas the Standardised MPAT would have yielded lower scores. Ideally then, customised MPAT results should always be presented alongside, or overlaid with, standardised results. This warning is mentioned to further highlight the need for users to look behind the curtain at the inner workings which produce the MPAT results for a given household, village, district or project. 6.2. Measuring Rural Poverty with One Number? Ravallion (2010, 2011) has questioned the value of aggregating multiple poverty-related dimensions into an index, suggesting that the sum may not be greater than the parts, and that dashboard approaches may therefore be preferable. Others (Ferreira, 2011) suggest that dashboard approaches miss out on the information provided by joint distributions of poverty/deprivation. There is something to be said for both points of view, and of course the context will undoubtedly dictate which approach is more suitable. With regard to MPAT, aggregating the 10 components into a single index, though perhaps tempting for policy consumption, was deemed conceptually unsuitable from the onset of the project (Cohen, 2010). It was nonetheless discussed extensively during the development of MPAT, from both theoretical and statistical points of view. Correlation analysis and scatterplots between the 10 components suggest that they account for different aspects of rural poverty, and that there is little overlap of information between them. This is evident in the non-significant correlations between the 10 components (using the 2009 pilot data from China and India) see Table 2. Though, the 10 components are closely interrelated, since they measure key aspects of rural poverty, statistically this is exactly the desired outcome: at least moderate correlation within components, and weak correlation between them. If one attempted to merge these 10 components into a single number, its resulting utility for project support would be of dubious value. The community of composite indicator developers may find this case quite interesting, as it suggests that a final composite indicator should not be seen as a goal in every instance. Indeed, as our work with MPAT illustrates, it is sometimes preferable to stop the aggregation procedure at the component level. 6.3. Using MPAT to Create Local Level Poverty Lines Since the release of the working paper version of MPAT in early 2010, a number of agencies have used MPAT and some have asked IFAD how to use the results to set poverty lines. On the MPAT Excel spreadsheet, results are presented using colour codes such that scores under 30 (lowest values) are highlighted in red, those between 30 and 59 in orange, 60 to 79 yellow, and 80 to 100 green. These colour codes allow for the quick identification of which sectors, regions, villages or even households may require support but they are not poverty lines. Given the context-specific nature of poverty, it would be imprudent to create an absolute numerical line based on MPAT values for all regions/countries, such that those below are poor, and those above

Downloaded by [Indian Institute of Technology Guwahati] at 22:54 26 March 2014

Downloaded by [Indian Institute of Technology Guwahati] at 22:54 26 March 2014

Table 2. Pearsons correlation coefficients between the 10 MPAT components Domestic water supply Health and health care Sanitation and Housing and Farm hygiene energy Education assets Non-farm assets Exposure and Resilience to shocks

Food and nutrition security

0.06 0.35* 0.23 0.32* 0.20 0.16 0.05 0.14 0.22

0.13 0.01

1. Food and nutrition security 2. Domestic water supply 3. Health and health care 4. Sanitation and hygiene 5. Housing and energy 6. Education 7. Farm assets 8. Non-farm assets 9. Exposure and resilience to shocks 10. Gender equality 0.11 0.19 0.42* 0.13 0.01 0.21 0.04 0.21 0.08 0.18 0.26 0.18 0.02 0.04 0.07 0.21 0.08 0.07 0.14 0.08 0.10 0.19

0.23 0.10 0.20 0.13 0.08

0.27* 0.07 0.10

0.14 0.01 0.17

0.08

Eliciting Expert Input to Develop the Multidimensional Poverty Assessment Tool 45

Notes: *Significant coefficients are greater than 0.27 (p < 0.05, n = 527). Spearman rank correlations are very similar to the Pearson coefficients reported here.

46 A. Cohen & M. Saisana not. That said, generally speaking one could say that a community scoring in the green range (80100) is likely in a relatively good position with regard to peoples well-being and access to social services. In addition, since MPAT is designed to support monitoring and evaluation, MPAT scores can provide indications of where the situation has improved, or worsened (assuming the Standardised MPAT is used at two or more time intervals). This can be seen, for example, by comparing MPAT results from the same region at two time points, as in Figure 5. The MPAT scores and colour codes, then, are meant to serve as general guidelines, not concrete signposts. That said, agencies may wish to use MPAT scores to create local-level poverty lines. One approach would be to form a consultative group of local and national experts to set numerical goalposts for their interventions. Thus, one project might decide that scores of 70 or more are sufficiently high with regard to their objectives for Food and Nutrition Security as measured by MPAT, but scores of 80 or higher would be needed with regard to Domestic Water Supply. This could also be facilitated by using some of the participatory approaches mentioned in this articles introduction for demarcating local-level poverty lines with MPAT indicator scores and possibly calibrating them against poverty lines developed with approaches such as the Ladder of Life (Narayan et al., 2009).

Downloaded by [Indian Institute of Technology Guwahati] at 22:54 26 March 2014

6.4. Future Research and Unanswered Questions This article discussed methods for recruiting experts and soliciting their input, but how ought one best determine the minimum expert sample size? One option would be to determine the minimum number of experts for which the weights standard deviation would not significantly change. As far as creating a representative sample, above we suggest one approach (creating a sampling frame of potential experts), but what guidelines should be established for participants to ensure sufficient expertise? That is, how ought one determine the inclusion criteria for experts in a given domain? Surely years of experience alone would be insufficient, so how to best balance quantity (experience) with quality (depth of understanding)?

Figure 5. Comparing baseline and mid-point MPAT results in rural Kenya, n = 480 households; insufficient data for 'Education' component comparison (data courtesy of Nuru International).

Eliciting Expert Input to Develop the Multidimensional Poverty Assessment Tool 47 Similarly, in addition to the suggestion of comparing local (poor) peoples suggested rankings with those of regional development experts, it would be worthwhile to compare different sources of expertise for each sector of interest. For example, with regard to water supply, do the weights suggested by hydrologists differ from those of epidemiologists, or anthropologists or economists? If so, how ought one determine which sector-specific expertise is most appropriate for a given application? Should a variety of experts be invited to suggest weightings and subsequent sensitivity analysis be used to determine which subset of experts provided the most robust weightings? Or would it be more appropriate to make these decisions before expert input is solicited? With regard to MPAT specifically, at what point should the weightings be re-evaluated? That is, at what point in the future will the weights no longer be applicable for most regions? Already we have seen that the Millennium Development Goals have spurred countries to focus their efforts on specific sectors; as the forthcoming Sustainable Development Goals continue to provide such development guidance, at what point will it be necessary to conduct another exercise to adjust MPATs weights?

Downloaded by [Indian Institute of Technology Guwahati] at 22:54 26 March 2014

7. Conclusions In this article we discussed how expert elicitation can be used for bottomup, survey-based, multidimensional poverty indicator development (see Table 3 for a summary of key points). This type of
Table 3. Summary of key issues, lessons learned and suggestions Concept/area Lessons learned Related literature (Winkler and Clemen, (2004) (Chowdhury and Squire, (2006) (R. Cooke, (1991)

Multiple experts, Overall, it is better to use multiple elicitation methods and multiple methods multiple experts, but it appears multiple experts provide more value than multiple methods. Dont expect full 62% per cent of invited Board members agreed to participate participation (53% in Chowdhury and Squires 2006 study) meaning selfselection bias certainly played a role. Compensating experts might increase participation. Diverse but We recommend choosing experts with a balance of diverse balanced backgrounds (gender, age, profession, education, country of contributors origin, and so forth) and collecting information on these characteristics to later analyse as possible sources of bias (as we did with country of origin). Psychometrics and Responsibly designing new surveys is complicated; as such we surveys recommend recruiting a psychometrics expert to assist with survey design so as to limit the bias introduced by the surveys themselves. Randomizise the When distributing forms/templates to contributors for weights or order of tasks cardinalisation, randomizise the order of the tasks so as to control for participant fatigue (or other bias based on ordering). Cardinalisation and Use multiple experts and, ideally, multiple methods. The Delphi scales Method would likely be helpful in quickly condensing cardinalisation inputs. Similarly, consider eventual aggregation methods when determining the cardinal score scales (for example, 05, 15, 110) and compare expertbased results with linear scaling when/as appropriate. Setting weights When soliciting expert input for indicator weights it is probably preferable to use a relative, ratio-based, budget allocation approach, in one or more steps (Delphi Method) rather than a 100% sum approach it may also be useful to use an analytic hierarchy process. Index vs. dashboard It is not always appropriate or desirable to strive for a multidimensional poverty index in cases, a thematic indicator (a dashboard approach) may be more appropriate.

(Schwarz and Sudman, (1996)

(Dalkey and Helmer, (1963); Landeta, (2006)

(Salty, (1980); OECD, (2008)

(Ravallion, (2011)

48 A. Cohen & M. Saisana approach can yield useful results, but the process itself is by no means straightforward. Participatory indicator design is a necessarily imperfect, and sometimes messy, process from which consensus eventually emerges. Indeed, soliciting input from a wide range of people at different organisations with different incentives for contributing to a project is challenging. What is more, there is no pre-existing rubric to calibrate the new tool. Consequently, one has to rely on statistical analysis and in-field assessments to try to ensure the tool is measuring what is intended. An additional benefit of the expert elicitation approaches we used is that those who contributed to MPATs creation gained a sense of ownership, since their suggestions were incorporated into the tool in highly visible ways. This in turn helped perpetuate their willingness to support the project as it unfolded, and bolstered the confidence of potential users and collaborating government agencies with regard to the tools utility. At this writing, IFAD is working to incorporate feedback on MPAT from the last few years and from recent case studies (in Bangladesh and Mozambique), to create a final version of the tool, slated for release in 2014. With regard to future research, we are considering using recent and forthcoming MPAT data sets to explore the utility of using MPAT scores to analyse joint distributions of deprivations by household. Given that this is still a relatively nascent field, hopefully the lessons learned from our efforts thus far will better elucidate the path forward for future research.

Downloaded by [Indian Institute of Technology Guwahati] at 22:54 26 March 2014

Acknowledgements This article benefited from extensive referee comments for which the authors are grateful. Many people offered their time and energy to support MPATs 20082009 development, especially Thomas Rath, as well as Rudolph Cleveringa, Mattia Prayer-Galletti, Shaheel Rafique, Roxanna Samii, Sun Yinhong and other colleagues at IFAD and other agencies (acknowledged in detail in the MPAT Book at http://www.ifad.org/mpat/resources/book.pdf). The corresponding author also extends thanks to Jeff Romm for his support of this work and sage advice. This work was originally funded by IFAD, DFID, Fulbright and government agencies in China and India.

Notes
1. For more information, see: http://hdr.undp.org/en/statistics/indices/. 2. For more information, see: http://hdr.undp.org/en/statistics/mpi/. 3. Both authors have also referred to this cardinalisation processes as normalisation in past publications, since, while most of the MPAT survey items fall on categorical scales, they are not always ordinal/linear. 4. Board members were not financially compensated for participation (except for travel and lodging expenses for those who attended workshops). The interested reader may consult the acknowledgements section (pp. 1217) of the MPAT Book (http:// www.ifad.org/mpat/resources/book.pdf) for a list of contributors. 5. Available in Annex I of the MPAT Book (http://www.ifad.org/mpat/resources/book.pdf). 6. In the interests of transparency and reproducibility, all of the survey item cardinalisations and aggregation rules are presented in the MPAT Users Guide (available on the http://www.ifad.org/mpat website); in light of the debatable nature of the cardinal scores and weightings MPAT users are also encouraged to examine and change them as appropriate (to create a customized MPAT). 7. For MPAT v.6, this was called Gender Equality; in subsequent iterations it was extended and renamed Gender and Social Equality.

References
Agresti, A. (2007). Categorical data analysis (2nd ed.). Hoboken: Wiley. Alkire, S., & Foster, J. (2011a). Counting and multidimensional poverty measurement. Journal of Public Economics, 95, 476487. Alkire, S., & Foster, J. (2011b). Understandings and misunderstandings of multidimensional poverty measurement. Journal of Economic Inequality, 9, 289314. Alkire, S., & Santos, M. E. (2010). Acute multidimensional poverty: A new index for developing countries. In. Oxford: Oxford Poverty and Human Development Initiative. Working Paper No. 38 Pages 1133 (July, 2010)

Eliciting Expert Input to Develop the Multidimensional Poverty Assessment Tool 49


Alkire, S., Santos, M. E., Seth, S., & Yalonetzky, G. (2010). Is the Multidimensional Poverty Index robust to different weights? In Oxford Poverty and Human Development Initiative. University of Oxford. Retrieved from http://www.ophi.org.uk. Barahona, C., & Levy, S. (2003). How to generate statistics and influence policy using participatory methods in research: Reflections on work in Malawi, 19992002. Brighton: Institute of Development Studies. Boltvinik, J. (1998). Poverty measurement methods an overview. In. New York: United Nations Development Program. Bossert, W., Chakravarty, S. R., & DAmbrosio, C. (August 17, 2009, pages 121 2009). Multidimensional poverty and material deprivation. In. Montreal: University of Montreal. Available at: ftp://www.econ.bgu.ac.il/Courses/Labor_Marcet_Policy_Selected_Issues/lectures/articles/boltvinik_measurement%201998.pdf Bourguignon, F., & Chakravarty, S. R. (2003). The measurement of multidimensional poverty. Journal of Economic Inequality, 1, 2549. Chambers, R. (1995). Poverty and livelihoods: Whose reality counts? Environment and Urbanization, 7, 173204. Chambers, R. (2008). Revolutions in development inquiry. London: Earthscan. Chowdhury, S., & Squire, L. (2006). Setting weights for aggregate indices: An application to the commitment to development index and human development index. Journal of Development Studies, 42, 761771. Cleaver, F. (1999). Paradoxes of participation: Questioning participatory approaches to development. Journal of International Development, 11, 597612. Cohen, A. (2009a). The Multidimensional Poverty Assessment Tool: Design, development and application of a new framework for measuring rural poverty. Rome: IFAD. Cohen, A. (2009b). The Multidimensional Poverty Assessment Tool: Users guide (working paper). In. Rome: IFAD. (http:// www.ifad.org/mpat/resources/user.pdf). 1100. Cohen, A. (2010). The Multidimensional Poverty Assessment Tool: A new framework for measuring rural poverty. Development in Practice, 20, 887897. Cooke, B., & Kothari, U. (2001). Participation: The new tyranny? New York and London: Zed Books. Cooke, R. (1991). Experts in uncertainty: Opinion and subjective probability in science. New York: Oxford University Press. Dalkey, N., & Helmer, O. (1963). An experimental application of the Delphi method to the use of experts. Management Science, 9, 458467. Evans, M. (1991). The problem of analyzing multiplicative composites. The American Psychologist, 46, 6. Ferguson, J. (1994). The anti-politics machine: Development, depoliticization, and bureaucratic power in Lesotho. Minneapolis, MN: University of Minnesota Press. Ferreira, F. (2011). Poverty is multidimensional. But what are we going to do about it? Journal of Economic Inequality, 9, 493495. Hargreaves, J. R., Morison, L. A., Gear, J. S. S., Makhubele, M. B., Porter, J. D. H., Busza, J., Watts, C., Kim, J. C., & Pronyk, P. M. (2007). Hearing the voices of the poor: Assigning poverty lines on the basis of local perceptions of poverty. A quantitative analysis of qualitative data from participatory wealth ranking in rural South Africa. World Development, 35, 212229. Hart, G. (2001). Development critiques in the 1990s: Culs de sac and promising paths. Progress in Human Geography, 25, 649658. Hickey, S., & Mohan, G. (2004). Participation: From tyranny to transformation? New York: Zed Books. IFAD. (in press). The Multidimensional Poverty Assessment Tool: User s guide. Rome: The International Fund for Agricultural Development. Kadane, J., & Wolfson, L. J. (1998). Experiences in elicitation. Journal of the Royal Statistical Society: Series D (The Statistician), 47, 319. Kapiriri, L., Norheim, O. F., & Heggenhougen, K. (2003). Public participation in health planning and priority setting at the district level in Uganda. Health Policy and Planning, 18, 205213. Klugman, J., Rodrguez, F., & Choi, H.-J. (2011). The HDI 2010: New controversies, old critiques. Journal of Economic Inequality, 9, 249288. Landeta, J. (2006). Current validity of the Delphi method in social sciences. Technological Forecasting and Social Change, 73, 467482. Leal, J., Wordsworth, S., Legood, R., and Blair, E. (2007). Eliciting expert opinion for economic models: An applied example. Value in Health, 10, 195203. Li, T. (2007). The will to improve: Governmentality, development, and the practice of politics. Durham, NC and London: Duke University Press. Mitchell, T. (2002). Rule of experts: Egypt, techno-politics, modernity. Berkeley, CA, and Los Angeles, CA: University of California Press. Mohan, G., & Stokke, K. (2000). Participatory development and empowerment: The dangers of localism. Third World Quarterly, 21, 247268. Narayan, D., Pritchett, L., & Kapoor, S. (2009). Moving out of poverty: Success from the bottom up. Washington, DC: World Bank. Noble, M. W. J., Wright, G. C., Magasela, W. K., & Ratcliffe, A. (2008). Developing a democratic definition of poverty in South Africa. Journal of Poverty, 11, 117141. OHagan, A. (1998). Eliciting expert beliefs in substantial practical applications. Journal of the Royal Statistical Society: Series D (The Statistician), 47, 2135.

Downloaded by [Indian Institute of Technology Guwahati] at 22:54 26 March 2014

50 A. Cohen & M. Saisana


OECD. (2008). Handbook on constructing composite indicators: Methodology and user guide. In: OECD and European Commission Joint Research Centre. Printed in (Paris) France by OECD Publications. Pages 1158 Ravallion, M. (2010). Mashup indices of development. In. Policy Research Working Paper 5432 (September 2010), available on the WB website. Pages 137. Washington, DC: The World Bank. Ravallion, M. (2011). On multidimensional indices of poverty. In. Policy Research Working Paper 5580 (February 2011). Pages 120. Washington, DC: The World Bank. Roe, E. (1998). Taking complexity seriously: Policy Analysis, triangulation, and sustainable development. Boston, MA: Kluwer Academic. Saisana, M., & Saltelli, A. (2010). The Multidimensional Poverty Assessment Tool (MPAT): Robustness issues and critical assessment. In EUR Report 24310 EN. Luxenbourg: European Commission, JRC-IPSC, Italy. Saisana, M., & Tarantola, S. (2002). State-of-the-art report on current methodologies and practices for composite indicator development. In: European Commission, Joint Research Centre, Institute for the Protection and the Security of the Citizen, Technological and Economic Risk Management Unit. EUR Report 20408 EN. Luxembourg: European Commission, JRCIPSC, Italy. Salty, T. (1980). The analytic hierarchy process: Planning, priority setting, resource allocation. New York: McGraw-Hill. Schmidt, F. L. (1973). Implications of a measurement problem for expectancy theory research. Organizational Behavior and Human Performance, 10, 243251. Schwarz, N., & Sudman, S. (1996). Answering questions: Methodology for determining cognitive and communicative processes in survey research. San Francisco, CA: Jossey-Bass. Scott, J. C. (1998). Seeing like a state: How certain schemes to improve the human condition have failed. New Haven, CT, and London: Yale University Press. Sen, A. (1976). Poverty: An ordinal approach to measurement. Econometrica, 44, 219231. Sen, A. (2000). Development as freedom. New York: Anchor. Srinivasan, T. N. (1994). Human development: A new paradigm or reinvention of the wheel? The American Economic Review, 84, 238243. Streeten, P., & Burki, S. J. (1978). Basic needs: Some issues. World Development, 6, 411421. Sullivan, C. (2006). Do investments and policy interventions reach the poorest of the poor? In P. Rogers (Ed.), Water crisis: Myth or reality? (vol. 1, pp. 221231). London: Taylor and Francis. Szal, R. (1979). Popular participation, employment and the fulfilment of basic needs. International Labour Review, 118, 2738. Trauer, T., & Mackinnon, A. (2001). Why are we weighting? The role of importance ratings in quality of life measurement. Quality of Life Research, 10, 579585. Welsch, H. (2002). Preferences over prosperity and pollution: Environmental valuation based on happiness surveys. Kyklos, 55, 473494. Winkler, R. L., & Clemen, R. T. (2004). Multiple experts vs multiple methods: Combining correlation assessments. Decision Analysis, 1, 167176. Wright, G., Noble, M., & Magasela, W. (2007). Towards a democratic definition of poverty: Socially perceived necessities in South Africa. Cape Town: HSRC Press.

Downloaded by [Indian Institute of Technology Guwahati] at 22:54 26 March 2014

Вам также может понравиться