Вы находитесь на странице: 1из 28

Paper presented in the Prime-Latin America Conference at Mexico City, September 24-26 2008

Evaluation of Mexican Research Groups in Physics and Mathematics using an Endogenous Approach1
Leonardo Reyes-Gonzalez2 Francisco Veloso3 Abstract Throughout the last three decades, tightening budgets, increasing competition and better understanding of the outputs of science and technology have stimulated the development of new types of research evaluations. Although useful, existing methods typically use ad-hoc definitions of the unit of analysis, lacking approaches capable of incorporating the self-organizing mechanisms of the research endeavor into the evaluation effort. This paper addresses this limitation by developing an evaluation method that takes into account these endogenous characteristics of the research effort. In particular, the collaboration patterns of researchers are used to identify the frontiers of the focal research units and the backward citation patterns are employed to establish relevant benchmark units for each focal unit. This method is then tested with the fields of Physics and Mathematics in Mexico in five leading institutions for the period of 1997-1999. Finally, we present a detailed peer benchmark.

This work was supported the Mexican Council for Science and Technology (CONACYT) CMUs Berkman Faculty Development Fund and the National Science Foundation (NSF) 2 Department of Engineering & Public Policy Program in Strategy, Entrepreneurship and Technological Change, Carnegie Mellon University, lreyesgo@andrew.cmu.edu 3 fveloso@cmu.edu, The usual disclaimers apply.

Paper presented in the Prime-Latin America Conference at Mexico City, September 24-26 2008

1. Introduction and Motivation Throughout the last decades, tightening budgets and an increasing competition between research projects, combined with a higher awareness of the outputs on science, have stimulated the development of new approaches towards research evaluation (COSEPUP, 1999; Georghiou and Roessner, 2000, van Raan, 2000; Rip, 2000; Frederiksen, Hansson and Wenneberg, 2003). For example, current assessments have evolved from the classical peer review to an informed peer review, in which research is evaluated with the aid of quantitative benchmarks; in the UK, the Research Assessment Exercises are tying funding to output (publications) and recognition (citations) (Leydesdorff, 2005). Despite an important evolution, evaluations still have a critical limitation: the boundaries of the unit of analysis are typically rigid (by institutes/departments, institutions, disciplines, regions, or countries), overlooking the unique and self-organizing characteristics of the research endeavor (Guimera et al., 2005). This means, for example, that present techniques have difficulty noting differences between low and top performing groups within a focal unit, say a university or even a department within a university. Likewise, benchmarking performance of university departments in a given area based solely on number of papers or citations fails to recognize that theoretical or experimental research profiles will necessarily imply different levels of publication and citation outputs. This renders a comparison based on average levels of productivity or impact for an area of limited value and potentially misleading. Furthermore, the performance of groups, or subunits (as described in Glser, Spurling and Butler (2004)) cannot be independently measured. Finally, current methods are particularly limited when assessing interdisciplinary research groups (RGs), because it is difficult to ascribe these groups to a particular field of knowledge and measure their performance within this field in comparison with equivalent groups. To overcome these limitations, we developed and test a research evaluation method that recognizes the endogenous, or self-organizing characteristics of research groups. Instead of an ad-hoc definition of the unit of analysis, the proposed method will use patterns of collaboration and the specific body of knowledge that these collaborations entail to identify the frontiers of the focal units, as well as other units that qualify as relevant benchmarks. First, the boundaries of a RG are be identified based on coauthorship and using the notion of Cohesive Groups, a method used in Network Analysis for subgroup identification (Wasserman and Faust, 1994, p. 249-284). Second, we use backward citations (found in the work published by each group) to establish its knowledge footprint. These footprints will be used to evaluate the degree of structural similarity between groups, i.e. the similarity between RGs is to be defined by how much their work cites common papers/journals. Once the RG are characterized and their peers identified, expect to measure and benchmark the performance/productivity of each RG. The method will be demonstrated by rankings group of the top five institutions in Physics and Mathematics in Mexico This paper is divided into five sections. First, we briefly describe different types of evaluations and their limitations. Second, we make reference to the theories that support our method, Bibliometric and Network Analysis. Third, we explain the method. Fourth, we apply this method to analyze the fields of Physics and Mathematics in Mexico

Paper presented in the Prime-Latin America Conference at Mexico City, September 24-26 2008

in 1997-1999 for five leading institutions and rank absolutely and relatively this area of knowledge in this country, within this same time period. Finally we present some policy implications. 2. Research Evaluation According to Papaconstantinou and Poltto (1997, p. 10), evaluation refers to a process that seeks to determine as systematically and objectively as possible the relevance, efficiency and effect of an activity. The evaluation can occur at three different stages: after (ex-post), during, or before (ex-ante) the activity. These evaluation types will produce information that can be used in the assessment of past policies, the monitoring of ongoing initiatives or the forward planning of innovation and technology policies (Papaconstantinou and Poltto, ibid). As stated previously, in the last 30 years evaluations (in general) have been on the rise. This trend has been fueled by budget stringency, the need to better allocate scarce public resources and by a broader reassessment of the appropriate role of government (Papaconstantinou and Poltto, 1997, p. 9). With respect to Science and Technology (S&T), new developments in these areas, increasing costs, higher awareness of its effects, and the desire to use the knowledge and outcomes of these activities, combined with the need to understand the consequences of S&T polices have spurred the demand for these types of activities (Martin and Irvine, 1983; COSEPUP 1999; van Raan, 2000; Rip, 2000, Frederiksen, Hansson and Wenneberg 2003). 2.1 Types Research of Evaluations To accommodate the changes and to address the new realities of S&T, several types of evaluations have been developed. Table 1 lists and describes the most common types of evaluations. Table 1 Current Methods for Research Evaluation Methods Bibliometric analysis Economic rate of return Peer Review Classic approach Modified approach Case studies Description Assumes that publications, citations, and patent counts signal the work and productivity of a unit of analysis. Used to estimate the economic benefits (such as rate of return) of research; gives a metric for the outcomes of research. Traditional method for evaluating science in which scientists continuously exercise self-evaluation and correction. Focuses on individual scientific products. Natural development from the classical one, it incorporates issues that are not strictly cognitive. Focuses on group learning. Historical accounts of the social and intellectual developments that led to key events in science or applications of science illuminate the discovery process in greater depth than other

Paper presented in the Prime-Latin America Conference at Mexico City, September 24-26 2008

methods Retrospective analysis Benchmarking Similar to case studies, but instead of focusing on one scientific or technological innovation it focuses on multiple cases. Used to assess whether a particular unit of analysis is at the cutting edge in terms of research, education or other measures.

Source: COSEPUP (1999, p 18-22); van Raan (2000); and Frederiksen, Hansson and Wenneberg (2003). Each method is substantially useful in its own way, but has significant drawbacks (Table 2). For all established methods noted above, important additional limitations exist. Typically, the boundaries of the unit of analysis are defined in an ad-hoc way, overlooking the endogenous characteristics of such units. Furthermore, these evaluations often assign broad cohort groups and use a working assumption that within a field all units are homogenous in terms of the knowledge they use and their comparative advantage (e.g. differences between low and top performing actors within a given unit). The method proposed in this study aims to address these limitations. Table 2 Pros and cons in current evaluation methods Methods Bibliometric analysis Pro Con At best, measures only quantity; not useful across all programs & fields; comparisons across fields or countries difficult; can be artificially influenced Measures only financial benefits, not social benefits; time separating research from economic benefit is often long; not useful across all programs and fields Focuses primarily on research quality; other elements are secondary; evaluation usually of research projects, not programs; great variance across agencies; concerns regarding use of "old boy network"; results depend on involvement of highquality people in process Happenstance cases not comparable across programs; focus on cases that might involve many programs or fields making it difficult to assess federalprogram benefit

Quantitative; useful on aggregate basis to evaluate quality for some programs and fields Economic rate Quantitative; shows of return economic benefits of research Peer Review Well-understood method and practices; provides evaluation of quality of research and sometimes other factors Provides understanding of effects of institutional, organizational, and technical factors influencing research process, so process can be improved; illustrates all

Case studies

Paper presented in the Prime-Latin America Conference at Mexico City, September 24-26 2008

types of benefits of research process Retrospective analysis Useful for identifying linkages between federal programs and innovations over long intervals of research investment Benchmarking Provides a tool for comparison across programs and countries Not useful as a short-term evaluation tool because of long interval between research and practical outcomes Focused on fields, not federal research programs

Source: COSEPUP (1999, p 18-22).

3. Theoretical Background In this section we describe the bodies of knowledge that have been used in the development of our method. First we use dynamic network analysis to establish the patterns of collaboration within a field and delimit the boundaries of an RG. In particular, we draw on the notion of Lambda Sets to determine the level of cohesiveness of these groups. Second, we use two techniques from bibliometric analysis to establish the similarities between RGs (co-citation analysis), assess their performance (with respect to their publication output, citation counts, citation impact and citation impact by group size) and rank these groups against their peers. 3.1 Network Analysis According to Rogers, Bozeman and Chompalov (2001), networks are used in the study of science and technology (S&T) as guiding metaphors and as techniques to measure structural properties of the ensemble. They classify these types of studies by the level of analysis that is given by the nature of the actors that will be placed at the nodes (i.e. nodes can be individuals, teams, departments or institutions), by the nature of the links between nodes (interaction networks vs. position networks)4, and by the domain in which the actors belong (intra-organizational vs. inter-organizational)5. In this work, we use network theory to understand the patterns of collaboration in science and delimit the boundaries of an RG. For such purpose, we focus our analysis on the ties that emerge through co-authorship and define an RG based on its levels of cohesiveness and the connectivity between co-authors.

In Interaction networks, the links represent actual information exchanges or other communication events between actors; whereas in position networks, they represent relationships established by the relative positions of the actors in the system (Rogers, Bozeman and Chompalov, 2001) 5 Intra-organizational studies only consider links between actors inside the boundaries of a single organization; in contrast, inter-organizational studies consider links between actors across organizational boundaries (Rogers, Bozeman and Chompalov, 2001)

Paper presented in the Prime-Latin America Conference at Mexico City, September 24-26 2008

Cohesive subgroups According to Wasserman and Faust (1994, p. 249), a cohesive subgroup for dichotomous relationships is a subset of nodes among whom there are relatively strong, direct, intense, frequent, or positive ties; and for valued relationships (like the ones encountered in this work) the subsets of nodes will have strong or frequent ties. In addition, these authors identify four mechanisms by which these subgroups can be formed, namely mutuality of ties, closeness or reachability between members, frequency of ties members, and relative frequency of ties among subgroup members compared to nonmembers. These authors contend that in the first three mechanisms, cohesive subgroups are defined based on the relative strength, frequency, density, or closeness of ties within the subgroups; whereas in the fourth this delineation is done on the relative weaknesses, infrequency, sparseness or distance of ties from subgroup members to non members. In a scale of strict to less strict, the forth mechanism (LS and Lambda Sets) is the most restrictive one, followed by the first and fifth one; and then by the other two. Table 2 summarizes these methods. In this paper, we combine lambda sets and cliques as a first approach to measure group cohesiveness because they allows to identify the Principal Investigators (or most connected people) in the network, the former, and all the (direct) collaborators of three or more co-authors, the latter. Furthermore, the other two measures can be understood by weakening the clique approach which can be further explored in future work. Lambda sets Lambda sets measure the level of cohesiveness of a group based on the frequency of ties within vs. outside subgroups. This type of subgroups are relatively robust in terms of its connectivity, i.e. it is difficult to disconnect by the removal of lines from the subgraph (Wasserman and Faust, ibid). In order to assess the level of cohesiveness of a group this algorithm varies the number of ties, d, within a group; the more ties you have to drop (within a group) the more cohesive this group is. In addition, this measure is used to identify the most connected people within a network (something that we denote as PIs). Table 3 Methods used to delimit subgroup Mechanism Method Definition A clique is a maximal complete subgraph with at least three nodes. It is a subset of nodes, all of which are adjacent to each other, and there are no other nodes that are also adjacent to all of the members of a clique. An n-clique is a maximal subgraph in which the largest geodesic distance between any two nodes is no greater than n. When n = 1, the subgraphs are cliques.

Mutuality of cliques ties

Reachability n-cliques and diameter (or closeness)

Paper presented in the Prime-Latin America Conference at Mexico City, September 24-26 2008

n-clans

An n-clan is an n-clique in which the geodesic distance, d(i,j), between all nodes in the subgraph is no greater than n for paths within the subgraph. An n-club is a subgraph in which the distance between all nodes within the subgraph is less than or equal to n; furthermore, no nodes can be added that also have geodesic distance n or less from all members of the subgraph. A k-plexes is a maximal subgraph in which each node may be lacking ties no more than k subgraph members. When k = 1, the subgraph is a clique. A k-cores is a subgraph in which each node is adjacent to at least a minimum number, k, of the other nodes in the subgraph. An LS set is a subgroup definition that compares ties within the subgroup to ties outside the subgroup by focusing on the greater frequencies of ties among subgroup members compared to the ties between subgroup members to outsiders. A lambda set is a cohesive subset that is relatively robust in terms of its connectivity, i.e. it is difficult to disconnect by the removal of lines from the subgraph.

n-clubs

Nodal k-plexes Degree (frequency of ties members) k-cores

Frequency LS Sets of ties within vs. outside subgroups Lambda Sets

Source: Based on Wasserman and Faust (1994, pp. 251-267). Cliques A clique, consists of a subset of nodes, all of which are adjacent to each other, and there are no other nodes that are also adjacent to all of the members of a clique (Wasserman and Faust, 1994, pp. 254). As previously stated, this technique will allow us to identify all the direct collaborators of a researcher. 3.2 Bibliometric Analysis Since the 1970s performance analysis (based on publication output and received citation) (Martin and Irvine, 1983; van Raan, 2000; Kane, 2001; van Raan, 2005) and the mapping of science (based on co-citation analysis) (Small, 1973; Narin, 1976; Narin, 1978; Small, 1978; Leydesdorff, 1987; Gmr, 2003) have been widely used in evaluative bibliometrics (see Noyons, Moed, and Luwel (1999) for a combined perspective; and van Raan, (2004) for a historical development). However it is important to note that, with the exception of Noyons, Moed, and Luwel (1999), these two approaches have been used separately. In the following section we briefly summarize both techniques.

Paper presented in the Prime-Latin America Conference at Mexico City, September 24-26 2008

Performance Analysis The traditional approach toward analysis of output performance by research units is to use publications, citations, and patent counts as signals of work and productivity of the focal unit. This method is based on the premise that an article will be published in a refereed journal only when expert reviewers and the editor approve its quality and it will be cited by other researchers as recognition of its authority (COSEPUP, 1999, p. 18). In order to measure this output, several metrics have been developed in the literature [van Raan, 2004; Thomson-ISI, 2003]. In this work, we use some of the most established metrics, including total scientific output, citation counts and citation impact, defined as:
Total scientific output " ! Published articles
Citation count " ! Received citations

(1) (2)

Citation impact "

! Received citations ! Published articles

(3)

In addition, we also use a normalized citation impact, in order to have a measurement of performance that is comparable across RGs, regardless of their size composition. * ! Received citations per group ' % #Group size $ Normalized citation impact " ( (4) ( ! Published articles per group % ) &
Co-citation analysis Co-citation analysis studies the structure and development of scientific communities and areas. This methodology is based on the notion that a citation is a valid and reliable indicator of scientific communication, and this measure signals the relevance of an article (Small, 1978; Garfield, 1979; Gmr, 2003). The specialized literature identifies a cocitation if two publications or authors cite the same reference; and uses this concept as a measure for similarity of content between the two publications or authors (Gmr, 2003).

In the last 30 years, two approaches have been developed within this framework, namely document co-citation (focused on documents and publications with peer-review procedures) and author co-citation. In this work, we use the first approach to measure the similarity between the knowledge base (citations) of two RGs. For such purpose we define the knowledge footprint (KFP) for group i all the backward citations used by all members of a group in all of their papers within a specific time frame.
4. Method In this section we discuss the method developed for the identification of research groups and for benchmarking their productivity and impact. This method consists of five steps. First, the collaboration patterns among researchers will be pointed out, by identifying all

Paper presented in the Prime-Latin America Conference at Mexico City, September 24-26 2008

the direct and indirect ties of each researcher and measuring its Lambda Set level6 (LSL); this collection of patterns and collaborators will form a set of collaboration groups (CG). Second, different research groups (RG) will be defined based on the researchers with the highest LSL (identified in this work as PIs), the CGs of these authors, and characteristics of their direct ties. Third, the Knowledge Footprint (KFP) of each group is delimited; and a co-citation analysis is performed with this footprint to establish the similarities between groups. Fourth, the performance of each RG is measured using the absolute number of articles and the number of articles per research group per researcher. Finally, the research groups are benchmarked combining the previous results. Figure 1 shows a conceptual representation of the method.
Method Techinques

1. Identify co-authors

Bibliometric analysis

2. Define collaboration groups

Netowrk analysis (grouping algorithms)

3. Define research groups 4. Establish knowledge footprint (KFP) 5. Measure research group performance 6. Benchmark research group

Co-citation analysis

Perfromance analysis

Figure 1 Method steps. This figure shows the steps of the method for the definition and benchmark of research groups. 4.1 Identification of Collaboration Groups In this step, the collaboration groups are established based on the pattern of co-authorship in a field within a certain period of time. This step is subdivided into four sub-steps. First (sub-step 1.1), all articles produced for a certain broad area of knowledge, time period and relevant universe (e.g. country, region, city, university) are identified. This sub-step generates a database containing a set of articles, authors, area of knowledge, backward citations, and citations received within a certain timeframe.

Second (sub-step 1.2), a dichotomous mode-2 matrix (NxM, N Authors, M articles) is created, allowing an author to be linked to all articles she or he has produced.
6

Labda Set level is the number of ties that need to be removed so an author is completely isolated

Paper presented in the Prime-Latin America Conference at Mexico City, September 24-26 2008

Because co-authorship exists in many articles, this matrix also provides an indirect relationship between authors (Figure A1 in appendix A shows an example of these types of matrices). Third (sub-step 1.3), to identify the direct relationships between authors, the previous mode-2 matrix is converted into a mode-1 matrix (or adjacency matrix). This produces a weighted NxN matrix (author by author) where the wij values of each cell indicate the strength or frequency of the relation (co-authorship) between authors i and j (Figure A2 in appendix A gives an example of this type of matrix).7 Fourth (sub-step 1.4), the most interconnected groups and people within a field of knowledge are identified using the level of cohesiveness. Specifically, the concept of Lambda Sets is used so by varying the number of ties (i.e. removing them) within a group all relevant cohesive groups are found. Furthermore, this procedure provides the Lambda Set Level (LSL) for each author, i.e. the number of ties that need to be removed so a researcher becomes completely isolated; identifying the key players within a network, or Principal Investigators (PIs). In addition, all collaboration groups (CG) within the area of knowledge are identified using a second measure of cohesiveness. In this case the concept of clique is used so all the direct ties are found. For the purpose of the study, a collaboration group will consist of three or more co-authors and/or authors that jointly share 2 or more co-authors. In contrast, individual authors (authors that have co-authors with no relation between) them will not be part of a collaboration group.
4.2 Research Group Delimitation In this step we use the information on each author LSL, its CGs and the characteristics of its direct ties to define the boundaries of the research groups (RG). The intuition of this step is that we first find all the Principal Investigators (PIs) within a particular knowledge area, using the LSL of each author. After this, we assign each PI to a research group and, based on the characteristics of its direct collaborators we allocate them to this or other RGs. This entails four steps.

First (sub-step 2.1), the Principal Investigators within a particular knowledge area are identify (by ordering and ranking all the authors based on their LSL) and assigned to a unique RG. Second (sub-step 2.2), based on the CGs of the PI, all its direct collaborators are identified and categorized as non-shared and shared collaborators. A direct collaborator is defined as a non-shared collaborator if all its collaborations groups are a subset of the collaborations groups of the PI. Conversely, a direct collaborator is defined as a shared collaborator if its collaborations groups are part of collaborations groups of more than two PIs.

The conversion of a mode-2 matrix into mode-1 matrix, calculations fore extracting the cliques; as well as the normalization and symmetrization of the KFP are performed using UCINET VI. This is a software has built-in all the necessary for Network Analysis. In addition, we used NetDraw v. 2.25 to plot the different networks, like figure YYY in section BBB.

10

Paper presented in the Prime-Latin America Conference at Mexico City, September 24-26 2008

Third (sub-step 2.3), the direct collaborators are immediately assign to a (PIs) research group if they are categorized as non-shared collaborator. And for the reaming ones a two stage rule is used. First, the overlap of CGs between the shared-collaborator and the PIs is assed. The shared-collaborator is assigned to a particular RG if the CG overlap between the PI of that group and the researcher is the highest. If the overlap were the same, a second rule is used. For this case, the institutional affiliation of the sharedcollaborator and the PIs RG is used, assigning this researcher to the group that has the highest similarities in institutional affiliations. Finally (sub-step 2.4), isolated dyads or individuals researchers with low LSL are assign to PIs (and its corresponding RG) with higher LSL using sub-step 2.3. Figure 2 provides an example for this part of the method (the identification of collaboration groups and shared and non-shared collaborators).
Author A Shared collaborators Non Shared collaborators Collaborators of non-shared collaborators Research Group for Auhtor A at a LSL Research Group for other auhtors at other LSL

11

Paper presented in the Prime-Latin America Conference at Mexico City, September 24-26 2008

Figure 2 Example of Research Group (RG) Delimitation. This figure provides and example of how the proposed method delimits the boundaries of a RG. First, it starts with the author (the PI of this group) that has the highest Lambda Set Level (in the case the green square) and creates a group (yellow triangle). This process is repeated far all the authors in a certain network. Second, all the collaborators of the author are linked to this group and a distinction is made between shared (with other RG) and non-shared collaborators. Third, the shared collaborators are assigned to this RG based on the LSL of the PIs and if necessary on the institutional affiliation of the shared researcher, i.e. this author will be assigned to the PIs RG with highest LSL and if necessary to the highest institutional overlap between the shared researcher and the RG

After taking these steps, the universe of analysis will be clustered in a set of RG, as well as various researchers that are not integrated in any research group, as defined by the method.
4.3 Knowledge footprint and group similarity This part of the method is divided into four sub-steps. First (sub-step 3.1), we establish the Knowledge Footprint (KFP) of each group. For such purpose, all backward citations for each RG are aggregated in a mode-2 matrix (KxJ, K research groups, J citations). Second (sub-step 3.2), we convert this matrix into a KxK matrix using RG affiliation. This produces a (directed) co-citation matrix (CCM) for all the research groups for a field of knowledge within a certain period of time. Figure 3 shows a hypothetical example of this type of matrix.
RG1 RG2 RG3 RG4 RG5 RG6 RG7 RG8 RG9 RG10 RG1 RG2 RG3 RG4 RG5 RG6 RG7 RG8 RG9 RG10 40 0 0 0 22 0 0 10 0 0 0 50 0 0 0 0 0 0 0 0 0 0 21 0 0 2 0 5 0 0 0 0 0 14 0 0 0 0 0 0 22 0 0 0 29 0 0 0 10 0 0 0 2 0 0 39 0 0 0 0 0 0 0 0 0 0 20 0 0 0 10 0 5 0 0 0 0 35 0 0 0 0 0 0 10 0 0 0 24 2 0 0 0 0 0 0 0 0 2 2

Figure 3 Co-citation matrix. This figure represents a hypothetical co-citation matrix. The value in the 11th cell corresponds to the total number of backward citations for group i. The value in the ij th cell represents the total number of common citations between groups i and j.

Third, the CCM is normalized with respect to the maximum number of citations in each column. This produces a (directed) normalized co-citation matrix (NCCM). Figure 4 shows a hypothetical example of this type of matrix.

12

Paper presented in the Prime-Latin America Conference at Mexico City, September 24-26 2008

RG1 RG2 RG3 RG4 RG5 RG6 RG7 RG8 RG9 RG10

RG1 RG2 RG3 RG4 RG5 RG6 RG7 RG8 RG9 RG10 100% 0% 0% 0% 76% 0% 0% 29% 0% 0% 0% 100% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 100% 0% 0% 5% 0% 14% 0% 0% 0% 0% 0% 100% 0% 0% 0% 0% 0% 0% 55% 0% 0% 0% 100% 0% 0% 0% 42% 0% 0% 0% 10% 0% 0% 100% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 100% 0% 0% 0% 25% 0% 24% 0% 0% 0% 0% 100% 0% 0% 0% 0% 0% 0% 35% 0% 0% 0% 100% 100% 0% 0% 0% 0% 0% 0% 0% 0% 8% 100%

Figure 4 Normalized Co-citation matrix. This figure represents a hypothetical NCCM matrix. The value of the ij th cell represents the level of similarity between the group in the i th row with respect to the group in the j th column, e.g. the KFP of RG5 has a level of similarity of 55% with respect to RG1. Conversely, RG1 has a 76% citation overlap with respect to RG5. 4.4 Scientific output and performance In order to measure the scientific performance of each research group, we sum the number of articles published by each research group, a proxy for RG scientific output, and the citation counts, which is a proxy for RG total scientific impact. Finally, citation impact by group size is established as a proxy for a normalized measure of scientific impact. 4.5 Group benchmark Once the research groups have been defined, the similarities among the RG established and their performance assessed, we proceed to benchmark groups against their peers. The benchmark groups are identified through minimum thresholds in the level of similarity of the knowledge on which the groups rely for their work. We used a ten, five and one percent overlap in KFP as the baseline levels for similarity between groups. Figures 8 shows a hypothetical research group, its peers (second column), the normalized cocitation share (third column), the different levels of similarities (different scale of grays) and in the bottom the KFP overlap.

The KFP is defined as:


KFP overlap i " ! similarity j
j CG

(5)

where CG is the number of peer groups and j is a peer of i. This indicator tells us how much the aggregated KFP of the peer group overlaps with group i.

5. Demonstration of the Method 5.1. Data To test the proposed method, we use a database from Thomson Scientific (2004), formerly known as the Institute for Scientific Information (ISI), owned by the Mexican Council for Science and Technology (CONACYT). This database contains all papers

13

Paper presented in the Prime-Latin America Conference at Mexico City, September 24-26 2008

published between 1980 and 2003 with at least one address in Mexico8. This database contains the following information: article name, author(s), author(s), address(es), year of publication, journal, volume, pages, backward citations (i.e. references) and total number of citations received. To illustrate the application of the method, we selected all the papers published in Physics and Mathematics by five leading institutions (and the departments related to these fields) in the period of 1997-1999. We chose this period because we wanted the most recent publications, while allowing for a citation window of five years. (i.e. for the papers published in 1997 we restricted the citation count to the period 1997-2001, for the ones published in 1998 this window corresponded to the period 1998-2002, and for 1999 we chose a 1999-2003 citation window). Table 4 provides a list of institutions and their departments with the number of papers published in each of them, as well as the number of forward citations. Finally, we provide an example of how to benchmark RG using the method developed in this paper.

5.2 Main outcomes Table 4 shows a traditional assessment done by institution (University or Research Center). From this tables it can be seen that the Autonomous National University of Mexico (UNAM) is the leading institution in absolute terms and ranks second if both measures are normalized by the total number of researchers, while the National Polytechnic Institute lag behind (among this selected institutions) in the critical output and impact measures Table 4 Number of Articles and Citations in Physics and Math for Selected Institutions, 1997-1999 Articles Total Per (Rank) researcher Citations Total Per (Rank) researcher

Institution

Autonomous National University of Mexico (UNAM) Research and Advanced Studies Centre (CINVESTAV) Metropolitan Autonomous University Iztapalapa
8

1051 (1)

2.2

3588 (1) 1145 (2)

7.4 15.5

417 (2) 5.6 280 (3) 2.1

719 (3) 5.4

In the last stage of our analysis we realized that this database also contained publications with at least one address in New Mexico and none in Mexico. We think that Thomson Scientific might have created this database with the key word Mexico, including all the papers from Mexico and New Mexico. We preserved the papers with all the addresses because we did not want to mistakenly eliminate useful data, however our primary concern was to concentrate on Mexico.

14

Paper presented in the Prime-Latin America Conference at Mexico City, September 24-26 2008

(UAM-I) Autonomous University of 202 (4) 2.4 Puebla (BUAP) National Polytechnic 114 (5) 0.4 Institute (IPN)

499 (4) 5.9 218 (5) 0.8

In Table 5, each institution has been broken down by an administrative boundary. These depend on the nature of the institution and may reflect, for example, a research center focused just on research; an institute or department training graduate students and doing research; or a school primarily training undergrad students and doing some research. This Table reflects the typical level of detail allowed by existing methods. It already provides a more complete perspective than a purely institutional characterization. For example, the Physics Institute in UNAM has a stellar performance (in total output), stronger than the Faculty of Science and much different from the Applied Math and Systems Research Institute. By contrast, in the Autonomous University of Puebla (fourth place at the institutional level) the Physics Institute performs above two Departments of the Research and Advanced Studies Centre (second place at the institutional level). However, if we would now look at the patterns of co-authorship and collaboration and identify research groups based on these patterns, a different and much more complete perspective emerges. Figure 6 shows the distribution of research groups, identified by the method, ranked by the number of publications published by each research group. From this picture it can be seen that the performance of RG is more heterogeneous, e.g. UNAM lead both top (for good) and bottom percentiles (for bad) with eight and eleven RGs, respectively. Whereas the RG form BUAP and IPN trail the other institutions.

Table 5 Total Number of Articles and Citations in Physics and Math by Department for Selected Institutions, 1997-1999 Number of Articles (Rank) Number of Citations (Rank)

Department

Autonomous National University of Mexico Physics Institute (IF UNAM ) Faculty of Science (FC UNAM ) Mathematics Institute (MI UNAM) Applied Math and Systems Research Institute (IIMAS) Research and Advanced Studies Centre Physics Department (Phy CINVESTAV) Applied Physics Department (APhy CINVESTAV)

594 (1) 285 (3) 114 (6) 82 (8) 289 (2) 83 (8)

2495 (1) 709 (3) 271 (6) 212 (8) 879 (2) 204 (9)

15

Paper presented in the Prime-Latin America Conference at Mexico City, September 24-26 2008

Math Department (Math CINVESTAV) Metropolitan Autonomous University Iztapalapa Physics Department (Phy UAM-I) Math Department (Math UAM-I) Autonomous University of Puebla Physics Institute (IF BUAP) Faculty of Science (FC BUAP) National Polytechnic Institute School of Physical and Mathematics (ESFM)

55 (9) 225 (4) 55 (9) 124 (5) 96 (7) 114 (6)

88 (12) 625 (4) 94 (11) 356 (5) 171 (10) 218 (7)

Furthermore, if we normalized this distribution by the number of researchers in each group a more complex picture emerges. Figure 6 shows the distribution of research groups, identified by the method, ranked by the number of publications published by each research group and normalized by group size. From this picture it can be seen that the performance of RG is quite heterogeneous across institutions, UNAM doesnt lead the top percentile and still leads the bottom one, and the number of RG in BUAP, IPN and UAM are about the same. In Figure 8, each institution has been broken down by an administrative boundary, showing the distribution of RG (identified by the method) by department ranked by the number of papers per RG. Once again it can be seen that the performance of research groups is quite heterogeneous across departments. Furthermore, the same heterogonous pictures emerge if the groups are ranked by number of citations (figures A3 and A4 in the appendix).

16 14

BUAP IPN UNAM UAM-I CINVESTAV

Number of research groups

12 10 8 6 4 2 0 10% 20% 30% 40% 50% 60% 70% 80%

90%

100%

Percentile

Figure 6 Distribution Research Groups (RG) by Institution, Ranked by Number of Papers per RG. This figure provides the distribution of RG based on the total number of

16

Paper presented in the Prime-Latin America Conference at Mexico City, September 24-26 2008

articles produced by RG within a certain institution. The left hand side shows the institutions that have the highest number of leading groups., e.g. UNAM has eight RG in the top percentile, where as BUAP has three. In contrast, the right side gives the number of laggard research groups, e.g. UNAM has 11 and BUAP has only one. From this figure it can be seen that the performance of RG is more heterogeneous across institutions.
18 16 BUAP IPN UNAM UAM-I CINVESTAV

Number of research groups

14 12 10 8 6 4 2 0 10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

Percentile

Figure 7 Distribution Research Groups (RG) by Institution, Ranked by Number of Papers per RG per researcher. This figure provides the distribution of RG based on the total number of articles produced by RG per researcher within a certain institution. The left hand side of this figure, shows the top performing institution by number of research groups in the top percentile, e.g. UNAM has only one RG in the top percentile, where as BUAP has four and UAM-I has the highest number of RG, eight. In contrast, the right side gives the least performing institution, e.g. UNAM has 17, BUAP three and IPN none. From this figure it can be seen that the performance of RG is quite heterogeneous across institutions and the overall performance of each institution change if the size of the RG is taken into account.

17

Paper presented in the Prime-Latin America Conference at Mexico City, September 24-26 2008

16 14

Number of research groups

12 10 8 6 4 2 0 10% 20% 30% 40% 50% 60% 70%

APhyCINVESTAV PhyCINVESTAV PhyUAM-I MathCINVESTAV MathUAM-I ESFM FCUNAM FCBUAP IIMAS IF-BUAP IF-UNAM MI-UNAM

80%

90%

100%

Percentile

Figure 8 Distribution Research Groups (RG) by Department, Ranked by Number of Papers per RG. This figure provides the distribution of RG based on the total number of articles produced by RG within a certain department. The left hand side shows the departments that have the highest number of leading groups. And the right side gives the number of laggard research groups by department. From this figure it can be seen that the performance of RG is quite heterogeneous across departments.
As suggested this analysis so far, identifying research groups should allow a much better understanding of the dynamics of scientific productivity within and across institutions and departments. Yet, this process also suggests that these groups will have very different characteristics in what concerns their research nature. Therefore, it may not be reasonable to compare groups across the board, but rather complement the ideas described above with a method that would allow an identification of other benchmark groups with a comparable research profile. As described in detail below, the method allow us to compare a given group with others that produce comparable research by looking at the overlap of citations in the papers they publish. Table 6 provides an example for RG 080, overall this group ranks 32 out of 143 (for publications by research group by researcher) and 24 (for citations by research group by researcher). However, within its cohort ranks 5th and 4th respectively and 1st at a level of similarity of 5% or more. Furthermore, if we break the RG by administrative boundary we can see that this group is located in the 20th percentile, vs. the 80th percentile if this group is just benchmarked in the overall network (Figure 8). The example above shows the power of combining a method than allows an identification of research group boundaries with another that characterizes distances in their knowledge footprint. The potential is for a very sharp and clear identification and

18

Paper presented in the Prime-Latin America Conference at Mexico City, September 24-26 2008

benchmark of the relevant pockets of knowledge generation and impact in an institution or region, allowing a more precise evaluation and reward process for university administrators, program managers or policy makers.

Table 6 Relative Ranking Based on Number of Articles per Researcher Using Group Knowledge Similarity Over 3 Years Articles Ranking Per Within Researcher Overall cohort
2,5 1,7 1,7 1,7 1,3 1,3 1,1 1,1 1 0,8 0,8 0,8 0,7 0,7 0,7 0,7 0,6 1 2 2 2 3 3 4 4 5 6 6 6 7 7 7 7 8 17 32 32 32 40 39 49 50 52 58 55 58 63 62 64 64 67

Research Level of Group Similarity

Citations Ranking Per Within Researcher Overall cohort


3,5 2,8 2,7 2,7 1,4 0 1,7 1,1 1 1,3 1 1 1,2 1,1 0,9 0,8 0,8 1 2 3 3 5 12 4 8 9 6 9 9 7 8 10 11 11 16 22 24 24 47 40 38 57 59 52 59 59 53 57 61 62 64

52 017_2 30 => 80 047_1 89 013_12 76 012_1 022_4 049_2 018_6 92 026_3 024_5 007_6 118

2,50% 16,80% 16,80% BASE 16,80% 2,50% 14,90% 4,30% 7,50% 7,50% 16,10% 23,60% 87,60% 23,60% 23,60% 1,20% 17,40%

Overall group 80 ranks 32 out of 143, but within its cohort (17 RG at a 1% level of similarity) it ranks 2nd.

6. Discussion and Policy Implications In the last thirty years the realm of science and technology has evolved dramatically. These changes have spurred an evaluation culture (and industry), enhancing traditional methods (like peer review (Guston, 2003)) or creating new ones, including benchmarking between countries (May, 1997; Adams, 1998; King, 2004; Veloso, Reyes-Gonzalez and Gonzalez-Brambila, 2005) or socioeconomic assessments (van Raan, 2000). While useful, these evaluations methods have an important common limitation: the boundaries of the focal unit are typically artificial and rigid, failing to notice unique and self organizing characteristics of the research endeavor.

19

Paper presented in the Prime-Latin America Conference at Mexico City, September 24-26 2008

To address this limitation, this paper develops an evaluation method that takes into account the endogenous, or self organizing, characteristics of research group. It defines research groups based on the strength and frequency of the collaboration patterns (within a field of knowledge) and ranks them using the level of similarity of their knowledge footprint (i.e. common citations). In addition, this method is tested with a database from the fields of Mathematics and Physics in Mexico containing all the papers published between 1997 and 1999 by 5 leading institutions (as reported by ISI); and a detailed full and relative peer benchmark is performed for both areas. The method developed in this paper produced two main results. First, as expected, the strength and frequency of the collaboration patterns allowed us to single out cohesive groups, i.e. this method identifies the key research groups (or collective actors) in a field of knowledge regardless of the institutional or location context of the members (researchers). This is a departure from traditional methods because, under this approach, a potential evaluator will not be able to assess the internal cohesiveness of groups, or seflorganizing mechanisms. In addition, this method can be used to breakdown groups that extend from the traditional boundaries, i.e. a multi-institutional group could be divided in two. Second, the knowledge footprint (KFP) and the benchmark at different levels of similarity in KFP is an important departure from the established evaluation literature. This step allows potential evaluators to identify similar research groups, assess these groups and produce more meaningful comparisons and rankings (e.g. see table 8). This solution contrasts with the more traditional approach, where the evaluator typically uses broad and artificial similarities, such as comparing mechanical engineering departments across universities, assuming that they are more or less the same. In addition, this method has an important feature: it can easily be extended to other type of focal units, including, institutions, departments, networks or regions, and even individuals, with minor modifications. From the preliminary results it can be concluded that this method can support policy makers and scientist to better identify the frontier of research groups and to find suitable and relevant peers for benchmark. In addition, this procedure allows scholars, as well as policy makers, to better understand the self-organizing mechanisms of research groups and assess how they evolve over time. We believe this whole process will increased our knowledge of the research endeavor and combined with other methods (like peer review) will produce better assessments. In addition, this method helps to close the gap between performance analysis and the mapping of science in bibliometric analysis, by giving a different perspective to Noyons, Moed, and Luwel (1999). It also creates a link between the areas of research evaluation and network analysis. But the development of this new method also generates new questions that need to be addressed in subsequent work. One of them is the effect of weakening the clique assumption, using other measures of group cohesiveness (e.g. n-cliques, k-plexes, etc.) to define collaborative groups. Another issue is to understand the impact of aggregating groups with a low Lambda Set Level (LSL) into groups with a higher LSL.. In addition,

20

Paper presented in the Prime-Latin America Conference at Mexico City, September 24-26 2008

further analysis is also need to test the robustness of this method, by incorporating other fields knowledge or using data from other countries or regions. Finally this method could further our understanding of the determinants of research group productivity (Gonzalez-Brambila, Veloso and Krackhardt, 2005) in a number of ways. One possibility is to study how the characteristics of the naturally emerging groups are tied to their productivity. Another possibility would be to extend the approach to other type of research output data amenable to equivalent analysis, in particular patents.

21

Paper presented in the Prime-Latin America Conference at Mexico City, September 24-26 2008

References Adams, J., 1998. Benchmarking International Research. Nature 396, 615-618. Carley, K. M. 2002. Smart Agents and Organizations of the Future. The Handbook of New Media. Edited by Leah Lievrouw & Sonia Livingstone, Ch. 12 pp. 206-220, Thousand Oaks, CA, Sage. Carley, K. M., 2003. Dynamic Network Analysis. Dynamic Social Network Modeling and Analysis: Workshop Summary and Papers, Eds. Ronald Breiger, Kathleen Carley, and Philippa Pattison, Committee on Human Factors, National Research Council, National Research Council. 133-145. COSEPUP. 1999. Evaluating Federal Research Programs: Research and the Government Performance and Results Act. National Academy Press, Washington, DC. Frederiksen, L. F., Hansson, F. and Wenneberg, S. B., 2003. The Agora and the Role of Research. Evaluation 9, 149172. Garfield, E., 1979. Is Citation Analysis a Legitimate Evaluation Tool? Scientometrics 1, 359-375. Georghiou, L. and Roessner, D., 2000. Evaluating technology programs: tools and methods. Research Policy 29, 657678. Glser, J., Spurling T. H. and Butler, L., Intraorganisational evaluation: are there least evaluable units? Research Evaluation 13, 19-32. Gmr, M., 2003. Co-citation analysis and the search for invisible colleges: A methodological evaluation. Scientometrics 57, 27 57. Gonzalez-Brambila, C. N., Veloso, F., and Krackhardt, D., 2005. Social Capital and the Creation of Knowledge. Working Paper. Carnegie Mellon University, Pittsburgh, PA. Guston, D. H., 2003. The expanding role of peer review process in the United States. In: Shapira, P., Kuhlmann, S. (Eds.), Learning from Science and Technology Policy Evaluation. Edward Elgar Publishing, Northampton, MA, p. 81-97 Guimer, R., Uzzi, B., Spiro, J. and Nunes Amaral, L. A. 2005. Team Assembly Mechanisms Determine Collaboration Network Structure and Team Performance. Science 308, 697-702. Kane, A., 2001. Indicators and Evaluation for Science, Technology and Innovation. Paper for an ICSTI working groups on evaluating basic research. Retrieved from August http://www.economics.nuigalway.ie/people/kane/2002_2003ec378.html; 2005.

22

Paper presented in the Prime-Latin America Conference at Mexico City, September 24-26 2008

King, A. D., 2004. The Scientific Impact of Nations. Nature 430, 311-316. Krackhardt, D., and Carley, K. M., 1998. A PCANS Model of Structure in Organization. Proceedings of the 1998 International Symposium on Command and Control Research and Technology. Conference held in June. Monterray, CA. Evidence Based Research, Vienna, VA, 113-119. Leydesdorff, L., 2005. The Evaluation of Research and the Evolution of Science Indicators. Current Science 89, 1510-1517. Leydesdorff, L., 1987. Various methods for the Mapping of Science. Scientometrics 11, 291-320. Martin, B. R. and Irvine, J., 1983. Assessing basic research: Some partial indicators of scientific progress in radio astronomy. Research Policy 12, 61-90. May, R. M., 1997. The Scientific Wealth of Nations. Science 275, 793796. Narin, F., 1976. Evaluative Bibliometrics: The Use of Publication and Citation Analysis in the Evaluation of Scientific Activity. National Science Foundation, Washington D.C. Narin, F., 1978. Objectivity versus relevance in studies of scientific advance. Scientometrics 1, 35-41. Noyons, E.C.M., Moed, H.F. and Luwel, M., 1999. Combining Mapping and Citation Analysis for Evaluative Bibliometric Purposes: A Bibliometric Study. Journal of the American Society for Information Science 50, 115131. Papaconstantinou, G. and Polt, W., 1997. Policy Evaluation and Technology: An Overview. OECD Proceedings, Conference on Policy Evaluation in Innovation and Technology. OECD Paris. van Raan, A. F. J., 2000. R&D evaluation at the beginning of the new century. Research Evaluation 9, 81-86. van Raan, A. F. J., 2004. Measuring Science Capita Selecta of Current Main Issues. In: Moed, H.F., Glnzel, W., and Schmoch, U. (Eds.), Handbook of Quantitative Science and Technology Research. Kluwer Academic Publishers, Dordrecht, p.19-50. van Raan, A. F. J., 2005. Fatal Attraction: Conceptual and methodological problems in the ranking of universities by bibliometric methods. Scientometrics 62, p. 133-143.

23

Paper presented in the Prime-Latin America Conference at Mexico City, September 24-26 2008

Rip. A., 2000. Societal Challenges for R&D Evaluation. Proceedings from US-EU Workshop Learning from Science and Technology Policy Evaluation. (Eds.) Shapira, P., Kuhlmann, S. Bad Herrenalb, Germany, September 2000. Rogers, J. D., Bozeman, B., and Chompalov, I., 2001. Obstacles and opportunities in the application of network analysis to the evaluation of R&D. Research Evaluation 10, 161-172. Small, H., 1973. Co-citation in scientific literature: A new measure of the relationship between publications. Journal of the American Society for Information Science 24, 265269. Small, H. G., 1978. Cited documents as concept symbols. Social Studies of Science 8, 327-340. Thomson-ISI, 2003. National Science Indicators database Deluxe version. Thomson Research, Philadelphia, PA. Thomson Scientific, 2004. National Citation Report 1980-2003: customized database for the Mexican Council for Science and Technology (CONACYT). Thomson Research, Philadelphia, PA. Veloso, F., Reyes-Gonzalez, L. and Gonzalez-Brambila, C. N., 2005. The Scientific Impact of Developing Nations. Working Paper. Carnegie Mellon University, Pittsburgh, PA. Wasserman, S. and Faust, K., 1994. Social Network Analysis: Methods and Applications. Cambridge University Press, Cambridge, UK.

24

Paper presented in the Prime-Latin America Conference at Mexico City, September 24-26 2008

Appendix A
Figures A1 and A2 are used in de description of the method provide. Figure A1 provides an example of a dichotomous mode-2 matrix (NxM, N Authors, M articles). And figure A2 shows mode-1 matrix (or adjacency matrix) of author by author.
Articles 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0

1 2 3 4 5 6 7 8 9 10 11

1 ABREU, CRA 1 AGUDO, AL 0 AGUILAR, R 0 AGUILAR-RIOS, G 0 AGUILAR-RODRIGUEZ, E 0 AGUILARA, R 0 AHONEN, P 0 ALVARADO, JFJ 0 ALVAREZ, J 0 ALVAREZ-RAMIREZ, J 0 ANCHEYTA-JUAREZ, J 0

2 0 0 1 0 0 0 0 0 0 0 0

3 0 0 1 0 0 0 0 0 0 0 0

4 0 0 0 1 0 0 0 0 0 0 0

5 0 0 0 0 0 1 0 0 0 1 0

6 0 0 0 1 0 0 0 0 0 0 0

7 0 0 0 0 0 0 0 0 0 0 0

8 0 0 0 0 0 0 0 0 1 0 0

9 0 0 0 0 0 0 0 0 1 0 0

Figure A1 Hypothetical example of a mode-2matrix (author by paper). In this case each row makes reference to all the articles published by a particular author and each column relates all the co-authors of an article. A 1 in the ijth cell indicates that author i is related to publication j. A 0 indicates no relationship.
Author 1 2 3 4 5 6 7 8 9 10 11 AB AG AG AG AG AG AH AL AL AL AN ABREU, CRA 1 0 0 0 0 0 0 0 0 0 0 AGUDO, AL 0 2 0 0 0 0 0 0 0 0 0 AGUILAR, R 0 0 7 0 0 0 0 0 0 1 0 AGUILAR-RIOS, G 0 0 0 2 0 0 0 0 0 0 0 AGUILAR-RODRIGUEZ, E 0 0 0 0 5 0 0 0 0 0 5 AGUILARA, R 0 0 0 0 0 1 0 0 0 1 0 AHONEN, P 0 0 0 0 0 0 1 0 0 0 0 ALVARADO, JFJ 0 0 0 0 0 0 0 1 0 0 0 ALVAREZ, J 0 0 0 0 0 0 0 0 3 0 0 ALVAREZ-RAMIREZ, J 0 0 1 0 0 1 0 0 0 18 0 ANCHEYTA-JUAREZ, J 0 0 0 0 5 0 0 0 0 0 5

Author

1 2 3 4 5 6 7 8 9 10 11

Figure A2 Hypothetical example of adjacency matrix (author by author). This matrix relates an author with all its co-authors. The value in the ijth cell indicates the strength of frequency between author i and j

Author

25

Paper presented in the Prime-Latin America Conference at Mexico City, September 24-26 2008

Table A1 gives number of research groups by Institution and Department identified by the proposed method

Table A1 Total number of research groups in Physics and Math for selected institutions, 1997-1999 Number of Research Groups by by Institution* Department*

Institution and Department

Autonomous National University of Mexico (UNAM) Physics Institute (IF UNAM ) Faculty of Science (FC UNAM ) Mathematics Institute (MI UNAM) Applied Math and Systems Research Institute (IIMAS) Research and Advanced Studies Centre (CINVESTAV IPN) Physics Department (Phy CINVESTAV) Applied Physics Department (APhy CINVESTAV) Math Department (Math CINVESTAV) Metropolitan Autonomous University Iztapalapa (UAM-I) Physics Department (Phy UAM-I) Math Department (Math UAM-I) Autonomous University of Puebla (BUAP) Physics Institute (IF BUAP) Faculty of Science (FC BUAP) National Polytechnic Institute (IPN) School of Physical and Mathematics (ESFM) TOTAL number of RG 143

104
99 27 7 6

61
46 23 6

44
42 3

33
24 16

31
31

*There is an overlap of RG across Institutions and Departments so the sum in more than the total number of RG (143)

26

Paper presented in the Prime-Latin America Conference at Mexico City, September 24-26 2008

Figure A3 shoes the distribution RG by institution ranked by number of citations per RG (top) and by number of citations per RG per researcher (bottom). And figure A4 shows the distribution RG by department ranked by number of citations per RG.
Distribution Research Groups (RG) by Institution Ranked by Number of Citation per RG
14 12

BUAP IPN UNAM UAM-I CINVESTAV

Number of research groups

10 8 6 4 2 0 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

Percentile

Distribution Research Groups (RG) by Institution Ranked by Number of Citation per RG per researcher
16 14

Number of research groups

12 10 8 6 4 2 0 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

Percentile

Figure A3 Distribution Research Groups (RG) by Institution, Ranked by Number of Citations per RG (TOP) and by Number of Citations per RG per researcher (BOTTOM). This figure provides the distribution of RG based on the total number of citations produced by RG and the normalization of this measure controlling by group size. The left hand side of this figure, shows the top performing RG by institution and the right side gives the least performing RG. From this figure it can be seen that the

27

Paper presented in the Prime-Latin America Conference at Mexico City, September 24-26 2008

performance of RG is quite heterogeneous across institutions and the overall performance of each institution change if the size of the RG is taken into account.

14 12

Number of research groups

10 8 6 4 2 0 10% 20% 30% 40% 50% 60% 70% 80%

APhyCINVESTAV PhyCINVESTAV PhyUAM-I MathCINVESTAV MathUAM-I ESFM FCUNAM FCBUAP IIMAS IF-BUAP IF-UNAM MI-UNAM

90%

100%

Percentile

Figure A4 Distribution Research Groups (RG) by Department, Ranked by Number of Citations per RG. This figure provides the distribution of RG based on the total number of citations produced by RG within a certain department. The left hand side shows the departments that have the highest number of leading groups. And the right side gives the number of laggard research groups by department. From this figure it can be seen that the performance of RG is quite heterogeneous across departments.

28

Вам также может понравиться