Abstract: Evaluation is a fundamental part of the scientific and technological policy of a country and, given its performative character, is a tool that allows for the introduction of changes in the sector. In this paper we present a critique of the hegemonic paradigm of evaluation by products, based mainly on quantitative indicators of papers and patents. Within this framework, we discuss the main issues of science and technology evaluation in Argentina. Among them: the excessive weight assigned to quantitative parameters; anonymity and lack of transparency; the primacy of ex ante evaluation carried out exclusively by peers; the lack of coherence with policies and plans, and the overlapping of evaluation systems. Finally, we present a series of proposals to improve the evaluation processes in Argentina, emphasizing the need to achieve consistency with State policies and a National Science and Technology Plan, and to include non-academic actors in evaluation of scientific-technological activity.
Keywords: Scientific evaluation, scientific carrers, CyT projects, social actors, systems of scientific information, anonymity, blind peer review, Argentina.
Evaluation in science and technology in Argentina. State of situation and proposals
Received: 20 September 2019
Accepted: 07 October 2019
*Article elaborated collectively in the framework of the Cátedra Libre CPS by Gabriel M. Bilmes, Marcela Fushimi and Santiago Liaudat. With contributions made by Julián Bilmes, Ignacio F. Ranea Sandoval and Jonatan Sabando.
Evaluation is considered a central aspect of science and technology (S&T) policies, especially since the second half of the 20th century, when the activity of the sector reached greater proportions (a period characterised as the "industrialisation of science" by Salomon, 1997). This role, however, is subject to permanent debates and tensions, fundamentally because through evaluation processes, resources are distributed, stable jobs are accessed, careers are promoted, lines of research are consolidated or discarded, and reputations are built or devalued.
All evaluative processes explicitly seek to assess progress, measure results, weigh up effects, attribute scores; but implicitly they act performatively, providing guidelines that orient, organise and privilege one type of activity over another. Various studies have shown how the actors in the scientific and technological complex adapt their practices to what is expected of them (Davyt & Velho, 1999; Fernández Esquinas et al., 2011). For this reason, evaluation is also a fundamental tool for introducing changes in the implicit policies of a country's S&T sector. Returning to Herrera's (1975) categories, it could be said that by modifying evaluative processes, an explicit policy (S&T plans) could be made implicit (norms, values and forms of organisation that effectively guide the practices of the actors).
Evaluation is an instrument at the service of those who plan, finance and/or manage S&T activities. The object to be evaluated is very diverse: from the macro level, such as policies, R&D plans and national programmes; to the meso level, such as institutional evaluations, sub-systems or specific areas; and finally, to the micro level, such as evaluations of individuals, research groups and particular projects. Moreover, it should be noted that the evaluation process can be carried out at different times: prior to the beginning of the activity (ex ante), during the course of the activity (intermediate) or at the end (ex post).
In short, evaluation is a key issue of enormous complexity that is often hidden from most of the actors who take part in a phase of the evaluation process, which, added to the inherent processes of bureaucratisation involved in the activity, generates a sense of opacity and lack of transparency for many of those being evaluated (Atrio, 2018). In this article we do not intend to exhaust a topic for which there exists abundant specialised literature, but to recover the emerging criticisms in relation to the standard model of evaluation by products, to analyse specific problems of evaluation in our country, and finally to present a series of proposals from the Cátedra Libre Ciencia, Política y Sociedad of the UNLP.
The most widespread approach to S&T evaluation at the global level is the one enshrined in the Frascati (1963) and Oslo (1992) Manuals of the Organisation for Economic Co-operation and Development (OECD).1 This is a linear perspective that measures income to the system - basically money invested and existing human resources - in relation to the results obtained, translated into the number of scientific articles published in refereed journals (usually referred to as papers), or the technological development achieved, measured in terms of the number of patents obtained. It is an evaluation paradigm that uses the input-output matrix of economics applied to S&T production. Over time, both manuals have been extended and annexed, and although they have kept the original analysis matrix, they have become more complex and have incorporated complementary variables. However, papers and patents continue to be the most valued items in academic evaluations.
Of all the criticisms and questions that this evaluation paradigm has received, we highlight the following:
The S&T product is reduced to a set of measurable and quantifiable results, undervaluing or directly leaving aside aspects such as social relevance, intervention in public management, regional integration, environmental impact, public communication, among others.
Scientific results and technological products tend to appear as individually authored, hiding the social dimension of the activity and thus neglecting the role of research teams and groups, scientific networks and collaboration.2
Institutional scenarios, their changes over time and their regional particularities are not taken into consideration, ignoring the fact that the input-output relationship is necessarily mediated by these realities and their specificities.
It reinforces the positivist image of S&T by implicitly assuming a single methodology for the production and circulation of knowledge, and considering that all fields of knowledge can be evaluated according to a standard parameter.
The simplified and decontextualised use of bibliometric indicators such as the impact factor, the H-index and other similar indicators for the evaluation of scientific production promotes unequal competition between disciplines and regions, favouring and reinforcing the power of oligopolistic databases and publishers.3
Several international statements and manifestos, widely supported by individual scientists as well as scientific associations, institutions and journals, question this assessment methodology (DORA, 2002; Hicks et al., 2015). However, evaluation by products remains hegemonic. This is partly due to its intrinsic merits, such as simplicity in its application, ease and economy of data collection, and comparability of results that facilitates management allocation of resources. These aspects tend to prevail over alternative approaches that propose more elaborate and specific strategies, but are for the same reason more complex to implement. On the other hand, and by way of hypothesis, we point out that the persistence of evaluation by products corresponds to the interests of the global powers expressed in the OECD, due to the fact that this logic of evaluation tends to strengthen mainstream science in the central countries and scientism in the periphery (Kreimer, 2011).
In our country, as in other parts of the world, the current paradigm of evaluation has been increasingly questioned, especially during the expansion of the S&T sector promoted by the governments of Néstor Kirchner and Cristina Fernández (2003-2015). During those years, there were debates, reflections and proposals that took shape in different documents and regulations, among which the most important were those drafted by the MinCyT between 2003 and 2015.
The most important of these were those drafted by the Ministry of Science and Technology between 2011 and 20124, the CONICET evaluation regulations (2008) and their successive modifications, and the creation of an inter-institutional commission for the Humanities and Social Sciences (CIECEHCS, 2014). As a result, new projects and financing lines associated with technological, social and productive development were designed and proposed during those years: mainly the Technological and Social Development Projects (PDTS) and sectoral funds such as FONTAR, FONSOFT and FONARSEC, which incorporated different evaluation mechanisms.5 More recently, new criteria were also adopted for entry to research careers and scholarships at CONICET, referring to strategic issues and institutional strengthening. These initiatives - which we will not analyse in this document - represent experiences to be taken into account when thinking about changes in evaluation mechanisms.
In our opinion, the issues that we will analyse below constitute central problems in the evaluation of scientific and technological activities in our country. They also arise recurrently in the debates, not always explicitly, and persist despite proposals for change. These are:
Excessive bias towards the quantitative. The uncritical application of evaluation by products has led to the quantification of research results as an almost unique indicator of scientific excellence. This is despite the fact that the need to prioritise qualitative assessments has been repeatedly pointed out. The negative consequences include: a) the orientation of research towards "fashionable" topics with a greater chance of being published in "mainstream" journals, to the detriment of local or regional issues; b) the strengthening of traditional disciplines and institutions, and of already consolidated groups, undermining those in formation or located in non-central geographical areas; c) the implementation of the logic of "Publish or perish", which leads to the unnecessary and productivist multiplication of the number of publications, many times superfluous and lacking in value in scientific terms; d) bureaucratisation of scientific activity and superficiality of evaluation activities, which are limited to the counting of papers and the application of pre-assembled bibliometric indices, all of which generates a growing unease that is experienced in terms of labour alienation.
Anonymity in the evaluation of resources and people. Ignorance of who evaluates is no guarantee of quality and, on the contrary, can be a source of possible discretion. Since public resources are involved, anonymity in the evaluation of R&D projects, research funds and permanent staff is a violation of National Laws 25,200 and 27,275, which guarantee, respectively, the necessary transparency of any evaluation instance in State bodies and the right of access to public information. Among the main consequences of anonymity, the main problem is that it can lead to irresponsibility in the exercise of the power to judge, resulting in ill-founded opinions, disguised nepotism, arbitrariness and others, which are often hidden and "protected" from public scrutiny. The resistance that still persists in some of the institutions that make up the S&T complex is inexplicable, even more so when in some organizations in the sector -for example national universities- the evaluations are already public.67
Lack of coherence with policies and plans, and overlapping of evaluation systems. In many organisations, a lack of adequacy and consistency of evaluation systems and criteria is observed, with the policies and plans that these organisations themselves promote and with national and regional S&T plans. This is one of the problems most suffered by researchers in their daily lives, and one of the most recognised by the institutions themselves.8 Its origin lies in the disarticulation and lack of inter-institutional and inter-ministerial coordination between all the organizations that make up the S&T complex (CONICET, universities, decentralised organizations, ANPCyT, etc.) since each of them implements its own mechanisms and systems, and information is generally not shared.
The most important consequences are: a) the lack of coordination between agencies allows inconsistencies in the evaluation criteria of a given instance with public plans and policies, sometimes even contradicting them, and producing overlapping deadlines and calls for proposals with sometimes incongruent demands and objectives; b) since evaluations are not shared, each organization carries out its own evaluation, which leads to the same actor, project or institution being evaluated multiple times, unnecessarily overloading the system. Thus, it is common, for example, for one organization to evaluate the promotion of a researcher, for another organization to evaluate the accreditation of a project in which the same researcher participates, and for a third organization to evaluate the awarding of funds; c) the use of multiple virtual platforms for uploading background information (CVar, SIGEVA CONICET, SIGEVA by universities, Incentives, etc.) generates excessive bureaucratisation and work overload that could be avoided.
Primacy of ex ante evaluation. There is an almost absolute predominance of ex ante evaluation of research proposals - be they work plans or projects. In general, intermediate and ex post evaluation is reduced to fulfilling formal steps - filling out forms in due time and form - which do not usually affect the course of the projects, and the absence of in situ evaluations is particularly noticeable.9 The only intermediate, in situ and ex post evaluations that are carried out are those related to the financial issue as a fiscal control, and they are those linked to the financial issue as a fiscal control, and are focused on the rendering of expenses and the inventory of assets. As a result, there is a gap between what a project or work plan said it was going to do and what is actually done, and the use of the results to reorient activities, save budget and improve the quality and relevance of the S&T carried out in the country is wasted.
Merely declarative use of social utility criteria. Although the project formulation grids include items in which the areas of impact and social usefulness of the results and their relevance must be made explicit, this is not often taken into account. A declarative use of impact and social utility is made, merely for the purpose of approving the project, but then it has no practical effect, since S&T production is evaluated by traditional products (papers and patents). Other forms of communication of results are not valued and sometimes non-traditional forms of dissemination, such as alternative media, open access publications, web dissemination, outreach activities among other ways that facilitate the social use of knowledge are undermined. This is an expression of the split between what is said and what is done, which is particularly important when it comes to thinking about how to link S&T to the social problems of our country.
Exclusively peer evaluation. Peer evaluation is unanimously considered a guarantee of the quality of S&T work. However, the fact that the evaluation is exclusively carried out by specialists has negative consequences, such as: a) it tends to adopt current international evaluation guidelines and not according to national problems, favouring the adoption of mainstream research topics and methodologies to the detriment of national and regional problems and traditions; b) corporate endogamy is generated, since peer-controlled evaluation gives rise to group reproduction logics (exchange of favours, status and prestige), rather than social problem-solving dynamics; c) in social impact projects and plans, in general, non-academic actors (workers, farmers, communities or others, whether they are beneficiaries or possible victims of S&T development) are not included at any stage of the evaluation, and the participation of economic actors (companies) and political actors (state bodies, municipalities) is very rare.
Concepts that are assumed to be objective and universal. Evaluation processes are often based on criteria that refer to expressions such as quality, excellence, productivity, relevance, impact, etc. Far from being terms with generalised meanings, they are concepts with a strong value load and multiple meanings. By not making the definition of these concepts explicit (that is, what the evaluator should evaluate), it is assumed that there is a common, objective and universal idea in this regard. This implicitly leads to the adoption of the dominant criteria emanating from the core countries and propagated throughout the world by international organisations (IDB, WB, OECD). The growing effort to link the S&T sector with the business sphere should also alert us to the need to clearly define evaluation criteria, to avoid the private sector shaping the State's research guidelines.
Lack of transparency in evaluation. Despite the fact that there has been progress in recent years, there are still instances, mechanisms and evaluation procedures that are not sufficiently public and transparent: ambiguous criteria, non-explicit parameters and scores, lack of knowledge of both evaluators and those being evaluated about the grids to be used, inaccessible reports, among others, are frequent elements that account for this problem.
On the basis of the above diagnosis, the Cátedra Libre CPS proposes the following actions that we consider necessary to guide changes in the evaluation processes:
Finally, and as part of an ethic that we want to promote, we believe that it is necessary to move towards collaborative and formative, inclusive and plural, contextual and situated evaluation. Although there will be always an inevitable competitive side, insofar as resources and access to positions that are limited are disputed, the sense of evaluation must not be lost sight of.