
EVALUATION
AND EFFICIENCY OF RESEARCH AND EXPERIMENTAL DEVELOPMENT (R&D) INSTITUTIONS
AND UNITS Simon Schwartzman and Amaury de Souza
March, 1984
Summary:
1. The question of R&D
evaluation
2. R&D evaluation in
developing countries
3. Conflicting criteria
for R&D evaluation
4.Who should evaluate?
5. Objective vs.
subjective indicators of performance
6. The uses of R&D evaluation
Notes
1 - The question
of R&D evaluation
Evaluation is a crucial component of any R&D activity as well as one
of the most difficult tasks to handle. Whether basic research or experimental
development are concerned, scientific work is expected to be performed at
the highest possible level of competence. R&D efforts are justified
by the expectation that it will reach beyond what is already known and established,
opening up new horizons for knowledge, creating novel and better ways of
dealing with the world and using it for the benefit of mankind. To work
at the frontier of knowledge, making the best use of available technical
and intellectual resources, is part of the very definition of R&D.
Scientific research and development activities, however, are not easily
amenable to objective evaluation. In the first place, working at the frontier
of knowledge, scientists can rightly claim that it is upon them to evaluate
their own work, or, at least, that no such evaluation can be carried out
without their active participation. It follows that scientific research
entails a necessary element of self-evaluation and control, which may lead
in turn to difficult problems of professional and institutional self-protection
and deception.(1)
Second, scientific research and development are usually open-ended activities,
in the sense that it is difficult or even impossible to determine in advance
their final product. There are, of course, significant differences between
more academic or "pure" and more applied fields of research. However,
even when precise specifications of final products do exist, it is normally
impossible to fully anticipate the steps required to bring them about, since
such steps constitute an integral part of the research effort.
Finally, there is hardly any consensus as to the specific meaning of "performance"
and quality. Thus, the problem of scientific research evaluation is not
only that there are no clear and indisputable judgmental parameters and
procedures but also that scientists, policymakers and relevant public have
different and often contradictory expectations regarding the outcomes of
scientific work as well as the standards of excellence by which it should
be evaluated.
2 - R&D
evaluation in developing countries
Research and development evaluation in countries which do not have a firmly
established science and technology tradition poses further difficulties.
It remains a matter of scholarly dispute which were the precise mechanisms
that led countries such as Germany, England, France and the United States
to play a central role in the development of modern science and technology.
It is generally accepted, however, that the emergence of advanced scientific
centers was historically linked with the existence of strong and fairly
autonomous universities, intensive industrialization and the diffusion of
social values which stressed the centrality of reason and empirical knowledge
and placed a premium on individual effort.(2)
Government support and sponsorship of scientific and technological research
was another important component of past national experiences, but not necessarily
the major one. In modern times, however, there is a clear parallel between
the roles played by governments in science and technology and in economic
development. Latecomers to the industrial world display higher levels of
state participation in the economy than first comers(3).
The same is true of countries that developed scientific and technological
capabilities in late nineteenth and early twentieth centuries, such as Japan,
the Soviet Union, and Índia.
After World War II, the notion that science and technology could not exist
and develop without government support and guidance became widely accepted.
The war effort demonstrated that scientific and technological advances were
key ingredients of economic and military power and economic theories were
put forward stressing the role of education and technological change as
crucial components of economic growth. However, as science and technology
expanded the radius of its potential contribution to societal welfare governments
in most countries were hard pressed to increase investments in R&D activities
and higher education. Government involvement in science and technology was
also furthered by the growing awareness that public scrutiny and control
was needed to prevent the misuse of scientific knowledge and technical capabilities.
The increasing role of government in science and technology also raised
questions of efficiency that had been traditionally associated with the
production of goods and services. In a free market it is expected that open
competition will naturally drive out the less efficient and competent(4).
Sociologists of science have used this market analogy to explain how the
scientific community is organized, and how resources and awards of prestige
and authority are distributed through mechanisms of competition and self-regulation(5).
Carrying the analogy further, however, leads to the recognition that scientific
research and development are also liable to monopoly or oligopoly growth
and stagnation. In addition, the existence of an international science and
technology Market demands purposeful intervention to reverse or compensate
for the ensuing geographic concentration of talent and resources.
If governmental intervention seeks to redress imbalances produced by open
competition, then standards of evaluation derived from the operation of
the free market cannot be readily applied to protected scientific and technological
activities. Indeed, unhampered competition between national R&D systems
unequally endowed with scientific and technological capabilities is likely
to entail the destruction of incipient local efforts toward scientific and
industrial growth and its potential utilization to the benefit of developing
societies. At the same time, protection against market competition is often
transformed into complacency toward incompetence, irresponsibility, and
waste. Unproductive research undertakings may drain public resources for
long periods and barriers to technology transfers or the closure of markets
to foreign products may substitute the protection of industrial inefficiency
for the original goal of stimulating development.
It is possible to argue that the creation and development of modern science
and technology institutions and policymaking bodies was the paramount concern
of developing countries in the l960s and 1970s and that considerations of
effectiveness were accordingly granted low priority. Not infrequently, R&D
evaluation was seen as premature, hence harmful to ambitious programs of
institution building and scientific development. An opposite movement may
be in the offing in the 1980s. Faith in the power of governmental planning
and intervention has decreased(6). While science policymaking agencies are now to
be found everywhere, there are persistent doubts as to whether their cost
and the complexity they bring to bear over scientific and technological
activities are justified by tangible results. Furthermore, the number of
research and experimental development institutions and units has greatly
increased in many developing countries, but their average productivity ranks
low by most standards, and there is mounting evidence to the effect that
they can hardly be expected to improve just through more generous funding
or technical assistance(7).
The uncertainty that now surrounds R&D efforts in developing countries
is likely to affect the volume of resources devoted to science and technology
although their levels of R&D expenditures still rank far below the developed
world's. Viewed in this context, sustained support for scientific and technological
development will be hard to come by in developing countries unless current
efforts are properly evaluated and reoriented.
3 - Conflicting
criteria for R&D evaluation
Scientific research and experimental development include a plurality of
activities and objectives that can hardly be subjected to the same judgmental
criteria. Empirical studies show that research units and institutions hold
different objectives and orientations, making it impossible to gauge their
effectiveness in terms of a single criterion(8).
Scientific effectiveness entails at least the following dimensions:(9)
- academic effectiveness, or the contribution toward the development
of scientific knowledge, as indicated by scientific publications, participation
in professional meetings and the like;
- technological effectiveness, or the ability to bring about new experimental
devices and products;
- training effectiveness, or the capacity to form new researchers and
provide technical and scientific training;
- social effectiveness, or the contribution toward the solution of
socially pressing problems; and
- economic effectiveness, or the ability to make products with higher
market value than their R&D cost.
The fact that dimensions of R&D effectiveness are uncorrelated, as several
empirical studies have concluded, means that different research institutions
and units emphasize different goals and that an outstanding performance
with respect to any one objective is no indication of similar performance
on all others. It is indeed a common mistake to elect one single dimension
of effectiveness as paramount and then use it as the sole standard for evaluation
purposes. Planners will often demand economic or social effectiveness while
dismissing the quest for academic or technical excellence as self-serving
and wasteful; academics will argue that the production of new knowledge
is the precondition of all scientific and experimental work; and university
authorities may insist on the evaluation of research groups in terms of
their contribution to teaching. Needless to say, the opposite mistake is
also frequently made and it consists in accepting the self-defined goals
of research units or institutions as the only criteria for their evaluation.
It is undeniable, however, that a university department that refuses to
train students and spend time doing routine work for outside clients or
a technological research center which claims to be doing strictly academic
work are possibly examples of misplaced efforts and wasted resources, from
the standpoint of their external supporters.
4 - Who should evaluate?
The search for proper criteria is but one aspect of a broader set of concerns:
regarding the evaluation of R&D activities. Equally compelling is the
question of who is entitled to evaluate and how wide a latitude of choice
and decisional authority should be granted to (10)evaluators.
This question stems from the very nature of R&D efforts. The highly
specialized character of scientific and technological work entails a strong
measure of self-evaluation and autonomy, since the adequacy of its goals
and procedures can hardly be judged by laymen. In addition, scientific researchers
normally form a corporate group whose self-esteem and professional pride
tend to make them refractory to outside interference.
As a result, attempts by planning or regulatory agencies to bring scientific
research under close control are liable to fail either because these agencies
seldom posses the necessary expertise or because scientists will naturally
resist attempts at external control by moving to other jobs. Agencies can,
of course, hire scientists into administrative staffs and keep others under
control by granting them work privileges and monetary advantages; but the
best qualified researchers, who tend to be strongly committed to their academic
communities and enjoy access to a wider range of professional choice, are
not easily (11)attracted. Clearly, academic commitment is likely
to vary across fields of knowledge and different R&D activities, making
work in policymaking bodies probably more attractive to, say, scientific
researchers devoted to technological development than to those working in
less applied areas of research.
An alternative to R&D evaluation from without is the peer review system,
widely adopted for the selection of manuscripts submitted for publication
in scientific journals. The distribution of research grants through funding
agencies and the hiring and promotion of faculty(12).
Needless to say, peer review systems are far from full-proofed. But they
can easily muster the required expertise and are largely accepted as legitimate
by those subject to their evaluation(13).
A crucial problem of peer review systems is of course the selection of constituent
members. Such bodies are not immune to work as representatives of the corporate
interests of scientific researchers, particularly when their members are
nominated or elected by their own communities. Under such circumstances,
peer review bodies are bound to guide their decisions by the interests of
local constituencies which do not always coincide with more general criteria
of scientific quality or effectiveness. To avoid this pitfall, it is necessary
that members of peer judgment bodies be selected as independent and free
agents, whose decisional authority rests solely upon their professional
competence. It is equally necessary that they be known and respected by
their colleagues and hold a public image of expertise and probity if their
recommendations are to be at the same time consequential and legitimate.
The creation of peer review systems is thus a delicate operation, requiring
broad consultation and independent judgment. In fact, the process by which
such bodies are created is as important as the individual prestige of its
members in securing the authoritative nature of its recommendations.
A related problem has to do with the natural inclination of peer review
systems to base their evaluation of ongoing activities on previous achievements
of research groups and institutions. Às a result, they are prone to reinforce
the tendency toward concentration of resources and talent within national
R&D systems. Such inclination may be curbed or at least partially redressed
by means of quota systems which favor certain geographic areas, ethnic groups,
types of institutions, and fields of knowledge. The determination of quotas
seldom springs from purely technical or scientific criteria, but there is
no reason why they should not be established in close consultation with
specialists.
5 - Objective
vs. subjective indicators of performance
Objective indicators of performance have been often advocated as an alternative
that could help reduce some of the biases stemming from the unavoidable
subjectivity of peer review mechanisms. There exists a consensus, however,
that objective indicators cannot replace qualitative evaluation of scientific
procedures and results by qualified reviewers.
Two major types of objective indicators of scientific performance have been
developed, one based on cost-benefit considerations and the other on estimates
of scientific productivity. Cost-benefit estimates are approximations to
efficacy measures. They typically compare the cost of technological innovation
with its potential economic benefit as measured by the final product's market
value, thus providing a standard for R&D investment decisions.
Cost-benefit analysis has proved to be most useful where expensive and very
specific technological development projects are concerned. This approach,
however, is of limited value with regard to other areas of scientific development.
For one thing, it is practically impossible to estimate eventual spinoffs
of R&D programs or future market receptivity to products that are not
yet in existence. R&D project costs are also difficult to estimate,
to say nothing of external environmental effects which are usually not even
taken into account in many technological development programs.(14)
Lastly, long-run technological projects deal with future configurations
of social and economic needs which are difficult to predict. In such cases,
cost-benefit estimates should merely provide general guidelines for decisions
have an essentially subjective dimension.
Empirical indicators of scientific productivity, some of which can be rather
sophisticated, have also been developed in recent years. Such indicators
are measures of effectiveness.
One well-known productivity indicator is the number of publications produced
by scientists, institutions, or research groups(15).
More elaborate versions, which weight physical production by quality, have
also appeared, in particular citation indexes through which the impact of
specific scientific contributions on the scientific community can be estimated.
Other common indicators include participation in international meetings,
academic qualification and awards. and the like.
The concern with the development of objective and quantifying measures of
scientific performance, however, should not obscure the fact that such indicators
dwell, in last instance, on subjective evaluations of specific segments
of the scientific community, be it editors of scientific journals academic
departments. or peer judgment groups. Therefore. however use-ful they may
be as criteria for the evaluation of a whole area of R&D activities,
such indicators are certainly inappropriate as guides for the evaluation
of specific projects or research groups. "Number of publications,"
for example, is too rough a measure of scientific merit as its meaning is
contingent upon academic discipline and personal style. The meaning of citations
can likewise vary and is not rare that articles are cited merely because
they happen to report a widely prescribed technicality in lieu of a major
scientific contribution. Last but not least, the selection of journals and
papers that are to be indexed and included in these measures is always subject
to national or linguistic biases that are practically impossible to control.
Objective indicators, then, seem to serve two main purposes. On the one
hand, they are useful for the evaluation of long-term developments and may
help detect R&D efforts that have lagged behind or that are of exceptional
quality. On the other hand, objective measures of scientific performance
can provide review committees with useful points of reference for making
the comparisons which form the basis of peer evaluation.
6 - The uses of R&D
evaluation
Research and development evaluation can serve important purposes. For planning
and science policy agencies, it can provide operational guidelines that
are necessary if limited resources are to be properly distributed to a large
number of applicants. At higher levels of authority, it can assure that
resources are being applied according to the institution's goals. In general,
it can lead to better utilization of existing resources and contribute to
the improvement or scientific and technological work.
R&D evaluation may also have other less direct but perhaps more important
consequences. It can bestow legitimacy upon national R&D systems, enhance
their national and international prestige, and assure access to needed resources.
Moreover, an established system of evaluation extensively based on peer
review mechanisms provides scientific researchers with a sense of efficacy
and participation in the conduct of their own affairs. The establishment
of legitimate standards of evaluation can also have important educational
effects over national R&D systems. By providing criteria for the ranking
of research groups and institutions along a gradient of scientific excellence,
and by establishing the associated system of rewards, R&D evaluation
mechanisms help create and crystallize models of scientific performance.
Therefore, far from being a mere technical device for resource distributions
evaluation can play an important role in the improvement and development
of national R&D systems. In reality, it may well be that an institutionalized
and respected system of R&D evaluation constitute a crucial condition
for the sustained support for science and technology in modern societies.
Notes
1. The notion that the scientific enterprise is strongly
tainted by corporate self-protection and deception appears in William Broad
and Nicolas Wade, Betrayers of the Truth: Fraud and Deceit in the Halls
of Science (New York: Simon and Schuster, 1982). For a rebuttal, see
Henry A. Bauer. "Betrayers of the Truth: A Fraudulent and Deceitful
Title from the Journalists of Science", 4s Review. Vol.1,
N.3, Fall 1983, pp.
2. The conditions for the emergence of modern science
are discussed in the seminal works of Robert K. Merton, Science, Technology
and Society in Seventeenth -Century England (New York: Harper &
Row, 1970) and Joseph Ben-David, The Scientists' Role in Society: a
Comparative Study (Englewood Cliffs, NJ: Prentice-Hall.1971). .17-23.
3. See A. Gerschenkron, Economic Backwardness in
Historical Perspective (Cambridge. Massachusetts: Harvard University
Press, 1962).
4. For the importance of market mechanisms and alternative
ways for fostering efficiency, see A. O. Hirschman, Exit, Voice and
Loyalty - Response to Decline in Firms, Organizations and States (Cambridge,
Massachusetts: Harvard University Press, 1970).
5. The classic reference is M. Polanyi, Personal
Knowledge (Chicago: Chicago University Press, 1958). See also Robert
K. Merton, 'The Normative Structure of Science", in R. K. Merton, the
Sociology of Science (Chicago: The University of Chicago Press, 1973).
6. See Naomi Caiden and Aaron Wildawsky, Planning
and Budgeting in Poor Countries (New York: John Wiley & Sons, 1974).
7. On the quality of third world research. see Eugene
C. Garfield, "Mapping Science in the Third World", Science
and Public Policy. June, 1973. pp. 112-127.
8. "In most general terms, it can be said that efficiency
is a concept intrinsic to Science and Technology which measures how far
resources invested in R&D have been productive within reasonable time
limits; it can theoretically be judged by input - output ratios whereas
effectiveness is a concept extrinsic to S&T which gauges the output
of RID both qualitatively and quantitatively against socioeconomic goals
or objectives pursued." See Unesco, Science. Technology and Governmental
Policy - a Ministerial Conference for Europe and North America. Minespol
II, 1979. p. 42.
9. The muldidimensionality of research performance has
been one of the most consistent results of the "International Comparative
Study on the Research Performance of Research Units" which is being
conducted worldwide under the coordination of Unesco. The following classification
is adapted from Frank Andrews (editor). Scientific Productivity
(Cambridge University Press and UNESCO. 1979, p. 39).
10. Stuart S. Blume argues that there is a growing realization
that "the right priorities are those which emerge from a properly constituted
committee or other forum or process". The same applies do R&D evaluation.
See his "Determining Priorities for Science and Technology - A Review
of the Unesco Method and its Application". Paper presented do the UNESCO
Seminar on Evaluation of Priority Determination Methods in Science and Technology.
Paris, 23-30 September. 1983.
11. The notion that academic work entails a strong component
of discipline and collegially oriented authority. developed by Burton R.
Clark in his analysis of higher education systems applies as well to the
scientific research community. See his the Higher Education System:
Academic Organization in Cross-national Perspective (Berkeley and Los
Angeles: University of California Press.1983, especially Chapter four, "Authority").
12. See Robert K. Merton and Harriet Zuckerman, "Institutionalized
Patterns of Evaluation in Science", in Robert K. Merton, the Sociology
of Science. edited by Norman W. Storer. Chicago: The University of
Chicago Press, 1973. pp.460-496
13. On the limitations of peer review mechanisms, see
Stephen Cole, Jonathan R. Cole and Gary A. Simon, "Chance and Consensus
in Peer Review", Science Vol. 214. N. 20. November, 1981.
pp.881-886. See also Rustum Roy. "An Alternative Funding Mechanism",
Science. Vol. 211, N. 27, March, 1981. p. 1377 (editorial) and
the ensuing debate in the same journal.
14. This is the object of what is known as "technological
assessment", constituting a wholly different area of R&D evaluation.
15. This is part of the growing field of "scientometrics",
originally associated with the work of Derek J. de Sola Price. See his Little
Science, Big Science (New York: Columbia University Press, 1963).
<