Indicators for Measurement of the Knowledge Base, Loet Leydesdorff

Loet Leydesdorff

Science & Technology Dynamics, University of Amsterdam

Amsterdam School of Communications Research (ASCoR)

loet@leydesdorff.net ; http://www.leydesdorff.net

 

Abstract

The knowledge infrastructure of an economy can be viewed in terms of the networked relations among universities, industries, and governmental agencies. While both the links and the nodes of the networks can be measured by using various indicators (e.g., patents, hyperlinks, citations), the knowledge base can be considered as a result of the interacting fluxes of communications through these networks. This article provides a state-of-the-art review of the indicators used for these measurements in science and technology studies from the perspective of developing indicators for knowledge-based innovations.

 

  1. Introduction

 

The competitive advantages in a knowledge-based economy can no longer be attributed to a single node in the network. The network coordinates the subdynamics of (i) wealth production, (ii) organized novelty production, and (iii) private appropriation versus public control. Political economies are upset and increasingly reshaped by knowledge-based developments (Schumpeter [1939], 1964; Nelson & Winter, 1982). These dynamics are complex and therefore cannot be expected to contain central coordination (Leydesdorff & Van den Besselaar, 1994; Krugman, 1996).

 

The knowledge infrastructure of knowledge-based systems of innovations, however, can be refocused (operationalized) in terms of networks of university-industry-government relations (Etzkowitz & Leydesdorff, 2000). Knowledge flows both within organizations and across institutional boundaries. In order to study organized knowledge production, one first has to distinguish analytically between the intellectual and the institutional organization of social systems. The intellectual organization evolves over time and across boundaries, whereas the institutional organization provides structural coordination at each moment in time.

 

The intellectual ‘knowledge base’ can also be considered as an overlay of the expectations and exchanges carried by institutional manifestations (Whitley, 1984; Luhmann, 1984 and 1990; Leydesdorff, 2001) Network arrangements provide the background for knowledge flows (Castells, 1996; David & Foray, 1994; Freeman & Perez, 1988). In a knowledge-based economy the institutional arrangements among knowledge organizers (e.g., universities, industries, and governmental agencies) can become a necessary condition not only for producing, but also for retaining wealth from knowledge (e.g., Popper & Wagner, 2002).

 

In other words, the ‘knowledge base’ of an economy generates a dynamics different from that of a political economy. For example, pharmaceutical corporations can nowadays no longer carry the costs of biotechnological innovations without relying on knowledge networks (Owen-Smith et al., 2002). Corporate boundaries increasingly function as mechanisms for the appropriation and shielding of competitive advantages from the knowledge fluxes through such networks. However, systems that contain knowledge (e.g., knowledge-based innovation systems) cannot be considered as given or immediately available for observation. The analyst has to specify the relevant system (e.g., a national system, a sector or a technology) before the latter can be indicated or measured. For this specification, the quantitative analysis remains thoroughly dependent on the qualitative hypotheses.

 

For example, one can raise the question of whether the so-called “Mode 2”-type of knowledge production (Gibbons et al., 1994) has indeed become dominant in the production of scientific knowledge. “Mode 2” concentrates on the interaction, rather than the distinction, between fundamental principles and applied knowledge. What would count as a demonstration of this dominance, and what would count as a counterargument? Can instances be specified in which one would also be able to observe processes of transition between the two modes? What should one measure in which instances, and why?

 

  1. Institutional and functional differentiation

 

The knowledge base of an economy is operational. During its evolutionary reconstruction, elements from different sources are recombined under the pressure of economic competition. The network of university-industry-government relations can be considered as an institutional “knowledge infrastructure” that carries a system of operations containing science, technology, and knowledge-based innovations.

 
Figure 1

Institutional and functional differentiation in the Internet age

(from: Leydesdorff & Scharnhorst, 2002)

 

2.1 Technology Indicators

 

Historically, patent data bases provide us with the oldest indicators. The U.S. Patent and Trademark Office makes all its patents available on-line at http://www.uspto.gov/ with images for the period 1790-1975, and searchable texts since 1975. The World Patent database can be researched from the website of the European Patent Office at http://ep.espacenet.com/.

 

A patent contains a wealth of information. For example, the patent may cite previous patents, or may itself be cited. The Patent Examiner provides citations in order to document the place of the claimed novelty in relation to the state of the art (Granstrand, 1999).

 

In science-based fields like biotechnology, patents are also linked to the scientific literature by citations. The Bayh-Dole Act of 1980 granted publicly funded research institutes in the United States the right to patent their inventions. Since then, university patenting has been expanding rapidly (Cohen, Nelson, & Walsh, 2002). Figure 2 exhibits the participation of universities in U.S. patenting by using the word “university” as a search term among the patent assignees filed in this database.

 

Figure 2

Percentage of U.S. Patents containing the term “university” among the assignees since the Bayh-Dole Act of 1980

 

Figure 3 shows the outcome of a more sophisticated analysis using this data. The 16 U.S. patents of 2001 with “stem cell” in their title are here analyzed in terms of the scientific knowledge base. To this end, 53 meaningful title words in these 16 patents are combined with 47 title words in the 265 articles in the scientific literature that are cited. The title words are limited to those that occur at least eight times.

 

 

Figure 3

Mapping the knowledge base of 16 U.S. patents with the word combination “stem cell” in their title in 2001

 

The figure shows that a cluster of (white) patent words forms a core surrounded by a corona of scientific title words (in brown). Other words in patents are more loosely connected. Thus, the knowledge base of these patents can be made visible more precisely.

 

2.2 Science Indicators

 

The scientific literature was the first system to be subjected to bibliometric analysis (Garfield, 1979; Leydesdorff, 1995; Narin, 1976). The Science Citation Index and its counterparts in both the social sciences and the arts and humanities have become the standard for scientometric analyses to such an extent that funding decisions are often influenced by the status of research groups in these databases (Van Raan, 1988).

 

In addition to ranking authors and institutions in terms of numbers of publications and/or citations, the databases also enable us to “map the sciences” using co-authorship relations, co-citations, co-words, etc. These mappings can be done in terms of institutional units (research groups, institutes, countries) using address fields or in terms of cognitive units using the aggregated citation relations among scientific journals. Figure 4 provides a visualization of the international network of coauthorship relations taking countries as units of analysis. Taking a threshold of 50 coauthorship relations, it shows that 18 countries belong to a core set. A second group is related to this core set directly, while the largest group is peripheral to this network.

 

 

Figure 4

Core/periphery analysis of international co-authorship relations based on the Science Citation Index 2000 (coauthorship threshold > 50; from: Wagner & Leydesdorff, in preparation)

 

Figure 5 shows a map of the cognitive domains that surround the leading journal Biotechnology and Bioengineering in terms of its aggregated citation relations. The journal-journal citation matrix was factor analyzed, and the clusters indicated in the map produced by multi-dimensional scaling of this matrix. In addition to the general science cluster, containing among others Science and Nature, the map shows the more precise position of the biotechnology group amidst microbiology and various groups of chemistry journals.

 

 

 

Figure 5

Citation environment of the journal Biotechnology and Bioengineering in 2000

 

A. Appl Biochem Biotech

B. Appl Environ Microb

C. Appl Microbiol Biot

D. Biochem Eng J

E. Bioprocess Eng

F. Bioresource Technol

G. Biotechnol Adv

H. Biotechnol Bioeng

I. Biotechnol Lett

J. Biotechnol Prog

K. Chem Eng Sci

L. Cytotechnology

M. Enzyme Microb Tech

N. J Am Chem Soc

 

O. J Bacteriol

P. J Biol Chem

Q. J Biosci Bioeng

R. J Biotechnol

S. J Chem Technol Biot

T. J Membrane Sci

U. J Mol Catal B-Enzym

V. Nature

W. P Natl Acad Sci USA

X. Process Biochem

Y. Science

Z. Trends Biotechnol

a. Water Res

b. Water Sci Technol

 

 

2.3 Web Indicators

 

As noted, the patent data can nowadays be searched full-text on-line using the Internet. However, more dedicated databases like the Science Citation Index are only available commercially. Increasingly, publishers are making their journals available also in electronic versions, but often these are accessible only for subscribers and institutions.

 

The Internet itself offers us a rich resource in the public domain, but quality control of the recall is a problem. The search algorithms of most search engines are not publicly available, and the more dedicated databases sometimes block searching with specific web crawlers (Introna & Nissenbaum, 2000). The results of searches using a search engine provide us with an instantaneous window on the Internet, while one knows that the Internet itself is dynamically evolving. Pages may be changed, and the history of the Internet is being rewritten as it develops.

 

Among the many search engines the Advanced Search Engine of AltaVista enables us to generate time series of data in combination with all kinds of Boolean operators.

 

Figure 5

Results of searches using the AltaVista Advanced Search Engine (March 24, 2002)

Figure 5 shows the results of a search using the terms “university,” “industry,” and “government” for the period 1993-2001 in the AltaVista domain. Using this search engine, the exponential growth of the Internet can also be assessed (Figure 6). The relationship between the size of the AltaVista domain and the Internet, however, varies over time (Rousseau, 1999; Butler, 2000). This generates uncertainty about the quality of the representation.

 

Figure 6

The exponential growth curve of the AltaVista domain

 

 

  1. The combination of various indicators

 

By combining science, technology, and innovation indicators, one can, for example, make a comparison between two units of analysis such as nation-states. Table One provides such a comparison between South Korea and the Netherlands.

 

Table 1

A comparison between South-Korea and the Netherlands for 2001

(from: Park & Leydesdorff, in preparation; AltaVista searches on August 23, 2002)

 

Categories South Korea Netherlands Data Source Comments
Scientific research articles 17,339 22,389 Science Citation Index Korea is weaker than the Netherlands in the production of scientific knowledge
Patents invented or assigned 3,958 2,098 U.S. Patent  & Trademarks Office Korea is industrially more inventive than the Netherlands.
Web pages 804,558 1,395,093 AltaVista Korea is less present on the Internet (less post-industrial).
Incoming hyperlinks 142,156 919,347 AltaVista Korea is far less internationally attractive
Outgoing hyperlinks 8,092 18,241 AltaVista Korea is less globally oriented

 

 

  1. The time dimension in processes of knowledge evolution

 

The evolving character of the Internet makes us reflexively aware that the time axis poses a problem for measurement. From a policy perspective, one is primarily interested in the current state of a system under study; one evaluates the information contained in the data with hindsight and with reference to the potentials for future developments. Statisticians tend to construct their databases on the basis of a historical time series, e.g., by taking 1990 as a baseline.

 

In emerging fields like biotechnology, artificial intelligence, new materials, etc., the meaning of these categories is sometimes redefined as a result of the development. Thus, one needs to update the categories from the present understanding and to backtrack from there into their history. This procedure is unavoidable when using Internet data, but it may be problematic when using scientometric databases. The categories cannot be changed each year. A reconstruction on the basis of today’s understanding can differ significantly from how a field was defined ten years ago.

 

In questions with high policy relevance, the scientometric measurement is sometimes highly sensitive to analytical decisions about the various parameters. For example, in the 1980s a debate raged in the literature about “the decline of British science” (Irvine, Martin, Peacock, & Turner, 1985; Leydesdorff, 1988; Anderson et al., 1988; Braun, Glänzel, & Schubert, 1991). Whereas the performance measurement of British science showed a decline using an ex ante fixed journal set, measurement in a dynamic journal set showed mainly stability (Leydesdorff, 1991; Martin, 1994).

 

The measurement of changes and fluxes requires an information calculus, whereas most indicators measure the values of variables instantaneously, for example, in multidimensional spaces. The differences which result from a comparison between such snapshots, however, does not necessarily represent “development.” For example, they could also represent differences in the errors in the measurements.

 

Probabilistic entropy measures provide us with a calculus that can also be used for the measurement of “self-organization” in complex dynamics (Leydesdorff, 2001). However, the more sophisticated indicators are less accessible to the intuitive understanding. For example, Figure 7 shows the development of negative entropy in the mutual information between the data exhibited in Figure 5 above. This negative entropy has increased during the expansion of the Internet during the second half of the 1990s, but this increase has slowed in recent years although the Internet has continued to expand (see Figure 6).

 

Figure 7

Mutual information in three dimensions (‘university,’ ‘industry,’ ‘government’) as measured using the AltaVista Advanced Search Engine (March 24, 2002)

 

There is an intimate connection between indicator research and parameter estimation in simulation studies when analyzing knowledge-based systems (Leydesdorff and Scharnhorst, 2002). Indicators study knowledge production and communication in terms of the traces that communications leave behind, while simulations try to capture the operations and their interactions. The common assumption of indicator research and simulation studies is that knowledge production, communication, and control are considered as operations that change the materials on which they operate. The unit of analysis is replaced with a unit of operation.

 

The relations between empirical studies and algorithmic simulations have to be guided by theorizing. Otherwise, the number of options explodes without quality control. What do the different pictures mean? Further theoretical specification as well as methodological control is continuously needed when one reconstructs knowledge-based systems reflexively.

 

 

References

 

Anderson, J., P. M. D. Collins, J. Irvine, P. A. Isard, B. R. Martin, F. Narin & K. Stevens (1988). On-line approaches to measuring national scientific output: A cautionary tale. Science and Public Policy 15, 153-161.

Braun, T., W. Glänzel & A. Schubert (1991). The Bibliometric Assessment of UK Scientific Performance—Some Comments on Martin’s Reply. Scientometrics, 20, 359-362.

Butler, D. (2000). Souped-up search engines, Nature, Vol. 405, 411 May 2000, 2112-2115.

Castells, M. (1996). The Rise of the Network Society. Cambridge, MA: Blackwell Publishers.

Cohen, W. M., R. R. Nelson & J. P. Walsh (2002). Links and Impacts: The Influence of Public Research on Industrial R&D. Management Science, 48(1), 1-23.

David, P. A., & D. Foray. (1994). Dynamics of Competitive Technology Diffusion Through Local Network Structures: The Case of EDI Document Standards. In L. Leydesdorff & P. Van den Besselaar (Eds.), Evolutionary Economics and Chaos Theory: New directions in technology studies (pp. 63-78). London: Pinter.

Garfield, E. (1979). Citation Indexing New York: Wiley.

Etzkowitz, H., & L. Leydesdorff (2000). The Dynamics of Innovation: From National Systems and “Mode 2” to a Triple Helix of University-Industry-Government Relations, Research Policy, 29(22), 109-123.

Freeman, C., & C. Perez (1988). Structural crises of adjustment, business cycles and investment behaviour. In G. Dosi, C. Freeman, R. Nelson, G. Silverberg, and L. Soete (Eds.), Technical Change and Economic Theory (pp. 38-66). London: Pinter.

Gibbons, M., C. Limoges, H. Nowotny, S. Schwartzman, P. Scott, & M. Trow. (1994). The new production of knowledge: the dynamics of science and research in contemporary societies. London: Sage.

Granstrand, O. (1999). The Economics and Management of Intellectual Property: Towards Intellectual Capitalism. Cheltenham, UK: Edward Elgar.

Introna, L. D., & H. Nissenbaum. (2000). Shaping the Web: Why the Politics of Search Engines Matter. The Information Society 16, 169-185.

Irvine, J., B. Martin, T. Peacock & R. Turner (1985). Charting the decline of British science. Nature, 316, 587-590.

Krugman, P. (1996). The Self-Organizing Economy. Malden, MA, and Oxford: Blackwell.

Leydesdorff, L. (1988). Problems with the “measurement” of national scientific performance. Science and Public Policy 15, 149-152.

Leydesdorff, L. (1991). On the “Scientometric Decline” of British Science. One Additional Graph in Reply to Ben Martin. Scientometrics, 20, 363-367.

Leydesdorff, L. (1995). The Challenge of Scientometrics: the development, measurement, and self-organization of scientific communications. Leiden: DSWO Press, Leiden University; at http://www.upublish.com/books/leydesdorff-sci.htm .

Leydesdorff, L. (2001). A Sociological Theory of Communication: The Self-Organization of the Knowledge-Based Society. Parkland, FL: Universal Publishers; at <http://www.upublish.com/books/leydesdorff.htm >.

Leydesdorff, L., & P. v. d. Besselaar (Eds.). (1994). Evolutionary Economics and Chaos Theory: New Directions in Technology Studies. London and New York: Pinter.

Leydesdorff, L. & A. Scharnhorst (2002). Measuring the Knowledge Base: A Program of Innovation Studies. Report to “Förderinitiative Science Policy Studies” of the German Bundesministerium für Bildung und Forschung. Berlin: Berlin-Brandenburgische Akademie der Wissenschaften.

Luhmann, N. (1984). Soziale Systeme. Grundriß einer allgemeinen Theorie. Frankfurt a. M.: Suhrkamp [Social Systems. Stanford, CA: Stanford University Press, 1995].

Luhmann, N. (1990). Die Wissenschaft der Gesellschaft. Frankfurt a.M.: Suhrkamp.

Martin, B. R. (1994). British Science in the 1980s–Has the Relative Decline Continued? Scientometrics, 29, 27-57.

Narin, F. (1976). Evaluative Bibliometrics: The Use of Publication and Citation Analysis in the Evaluation of Scientific Activity. Washington, DC: National Science Foundation.

Nelson, R. R., & S. G. Winter. (1982). An Evolutionary Theory of Economic Change. Cambridge, MA: Belknap Press of Harvard University Press.

Park, H. & L. Leydesdorff, A Comparison of the Knowledge-Based Innovation Systems in the Economies of South-Korea and the Netherlands (in preparation).

Rousseau, R. (1999). Daily time series of common single word searches in AltaVista and NorthernLight,. Cybermetrics 2/3, Paper 2 at <http://www.cindoc.csic.es/cybermetrics/articles/v2i1p2.html>.

Van Raan, A. F. J. (Ed.). (1988). Handbook of Quantitative Studies of Science and Technology Amsterdam: Elsevier).

Wagner, C. S. & L. Leydesdorff. Mapping Global Science Using International Co-authorships: A Comparison of 1990 and 2000. Paper to be presented at the 9th Conference of the International Society of Scientometrics and Informetrics, Beijing, August 2003.

Whitley, R. D. (1984). The Intellectual and Social Organization of the Sciences. Oxford: Oxford University Press.