Ko 40 2013 1

Knowl. Org. 40(2013)No.
KO
KNOWLEDGE ORGANIZATION
ISSN 0943 7444
Official Bi-Monthly Journal of the International Society for Knowledge Organization
International Journal devoted to Concept Theory, Classification, Indexing and Knowledge Representation
Contents
Editorial Richard P. Smiraglia. ISKO 12s BookshelfEvolving Intension: An Editorial ......................................................................... 3 Articles Birger Hjrland. User-based and Cognitive Approaches to Knowledge Organization: A Theoretical Analysis of the Research Literature ................................. 11 Mercedes de la Moneda Corrochano, Mara J. Lpez-Huertas and Evaristo Jimnez Contreras. Spanish Research in Knowledge Organization (2002-2010) ...................................................................... 28 Joseph T. Tennis. Ethos and Ideology of Knowledge Organization: Toward Precepts for an Engaged Knowledge Organization ..................................................................... 42 Maria Luiza de Almeida Campos, Maria Luiza Machado Campos, Alberto M. R. Dvila, Hagar Espanha Gomes, Linair Maria Campos, and Laura de Lira e Oliveira. Information Sciences Methodological Aspects Applied to Ontology Reuse Tools: A Study Based on Genomic Annotations in the Domain of Trypanosomatides............................................................. 50 Book Review Rajendra Kumbhar. Library Classification Trends in the 21st Century. Witney, UK: Chandos Publishing (Oxford) Ltd.: 2012. ISBN: 1843346605, 9781843346609...............62 Classification Issues Nancy J. Williamson. Paradigms and Conceptual Systems in Knowledge Organization, the Eleventh International ISKO Conference, Rome, 2010.........................................................................64 Letters to the Editor Rick Szostak. Speaking Truth to Power in Classification: Response to Foxs Review of My Work; KO 39:4, 300 ......................................................................76 Guohua Xiao. A Knowledge Classification Model Based on the Relationship Between Science and Human Needs ...........77 Index to Volume 39 ..........................................................79
Knowl. Org. 40(2013)No.1
Official Bi-Monthly Journal of the International Society for Knowledge Organization
KO
ISSN 0943 7444
International Journal devoted to Concept Theory, Classification, Indexing and Knowledge Representation
This journal is the organ of the INTERNATIONAL SOCIETY FOR KNOWLEDGE ORGANIZATION (General Secretariat: Vivien PETRAS, Humboldt-Universitt zu Berlin, Institut fr Bibliotheks- und Informationswissenschaft, Unter den Linden 6, 10099 Berlin, Germany. E-mail: secr@isko.org.
Dr. Rebecca GREEN, Assistant Editor, Dewey Decimal Classification, Dewey Editorial Office, Library of Congress, Decimal Classification Division , 101 Independence Ave., S.E., Washington, DC 20540-4330, USA. E-mail: greenre@oclc.org Dr. Jos Augusto Chaves GUIMARES, Departamento de Cincia da Infromao, Universidade Estadual PaulistaUNESP , Av. Hygino Muzzi Filho 737, 17525-900 Marlia SP Brazil. E-mail: guima@marilia.unesp.br Dr. Birger HJRLAND, Royal School of Library and Information Science, Copenhagen Denmark. E-mail: bh@iva.dk Dr. Barbara H. KWASNIK, School of Information Studies, Syracuse University, Syracuse, NY 13244 USA. E-mail: bkwasnik@syr.edu Dr. Mara J. LPEZ-HUERTAS. Universidad de Granada, Facultad de Biblioteconoma y Documentacin, Campus Universitario de Cartuja, Biblioteca del Colegio Mximo de Cartuja, 18071 Granada, Spain. E-mail: mjlopez@ugr.es Dr. Kathryn LA BARRE, The Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign, 501 E. Daniel Street, MC-493, Champaign, IL 61820-6211 USA. E-mail klabarre@illinois.edu Dr. Marianne LYKKE, e-Learning Lab, Center for User-driven Innovation, Learning and Design, Department of Communication, Aalborg University, Kroghstraede 1, room 2.023 Denmark 9220 Aalborg OE. E-mail: mlykke@hum.aau.dk Dr. Ia MCILWAINE (Literature Editor), Research Fellow. School of Library, Archive & Information Studies, University College London, Gower Street, London WC1E 6BT U.K. E-mail: i.mcilwaine@ucl.ac.uk Dr. Jens-Erik MAI, Royal School of Library and Information Science, Copenhagen Denmark. E-mail: jem@iva.dk Dr. Widad MUSTAFA el HADI, Universit Charles de Gaulle Lille 3, URF IDIST, Domaine du Pont de Bois, Villeneuve dAscq 59653, France. E-mail: widad.mustafa@free.fr H. Peter OHLY, Prinzenstr. 179, D-53175 Bonn, Germany. E-mail: president@isko.org Dr. K. S. RAGHAVAN, DRTC, Indian Statistical Institute, Bangalore 560 059, India. E-mail: raghavan@isibang.ac.in Dr. M. P. SATIJA, Guru Nanak Dev University, School of Library and Information Science, Amritsar-143 005, India. E-mail: satija_mp@yahoo.com Dr. Aida SLAVIC, UDC Consortium, PO Box 90407, 2509 LK The Hague, The Netherlands. E-mail: aida.slavic@udcc.org Dr. Dagobert SOERGEL, Department of Library and Information Studies, Graduate School of Education, University at Buffalo, 534 Baldy Hall, Buffalo, NY 14260-1020. E-mail: dsoergel@buffalo.edu Dr. Renato R. SOUZA, Applied Mathematics School, Getulio Vargas Foundation, Praia de Botafogo, 190, 3o andar, Rio de Janeiro, RJ, 22250-900, Brazil. E-mail: renato.souza@fgv.br Dr. Maja UMER, Faculty of Arts, University of Ljubljana, Askerceva 2, Ljubljana 1000 Slovenia, E-mail: maja.zumer@ff.uni-lj.si
Editors
Dr. Richard P . SMIRAGLIA (Editor-in-Chief), School of Information Studies, University of Wisconsin, Milwaukee, Northwest Quad Building B, 2025 E Newport St., Milwaukee, WI 53211 USA. E-mail: smiragli@uwm.edu Dr. Joseph T. TENNIS (Book Review Editor), The Information School of the University of Washington, Box 352840, Mary Gates Hall Ste 370, Seattle WA 98195-2840 USA. E-mail: jtennis@u.washington.edu Dr. Nancy WILLIAMSON (Classification Research News Editor), Faculty of Information Studies, University of Toronto, 140 St. George Street, Toronto, Ontario M5S 3G6 Canada. E-mail: william@fis.utoronto.ca David J. BLOOM (Editorial Assistant), School of Information Studies, University of Wisconsin, Milwaukee, Northwest Quad Building B, 2025 E Newport St., Milwaukee, WI 53211 USA. Melodie Joy FOX (Editorial Assistant), School of Information Studies, University of Wisconsin, Milwaukee, Northwest Quad Building B, 2025 E Newport St., Milwaukee, WI 53211 USA. Daniel Martnez vila (Editorial Assistant), Department of Library and Information Science, University Carlos III of Madrid, C/Madrid 126 28903 Getafe Madrid, Spain. E-mail: dmartine@bib.uc3m.es
Editors Emerita
Dr. Hope A. OLSON, School of Information Studies, 522 Bolton Hall, University of Wisconsin-Milwaukee, Milwaukee, Northwest Quad Building B, 2025 E Newport St., Milwaukee, WI 53211 USA. E-mail: holson@uwm.edu Dr. Clare BEGHTOL, Faculty of Information Studies, University of Toronto, 140 St. George Street, Toronto, Ontario M5S 3G6, Canada. E-mail: clare.beghtol@utoronto.ca Ingetraut DAHLBERG, Am Hirtenberg 13, 64732 Bad Konig, Germany. E-mail: IDahlberg@t-online.de
Editorial Board
Dr. Jonathan FURNER, Graduate School of Education & Information Studies, University of California, Los Angeles, 300 Young Dr. N, Mailbox 951520, Los Angeles, CA 90095-1520, USA. E-mail: furner@gseis.ucla.edu Prof. Jess GASCN GARCA, Facultat de Biblioteconomia i Documentaci, Universitat de Barcelona, C. Melcior de Palau, 140, 08014 Barcelona, Spain. E-mail: gascon@ub.edu Claudio GNOLI, University of Pavia, Mathematics Department Library, via Ferrata 1, I-27100 Pavia, Italy. E-mail: gnoli@aib.it
Knowl. Org. 40(2013)No.1 ISKO 12s Bookshelf Evolving Intension: An Editorial
ISKO 12s BookshelfEvolving Intension: An Editorial

Richard P . Smiraglia
1.0 The 12th International ISKO Conference, Mysore, India The 2012 biennial international research conference of the International Society for Knowledge Organization was held August 6-9, in Mysore, India. It was the second international ISKO conference to be held in India (Canada and India are the only countries to have hosted two international ISKO conferences), and for many attendees travel to the exotic Indian subcontinent was a new experience. Interestingly, the mix of people attending was quite different from recent meetings held in Europe or North America. The conference was lively and, as usual, jam-packed with new research. Registration took place on a veranda in the garden of the B. N. Bahadur Institute of Management Sciences where the meetings were held at the University of Mysore. This graceful tree (Figure 1) kept us company and kept watch over our considerations (as indeed it does over the academic enterprise of the Institute). The conference theme was Categories, Contexts and Relations in Knowledge Organization. The opening and closing sessions fittingly were devoted to serious introspection about the direction of the domain of knowledge organization. This editorial, in line with those following past international conferences, is an attempt to comment on the state of the domain by reflecting domain-analytically on the proceedings of the conference, primarily using bibliometric measures. In general, it seems the domain is secure in its intellectual moorings, as it continues to welcome a broad granular array of shifting research questions in its intension. It seems that the continual concretizing of the theoretical core of knowledge organization (KO) seems to act as a catalyst for emergent ideas, which can be observed as part of the evolving intension of the domain. The proceedings of the conference (Neelameghan and Raghavan 2012) were used to generate the analysis reported here. For the first time in recent memory, many papers were not presented by their authors, but rather were presented by colleagues who were in attendance. Be that as it may, two papers (one by Szostak and one by Campbell) appear in the printed proceedings that were not presented in Mysore. After some rumination it was decided to include those papers in the present analysis, insofar as the printed record of the conference will live on into the future with those papers in the mix. It continues to be a problem for domain analytic research that Thompson Reuters Web of Science, for some reason, is not indexing international ISKO proceedings. Manual indexing such as that represented here is difficult and time-consuming, but
Figure 1. Garden at the B.N. Bahadur Institute of Management Sciences, University of Mysore
at the moment is our only option. Therefore the spreadsheet including the conference program as well as all of the references from all of the papers that was the nexus of the present analysis is available for download (caveat emptor!) at my LazyKOblog (http:// lazykoblog.wordpress.com/). Ultimately, 55 papers included in the proceedings were used for this analysis. 2.0 International Presence and Thematic Foci Like most international ISKO conferences, attendance is influenced to a greater or lesser degree by the location of the meeting and this conference was no exception. The country of affiliation of the first author of each paper was recorded. This yielded a list of seventeen nations represented, and these are shown in Figure 2. Regional attendees were present in substantial numbers (18% from India, another 8% from Asian countries), presenting a different geopolitical mix from that of the 11th international conference in Rome in 2010, at which there was no Asian participation (Smiraglia 2011). Also impressive was the Brazilian presence, which accounted for nearly a third of
the papers, more than doubling their presence in 2010. Notable newcomers were authors from Iran and Algeria. Clearly, the domain continues to grow internationally. Conference sessions were divided thematically, and the distribution of themes as represented in the official conference programme is shown in Table 1.
Theme digital KO relationships design and development domain of KO domain specificity archives ontology users and context categories general classifications information mining navigation Table 1. Conference themes No. of papers 7 7 6 6 6 4 4 4 3 3 3 3
Figure 2. Countries of affiliation
Thematically speaking, the contents of the conference were typical, with digital KO and relationships forming the largest clusters, and new clusters (since 2010) for information mining and archives. A 2x2 matrix was used to generate a three-dimensional visualization of the thematic interests by country of affiliation, shown in Figure 3. There is greatest diversity in the large cohorts from Brazil, India, Canada and the USA, but thematic diversity is spread evenly across the whole geographic distribution as well. 3.0 Citations There were 850 citations in 55 papers. The number of citations per paper ranged from 1 to 36 with a mean of
14.45 (which is comparable to the 2010 mean of 14.88). The median was 12 and the mode was 8, which were both higher than in 2010. In other words, most papers had 8 citations within a fairly wide range. The mean per country was analyzed, ranging from 6 to 18.7 with most hovering near the mean. The age of work cited also was analyzed; the mean was 13.1 years, with a median of 9 years and a mode of 1 year. This means most citations were to very recent material, but as always, there was a wide range (from 1 to 173). Calculated by country of affiliation the mean age of citation ranged from 6 (Slovenia) to 18.7 (Poland). The mean age of work cited and mean number of references per country were plotted together and this is shown in Figure 4.
Figure 3. Theme by Country; Country by Theme
Figure 4. References and age of citation by country of affiiliation
The figure helps visualize the variation by country, although it also emphasizes the fact that both hover near the overall means. Thematic clusters also were analyzed and plotted and these are shown in Table 2 and visualized in Figure 5.
Theme archives categories design and development digital KO domain of KO domain specificity general classifications information mining navigation ontology relationships users and context Mean # of References 15.7 17.6 10.8 12.8 21.3 15.8 12 8.6 15.6 18.5 11.2 13 Mean Age of Citation 9.9 23 9.5 8.6 14.2 17.8 23.4 7.1 14.7 10.5 9.9 9.9
T-tests showed that the differences from cluster to cluster were not statistically significant, suggesting the variation is due either to the individual preferences of the researchers involved, or to epistemological differences reflected methodologically. That is, humanistically-oriented papers likely will have more and older references than papers that report empirical research results. The distribution of media was also analyzed. Table 3 shows the distribution. About half of the citations are to journal articles. If KO were truly a science one might expect that proportion to be higher. But, given that there are few journals in the domain, and the constant stream of conferences provide a platform for the presentation of new research, one could read this the other way and say that about half of the papers cited are not from journals, but rather come from more immediate and scholar-oriented publications. 56 journals were cited; the most cited journals are shown in Table 4.
Table 2. References and age of citation by theme
Figure 5. References and age of citation by theme Medium journal articles conference papers monograph chapter web theses unidentifiable Table 3. Media types Proportion 47% 16% 15% 9% 7% 2% 1%
Journal title Knowledge Organization Journal of the American Society for Information Science Journal of Documentation Information Studies Information Processing and Management Cataloging & Classification Quarterly Cincia da Informao Scire Archivaria
No. of citations 33 21 17 9 5 4 4 4 4 Table 4. Most cited journals
There are no suprises in this tablethe present journal received the most citations, and as we saw in Table 3, the next largest cluster came from conference proceedings. 3.1 Citedness The 789 citations in the 55 contributed papers were sorted by first author and duplicates removed to generate a list of most-cited authors. This demonstrated that the citations were to 382 individual works, a large number for certain but demonstrating much less breadth than the 2010 conference (in which 972 citations were to 891 works). Single-occurrence authors were removed from the list, leaving 101 multiply-cited authors. The remaining authors were arrayed by frequency of citation, and the upper tier of this distribution appears in Table 5.
Author Smiraglia Hjrland Neelameghan Ranganathan Dahlberg Tennis La Barre Szostak Gardin Guimares Beghtol McIlwaine Table 5. Most cited authors Frequency of citation 22 18 16 12 11 11 8 8 7 7 6 6
These names were used to generate two co-citation analyses. First, the proceedings were analyzed for cocitation among the contributed papers. This matrix was plotted using SPSS and appears in Figure 6. In this case we are visualizing the perceptions of the authors who contributed papers to the Mysore conference concerning similarities among the co-cited authors. The plot exactly fits the model, however only 7 of the 13 authors were co-cited sufficiently to run the software. There are no secrets in this plotthe upper left cluster represents the movement for subject ontogeny started by Tennis and now joined by research teams studying the history of the UDC. The other cluster clearly joins concept theory with faceted classification; interestingly, the Brazilian influence on the conference is seen clearly in this cluster. These are artifacts of the particulars of the Mysore conference. A second author co-citation analysis was compiled using the same set of most-cited authors, but this time deriving co-citation data from Web of Science; this means that this external analysis reveals the perception of the domain at large about this cluster of authors whose research is most cited in the contributed papers for this conference. This is visualized in Figure 7. This plot also closely fits the model. This time cocitation is abundant. There are two major clusters, but both closely adhere to a separate cluster around the classic Ranganathan. In the upper cluster are digital systems for KO, and in the lower cluster is classical North American KO, but now clearly including an approach to faceted classification. Notice the proximity of La Barre to Ranganathan (facets) and also the distance of Beghtol and Hjrland, anchoring classical concept-theoretical positions similar to the author cocitation analysis of the most-cited 2010 authors. Notice also the density of the research front. When we compare the two visualizations we see clearly how the research front represented by the authors who contributed papers to this conference perceives the movement of the domains intension, toward faceted digital systems. But we also see a tightening of the core theoretical positions. These are signs of a domain that is intellectually secure, and is protecting its extension, while allowing experimentation on a broad scale in its intension. 4.0 Co-Word Analysis The titles of the 55 contributed papers were entered into WordStat and a frequency distribution of title keywords was generated. An unfiltered distribution yielded 301 keywords from the titles of the confer-
Figure 6. Interconference author co-citation (stress = 0 R2 = 1)
Figure 7. Author co-citation from Web of Science (stress = .03066 R2 = .99763)
ence papers; when passed through a dictionary designed for ISKO, the filtered distribution revealed 21 key terms. These two lists are brought together in Table 6. The filtered terms fit (more or less) with the thematic clusters from the conference programme; the unfiltered terms show us what the contributing researchers had in mind. There are few surprises, except that the granularity in the long tail (not shown here) included another 260 terms. Even frequently used terms such as classification appear only 2.1% of the time. So this is further evidence of the expanding, or shifting, or arguably evolving intension of the domain as represented by the papers contributed to this conference. A three-dimensional plot of the filtered keywords helps us visualize the thematic core of the domain as represented by the papers contributed to this conference. This is shown in Figure 8. The model fits the plot fairly well. As in the author co-citation analysis we have little density and clearly defined clusters. The associations are relatively consistently weak, but there are two distinct clusters. These clusters are familiar from earlier analyses of parts of the KO domain; there is a theoretical cluster around classification and concept theory; and there is a systems design cluster around the development of specific KO systems. Interestingly, in this cluster, epistemology resides with the system design cluster.
Filtered Term Organiz* Classificat* Domain Ontolog* Model Access Cognit* Concept Construct* Domain_analy* Thesaur* User Freq. 17 8 5 4 3 2 2 2 2 2 2 2 % 28.80% 13.60% 8.50% 6.80% 5.10% 3.40% 3.40% 3.40% 3.40% 3.40% 3.40% 3.40%
5.0 Mysore is Different This conference is different in many ways from those that preceded it. Of course, it was in India, and not in North America. Yet, although there was increased presence from Asian scholars, the theoretical core of the domain seems not to have shifted greatly. There is less granularity than we saw in Rome in 2010, but there is still sufficient activity in hypothesis-generation to keep the intension shifting gelatinously. Journal productivity measures, number of citations, and age of citation are consistent with the 2010 conference. The most-cited author list is a bit different from usual, although the visualization of the intellectual core that it provides indicates a shifting intension in the domain, particularly regarding facets, subject ontogeny, and digital solutions. References Neelameghan, A. and Raghavan, K.S. eds. 2012. Categories, contexts and relations in knowledge organization: Proceedings of the Twelfth International ISKO Conference 6-9 August 2012 Mysore, India. Wrzburg: Ergon Verlag. Smiraglia, Richard P . 2011. ISKO 11s Diverse Bookshelf: An Editorial. Knowledge organization 38: 179-86.
Unfiltered Term Knowledge Organization Information Classification Study Domain Semantic Subject Analysis Categories Indexing Systems Freq. 19 17 10 9 9 8 6 5 4 4 4 4 % 4.40% 4.00% 2.30% 2.10% 2.10% 1.90% 1.40% 1.20% 0.90% 0.90% 0.90% 0.90%
Table 6. Title Keywords
10
Figure 8. Co-Word Analysis (stress = 0.26976 R2 = 0.7517)
Knowl. Org. 40(2013)No.1 B. Hjrland. User-based and Cognitive Approaches to Knowledge Organization
11
User-based and Cognitive Approaches to Knowledge Organization: A Theoretical Analysis of the Research Literature
Birger Hjrland
Royal School of Library and Information Science, 6 Birketinget, DK-2300, Copenhagen, Denmark, <bh@iva.dk>
Birger Hjrland holds an M.A. in psychology and Ph.D. in Library and Information Science. He is professor in knowledge organization at the Royal School of Library and Information Science in Copenhagen since 2001 and at the University College in Bors 2000-2001. He was research librarian and coordinator of computer based information services at the Royal Library in Copenhagen 1978-1990, and taught information science at the Department of Mathematical and Applied Linguistics at the University of Copenhagen 1983-1986. He is a member of the editorial boards of the Journal of the American Society for Information Science and Technology and Journal of Documentation, chair of ISKOs Scientific Advisory Council, and consulting editor of Knowledge Organization. Hjrland, Birger. User-based and Cognitive Approaches to Knowledge Organization: A Theoretical Analysis of the Research Literature. Knowledge Organization. 40(1), 11-27. 95 references. ABSTRACT: In the 1970s and 1980s, forms of user-based and cognitive approaches to knowledge organization came to the forefront as part of the overall development in library and information science and in the broader society. The specific nature of userbased approaches is their basis in the empirical studies of users or the principle that users need to be involved in the construction of knowledge organization systems. It might seem obvious that user-friendly systems should be designed on user studies or user involvement, but extremely successful systems such as Apples iPhone, Dialogs search system and Googles PageRank are not based on the empirical studies of users. In knowledge organization, the Book House System is one example of a system based on user studies. In cognitive science the important WordNet database is claimed to be based on psychological research. This article considers such examples. The role of the user is often confused with the role of subjectivity. Knowledge organization systems cannot be objective and must therefore, by implication, be based on some kind of subjectivity. This subjectivity should, however, be derived from collective views in discourse communities rather than be derived from studies of individuals or from the study of abstract minds. Received 25 August 2012; Revised 12 September 2012; Accepted 12 September 2012
1.0 Introduction Hjrland (2008) listed six different approaches to knowledge organization (KO), including the facetanalytical approach, the information retrieval tradition, user oriented and cognitive views, bibliometric approaches, and the domain analytic approach. The theoretical assumptions underlying these different approaches have not been thoroughly discussed in the literature, and papers are planned about each of these traditions. The purpose of the present article is to examine the theoretical foundations of the user-based
and cognitive approach to KO, but it will not examine user-based or cognitive views in library and information science (LIS) in general, and will not include other subfields such as human-computer interaction. It will, however, include some overall perspectives on user studies and cognitive studies, which are considered important as background knowledge. The user-based and cognitive approaches to KO developed as part of the overall development in LIS especially in the 1970s and 1980s. In LIS, studies of the users of libraries and information services go back, according to Siatri (1999), to 1948 in the Scientific In-
12
formation Conference of the Royal Society, where Urquhart (1948) and Bernal (1948) reported their research findings. According to Martin (1976, 483), however, they go yet farther back: There is a long history of reader studies in American librarianship ... . In the 1920s and 1930s the stream widened and deepened, with the efforts first of William Gray and Ruth Monroe (1929) and then of Douglas Waples (1939) all seeking to utilize reliable samples and to reach valid conclusions. Also, Wilson (1994, 2000, 2008) identifies studies of library use and users dating back to 1916, reviewed by McDiarmid in 1940. Another early contribution to the field of user studies was the Russian researcher N. A. Rubakins (1862-1946) writings on bibliopsychology (Simsova 1968). It should also be mentioned that, in the neighboring field of media studies, use and gratification studies has been a related trend. Lazarsfeld (1940) is an early example who began seeing patterns from the perspective of the uses and gratifications of radio listeners. Menzel (1966) refers to two comprehensive bibliographies of user studies in LIS in 1964 and 1965, containing 438 and 676 studies, respectively. Since then, the field has grown further, and it is today one of the most researched areas in LIS (often referred to as information behavior studies). Some studies seem to indicate that user-based and cognitive views became influential in information science from about 1980. White and McCain (1998, 351) write: Our data have implied an increase of interest in the cognitive side of information science and generally in user studies since about 1980, the start of the second period. This independently corroborates claims to that effect by expert judges, such as Saracevic (1992), who calls it a paradigm shift, and Ingwersen (1996), who writes of it as the turning point 19771980. It is a very fragmented field with very many theories. Fisher, Erdelez, and McKechnie (2005) presented 72 different conceptual frameworks (and this is, in no way, a complete coverage of approaches). Although it is a productive subfield within LIS, it is not without problems and critics. Cronin (2009), for example, wrote: A great deal has been written on the subject of [users] information seeking over the years ... but there is a regrettable lack of cumulation and coherence. This development within LIS is related to developments in the broader society. Information scientist Harry Bruce (2002, 29) wrote:
In the past twenty-five years or so, we have seen what some have referred to as a user-centered revolution (Nahl 1996, 2003). This revolution is manifest in the policy, theory, methodology and practice of a range of disciplines and fields of study. The terminologies used to describe a focus on the beneficiaries or recipients of services, products, systems or professional actions vary. Engineers design end-user technologies. Businesses, organizations and institutions claim to be client centered, customer oriented or market driven. The education field is learner centered. Various stakeholders in the development of the Internet have developed versions of the user centered revolution but overall we can see a shift from technology to people, from product to service, from outcome to process and so on. The common ground is a focus on people user oriented, people centered, user based, human centered, user responsive and so on. The user focus is an amalgam of methods, approaches and techniques that provide professions and disciplines with ways to define, understand, explain, measure and ultimately serve, the needs of people. What Harry Bruce describes here is a general interdisciplinary and social trend of which LIS forms a part. A recent trend is customizing to make products tailored to specific customers. Pariser (2011), for example, describes how sites from Google and Facebook to Yahoo News and the New York Times are now increasingly personalizedbased on your Web history, they filter information to show you the stuff they think you want to see. That can be very different from what everyone else seesor from what we need to see. Very few people have questioned these user-based trends and discussed their overall ideological perspective. Such a discussion is much needed, however. It is not without problems to make educational institutions, libraries, scientific journals, databases, etc. driven by commercial criteria and user demands rather than by scholarly principles and criteria of quality (or, in the case of public libraries, by cultural policies). One hypothesis is, therefore, that the user-based approaches to LIS and KO are part of a larger trend, but that this has not been explicitly considered. Only a few people within LIS (e.g., Suominen 2007; Rosenbaum et al. 2003) have questioned the usercentered revolution in the scholarly literature. Also very few people have contrasted this view with alternatives. It has often been considered a kind of safe basis
13
on which information professionals may avoid difficult questions. Recently, Jonathan Furner (2012) wrote in relation to the work about IFLAs principles known as Functional Requirements for Subject Authority Records (FRSAR): Ultimately, the FRSAR Working Group does not take a philosophical position on the nature of aboutness; rather, it looks at the problem from the users point of view (Zeng, umer and Salaba 2010, 8). The implication here is that, not only is it desirable to refrain from taking a philosophical position on the nature of aboutness when modeling bibliographic and authority data, but also that it is indeed possible to so refrain. On reflection, I have to admit that I am not comfortable with the Working Groups implicit endorsement of the latter claim. I am not sure that it is possible to avoid taking a philosophical position on this matter. In this quotation, Furner expresses the view that researchers cannot avoid theoretical and philosophical problems by choosing user studies as an alternative. Theoretical issues are also inherent in user studies and therefore need to be examined. 2.0 The case for user-friendliness A part of the trend described by Harry Bruce may be seen as a trend against user unfriendliness. Would anybody argue that an information system or a knowledge organization system (KOS) should be difficult, cumbersome, and frustrating to use? This is certainly difficult to imagine today, but actually such ideals have formerlyto a limited degreedriven some design principles for libraries and KOSs. Around 1900, for example, it was a goal for many public libraries to limit the use of fiction (and to increase the use of non-fiction), and they deliberately made limitations on the relations between how many fiction and non-fiction books a user could borrow. And they consciously made attempts to make fiction books hard to find in the classification systems and on the shelves (Eriksson 2010; only available in Danish). We can also imagine that some university teachers, as well as librarians, have seen a prestige in making their lessons and their classifications difficult because it was another era, and users were not considered customers, as they often are today, but were seen as people who should prove that they were capable and motivated to learn difficult things. (The idea
that it should be difficult for users may find theoretical justification in the handicap-principle (Nicolaisen and Frandsen 2007), which is in opposition to the principle of least effort (cf. Zipf 1949)). Johannes Jensen (1973/1947 trans. from Danish) has a story about a person who (before 1947) looked for information about the mercantile law of the Netherlands. He comes to the Royal Library in Copenhagen, approaching the librarian andrather than being helped directlyis referred to the catalog. He discovered the catalog was (at that time) written in Latin, and the title was: Catalogus Bibliothec Regi Hafniensis sub Auspiciis et Jussu Munificentissimi ejus Everget, Augustissimi Regis Frederici Vlti adornatus. In spite of his knowledge of Latin, he was not able to find here what was needed and returned to the librarian, where he was informed that he was presumed to know Roman law, because the librarys catalog was organized according to the principles of Roman law (after 1950, a new catalog was developed based on new principles, but it is still necessary to use the old catalog for books printed before 1950). At the end of the 1970s, it was still common to come across the attitude that users should not have direct access to the shelves in research libraries, because only a catalog search would provide a full display of what the library owned on a given subject (although many libraries still do not provide access to the shelves, the motives are probably different today). So, yes, principles for design in the LIS context have sometimes been based on unfriendliness. In what follows, it will be assumed that all approaches to LIS and to KO today are devoted in some way or another to the principle of user-friendliness. This article discusses user-based and cognitive approaches as one family of approaches among others. All existing approaches will argue that they provide user-friendly systems. Different approaches are competing views on how best to provide user-friendly systems. We therefore have to make a sharp distinction between user-friendly systems on the one hand and user-based systems on the other. User-based and cognitive approaches are therefore not different from other approaches by attempting to be friendly, but in their view on how to accomplish this goal. 3.0 User-oriented versus user-based design In order to design good systems for the users, what kind of knowledge should the information specialists have? User-based approaches may be defined as approaches in which KOS are constructed on informa-
14
tion derived from either empirical studies of users or on users input during the design process as suggested by Elaine G. Toms (2010, 5452): User-Centered Design (UCD) ... is founded on the principle that users need to be involved in the design and development process for systems to be truly usableefficient, effective, and satisfying. It is important to say that user-based design is based on assumptions rather than on evidence. For people subscribing to this view, these assumptions may seem evident. It may seem evident, for example, that in order to design a user-friendly laptop, cellphone, database system, dictionary, etc., you should examine what the users need and what they prefer. However, if you look at many of the greatest design successes, such as Apples computers and iPhones, and Dialogs search system, or the heart of Googles search engine, PageRank, they were not constructed on the basis of user studies. Apples approach is described in the following quotation from Verganti (2009, viii): A marketing manager for Apple described its market research as consisting of Steve (Jobs) looking in the mirror every morning and asking himself what he wanted (Young and Simon 2005). This claim seems preposterous and illogicalalmost blasphemous. It contradicts popular theories of user-centered innovation. We have been bombarded by analysts saying that companies should get a big lens and peruse customers to understand their needs. The framework provided in this book shows that even if a company does not get close to users, even if it apparently does not look at the market, it can be much more insightful about what people could want. It is thus clear that Apples philosophy is not based on user studies. The lesson from Dialog is similar: when it was established around 1972, there were two major competitors: Bibliographic Retrieval Services (BRS) and System Development Corporation (SDC). The latter examined the need for databases using survey methodology, but Dialog constructed a supermarket of different databases and became the leader; each database brought in new customers who in turn used existing databaseskind of a push/pull phenomenon. Our last example is Google, whose PageRank algorithm was not based on user studies, but
inspired by bibliometric links between papers. Although Google has since modified its system and have now also introduced principles based on customization and user-based principles (perhaps primarily in order to optimize advertisement rather than retrieval?), all three examples are powerful challenges to prevailing theories of user-centered innovation. The idea of user-based approaches to KO is that the knowledge needed to design a KOS comes primarily from the study of users (or the involvement of users). This is in contrast to other approaches to KO, which focus on, respectively, technical aspects of computer systems, analysis of documents, expert evaluations, or the analysis of knowledge domains and genres, including their different epistemologies and ideologies. A historical voice from the founder of the UDC classification, Paul Otlet, is expressed by Boyd Rayward (1994, 247): Otlets primary concern was not the document or the text or the author. It was also not the user of the system and his or her needs or purposes. Otlets concern was for the objective knowledge that was both contained in and hidden by documents. Another classical demand in KO is Hulmes (1911) concept of literary warrant, which is also clearly an alternative to user-based approaches. Does user-based KO represent an alternative or a supplement to such alternatives? Not much has so far been said about this in the literature about user-based and cognitive approaches. A remark should also be put in relation to folksonomies and related social technologies, which is a hot topic these days. The success of such systems depends on the amount of qualified input; they are often considered user-based, but they could alternatively be considered systems drawing on a wide amount of volunteer and/or distributed subject expertise. Therefore they do not provide new arguments in relation to the examination of the value of user-based principles in KO. 4.0 Users: abstract or specific? How are users being studied? What kinds of assumptions drive the field? Different psychological, sociological, and anthropological theories or paradigms have very different implications for the study of information user(s). In psychology, particularly in behavioral and cognitive psychology, there has been a tendency to
15
consider human beings as fundamentally governed by general, species-specific principles. The human mind is physiologically and psychologically the same since the homo sapiens was born, wrote Neelameghan et al. (1992, xiv). Neelameghan and other researchers thus work from the premise that the mind is closely related to the brain and therefore assume that the mind has not changed either. Apart from biologically determined variations in the population (as reflected, for example, in Bell curves), the mind is considered universal. That means that there are certain universal principles that can be discovered by experimental psychology and by cognitive ergonomics and applied to information science. Examples are that designers of information systems should avoid the color red because red is difficult to perceive, or that human short-term memory has a limited capacity and therefore designers should avoid presenting more than seven units of information at a time (Miller 1956). Cognitive psychologist George A. Miller is of particular interest to information science because he later developed the WordNet system. We shall return to him and cognitive psychology when we look at the cognitive approach to KO below. An alternative to the understanding of the mind as a universal mechanism (e.g., a universal computer) is to consider it as culturally, socially, and individually shaped. The fields of cultural psychology and social anthropology are based on the understanding that the basic functions of the human mind are determined by the languages and other cultural symbolic systems that are learned in a given culture or domain. This cultural view is in opposition to the cognitive view in information science and is the perspective from which the present author approaches problems of KO. Case (2006, 2007) categorized the groups studied in information science as defined by occupation/discipline, by role, or by demographic status. Some examples of groups studied are: By occupation/discipline: Scientists, engineers, doctors, nurses, pharmacists, social scientists, humanities scholars, psychologists, industrial managers, journalists, lawyers, farmers, artists, police officers, arts administrators, theologians, architects, teachers. By role: Patients, students, researchers, professors, citizens, jobseekers, genealogists, hobbyists (e.g., cooks, coin buyers, knitters), library users, shoppers, readers, Internet users. By demographic: Children, teenagers, women, mothers, older people, immigrants, poor people, homeless people, retired people, inhabitants of particular countries or areas, ethnic minorities.
Information-seeking behavior is, of course, partly determined by whether you are a scientist, a nurse, a farmer, or a teacher, and also by your role and demographic characteristics. But also individual characteristics are at play. Jannica Heinstrm (2005), for example, assessed information behavior of students by survey, identifying three behavioral patternsfast surfing, broad scanning and deep divingand related these patterns to different personalities and learning styles. There seem to be four important critical issues in relation to such kinds of studies. The first is that they are descriptive studies of what users do. But how can we, as information professionals, use such knowledge to help users improve their information searching? How can we come from descriptions to prescriptions? If we learn, for example, that students prefer Google to library OPACs (Rosa et al. 2005, 2006; Pors 2005), what do we learn from this on how to improve our information services? Larson (1991) informed us that people learned to avoid Library of Congress Subject Headings, but not how to improve the system for users. We, as information specialists, should have knowledge and be able to help people search for information. Our knowledge as information professionals cannot, therefore, be obtained from what the users do. Empirical studies of users may be popular because this seems to be a relatively simple way to do scientific studies in information science. But it is always important to consider what kind of knowledge it is important to gain. A second problem is the tendency to consider the average or typical information behavior. Allen (1966) is a famous study showing that engineers prefer an easily available information source at the expense of information sources considered by the engineers to be of higher quality. But libraries need to be there with high-quality information in order to serve the minority who care to check the correctness of information. Information services may not be made just for the average user, but for users who are critical and who want to examine things carefully. Without such critical people (and quality information services to support them), errors would never have a chance of being corrected. A third problem is the way studies are often generalized. Shiri, Revie, and Chowdhury (2002, 12), for example, found that the results of these studies demonstrate the usefulness of thesauri both in terms of providing users with alternative search terms for query expansion and in improved retrieval performance. The quality of the specific thesauri where not investigated, however. It seems obvious that the quality and
16
the usefulness of thesauri must be related and that the quality of specific thesauri depends on the principles and qualifications on which it is constructed. As stated by the authors: Given the fact that few domain-specific thesauri have been evaluated in terms of their coverage and performance for query expansion, research needs to be carried out to evaluate thesaurusaided query expansion in a range of subject domains (Shiri, Revie, and Chowdhury 2002, 13). The fourth problem is that it is a fundamental error to see users as outside information and to investigate information behavior as variables between supposedly independent factors. People need to obtain the information that is needed in, for example, their jobs. Otherwise, they are not qualified and would not keep their jobs. Therefore users always have some kind of pre-knowledge and are positioned somewhere inside the information ecology. Whitleys (1984/2000) The Intellectual and Social Organization of the Sciences is a book that classifies scholarly disciplines according to scientists functional and strategic dependence, and technical and strategic uncertainty. Krampen, Fell, and Schui (2011) is a study of psychologists information seeking based on this model. The point is that in some fields users may be freer to have individual preferences in formulating research problems, selecting research methods and seeking information, whereas in other fields there are narrowly defined norms that have to be followed. To understand human information behavior being shaped in this way by social arrangements is a much more fruitful way of understanding compared with study correlations between variables as they are traditionally done (Day 2011; Johnson 2011). 5.0 The Book House System as an example of a system based on user studies One of the few prominent examples of systems developed on the basis of user/cognitive studies is the Book House System (or AMP system) developed in 1987 by Annelise Mark Pejtersen and associates (Pejtersen 1989, 1992). This system represents, in many different ways, a pioneering work and is probably one of the most prominent examples of KOS based on the user-based view. It was a Danish system developed for information retrieval in fiction over a period of 20 years. It contained about 3,000 references to books for adults and children. The books have been analyzed according to user preferences, and the system is based upon a comprehensive research. It was based eclectically on many ideas. It used the most advanced computer technology of the day, e.g., color
screens and icon-based user interface. It used a kind of facet analysis of indexing fiction, and Pejtersen abandoned many traditional properties of classification systems: the class marks, the hierarchies and the idea of exhaustivity and mutually exclusive classes (and the reason for doing so was that her classification was not meant for shelf arrangement). The system was well received. Rune Eriksson (2010, 99-130) is a careful study of the AMP system (unfortunately, as already said, only available in Danish). Annelise Mark Pejtersen is the researcher, among all countries and all times, who has worked most intensively with the classification/indexing of fiction from the perspective of public libraries. She developed the AMP system in several versions, but they changed surprisingly little during its many versions, although it was constantly modified and improved. It was never finished in the sense that it was always meant to be followed by new, improved versions. Some of the versions were implemented in the so-called Book House System from 1987 (Pejtersen 1992). The AMP system is very thoroughly described and documented. There is (or was) the system itself, its empirical research base, a well-argued structure, detailed interpretations of the categories, examples of records, and manuals for indexers. Some of these things are published in English, but the overwhelming part is only available in Danish, and the manuals exist only in an unpublished form. In this article, we have to disregard many things, such as the advanced technology relative to its construction, and just focus on the question: how did the study of the users contribute to this successful system? The claim that it is based on user modeling and applying a cognitive view in knowledge organization is, for example, expressed by Pejtersen (1992, 573) here: Traditionally classification and indexing schemes have been developed to reflect the contents of a document in terms of its relationships with the knowledge structure of the subject field to which it belongs and does not usually take the users request as a focus. What is needed to extend this foundation is an appropriate frame of reference of indexing and searching based on a cognitive analysis focusing on the needs and capabilities of the end-user. Among other things, this can lead to solutions which let the user choose search attributes which adequately cover the specific domain of interest and, at the same time, give the user the opportunity to solve his/her problem in a natural way.
17
In this quotation, Pejtersen expresses the view that traditional classification and indexing systems do reflect the subject field, but not the users needs and requests. This is claimed without any critical examination of such traditional systems. It might be the case that traditional classification systems also use classification criteria, which are relevant for the users of the domain. For example, the genre concepts developed in fiction are relevant for classification in that domain. It is therefore not demonstrated in the quotation that cognitive studies are superior compared with literature-based studies. It is correct that the Book House System uses many more dimensions of indexing documents and is therefore superior to traditional classification systems, which is, of course, an important achievement of Pejtersen. But the idea to do so may simply come from the technology that enabled it and from knowledge of the nature of fiction. There is no evidence that this idea derived from the study of users. It should also be mentioned that Pejtersen was educated in literature studies/literary theory. Which role did this domain knowledge play in the design of the system? The AMP system applied user studies in two ways: 1) The users were consulted before the system was realized in order to get information about how to design it; 2) Users were asked to evaluate versions of the system in order to improve it (or simply tell whether it was good or bad). The first user studies were recordings of conversations between users and librarians in real-life situations. On the basis of a careful reading of Pejtersens publications, Eriksson (2010, 108-109; my translation, BH) writes: In this way the quantitative analysis of the userlibrarian conversations as well as the final examples of such conversations almost come to be a kind of postscript; perhaps it is just unfortunate, but as the publication is, it is the system which came first, while the quantitative analysis and the main part of the examples appear as the second link, that is rather as a legitimation of the relevance of the system for practice than as the foundation of the system. This is not to say that the user-conversations have not played a role for the design of the system ... there are certainly elements from them, of which it can be said that they are expressed in the system. The connection between the conversations and the system is perhaps not quite as intimate as many of the publications say it is. Pe-
jtersen has also acknowledgedin an interview with Eriksson on May 16, 2007that one of the dimensions in the AMP system, the author intention, was partly inspired by literary theory. Eriksson finds that literary theory plays a much bigger role than what Pejtersen expresses in her many publications and even in the interview in 2007. During her career, Pejtersen totally ignored the connection to literary theory after 1976. Pejtersen wanted to provide the impression that it was based on the empirical studies of users, not on the application of literary theory. It is Erikssons opinion, however, that the AMP system is generally wiser than the user conversations on which it claims to be basedand this wisdom is attributed to Pejtersens background in literary studies. Pejtersen (1994) and Pejtersen et al. (1996) argued for work domain analysis as the methodological basis, but, according to Eriksson (2010, 103), this concept does not change the basic aspects of the AMP system, and, even if Pejtersen et al. (1996) have five authors, this publication is, for long stretches, simply a rewording of Pejtersen (1989). Why would Pejtersen deny that she uses her knowledge from her formal education in literature? (Why would anybody make oneself look less wise than he or she in reality is?) The methodological descriptions of how the system was developed (Pejtersen 1989) underplays the fact that the author has a background in literature studies. Such an attitude may reflect a kind of positivism in which the empirical studies of users are seen as better research than the scholarly studies of literary genres. That might be one reason to repress the role of literary theory. (If the importance of literary theory had been acknowledged, the approach would have been domainanalytic rather than user-based.) Eriksson (2010, 108) writes that Pejtersens empirical investigations probably did not reveal all the needs of the users. He finds it ironic that it is another investigation by Pejtersen that makes this probable. Pejtersen et al. (1996, 42) demonstrated that 31% of users did not find that they had problems finding good books, but, of the remaining 69%, only 8 solved the problem by consulting the librarian. Eriksson (2010, 108): This is unfortunate, but it demonstrates that the problems of the users are far more comprehensive than revealed by the specific enquiries. It is remarkable how many enquiries refer to the
18
easy genres, so perhaps users with more complex needs avoid asking the librarian because they do not expect that she is able to help. Another possibility is that they are not able to formulate their query properly, but that does not mean that there are no problems. It is therefore absolutely thinkable that the user-librarian conversations only reveal a part of the real user needs. This is a criticism revealing that user studies are only to a limited degree able to identify user needs. Another problem with Pejtersens user studies, according to Eriksson (2010, 108), is that she based her system on the same studies after 15 years. During that time, society changed, new user groups arrived, and the literature itself evolved in ways that provoked new kinds of enquirers. We can conclude this example by stating that the Book House System was probably user-based and cognitive more through claims than in reality. The basic ideas and structures may have been based on domain knowledge and much of the careful empirical work may only have contributed to a limited degree. In addition, it can be said that the empirical studies probably could not have been carried out without solid domain knowledge in literary studies. 6.0 The word association method as an example of a user-based methodology Marianne Lykke (formerly Marianne Lykke Nielsen) is a Danish information scientist. She applied the word association method in her Ph.D. dissertation (Lykke Nielsen 2002) as a method of thesaurus construction. In this method, subjects respond to a stimulus word by naming another word which first comes to the subjects mind. The method was developed in psychology by Sir Francis Galton (Galton 1883) to demonstrate his claim that very few thoughts or actions are ever the spontaneous product of the will but are related to desires and ideas, the associations of which we have little conscious awareness. Also, psychologist C. G. Jung (18751961) became curious about the time delay that occurred in responding to certain words. Jung theorized that the delay between stimulus and response indicated some sort of block in self-expression and developed a word association test in 1910. The first consideration is, therefore, that the userbased approach in LIS here is applying a psychological methodology as a tool for thesaurus construction.
What is it an alternative to? It might be an alternative to literary collection methods. Lykke Nielsen (2002, 174) writes: compared to literary collection methods it [the word association method] is an economic and efficient method. However, both the literary method and the word association may be carried out in many different ways: different documents could be examined and different people could be used as subjects for the word association method. In order to determine the relative benefits and drawbacks of the two methods, both alternatives have to be considered carefully. In both cases the question arises: What are the information sources with the highest level of cognitive authority? In order to answer that questionand thus to select documents or personssubject knowledge and subject theory are required. This leads to another question: should the people used for the word association test by Lykke be considered experts, or should they be considered users? If they are considered experts, then we are not talking of a method of thesaurus construction that is user-based, but on a method to gain knowledge from experts. As I previously wrote (Hjrland 2002, 259-60): The data collection methods described in Lykke Nielsen (2000) are well known in AI [artificial intelligence] as techniques or methods of knowledge elicitation. If you are going to build an expert system, you have to get the expert knowledge from somebody or somewhere. An obvious solution is to elicit the needed knowledge from somebody considered an expert on the task or issue. Cooke (1994), for example, presents a variety of such knowledge elicitation techniques, including group discussions and free associations. Such methods have primarily been considered of a psychological nature, while the domain-analytic methods that I have been a spokesman for have mainly been of a sociological and epistemological nature. We shall go no further with the word association test here. As in the Book House example, there seems to be a problematic tendency to claim that the necessary information comes from users rather than from adequate domain knowledge. 7.0 The meaning of the cognitive approach The cognitive view (or in the plural: the cognitive views) in KO is related to the cognitive views in LIS
19
in general as well as to broader trends related to the development of cognitive science. Within psychology, the cognitive paradigm is mostly used synonymously with information-processing psychology. Its basic assumptions have been expressed in this way (Pylyshyn 1983, 70): [The approach is] the attempt to view intelligent behavior as consisting of processing information or to view intelligence as the outcome of rulegoverned activity. But these characterizations express the same underlying idea: computation, information processing and rule-governed behavior all depend on the existence of physically instantiated codes or symbols that refer to or represent things and properties outside the behaving system. In all these instances, the behavior of the systems in question (be they minds, computers or social systems) is explained, not in terms of intrinsic properties of the system itself, but in terms of rules and processes that operate on representations of extrinsic things. This paradigm was introduced in psychology around 1956 by, in particular, Jerome Bruner, Noam Chomsky, George A. Miller, and Ulrich Neisser. It was received as a scientific revolution. By the 1990s, it was, however, confronted by increasing criticisms, and many researchers, including Bruner (1990), turned against their own former understanding. The relationship between cognitive psychology and information science is based both on a specific understanding of users as governed by internal rules, structures, capacities, and programs, such as George A. Millers study of limits in short-term memory. The relationship is also based on the concept of expert systems, and there has been a mutual inspiration between cognitive psychologists and computer scientists developing such artificial intelligence. This issue was also taken up in information science: for example, Peter Ingwersen (1992) developed the so-called MEDIATOR model, and he also decided that a textbook on cognitive psychology (Lindsay and Norman 1977) should form the basis for the new masters program at the Royal School of Library and Information Science in Denmark in 1990. In his 1992 monograph, Ingwersen (1992, 157) saw the cognitive view as a synthesis between user-oriented approaches and the traditional approach and wrote: The transformation from the user-oriented and the traditional approaches into a cognitive one happens when IR research comes to have each others isolated models in mind.
In Ingwersen and Jrvelin (2005, 191), however, user-oriented and cognitive views seem no longer to be separated. Here, the authors [discuss] the development of cognitive and user-oriented research from the 1970s and onwards under one umbrella and state that the cognitive approach to IR could briefly be characterized as user- and intermediary-oriented. I interpret thisin line with other writingsas a tendency to give up the cognitive approach as differentiated from user-based approaches. However, a broader historical description may be necessary in order to explain the appearance (and fall?) of the cognitive view. After 1990, many people became skeptical about the theoretical basis of the cognitive paradigm, in particular the way the role of culture and society in cognition was marginalized by cognitive science. Also, in information science, this view has been seriously attacked (see e.g., Palermiti and Polity 1995). We shall return to this in section 9 Reception and criticism of the cognitive view in KO below. 8.0 Color classification as an example of a controversy over cognitivism The purpose of this section is to address a fundamental issue related to the cognitive view as it has been discussed in the interdisciplinary literature; we connect a basic problem of KO with an important debate: should concepts and classification be determined by studying our biological make-up or by studying different domains? Research based on the assumptions in the cognitive view may assume that concepts are somehow hardwired to our mind or brain, for example, in our so-called mental lexicon. Sociocultural views, on the other hand, tend to assume that concepts are learned by growing up and living in a specific culture. The difference between these two points of view is perhaps seen most clearly in the controversy in the research on color concepts. The book Basic Color Terms: Their Universality and Evolution (Berlin and Kay 1969) has had a big impact on the view of color terms. In that book the authors claimed the universality and evolutionary development of 11 basic color terms (BCTs); the following characteristics of this view are written by two of the main critics of that view, Barbara Saunders and Jaap van Brakel (2001, 162): According to the dominant view in cognitive science, in particular in its more popularized versions, color sensings or perceptions are lo-
20
cated in a quality space. This space has three dimensions: hue (the chromatic aspect of color), saturation (the intensity of hue) and brightness. This space is structured further via a small number of primitive hues or landmark colors, usually four (red, yellow, green, blue) or six (if white and black are included). It has also been suggested that there are eleven semantic universals the six colors previously mentioned plus orange, pink, brown, purple and grey. One of the influential standards for classifying colors is the Munsell color system developed by the American painter Professor Albert Henry Munsell (1858 1918). This system is in cognitive science often assumed to reflect the human visual system, although all color names are not developed in all cultures (Saunders 1998): The relation between Munsell, the workings of the visual system [in the brain], and the colournaming behaviour of people, is so tight it can be taken to be a causative law. Diversity of colournaming behaviour is defined as a system-regulated stability evinced by Evolution. The full lexicalisation of the human colour space is designated Evolutionary Stage Seven, as in American English; languages below this level are the fossil record. Berlin and Kays (1969) view of color concepts is contrasted with a cultural-relative view in which our color concepts (and semantics in general) are not supposed to be determined primarily by our visual (neurological) system, but by our relative needs to act in relation to the colored environment. Cultural psychologist Carl Ratner (1989, 361) writes: Sociohistorical psychology emphasizes the fact that sensory information is selected, interpreted and organized by a social consciousness. Perception is thus not reducible to, or explainable by, sensory mechanisms per se. Sapir, Whorf, Vygotsky and Luria all maintained that sensory processes are subordinated to and subsumed within higher social psychological functions. Van Brakel and Saunders (2001, 162) continue with critical comments of the cognitive view: Scientific evidence for these widely accepted theories is at best minimal, based on sloppy
methodology and at worst non-existent. Against the standard view (Berlin and Kays view), it is argued that color might better be regarded as the outcome of a social-historical developmental trajectory in which there is mutual shaping of philosophical presuppositions, scientific theories, experimental practices, technological tools, industrial products, rhetorical frameworks, and their intercalated and recursive interactions with the practices of daily life. That is: color, the domain of color, is the outcome of interactive processes of scientific, instrumental, industrial, and everyday lifeworlds. That is: color might better be called an exosomatic organ, a second nature. Regarding relativism in color concepts, see also Goodwin 2000; Lucy 1997; Roberson, Davies, and Davidoff 2000; and Saunders 2000. We may thus conclude that the universality of color terms is a controversial point of view. The dominant view (in a period) was (and probably still is) based on cognitivism and maintains the universality of concepts, while a well-argued minority maintains a relativist view of color concepts. This debate is important for the theory of knowledge organization: should colors (and all other concepts) be classified the same way for all groups of users? Should the study of concepts be founded on psychological studies, or should it rather be based on cultural and domain-specific studies? This is what the controversy about cognitivism is basically about. 9.0 Reception and criticism of the cognitive view in KO The cognitive view came to the forefront of knowledge organization in 1992, where the Second International ISKO Conference in Madras had this approach as its theme (Neelameghan et al. 1992). In the proceedings, there is an introduction from which we quote (Neelameghan et al.1992, xiii): Cognitive paradigms indicate the knowledgeseeking behaviour of individuals and groups of individuals. It is a nascent state of human mind wherein a kind of gap in knowledge structure occurs and the mind searches for a connection through its external environment. In the context of Information Retrieval, the searcher seeks some relevant information from the vast store of a knowledge base to find some kind of equilibrium in the knowledge state. The analysis and
21
diagnosis of this state of inquiring mind provides guidelines for organisation of information in databases and similar environments. Such guidelines are aimed at providing a conducive compatibility between searchers approach and knowledge organisation in the database. It continues (Neelameghan et al.1992, xiv): The human mind is physiologically and psychologically the same since the homo sapiens was born. This introduction does not provide any hint at all about how to investigate the mind in a way that may provide a basis for indexing, classification, or metadata assignment, which is what KO is about. Their remark that the human mind is physiologically the same since the homo sapiens was born is related to the controversial assumption in the cognitive paradigm to consider the mind as a universal system of mechanisms. Against this view exists the alternative view that psychologically the mind is also historically, culturally, and socially determined. The introduction above thus disregards the social nature of knowledge. It also fails (as does Xiao (1994), cf. below) to compare the cognitive paradigm with other paradigms in KO. Ingetraut Dahlberg, the founder of ISKO and the journal Knowledge Organization, wrote an editorial about the cognitive view in KO (Dahlberg 1992). Here the term cognitive approaches is declared a tautology because all approaches to KO must, in one way or another, be concerned with conceptual and cognitive issues; the term is thus not specifying anything new in KO. Then different paradigms in LIS are considered. Both the so-called physical view associated with the Cranfield experiments and the influence of Shannons Information Theory are said to have led astray generations of information workers. Ranganathans approach is mentioned as the first (and only) paradigm in KO. Meys (1980, 48) often-used definition that any processing of information, whether perceptual or symbolic, is mediated by a system of categories or concepts which, for the informationprocessing device, are a model of his [its] world is quoted by Dahlberg, as is the conclusion, that the meaning of the cognitive view is that an information retrieval system should reflect in its operations, in some way or other, the cognitive world of the user. Whether or not Dahlberg see an inherent conflict between different approaches to KO or whether the cognitive view is somehow improving Ranganathans
theory is not discussed. Perhaps she also felt that it would be inadequate to make a fundamental criticism since cognitive paradigms were chosen as the theme of the conference? This example demonstrates that it may be difficult to find clarification of theoretical views in the printed literature. Xiao (1994) is a paper about facet analysis as a paradigm in KO which also includes a discussion of the relation between facet analysis and cognition. She fails, however, to consider the specific literature about cognitive views in LIS and the basic assumptions put forward using this label. She just says that Ranganathan had an epistemological view (that knowledge is dynamic, multidimensional, and unlimited). She fails to identify other contemporary approaches to KO with which the facet-analytic paradigm can be compared. Bernd Frohmann (1990) is the most important critic of the cognitive view in KO. Based on the philosopher Ludwig Wittgenstein, he contests the mentalism represented in current work on human indexing. Frohmann claims that indexing rules are not based on cognitive processes resident in the mind of users (as understood in cognitive views, which he also includes in the term mentalism). By contrast, indexing is based on socially constructed rules apprehended by indexers. So Frohmann (1990, 96) argues that the focus in KO must shift indexing theory away from the cognitive view and rule discovery and toward rule construction: Mentalisms focus on processes occurring in minds conceals the crucial social context of rules. Since we do not understand the rule we are constructing without understanding its social context, or the way it is embedded in the social world, its point, its purpose, the intentions and interests it serves, in short, the social role of its practice, indexing theory cannot avoid investigation into the historical, economic, political and social context of the rules in its domain. Mentalism, on the other hand, either erases the social dimension altogether by conceiving rules as operating in disembodied, ahistorical, classless, genderless and universal minds, or else acknowledges it only by expanding the set of rules of mental processing. In a paper from the Second International ISKO Conference in Madras, Frohmann (1992, 47) writes: Human subjectivity, or personal identity, consists far less in offering a stable ground for the
22
unification of messages into a coherent picture, image or model of the world than various competing, temporary, fragmentary and contradictory postures and poses, tentatively stitched together from the available products of real social relations. A genuine shift to users can therefore not be carried out within the abstract, universal and representational form of the cognitive paradigm in LIS theory. A more recent criticism has been put forward by Jack Andersen (2004, 139-144). He discusses request, user and cognitive-oriented indexing and writes: A cognitive approach to indexing has been put forward in several writings by John Farrow (Farrow 1991; 1994 and 1995). Farrows objective is to provide an understanding of the indexing process based on cognitive psychology and cognitive reading research. Reading research distinguishes between perceptual and conceptual reading. The former is relying on scanning the text for cues, whereas the latter is dependent on the background knowledge (e.g. knowledge of subject matter) a reader approaches the text with. Basically, Farrow argues that the indexing process may be viewed in light of these two modes of reading. It is, however, difficult to see what a cognitive approach to indexing offers and, if it offers something, what is cognitive about it. Turning indexing (and reading) into a cognitive matter is to remove attention away from the typified socio-cultural practices of document production and use, that authors, indexers and readers are engaged in. Mai (2000, 123-124) also criticizes Farrows cognitive model of indexing as it ... adds no further knowledge or instructions to the process. He simply says that indexing is a mental process, which can be explained by using models of human information processing from cognitive psychology. But these arbitrary models of minds, memory and cognition explain little about the indexing process. Joacim Hansson (2006, 33) is also among the critics: In knowledge organization theory, cognitive perspectives have not been as dominant as in information behavior research. The reason for this is it is practically impossible, at least in the long run, to avoid connecting knowledge organization and classification research to the actual content of the
documents and document collections in relation to the classification and indexing performed. This can seem trivial, but it is actually not. As described above, the cognitive view in KO has been met by important criticisms. Unfortunately, the adherents of the cognitive view have not provided proper scholarly response. Konrad (2007, 23) also found that the cognitive viewpoint literature [in LIS] is sparse in its use of, and even reference to, any of these [cognitive science disciplines], preferring to originate its own postulates in these areas. The cognitive view in KO seems thus to lack sufficient intellectual foundations. 10.0 WordNet as an example of a system based on the cognitive paradigm WordNet is today a very large lexical database freely available on the Internet (http://wordnet.princeton. edu/) and it is constantly evolving. It is a very fine English-English dictionary that is useful for looking up unknown words and their relations to other words and underlying concepts. It was developed by previously mentioned cognitive psychologist George A. Miller as a tool for developing AI technologies and is claimed to be based on principles derived from psycholinguistics. The question for us is: what is the connection between the (claimed) cognitive foundation and the actual database? Although we cannot go into detail here, we shall briefly look into the issue but leave a thorough investigation until another time. The psychological/cognitive principles underlying WordNet have been presented in, among other works, Fellbaum (1998, 2005), Miller (1998a, 1998b), and Nikolova, Boyd-Graber, and Fellbaum (2009). George Miller (1998b, 43) wrote: In earlier descriptions of WordNet it was suggested that WordNet is based on psycholinguistic principles in the same sense that the Oxford English Dictionary is based on historical principles. That claim has not borne the fruit that was expected at the time it was first made. The fact is that WordNet has been largely ignored by psycholinguists. Miller (1998b, 44) continues: Development of the nouns in WordNet has therefore been driven far more by potential application to computational linguistics than by
23
advances in theories of cognitive psychology. Perhaps this outcome should have been foreseen. After all, a dictionary based on historical principles contributed little to the study of history. These quotations are interesting for two reasons. First, if we consider dictionaries kinds of knowledgeorganizing systemsas Hodge (2000) doesthe quotations confront two different approaches to their construction: the historical versus the psycholinguistic/cognitive approach. Secondly, it is partially an acknowledgement that the cognitive approach did not succeed. (The claim that historical dictionaries did not contribute to the study of history may be wrong: the German tradition of Begriffsgeschichte hasas far as I knowcontributed considerably to the understanding of the historical periods which they reflect). It seems that Miller does acknowledge that research in cognitive psychology has not had much to offer. It should be said, however, that Miller says more than what is quoted here. He also says that it is not false that WordNet is based on psycholinguistic principles and he exemplifies why that is the case, but we shall not go into those arguments here. Instead two things should be said: Miller acknowledges that semantic relations (e.g., synonymity) are not universal, but context dependent. But WordNet itself does not reflect this. Later a semantic concordance was developed at the Princeton Cognitive Science Laboratory (Fellbaum 1998, 13). In my opinion, this is an approach that is closer to being a historical-social approach than a cognitive approach. There seem to be underlying assumptions about one correct way of representing semantic relations (Miller 1998a, xvii). Nowhere is there an indication that semantic relations reflect scientific theories, for example that whether a certain drug is a tranquilizer or not depends on medical and biological experiments. Human users learn such empirical established knowledge that cannot be hardwired into our brains from birth. Therefore the cognitive enterprise seems to be based on problematic assumptions. As was the case with the Book House System and with the word association method, it may be the case that WordNet is in reality less based on cognitive views than was expected and what has been claimed. 11.0 Psychology versus epistemology Perhaps the popularity of the user-based and cognitive views is based on confusion between users and subjectivity, between psychology and epistemology?
Psychology is about general models of minds or about individual minds. Epistemology, on the other hand, is about ways of thinking (paradigms) as reflected by scientific disciplines and by groups of people. It is one thing to say that indexing should reflect an abstract human mind, quite a different thing to say that indexing can be tailored to specific groups of users, e.g., evidence-based medical doctors or feminist scholars. The domain-analytic view in LIS and KO is an attempt to base the field on the criteria of relevance shared by groups of people (Hjrland 2010). What Fidel (1994) calls user-centered indexing as opposed to document oriented indexing may very well be oriented towards certain perspectives such as evidence based practice or feminist epistemology without being based on studies of users. It seems better to say that the epistemological view claims that a specific way of indexing may serve certain theoretical views better than others (e.g. an evidence-based view or a feminist point of view) compared to a specific group of people. If indexing leaves the studies of users and abstract minds and turns instead towards serving specific epistemological criteria, then we have turned away from the cognitive view to the domain-analytic approach to KO. 12.0 Conclusion This article has put forward a wide range of problematic assumptions concerning the user-based and cognitive approaches to knowledge organization. Does that mean that the enormous amount of research in the field has been fruitless? Bawden and Robinson (2012) have provided their view on this issue and they try to summarize the results revealed so far. Their conclusions seem, however, rather vague and general. They state: While there is therefore a large body of good evidence to support the practice of information provision to a variety of user groups, it is not so clear that many general findings have emerged. What, after over fifty years of effort, do we know about information behaviour in general? (Bawden and Robinson 2012, 204). Bawden and Robinson do not, however, demonstrate that the large body of good evidence is useful in the construction of KOS. In general, their answers to what we have learned from many years of user studies are rather vague, for example, that users tend to follow the principle of the least effort and that they do not tend to use the products of LIS very much. There is one thing in knowledge organization that we really seem to have learned from user studies: when online systems were introduced in the 1960s
24
and 1970s, a common experience was that searchers preferred verbal search languages. They did not consider classification codes to be user-friendly. Often user studies may also approach domain studies by characterizing the nature of information in a given domain (e.g., Bates, Wilde, and Siegfried 1993). The basic issue in KO is, however, about questions such as: should document A be classified in class X? Is term A synonymous with term B? User-based and cognitive approaches cannot contribute to solving such core issues. References Allen, Thomas John. 1966. Managing the flow of scientific and technological information. Ph.D. dissertation. Massachusetts Institute of Technology, Sloan School of Management. Andersen, Jack. 2004. Analyzing the role of knowledge organization in scholarly communication: an inquiry into the intellectual foundation of knowledge organization Ph.D. dissertation. Copenhagen: Royal School of Library and Information Science. Available http://www.db.dk/dbi/samling/phd/jackandersenphd.pdf. Bates, Marcia J., Wilde, Deborah N. and Siegfried, Susan. 1993. An analysis of search terminology used by humanities scholars: the Getty Online Searching Project report no. 1. Library quarterly 63: 1-39. Bawden, David and Robinson, Lyn. 2012. Introduction to information science. London: Facet. Berlin, Brent and Kay, Paul. 1969. Basic color terms. Their universality and evolution. Berkeley, CA: University of California Press. Bernal, John Desmond. 1948. Preliminary analysis of pilot questionnaires on the use of scientific literature. In Proceedings of the Royal Society Scientific Information Conference. London: Royal Society, pp. 589637. Bruce, Harry. 2002. A focus on usings. In: The user's view of the Internet. Lanham, MD: Scarecrow Press, pp. 3167. Bruner, Jerome. 1990. Acts of meaning. Cambridge, MA: Harvard University Press. Case, Donald O. 2006. Information behavior. Annual review of information science and technology 40: 293327. Case, Donald O. 2007. Looking for information: a survey of research on information seeking, needs and behavior, 2nd ed. New York: Academic Press.
Cooke, Nancy J. 1994. Varieties of knowledge elicitation techniques. International journal of humancomputer studies 41: 80149. Cronin, Blaise. 2009. Introduction. In Annual review of information science and technology 43. Medford, NJ: Information Today, pp. vii-x. Dahlberg, Ingetraut. 1992. Cognitive paradigms in knowledge organization. International classification 19: 125, 145. Day, Ronald E. 2011. Death of the user: reconceptualizing subjects, objects, and their relations. Journal of the American Society for Information Science and Technology 62: 7888. Eriksson, Rune. 2010. Klassifikation og indeksering af sknlitteratur et teoretisk og historisk perspektiv. Ph.D. dissertation. Copenhagen: Royal School of Library and Information Science. Available http:// pure.iva.dk/files/30769518/Eriksson%5Fphd%5F2 010.pdf Farrow, John D. 1991. A cognitive process model of document indexing. Journal of documentation 47: 14966. Farrow, John D. 1994. Indexing as a cognitive process. In Kent, Allen, ed., Encyclopedia of library and information science 53 supp. 16. New York: Marcel Dekker, pp. 15571. Farrow, John D. 1995. All in the mind: concept analysis in indexing. The indexer 19(4): 24347. Fellbaum, Christina. 1998. Introduction. In Fellbaum, Christina, ed., WordNet: an electronic lexical database. Cambridge, MA: The MIT Press, pp. 119. Fellbaum, Christina. 2005. WordNet and wordnets. In Brown, Keith, ed., Encyclopedia of language and linguistics. Oxford: Elsevier, pp. 66570. Fidel, Raya. 1994. User-centered indexing. Journal of the American Society for Information Science 45: 572-6. Fisher, Karen E., Erdelez, Sanda and McKechnie, Lynne. 2005. Theories of information behavior. Medford, NJ: Information Today. Frohmann, Bernd. 1990. Rules of indexing. A critique of mentalism in information retrieval theory. Journal of documentation 46: 81101. Frohmann, Bernd. 1992. Cognitive paradigms and user needs. In Neelameghan, A, Gopinath, M.A., Raghavan, K. S. and Sankaralingam, S.P ., eds., Cognitive paradigms in knowledge organization: second international ISKO conference. Madras, August, 26 28, 1992. Madras: Sarda Ranganathan Endowment for Library Science, pp. 3550.
25
Furner, Jonathan. 2012. FRSAD and the ontology of subjects of works. Cataloging & classification quarterly 50 nos. 5-7: 494-516. Galton, Francis. 1883. Inquiries into human faculty and its development. London: Macmillan. Goodwin, Charles. 2000. Practices of color classification. Mind, culture and activity 7: 1936. Gray, William S. and Monroe, Ruth Learned. 1929. Reading interests and habits of adults. New York: Macmillan. Hansson, Joacim. 2006. Knowledge organization from an institutional point of view: implications for theoretical & practical development. Progressive librarian: a journal for critical studies & progressive politics in librarianship 27: 3143. Heinstrm, Jannica. 2005. Fast surfing, broad scanning and deep diving: the influence of personality and study approach on students informationseeking behaviour. Journal of documentation 61: 22847. Hjrland, Birger. 2002. Epistemology and the sociocognitive perspective in information science. Journal of the American Society for Information Science and Technology 53: 25770. Hjrland, Birger. 2008. What is knowledge organization (KO)? Knowledge organization 35: 86-101. Hjrland, Birger. 2010. The foundation of the concept of relevance. Journal of the American Society for Information Science and Technology 61: 21737. Hodge, Gail. 2000. Systems of knowledge organization for digital libraries: Beyond traditional authority files. Washington, DC: The Council on Library and Information Resources. Available http://www.clir. org/pubs/reports/pub91/contents.html Hulme, Edward Wyndham. 1911. Principles of book classification. Library Association record 13: 354 58, 389394, 44449. Ingwersen, Peter. 1992. Information retrieval interaction. London: Taylor Graham. Ingwersen, Peter. 1996. Information and information science in context. In Olaisen, Johan Leif, MunchPetersen, Erland and Wilson, Patrick, eds., Information science: from the development of the discipline to social interaction. Oslo: Scandinavian University Press, pp. 69111. Ingwersen, Peter and Jrvelin, Kalervo. 2005. The turn. Integration of information seeking and retrieval in context. Dordrecht, The Netherlands: Springer. Jensen, Povl Johannes. 1973. Catalogue and scholarship: D. G. Moldenhawer's catalogue in the Royal Library of Copenhagen. Copenhagen: The Royal Library.
Johnson, Nathan R. 2011. Review of Ron Days (2011) Death of the user Available http://whois nate.com/2011/07/01/ron-days-death-of-the-user/ Jung, Carl G. 1910. The associationmethod. The American journal of psychology 31: 219-69. Konrad, Allan Mark. 2007. On inquiry: human concept formation and construction of meaning through library and information science intermediation Ph.D. dissertation. University of California. Available: http://escholarship.org/uc/item/1s76b6hp Krampen, Gnter, Fell, Clemens and Schui, Gabriel. 2011. Psychologists' research activities and professional information-seeking behaviour. Journal of information science 47: 43950. Larson, Ray R. 1991. The decline of subject searching: long-term trends and patterns of index use in an online catalog. Journal of the American Society for Information Science, 42: 197-215. Lazarsfeld, Paul Felix. 1940. Radio and the printed page. New York: Dvell, Sloan, Pearce. Lindsay, Peter H. and Norman, Donald. 1977. Human information processing: an introduction to psychology, 2nd ed. New York: Academic Press. Lucy, John A. 1997. Linguistic relativity. Annual review of anthropology 26: 291-312. Lykke Nielsen, Marianne. 2000. Domain analysis, an important part of thesaurus construction. In Advances in classification research online. http:// journals.lib.washington.edu/index.php/acro/article/ view/12768 Lykke Nielsen, Marianne. 2002. The word association method: a gateway to work-task based retrieval. bo: bo Akademi University Press. McDiarmid, Erret Weir. 1940. The library survey: problems and methods. Chicago: American Library Association. Mai, Jens-Erik. 2000. The subject indexing process: an investigation of problems in knowledge representation. Ph.D. dissertation. The University of Texas at Austin. Martin, Lowell A. 1976. User studies and library planning. Library Trends 24: 48396. Menzel, Herbert. 1966. Information needs and uses in science and technology. Annual review of information science and technology 1: 4169. Mey, Marc de. 1980. The relevance of the cognitive paradigm for information science. In Harbo, Ole and Kajberg, Leif, eds., Theory and application of information research. Proceedings of the 2nd international research forum on information science. London: Mansell, pp. 49-61.
26
Miller, George A. 1956. The magical number seven, plus or minus two: some limits on our capacity for processing information. Psychological review 63: 8197. Available http://psychclassics.yorku.ca/ Miller/ Miller, George A. 1998a. Foreword. In Fellbaum, Christina, ed., WordNet: an electronic lexical database. Cambridge, MA: The MIT Press, pp. xvxxii. Miller, George A. 1998b. Nouns in WordNet. In Fellbaum, Christina, ed., WordNet: an electronic lexical database. Cambridge, MA: The MIT Press, pp. 23-46. Nahl, Diane. 1996. The user-centered revolution: 19701995. In Kent, Allen, ed., Encyclopedia of microcomputers 19. New York: Marcel Dekker, Inc., pp. 14399. Nahl, Diane. 2003. The user-centered revolution. In Drake, Miriam A., ed., Encyclopedia of library and information science (2nd edn.). New York: Marcel Dekker; Inc., pp. 302842. Neelameghan, Arashanapalai, Gopinath, M. A., Raghavan, K.S. and Sankaralingam, S. P . 1992. Introduction. In Cognitive paradigms in knowledge organization: second international ISKO conference. Madras, August, 2628, 1992. Madras: Sarda Ranganathan Endowment for Library Science, pp. xiii-xvi. Nicolaisen, Jeppe and Frandsen, Tove Faber. 2007. The handicap principle: a new perspective for library and information science research. Information research 12(4): paper colis 23. Available http:// InformationR.net/ir/12-4/colis/colis23.html Nikolova, Sonya, Boyd-Graber, Jordan and Fellbaum, Christiane. 2009. Chapter 5: Collecting semantic similarity ratings to connect concepts in assistive communication tools. In Mehler, Alexander, Khnberger, Kai-Uwe, Lobin, Henning, Lngen, Harald, Storrer, Angelika and Witt, Andreas, eds., Modelling, learning & processing of text-technological data-structures. New York: Springer-Verlag. Available http://wordnet.cs.princeton.edu/papers/ evocation_chapter.pdf Palermiti, Rosalba and Polity, Yolla. 1995. Desperately seeking user models in information retrieval systems: benefits and limits of cognitivist and marketing approaches. The new review of information and library research 1: 5765. Available http://www. iut2.upmf-grenoble.fr/RI3/Usermodels.htm Pariser, Eli. 2011. The filter bubble: what the Internet is hiding from you. New York: Penguin Press. Pejtersen, Annelise Mark. 1989. The Book House: modeling user needs and search strategies as a basis
for system design. Roskilde, Denmark: Ris National Laboratory. (Ris report M-2794). Pejtersen, Annelise Mark. 1992. The Book House. An icon based database system for fiction retrieval in public libraries. In Cronin, Blaise, ed., The marketing of library and information services 2. London, Aslib, pp. 57291. Pejtersen, Annelise Mark. 1994. A framework for indexing and representation of information based on work domain analysis: a fiction classification example. In Albrechtsen, Hanne and rnager, Susanne, eds., Knowledge organization and quality management. Proceedings of 3rd International ISKO Conference, Copenhagen, June 1994. Frankfurt: Index Verlag, pp. 25164. Pejtersen, Annelise Mark, Albrechtsen, Hanne, Lundgren, Lena, Sandelin, Ringa and Valtonen, Riitta. 1996. Subject access to Scandinavian fiction literature: Index methods and OPAC development. Copenhagen: Nordisk Ministerrd. Pors, Niels Ole. 2005. Studerende, Google og biblioteker: en undersgelse af 1694 studerendes brug af biblioteker og informationsressourcer. Copenhagen: Biblioteksstyrelsen and Royal School of Library and Information Science. Available http://www. statensnet.dk/pligtarkiv/fremvis.pl?vaerkid=45550 &reprid=1&iarkiv=1 Pylyshyn, Zenon W . 1983. Information science as viewed from the perspective of cognitive science. In Machlup, Fritz and Mansfield, Una, eds., The study of information: interdisciplinary messages. New York: John Wiley & Sons, pp. 63-74. Ratner, Carl. 1989. A sociohistorical critique of naturalistic theories of color perception. Journal of mind and behavior 10: 36172. Available http://web. archive.org/web/20031029152929/http://www. humboldt1.com/~cr2/colors.htm Rayward, W . Boyd. 1994. Visions of Xanadu: Paul Otlet (1868-1944) and hypertext. Journal of the American Society for Information Science 45: 23550. Roberson, Debi, Davies, Ian and Davidoff, Jules. 2000. Color categories are not universal: replications and new evidence from a stone-age culture. Journal of experimental psychology: general 129: 36998. Rosa, Cathy de, Cantrell, Joanne, Cellentani, Diane, Hawk, Janet, Jenkins, Lillie and Wilson, Alane. 2005. Perceptions of libraries and information resources. A report to the OCLC membership. Dublin, Ohio USA: OCLC Online Computer Library Center, Inc. Available http://www.oclc.org/reports/ pdfs/Percept_all.pdf.
27
Rosa, Cathy de, Cantrell, Joanne, Hawk, Janet and Wilson, Alane. 2006. College students perceptions of libraries and information resources. A report to the OCLC membership. A companion piece to perceptions of libraries and information resources. Dublin, OH, USA: OCLC Online Computer Library Center, Inc. Available: http://www.oclc.org/ reports/pdfs/studentperceptions.pdf. Rosenbaum, Howard, Davenport, Elisabeth, Lievrouw, Leah and Day, Ron. 2003. The death of the user. Panel presentations at ASIST 2003 Annual Meeting. Westin Long Beach, CA. Available http:// www.asis.org/Conferences/AM03/abstracts/Sun-3 30-4.html Saracevic, Tefko. 1992. Information science: origin, evolution, relations. In Vakkari, Pertti and Cronin, Blaise, eds., Conceptions of library and information science: historical, empirical and theoretical perspectives. London: Taylor Graham, pp. 527. Saunders, Barbara. 1998. Revisiting basic color terms. Paper presented at the conference on Anthropology and psychology: the legacy of the Torres Strait Expedition, St. Johns College, Cambridge 10-12 August. Available http://human-nature.com/science-asculture/saunders.html Saunders, Barbara. 2000. Revisiting basic color terms. Journal of the Royal Anthropological Institute 6: 81-99. Shiri, Ali Asghar, Revie, Crawford and Chowdhury, Gobinda. 2002. Thesaurus-assisted term selection and query expansion: a review of user-centered studies. Knowledge organization 29: 1-19. Siatri, Rania. 1999. The evolution of user studies. Libri 49(3): 13241. Available http://www.libri journal.org/pdf/1999-3pp132-141.pdf Simsova, Silva. 1968. Nicholas Rubakin and bibliopsychology, translated by M. Mackee & G. Peacock. Hamden, CT: Archon Books. Suominen, Vesa. 2007. The problem of 'userism', and how to overcome it in library theory. Information research 12(4) paper colis33. Available http:// InformationR.net/ir/12-4/colis/colis33.html Toms, Elaine G. 2010. User-centered design of information systems. In Bates, Marcia J. and Maack, Mary Niles, eds., Encyclopedia of library and information sciences 3rd ed. VII. London: Taylor & Francis, pp. 545260. Urquhart, Donald J. 1948. The distribution and use of scientific and technical information. Proceedings of the Royal Society Scientific Information Conference. London: Royal Society, pp. 408-19.
van Brakel, Jaap & Saunders, Barbara. 2001. Color: an exosomatic organ? In Eschbach, Reiner and Marcu, Gabriel G., eds., Color imaging: device-independent color, color hardcopy, and applications VII. Proceedings from SPIE [the international society for optics and photonics] 4663, pp. 162-76. Verganti, Roberto. 2009. Design-driven innovation: changing the rules of competition by radically innovating what things mean. Boston, MA: Harvard Business Press. Available http://www.designdriven innovation.com/letter.html. Waples Douglas. 1939. People and print. Chicago: University of Chicago Press. White, Howard. D. and McCain, Katherine W . 1998. Visualizing a discipline: an author co-citation analysis of information science, 19721995. Journal of the American Society for Information Science 49: 32755. Whitley, Richard. 1984. The intellectual and social organization of the sciences (2nd edn. with a new introduction 2000). Oxford: Oxford University Press. Wilson, Tom D. 1994. Information needs and uses: fifty years of progress. In Vickery, Brian Campbell, ed., Fifty years of information progress: a journal of documentation review. London: Aslib, pp. 1551. Wilson, Tom D. 2000. Human information behaviour. Informing science 3(2): 4955. Wilson, Tom D. 2008. The information user: past, present and future. Journal of information science 34: 457464. Xiao, Yan. 1994. Faceted classification: a consideration of its features as a paradigm for knowledge organization. Knowledge Organization 21: 648. Young, Jeffrey S. and Simon, William L. 2005. Icon: Steve Jobs the greatest second act in the history of business. Hoboken, NJ: Wiley. Zeng, Marcia Lei, umer, Maja and Salaba, Athena. 2010. Functional Requirements for Subject Authority Data (FRSAD). IFLA Working Group on the Functional Requirements for Subject Authority Records. Available http://www.ifla.org/files/classi fication-and-indexing/functional-requirements-forsubject-authority-data/frsad-final-report.pdf Zipf, George Kingsley. 1949. Human behavior and the principle of least effort: An introduction to human ecology. Cambridge, MA: Addison-Wesley.
28
Knowl. Org. 40(2013)No.1 M. de la Moneda Corrochano, M. J. Lpez-Huertas, E. Jimnez-Contreras. Spanish Research in Knowledge Organization
Spanish Research in Knowledge Organization (2002-2010)

Mercedes de la Moneda Corrochano*, Mara J. Lpez-Huertas**, and Evaristo Jimnez-Contreras***
*/**/*** Universidad de Granada. Departamento de Biblioteconoma y Documentacin, *<dlmoneda@ugr.es>, ** <mjlopez@ugr.es>, *** <evaristo@ugr.es>
Prof. Evaristo Jimnez-Contreras is Ph.D. in Documentation (1993) and professor in Bibliometrics in the Department of Information and Documentation of the University of Granada. He coordinates the research group Evaluation of the Scientific Information and Communication. His main research lines are methodologies for the evaluation of science in the academic institutions. Mercedes de la Moneda Corrochano is Ph.D. in Documentation at the University of Granada. She teaches in the Department of Information and Documentation of the same University. Her publications are on the evaluation of science. She is part of the research group of the University of Granada EC3 which carries out several research project inside the National Research Plan I+D. She is in charge of the Impact Index of Social Science Journals (INRECS) Mara J. Lpez-Huertas is Ph.D. in Roman Linguistics at the University of Granada. She teaches Knowledge Organization at the Department of Information and Documentation of the University of Granada at Graduate and Ph.D. level. Her research interests are knowledge organization design and methods, interdisciplinary knowledge organization, and evaluation of interdisciplinary knowledge. She is the former president and current vice president of ISKO. De la Moneda Corrochano, Mercedes, Lpez-Huertas, Mara J., and Jimnez-Contreras, Evaristo. Spanish Research in Knowledge Organization (2002-2010). Knowledge Organization. 40(1), 28-41. 9 references. ABSTRACT: This study analyzes Spanish research on Knowledge Organization from 2002 to 2010. The first stage involved extraction of records from national and international databases that were interrogated. After getting the pertinent records, they were normalized and processed according to the usual bibliometric procedure. The results point to a mature specialty following the path of the past decade. There is a remarkable increase of male vs. female authors per publication, although the gender gap is not big. It is also evident that there is a remarkable internationalization in publication and that the content map of the specialty is more varied than in the previous decade. Received 2 October 2012; Revised 27 October 2012; Accepted 5 November 2012
1.0 Introduction In a previous study (Lpez-Huertas and JimnezContreras 2004), scientific output in the area of knowledge organization was first analyzed for the period 1992-2001. Since then, there have been great
changes in the field itself, as well as in the university setting where most of this research and these authors are rooted. The present study attempts to reflect the state of knowledge organization research in Spain during the period 2002 to 2010, and compare it with the results described for the previous decade of 1992-2001.
29
With the focus on knowledge organization (KO), the trend detected in 1992-2001 was one of positive evolution and expansion, with overflow into corporate settings or the workplace in general, into decision-making and so-called competitive intelligence (2012). At any rate, there is still no overall consensus among specialists as to whether the aforementioned contexts pertain to KO or not (Hjrland 2008; Smiraglia 2005). The present study is limited to the realm of KO in a strict sense, centering on information retrieval systems. This focus will be justified later on. Moreover, as affirmed in the paper published in 2004 and cited above, the difficulty of drawing conceptual boundaries for KO and its epistemological weakness or lack of theoretical coherence have been stressed by previous authors (Hjrland 2002). Very few contributions about KO studies have come to light in recent years. One publication partly regarding the subject has a more limited temporal coverage than our study (Oliveira, Grcio, and Silva 2010) or limited to a source (Alves et al. 2011). Other studies have a similar coverage of time (Moneda, Lpez-Huertas, and Jimnez-Contreras 2011) or they consider a longer period of time (Travieso 2011). 1.1. Justification and objectives Since the aforementioned paper, published in 2004, hardly anyone has conducted research into this subject area, regarding Spanish contributions to KO, giving us good reason to undertake a review of the state of the art. Furthermore, this second endeavour comes to complement the perspective traced in 2004 while allowing us to follow the evolution of the field and pinpoint possible changes in the (roughly) two decades analyzed. The time spam covered by the present study is of nine years instead of the ten covered by the 2004 publication. The reason is that we presented a short paper on this topic in the 10th ISKO-Spain Conference held in Ferrol in 2011. The conference topic was the evolution of KO in Spain, and it seemed to us interesting to study the Spanish research on KO from the period 2002-2010, since the presentation would be in 2011, following up on the 2004 study. As conceptual limits, in order to produce homogeneous results that would permit comparison of the two periods involved (1992-2001 and 2002-2010), we adopted the same bases as in the previous study. That is, we restricted the concept of KO to systems that approach the subject area from a linguistic-conceptual perspective, fundamentally; although other approaches clearly focused on Knowledge Organization
are also included. All research into specialized conceptual structures or encyclopaedias was taken into account, regardless of whether the approach was theoretical, methodological, practical, or professional. Therefore, content analysis and indexing per se were not considered. Accordingly, only publications by researchers born in Spain and by naturalized citizens of Spain were included. 2.0 Material and methods Considering that the specialized area chosen has diversified output, in a number of different formats, our study embraced all of themmonographs, theses, conference papers (national or international), and articles of any extension published in all the journals indexed by the databases specified below. 2.1 Databases consulted In general, the same patterns as in the previous study were followed, for the sake of consistency. Notwithstanding, some changes were necessary due to the appearance of new databases, such as Dialnet, which, in turn, led us to disregard the databases of Teseo, Rebiun, and Rueca, given that they refer to the same documental type, Ph.D. theses and monographs, and they offer a similar degree of coverage in their collections. Therefore, the databases finally consulted were ISI, LISA, Dialnet, and ISOC. In addition, the publications of the International ISKO acts were incorporated manually, as they are not included in the databases consulted; those of ISKO-Spain were included in view of the importance they have in the context of our study. The search strategies were likewise repeated, as detailed in Tables 1 and 2. This terminological approach to the databases made it necessary to perform several searches so that the combined sum of all would guarantee exhaustive retrieval of our subject area, even though that implied that the results would have some duplication, given that the use of Knowledge Organization as the only term could have silenced numerous relevant documents. The duplicated references were detected and eliminated after loading all the search items into the Procite database. With respect to the structure of the consulted databases, we should acknowledge the lack of homogeneity and standardization in the formats for data retrieval, and above all the lack of information in fields that are of great important for bibliometric studies, such as author affiliation, something that was found
30
to be common in Dialnet, and which led to considerable manual labor afterwards. We should also mention that Dialnet does not offer search syntax, meaning that the retrieval of documents with terms is difficult and time-consuming. Finally, we underline the renowned lack of normalization of author names, above all in international data bases (Ruiz, Delgado, and Jimnez 2002), which, along with the all-too-frequent appearance of first names shown by initials, can severely affect analyses concerned with the study of author gender. The bibliographic processor Procite was used to process data. 2.2 Obtaining and processing data Thematic searches were carried out for the selection of documents, mainly searches by terms and in some cases by classification codes, depending on the database. The international results were obtained by consulting the databases of ISI and LISA. In the case of ISI, the query was made with the list of terms shown in Table 1. Please note that, in addition to the use of the field topic, where the terms were stored, a further refined search was conducted by place (Spain). No refined search was based on the specialized areas arising from each search in order to obtain a more pertinent retrieval, although this called for a posterior manual filtering to eliminate any irrelevant documents. In this way, we were able to include 42 publications that
were not included under the tag of Library and Information Science. The search conducted in the LISA database involved a list of a priori terms shown in Table 1. The national results were extracted from the ISOC and Dialnet databases. For ISOC, the search strategy was twofold, using the classification codes of the database (which led to a search of low precision and wide scope) and a manual selection of pertinent documents for our study. The codes used are shown in Table 3. Aside from this search, another was carried out with terms to cross the results of the previous search. The terms used to retrieve information from ISOC and Dialnet are indicated in Table 2. The list of terms used in English and in Spanish is basically the same as the one used in the study published in 2004. However, we added new expressions that are considered necessary given the appearance of new topics or the increasingly generalized use of some terms over the past decade. Such is the case of ontologies, taxonomies, folksonomies, and systems for knowledge organization, respectively (see Table 2). The result of these searches had to be filtered by the revision of the results obtained in order to ensure the relevance of the results in any case. Once the references had been selected, they were exported to a bibliographic processor to process the information obtained. The duplications were eliminated, and authority control was exercised to correct
Table 1. Search terms in ISI and LISA
Table 2. Search terms in ISOC and Dialnet
31
Table 3. Search codes in ISOC
and normalize the names of authors, which, in many cases, called for consulting alternative sources such as personal webpages. Statistical treatment of the data was trivial and will not be specified here, except to clarify that, in the recount of authors associated with institutions, we used fractioned recount; that is, the portion resulting from each institution resulted from operating with each document was expressed as 1/n, n being the number of authors in question. 3.0 Results and discussion The results obtained respond to the following research questions: What are the characteristics of the population of publishing authors? Is there equality in terms of author gender? How much research is actually printed and divulged? How has output evolved over time? Where is Spanish research published, and how many studies have come to light in the period of study here? Heterogeneity in the identified documents made it necessary to group them into three types: articles, monographs, and dissertations. Each group has its own characteristics, in terms of structure and objectives, as well as in the data identifying the authors. Thus, we first proceeded to perform a sectorial analysis to eventually arrive at an analysis of the data set as a whole, which allowed us to reflect the conduct and the dynamics of Spanish research in the field of Knowledge Organization. The figures obtained were recounted after eliminating irrelevant documents or duplications in the da-
tabases consulted. These generally presented the aforementioned problems of little visibility of KO researchers in bibliometric studies, which may have to do with inadequate categorization of the subject matters included under the specialty, and the inclusion of specific categories within other more general categories, which makes it difficult to identify them while furthermore producing noise in the retrieval process. The contribution in the number of documents of each data base used in this study is shown in Table 4.
Databases ISI LISA ISOC DIALNET TOTAL Documents 96 66 145 226 533
Table 4. Documents in the databases consulted
After the final filter, the number of articles consulted was just 357, a figure slightly below that of the previous period 1992-2001 (399 documents). Yet we must emphasize that, in the latter case, one more year of study was included. Indeed, if we extrapolate the data gathered in this study to a ten year period, the number of publications would be around 497 hypothetical articles. Hence, we stress the numerical difference observed with respect to the decade 1992-2001, as documented in international databases and, in particular, the ISI, where a great increase in publications indexed in the period 2002-2010 is witnessed, as a to-
32
tal of 497 works were published. This stands in remarkable contrast to the previous decades and their publications indexed in the ISI: the rise in publication reached as much as 18% of total output. Meanwhile, the ISI publications identified in the previous decade represented only 4.2% of the entire set of documents. In LISA, there is also an increase, but it is not as surprising as the case just described. This finding has very interesting implications, as it suggests that Spanish research in KO has greater impact internationally than at the national level. 3.1. Quantification of author output Because the collection of documents obtained was irregular from the documental standpoint, as commented earlier, we expound the results in three groups: authors with articles and presentations, authors of monographs, and authors of Ph.D. theses. 3.1.1 Authors of articles and their output
Authors Garcia Marco, F. J. Lpez-Huertas, M.J. Moreiro Gonzlez, Jose A. Sorly Rojo, A. Lpez Alonso, M.A. Morato Lara, Jorge San Segundo, R. Snchez Cuadrado, S. Sicilia Urban, M.A. Urea Lpez, L.A. Eito Brun, R. Granados, M. Montejo Raez, A. Snchez Jimnez, R. Caldera Serrano, J. Caro Castro, C. Garca Barriocanal, E. Llorens Morillo, J.B Martnez Mndez, F.J. Pastor Snchez, J.A. Prez Agnera, J.R. Rodriguez Bravo, B. Snchez Alonso, S. Authors with 3 publications Authors with 2 publications Authors with 1 publication Number of papers published 11 10 10 7 6 6 6 6 6 6 5 5 5 5 4 4 4 4 4 4 4 4 4 12 59 395
Under this heading, we describe both the articles published in journals and those printed as acts of national or international conferences/congresses. We identified 489 authors, who produced 298 papers in periodicals (179 journal articles plus 119 conference communications). A summary of the most productive ones is offered in Table 5. Accordingly, there were 23 authors behind a total of 489 papers published, who may therefore be considered productive in the development and diffusion of Knowledge Organization. They represent 4.7% of total authors. The output by this particular group is 130 articles, which stands as 43% of overall KO publication. It is seen that, according to the model put forth by A. J. Lotka, and corroborated in the previous decade studied, a small percentage of authors does in fact produce a high percentage of publications, in this case 43%. If we compare these results with those of the previous decade, a period for which 201 authors were identified, we find that, between 2001 and 2010, the number of authors increased to 395. However, the production in this decade is not greater than the previous one, during which a total of 330 articles came to press. That is, the number of productive authors is on the rise, but productivity per se is not, showing a somewhat disappointing harvest of 298 articles. Hence, we must conclude that the increase in author ranks is related with the number of undersigning authors: 32.5% of the documents analyzed were signed by three or more authors, and 15% were co-authored by four to six researchers. Table 7 reflects these figures, taking all the document types into account.
Table 6. Number of authors per work published
Table 5. Publications in journals and proceedings of conferences, by author
It is important to point out that many journals chosen by the cited authors to publish their research are not LIS journals. The total of the articles published in
33
these journals comprise 42.65% of the total titles and incorporate 26.78 % of the articles. The main areas of knowledge of these journals are: informatics (11.48% of the articles), economy, and enterprises (7.10% of the articles). Journals devoted to health sciences, psychology, translation, etc. follow with less representation. It is also remarkable the lack of collaboration between areas LIS/non-LIS in these publications, where almost all authors do not belong to the LIS area of knowledge. Considering the articles published in LIS journals, authors coming from areas out of LIS represent only the 11%. Table 7 shows the number and percentage of articles published in LIS and non LIS journals.
Authors Gil Urdiaciain, Blanca Lpez-Huertas, Mara J. Moreiro Gonzlez, Jos A. Agustn Lacruz, M. del Carmen Caro Castro, Carmen Torres Ramrez, Isabel 38 authors
No. of publications 3 3 3 2 2 2 1
Table 8. Authors of monographic works and their productivity
Knowledge Areas LIS Informatics Economy-Enterprise Health Sciences Translation Social sciences Psychology Architecture Museums TOTAL LIS Non LIS % of Non LIS % of LIS
Articles 134 21 13 5 4 3 1 1 1 183 134 49 26,78 73,22
Journals 39 6 10 5 2 3 1 1 1 68 39 29 42,65 57,35
Comparison of these results with those from the previous decade make evident a sharp decline in publication. In this period, there were 141 authors who produced 278 monographic works. One possible explanation is the fact that the institutions or organizations that undertook publishing tasks in the past largely involving thesauri or material headingshave since become less active in this area of activity. 3.1.3 Global analysis of authors of articles and monographs Finally, we prepared a joint list of all the most productive authors of articles or monographs, so as to derive an integral notion of the group dynamics and assess productivity overall. The results are shown in Table 9. In this context, we should point out that 12 of the 25 authors from the Table of 1992-2001 are seen to be active a decade later. On the other hand, 2002-2010 is witness to 35 new authors who, due to their low productivity in most cases, are not referenced by name in Table 9. 3.1.4 Authors of PhD theses and their output The number of Ph.D. dissertations published comes to 23, a higher figure than the 15 of the previous period. Thus, we can speak of a moderately heightened activity if we moreover bear in mind that the second period of analysis is one year longer than the first. Ph.D. theses were generated in eleven Spanish universities, listed in order of importance: Universidad de Valencia, Universidad Politcnica de Valencia, Universidad Carlos III of Madrid, Universidad de Alcal de Henares, Universidad de Murcia, Universidad Complutense de Madrid, Universidad de la Corua, Universidad de Granada, Universidad de Mlaga, Oberta de Catalua, and the universities of Len and Salamanca.
Table 7. Knowledge areas of journals of selected publications
3.1.2 Authors of Monographs and Their Output In this group, we look at complete monographs (13) and book chapters (23), giving a total of 36 publications. This collection amounts to 8.4% of all the works referenced. They were signed by 43 authors, who represent just 8.26% of all authors identified for all the document types published in the period 2002-2011. Here, unlike the case of articles, co-authorship is very low. At the very most, we can encounter three authors. In terms of productivity, we again see that a small number of authors (15) produce 41.5% of all the monographs. In contrast to the period 1992-2001, here the collaborative authors are few, and therefore were not analyzed separately. Results are given in Table 8.
34
Moreiro Gonzalez, Jose Antonio Lopez- Huertas, Maria Jose Garcia Marco, Francisco Javier Garcia Jimenez, Antonio Mochon Bezares, Jose Angel Morato Lara, Jorge Sorli Rojo, Angela Lopez Alonso, Miguel Angel San Segundo Rosa. Sanchez Cuadrado, Sonia Urea Lopez, Luis Alfonso Caro Castro, Carmen Sicilia Urban, Miguel Angeles Agustin Lacruz, Maria del Carmen Eito Brun, Ricardo Granados, Mariangels Montejo Raez, Arturo Pastor Sanchez, Juan Antonio Sanchez Jimeno, Rodrigo Vicedo, Jose Luis Authors with 4 works Authors with 3 works Authors with 2 works Authors with 1 work
14 13 12 8 8 8 8 7 7 7 7 6 6 5 5 5 5 5 5 5 12 15 67 422
Table 9. Most productive authors of articles and monographs
3.1.5 Most cited authors Along the lines of the previous methodology (and Jimnez 2004), we located citations of the works recorded in the ISI. Of the 96 publications identified,
No. of publications
41 were cited. Thus, we can say that the international visibility is greater, as we are speaking of 96 ISI papers as opposed to 17 in the previous decade. Similarly, we observed that the repercussions as measured in the number of citations and, in absolute terms, was also greater than in the previous period, since total citations received was 135 versus seven from the previous decade. An explanation of the increased international visibility of the Spanish research could be the fact that ISI has introduced Conference Proceedings in its database and, specially, the inclusion of two Spanish journals: El Profesional de la Informacin, which published 19 ISI selected articles which received 12 citations, and the Revista Espaola de Documentacin Cientfica, with nine works selected and three citations thereof. The existence of these two journal might makes it easy for authors the publication process. An external cause could also be responsible for the increase: The Spanish Agency of Evaluation for Universities is more and more considering that international publications are a must for promotion. All, taken together, may explain this phenomenon. Altogether, the number of citations received is distributed as shown in the Table below. We should underline that 60% of the citations (81) are concentrated in five papers; and it is also noteworthy that the most productive authors in the area of KO are not precisely the ones showing a greater number of citations of their work. Regarding the geographic origin of the citing authors (Table 11), a considerable degree of diversification is seen, though the order of the first two positions is maintained with respect to the previous period. Logically, the top spot is occupied by Spain,
Citations received 39 15 11 10 6 4 3 2 1 135 No auto-citation 19 14 7 8 4 4 Identified works 4 2 1 3 4 2
1. (Moya, F. et al. A new technique for building maps of large scientific domains based on the cocitation of classes and categories 1. Garca-Berrocal, E. et al. Usability evaluation of ontology editors 1. Daz, I. et al. A specification pattern for use cases 1. Zazo, N.F. et al. Reformulation of queries using similarity thesauri 1. Snchez-Alonso, S. et al. Making use of upper ontologies to foster interoperability between SKOS concept schemes 1. Guerrero, V .P . Automatic extraction of relationships between terms by means of Kohonen's algorithm 6 works 5 works 24 works Total
Table 10. Distribution of citations received
35
Origin of the First Author of the Citing Works Spain USA Brazil UK Canada China South Korea Germany Mexico Argentina Australia Belgium Cuba France Italy Taiwan Colombia Croatia Ecuador Slovenia Finland Holland Jordan Poland Total
No. of publications 61 23 6 5 4 4 4 3 3 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 135
INSTITUTIONS UNIVERSITY SERVICES (administration, archives, non-university libraries) CSIC (Consejo Superior de Investigaciones Cientficas) PUBLIC AND PRIVATE ENTERPRISES FOUNDATIONS HOSPITALS NOT LOCATED Total
% OF OUTOUT 80.10 7.10 2..96 2.84 2.38 0.49 4.00 99.87
Table 12. Most productive institutions
Table 11. Origin of the citations received
with 45.19% of the citing works; in second place, we find USA with 17.045% of the citing works. Grouping the citations by geopolitical areas shows Europe to be the first citing region (59.26% of works), followed by North America (20%), Iberoamerica (11%), the Far East (7.41%), and other countries, with a representation of 2.22%. 3.2 Institutional affiliation of authors Spatial and institutional distribution of the authors, as summed up in Table 12, reflects the spatial and corporative geography of Spanish research in Knowledge Organization. It is evident that the vast majority of authors are affiliated with Spanish universities (80.10%). This collective is followed closely by authors who work in non-university archives and libraries. Further behind stands the CSIC (Spains Scientific Research Coun-
cil), foundations, and hospitals. The trend for universities to generate more publications appears as a constant, as in the previous decade of study, it showed 80% of output as well. Table 13 sums up these results. The degree of productivity of the universities reflects some changes with respect to the period 1992-2001, when the ranking was: Zaragoza, Carlos III, Murcia, Granada, Salamanca, Sevilla, Autnoma de Madrid, Valencia, and Barcelona. At any rate, we find that only five universities (Zaragoza, Carlos III, Murcia, Granada and Salamanca) were among the most productive in both periods. In the ranking of the most productive universities (2002-2012), we have to point the entrance of the Universities of Jan and Alicante, which do not have LIS in their curricula. The authors of the selected papers are working in the Department of Informatics in the former case and the Department of Languages and Informatic Systems in the latter one. In both cases, the articles written by them do not include an author from LIS Departments. This is another reason that let us consider, on top of the journals where they publish, the interest toward knowledge organization from other specialties, although their productivity in KO is much lower. In the same line, we could count more than 15 articles from authors belonging to Departments of Economy and Management of Enterprises. We also found authors belonging to the Departments of Informatics, Health Sciences, Translation, Psychology, or Architecture, in which production is lower than 10 papers. Nevertheless, it evidences the interest for KO from other specialties. Likewise noteworthy is the presence of nonuniversity entities, which generated 15.77% of total output. At the same time, we see some diversification
36
of the institutions involved, especially libraries and archives, but also film archives and press documentation centers. Moreover, there are centers that did not appear in the previous period, including diverse enterprises, foundations, and hospitals, responsible for 31 publications. The broadening area of interest in knowledge organization suggests greater social sensitivity regarding the benefits that it may hold for private companies or institutions, in special business and hospitals together with its use in informatics, social research, museums, etc. As stated above, these facts might mean recognition of the usefulness of knowledge organization in contexts not only linked to information retrieval. 3.3 Evolution of output over time In general, output is seen to be more or less stable over the nine years studied here (Figure 1), ranking between the 29 publications of 2010 to as many as 53
SOURCE Universidad Carlos III de Madrid Universidad de Granada Universidad de Zaragoza Universidad Complutense de Madrid Universidad de Extremadura Universidad de Murcia Universidad de Salamanca Universidad de Jan Universidad de Alcal de Henares Universidad de Alicante Other Universities (39)
in 2007. There are peaks of greater production in the years 2002, 2003, and 2007, coinciding with the publication of the ISKO Proceedings. These congresses also show the greatest volume of activity in the 1990s. We can therefore speak of a definite impact of ISKO events on the volume of Spains scientific output in the area of knowledge organization. 3.4 Analysis of scientific output by gender Without a doubt, approaching this type of analysis is of general interest, but it gains extra interest in a specialized field where women are present at all levels, whether as professionals, students, or teachers. We hoped to determine whether this reality was reflected in the scientific output. Yet it was impossible to determine the first name of some authors (and therefore their gender), since only first initials were used in some records. As an average value, we found that, for 41% of the studied publications, at least half of
No. of contributions 40.83 25.94 20.75 20 14.33 14.25 12.25 10 9.91 9.40 108.26 % Contributions 11.44 7.27 5.81 5.60 4.01 3.99 3.43 2.80 2.78 2.63 30.33
Table 13. Most productive universities 2002-2012
Figure 1. Output by year
37
Table 14. Distribution of authors by gender
the undersigning authors were women. This trend depends on whether there was one author or more. Among the published works with just one author, women represent 42% of the total, but as the number of authors increases, so does the proportion of female authorship. Overall, we found that the more the coauthors, the higher the participation of women. At any rate, however, the differences were slight and do not point to any significant gender gap. The following table displays the participation of women in authorship. 3.5 Distribution of output by subject In order to carry out this part of the study, the content of each one of the published works was analyzed. We observed considerable thematic variety in the collection of publications, particularly in this second period. Furthermore, we found other topics that were not present in the previous period, which generated new terminology and the need for a certain internal restructuring of the subject matter. Although we attempted to maintain the thematic groups used previously, at times it was necessary to introduce changes due to the evolution of the area. In the first place, very general groups were drawn to provide a clearer view of the contents of the publications analyzed. Figure 2 shows the general subject areas and their percentage-wise distribution.
It is interesting to note that the terminology used to represent the contents of the output from 20022010, if compared with that of the previous decade, shows only one coincidence: Knowledge Organizationwhich here represents 11% of all outputstood for 13% in the previous period. Over the period 20022010, there were a number of terminological and conceptual changes, and only one of the groups of the previous period is still present, namely Knowledge Organization Systems. It is an expression rooted in the specialized area studied and which came to be largely to denominate what was once referred to as Documental Languages. For this reason, documental languages have been included within Knowledge Organization Systems together with all the specific types of systems: Classifications, Subject Headings, Thesauri, Taxonomies, Ontologies, etc., making it the most important group of the set, with 56% of the output. The rest of the subjects that appear in the figure are novel, and we believe can be attributed to the fact that research into knowledge organization has become increasingly specialized and is now more focused on searches, with a presence of 4%, or retrieval, with 5%. Also new is Knowledge Representation, with 7% of the total, and which includes the study of any linguistic, conceptual, or algorithmic method used to represent the contents of documents in information systems. Knowledge Processing likewise appears for the first time in the realm of study, with 5% of output, but is
38
Figure 2. Display of percentage of general themes
oriented more towards managerial knowledgethat is for companies and organizationsstill having strong connections with knowledge organization in a strict sense. For example, there are works describing the need to organize and prioritize this knowledge to later process it adequately, even in decision-making processes. Some authors focus on the study of knowledge in general, this minor group representing just 2% of output. Web Systems arise as a new subject area of interest in this second period, with a presence of 10%. The contents represented with this tag make mention of knowledge organization on the web, portals, social networks (Facebook, etc.) folksonomies, etc. It is interesting to examine the internal composition of the most representative group, which is Knowledge Organization Systems and compare it, in turn, with the results of the period 1992-2001, as shown in the tables below.
Classification Subject Headings Documental Languages Thesauri Total documents 76 47 5 236 364
Comparison of Tables 14 and 15 makes evident that the total number of documents in this group is greater in the previous period than in 2002-2010; therefore, interest in these subjects is on the decline. We also observe a diversification of subject matter, doubling in the second period, and, except for documental languages which increase their presence, the rest of the topics decrease, especially the thesaurus, which suffers a dramatic drop from 236 to 63, and subject headings go from 47 to 8. It appears that the migration of interest on the part of researchers toward new systems and new subject areas would explain the present situation. Among the latter, a growing interest is seen in ontologies, which are 15% of the group.
Classification Subject Headings Documental Languages Tesauri Conceptual Maps Ontologies Taxonomies Systems for Knowledge Organization Total documents 56 8 12 63 7 48 8 47 249
Table 15. Representation of the subject areas of the Documents 1992-2001
Table 16. Representation of the topics in no. of documents 2002-2010
39
Figure 3 below displays the percentages of the group Knowledge Organization Systems. It reflects an evident interest in the study of documental languages, approached from a general perspective, at 24%, and by the now denominated Knowledge Organization Systems, with 15%, which, if added, give us 39% of the total. If we more closely analyze the groups deserving mention, thesauri, with 20%, mostly refer to thesauri of different specialized areas; the rest of the documents look into norms, theory, the state of the art, and methodology for constructing them; classification, presenting an internal
composition from top to bottom, which includes: specialized classification, with 25 documents, theory and general aspects of classification, with 18 documents, and bibliographic classifications, with 9 documents, 7 of them corresponding to the UDC. Finally, ontologies, with 15%, are mostly ontologies built for specific subject areas, whereas the rest deal with aspects related with construction theory and methodology. The comparative evolution of the subjects configuring the group Knowledge Organization Systems in the two periods 2002-2010 can be viewed in Figure 4.
Figure 3. Percentage-wise distribution of the group Knowledge Organization Systems
Figure 4. Comparative evolution of Knowledge Organization Systems (1992-2001 and 2001-2010)
40
4.0 Conclusions In view of the results obtained and described here, we may affirm that research into knowledge organization is well consolidated in Spain, and indeed shows growth and development with respect to the previous decade. The ISKO conferencesin particular the national ones, but also the international onesclearly contribute to the increase in Spains research output. Aside from a slight increase in the number of publications by some of the most prolific Spanish authors from the previous decade, we detected a fresh influx of newcomers: only 12 authors from the period 19922010 are also active in the period 2002-2010. Deserving mention is the appearance of 35 novel authors in this field of study, although their output is limited: Jos Antonio Moreiro co-signed 14 articles, Mara J. Lpez-Huertas co-authored 13, and Javier Garca Marco produced or co-produced 12. Interest in knowledge organization regarding areas beyond library and information science is evident due to the presence of papers coming from specialties others than that of LIS, in special from informatics and economy-business. This piqued interest is no doubt partly responsible for the appearance of new authors on the list of the most productive researchers, such as Urea and Montejo Vicedo or Snchez Alonso from the Informatics Department. Likewise, our analysis allows us to confirm that this field of study is increasingly interdisciplinary. Despite a discrete overall increase in output during the period 2002-2010, there is a manifest drop in monographic publications. This points to a change in perspective on the part of researchers; we believe that they now tend towards social sciences as the realm of dissemination of research findings. There were nearly 50% more Ph.D. dissertations in the second period of study, indicating a greater degree of interest in knowledge organization on the part of students enrolled in LIS studies. According to the ISI, there was a noteworthy increase in knowledge organization studies stemming from Spain. We highlight this finding as a sign of heightened quality in research output and greater international visibility of Spanish research efforts in the area of knowledge organization. It also suggests a change in publishing habits, perhaps due to Spains overall scientific policy and decision-making procedures. Such development translates as an increased citation of Spanish authors, as recorded by the ISI database.
Results suggest that the gender gap has receded. Women were roughly half of the co-authors of 41.07% of the papers produced by Spanish research institutions. Notwithstanding, the fact that there are more women researchers active at present sheds some essential light on these data. Indeed, we found that the greater the number of undersigning authors, the greater the proportion of female authors. In short, a considerable change is seen in the arena of knowledge organization output from Spanish institutions. Five topics are seen to emerge with vigor: knowledge representation, information search and retrieval, web systems, and knowledge management. Deserving special mention is the group we denote as knowledge organization systems (KOS), which incorporates documentary languages. Its internal composition reveals that specialized areas such as ontologies, conceptual maps, and taxonomies are gaining research interest. The new area known as folksonomies, generally included under web systems, is also a topic of growing interest. In short, the growth in output documented here reflects conceptual advances in knowledge organization on the part of Spanish researchers on the whole. References Alves, Bruno Henrique, Gracio, Maria Cludia Cabrini and Oliveira, Ely Francina Tannuri. 2011. A produo cientfica da revista Scire: uma anlise bibliomtrica do perodo 2006/2010. Presented at I Congresso Brasileiro de Organizao e Representao do Conhecimento (ISKO-Brasil), Braslia.. Hjrland, Birger. 2008. What is knowledge organization (KO)? Knowledge organization 35: 86-101. Hjrland, Birger. 2002. The methodology of constructing classification schemes: A discussion of the state of the art. In: Lpez-Huertas, Mara ed., Challenges in knowledge representation and organization for the 21st century. Integration of knowledge across boundaries. Proceedings of the 7th International ISKO Conference, 10-13 July 2001, Granada, Spain. Wrzburg: Ergon Verlag, pp. 450-56. Lpez-Huertas, Mar.a Jos and Jimnez Contreras, Evaristo. 2004. Spanish research in knowledge organization (1992-2001). Knowledge organization 31: 136-50. Moneda Corrochano, Mercedes de la, Lpez- Huertas, Mara Jos and Jimnez Contreras, Evaristo. 2011. La investigacin sobre organizacin del conocimiento en Espaa (2002-2010). In: Resmenes X Congreso Captulo Espaol de ISKO, Ferrol, 30 junio - 1 julio, p. 74
41
Oliveira, Ely Francina Tannuri; Gracio, Maria Claudia; Cabrini Silva, Ana Claudia. 2010. Investigadores de mayor visibilidad en organizacin y representacin del conocimiento: un estudio desde el anlisis de cocitaciones. Scire, 16n2: 39-45. Ruiz Prez, Rafael, Delgado Lpez-Czar, Emilio and Jimnez-Contreras, Evaristo. 2002. Spanish personal name variations in national and international biomedical databases: implications for information retrieval and bibliometric studies. Journal of the Medical Library Association 90: 411-30.
Smiraglia, Richard. 2005. About knowledge organization: An editorial. Knowledge organization, 32: 139-40. Travieso-Rodrguez, Crspulo, Lascurain-Snchez, Mara Luisa, Sal-Agero, Alberto and Sanz-Casado, Elas. 2011. La organizacin del conocimiento en Espaa a partir del anlisis bibliomtrico de los Congresos ISKO-Captulo Espaol. In: Resmenes X Congreso Captulo Espaol de ISKO, Ferrol, 30 junio - 1 julio, p. 73.
42
Knowl. Org. 40(2013)No.1 J. T. Tennis. Ethos and Ideology of Knowledge Organization: Toward Precepts for an Engaged Knowledge Organization
Ethos and Ideology of Knowledge Organization: Toward Precepts for an Engaged Knowledge Organization
Joseph T. Tennis
The Information School of the University of Washington, Box 352840, Mary Gates Hall, Ste 370, Seattle, WA, United States 98195-2840, <jtennis@uw.edu>
Joseph T. Tennis is an Assistant Professor at the Information School of the University of Washington and an Associate Member of the Peter Wall Institute for Advanced Study at The University of British Columbia. He has been an occasional visiting scholar at the State University of So Paulo since 2009. He is Reviews Editor for Knowledge Organization, Managing Editor for Advances in Classification Research Online, and on the editorial board for Library Quarterly and Scire. He holds a Ph.D. in Information Science from the University of Washington. He works in classification theory, scheme versioning, and comparative studies of metadata. Tennis, Joseph T. Ethos and Ideology of Knowledge Organization: Toward Precepts for an Engaged Knowledge Organization. Knowledge Organization. 40(1), 42-49. 10 references. ABSTRACT: This paper provides rationale for considering precepts for an engaged knowledge organization based on a Buddhist conception of intentional action. Casting knowledge organization work as craft, this paper employs ieks conception of violence in language as a call to action. The paper closes with a listing of precepts for an engaged knowledge organization. Received 6 June 2012; Revised 30 September 2012; Accepted 1 October 2012
1.0 Ethos and ideology Ethos is the spirit that motivates ideas and practices. When we talk casually about the ethos of a town, state, or country, we are describing the fundamental or at least underlying rationale for action, as we see it. Ideology is a way of looking at things. It is the set of ideas that constitute ones goals, expectations, and actions. In this brief essay, I want to create a space where we might talk about the ethos and ideology in knowledge organization from a particular point of view, combining ideas and inspiration from the Arts and Crafts movement of the early twentieth century, critical theory in extant knowledge organization work, the work of Slavoj iek, and the work of Thich Nhat Hahn on Engaged Buddhism. I will expand more below, but we can say here and now that there are many open questions about ethos
and ideology in and of knowledge organization, both its practice and products. Many of them in classification, positioned as they are around identity politics of race, gender, and other marginalized groups, ask the classificationist to be mindful of the choice of terms and relationships between terms. From this work, we understand that race and gender requires special consideration, which manifests as a particular concern for the form of representation inside extant schemes. Even with these advances in our understanding, there are still other categories about which we must make decisions and take action. For example, there are ethical decisions about fiduciary resource allocation, political decisions about standards adoption, and even broader zeitgeist considerations like the question of Fordist conceptions (Day 2001; Tennis 2006) of the mechanics of description and representation present in much of todays practice.
43
Just as taking action in a particular way is an ethical concern, so too is avoiding a lack of action. Scholars in knowledge organization have also looked at the absence of what we might call right action in the context of cataloguing and classification. This leads to some problems above and hints at larger ethical concerns of watching a subtle semantic violence go on without intervention (Bowker and Star 1999; Bade 2006). The problem is not to act or not act, but how to act or not act in an ethical way, or at least with ethical considerations. The action advocated by an ethical consideration for knowledge organization is an engaged one, and it is here where we can take a nod from contemporary ethical theory advanced by Engaged Buddhism. In this context, we can see the manifestation of fourteen precepts that guide ethical action and warn against lack of action. This paper pulls together four distinct lines of thought and brings each in its part to bear on the issue of intentionality in knowledge organization. In what follows, I will make an argument 1) for knowledge organization work as craft that uses words that are potentially instruments of violenceas conceptualized both by 2) critical theory in KO research and by 3) iek, and 4) that a framework for intentional action guided by a Buddhist ethical stance can serve as one amelioration to this violence, while doing justice to the ontology of knowledge organization work as craft. Each of these arguments is introduced in separate sections below. Each section ends with an assertion about knowledge organization work. Each assertion is presented as a formal assertion at the end of each section. I then use these assertions to talk about precepts for an engaged knowledge organization In the following section, I will make some assertions about work and language in relation to the ethos and ideology present in the practices of indexing. I will start with conceptions of craftwork and move to a hermeneutics of suspicion (seeing a different perspective than many) manifest in the analyses of Slavoj iek. The result is a recasting of intention in indexing based on this composite frame of viewing work and the raw material of our indexing work, that is, language. 2.0 The Arts and Crafts Movement and assertion number one William Morris responded to the advances of the industrial revolution by returning to nature and to history. His work surfaced in the milieu of the Arts and Crafts Movement in Britain (1850-1900). This move-
ment was a response to the Industrial Revolution. The development of the steam engine by James Watt in 1765 led to the mechanization of industry, agriculture, and transportation and changed the life of the workingman in Britain. Industrialization left people with a sense that their lives had changed for the worst. Many had sacrificed a rural lifestyle 'in England's green and pleasant land' for the sake of a job in the 'dark Satanic mills' of the Industrial Revolution. As a result, they lost that feeling of security and belonging which comes from living in smaller communities. The members of the Arts and Crafts Movement included artists, architects, designers, craftsmen, and writers. They feared that industrialization was destroying the environment in which traditional skills and crafts could prosper, as machine production had taken the pride, skill, and design out of the quality of goods being manufactured. They were convinced that the general decline of artistic standards brought on by industrialization was linked to the nation's social and moral decline. Knowledge organization as craft is implicit in many of the discussions of cataloguing, classification, and bibliography. For example, bibliography, the systematic description and enumeration of books, is, in my estimation, a craft. It is skilled work done with handheld tools. These tools, the extant catalogues and classification schemes require specialized knowledge, often limited to only those that have apprenticed under another skilled bibliographer, or what we commonly call cataloguer. Their work combines learned practices, methods of analysis, and the creation of representations, which result in a bibliographic description. Bibliographic description, as an artifact, is born of this labor and is often described when well done as beautiful by peer bibliographers. We can contrast this view of bibliography or knowledge organization with other work practices that do not honor skilled labor, apprenticeship, and are absent of aesthetic interest. Some consider bibliography as an act that requires no skill, that can be done by anyone, and is only functional, not a product of labor by a trained and skilled craftsperson. In recent developments in the Hathi Trust project, we have seen how this latter work ethic has resulted in low-quality metadata and a need for more skilled hands to create more robust and viable representations of the now millions of digitized volumes (York and Downie, 2012). This later view might be likened to the industrial view of labor, because as a work practice it erases the value of metadata created by craftspeople. This leads us to our first assertion.
44
Assertion One: We can see this historical conflict as a metaphor for conflicting stances on the work done in knowledge organization today. We have artisanal work in our knowledge organization systems. Work in knowledge organization does not have to be industrialized. 3.0 Assertion number two: critical theory, knowledge organization research, and right action As it stands, there are many open questions about ethics in knowledge organizationits practice and products. Many of them, relevant to knowledge organization, but cast as classification research and positioned as they are around identity politics of race, gender, and other marginalized groups, ask the classificationist to be mindful of the choice of terms and relationships between terms. To highlight these concerns, scholars have invoked feminist philosophy of the limit (Olson 2002), queer theory (Campbell 2001), and critical race theory (Furner 2007). From this work, we understand that race and gender require special consideration, which manifests as a particular concern for the form of representation inside extant schemes and indexing languages. Even with these advances in our understanding, there are still other categories about which we must make decisions and take action. For example, there are ethical decisions about fiduciary resource allocation, political decisions about standards adoption, and even broader zeitgeist considerations, like the question of Fordist conceptions of the mechanics of description and representation present in much of todays practice versus a more Morris-esque Arts and Crafts version of the same (Day 2001; Tennis 2006). Just as taking action in a particular way is an ethical concern (assigning suspect indexing terms to documents), so too is avoiding a lack of action. Scholars in knowledge organization have also looked at the absence of what we might call right action in the context of cataloguing and classification. This leads to some problems related to identity (mentioned above) and hints at the larger ethical concernsnamely watching a subtle semantic violence persist in our systems without intervention (Bowker and Star 1999; Bade 2006). What Bowker and Star discuss in their 1999 work is the accretion of compromises in systems of representation, and how a lack of control over the design context and
timeline can lead, inadvertently, to structures that do not benefit and, in fact, might be seen to hurt. Bade, for his part, discusses how it is possible to adhere to standard practice of representation in cataloguing and data entry into online bibliographic utilities, but still expend resources in the form of time and electricity to no helpful end. That is, we can maintain nonsense in online catalogues (which contain indexing work as well as cataloguing work) without impunity because we followed a standard practice. Thus, we must fully understand what kind of action we take and how such action might or might not be considered beneficial or, on the other extreme, violent. In this case, violence can be understood as the expression of force against self or other, compelling action against one's will on pain of being hurt. Violence is used as a tool of manipulation. Right action is understood as action for which one is responsible. If one understands the consequences of her or his actions, and they accord with engendering benefit, then the action can be said to be right action. It is the combination of understanding violence (in all its guises) and understanding right action (in what we do and what we chose not to do) that we can reflect on intention in indexing. And if we are concerned with doing beneficial work with our scarce resources, we can make a second assertion. Assertion Two: Not taking right action in knowledge organization practice is an act of violence. 4.0 ieks concept of violence and a third assertion Slavoj iek in Violence describes three forms that frame our understanding of the same. Subjective violence, symbolic violence, and systemic violence (the latter two are both considered objective violence). Subjective violence is carried out by a subject, an actor, an identifiable agent. Clear examples of subjective violence are acts of crime, terror, civil unrest, and international conflict (iek 2008, 1). The two forms of objective violence are not as obvious. In ieks analysis, these other forms of violence are embedded and invisible to most of our observations. iek (2008, 2) writes: Objective violence forms the status quo against which we measure subjective violence. Symbolic
45
violence is the universe of meaning imposed by a language on a group of people and systemic language. Systemic violence is the consequences (often catastrophic) of the smooth functioning of economic and political systems. The most striking example of violence iek calls out is the violence of the liberal communist. This is someone who has made money (thereby taking it from others) and has turned around to fix problems in under-privileged and developing world contexts. This asserts a particular socio-political stratificationthe liberal global capitalist democracy. In indexing, it is easy to see that objective violence can surface in our work, because our work is rooted in what iek calls symbols and systems. First, we use the symbolic systems of language and its more refined subset of indexing languagesoften controlled indexing languages. And we operate within systems, as defined by iek, that are part of the sociopolitical systemlegitimated as components to help the (capitalist) democratic citizen. Assertion Three: Objective violence (symbolic and systematic) is potentially present in contemporary acts of indexing. 5.0 Assertion number four: toward and engaged, reflective, and intentional practice of knowledge organization It would seem to me that if we buy the assertion that objective violence can surface in our work, then we have ethical decisions to make to prevent it. We must establish a reflective understanding of intention in indexing. If we establish the perspective that indexing, its practice and its products, are at least complicit in, if not tools for propagating violence as outlined above, we are then forced to engage with this new stance. I would argue that the action advocated by an ethical consideration for knowledge organization, in this case right action, is an engaged one, and it is here where we can take a nod from contemporary ethical theory advanced by Engaged Buddhism. In this context, we can see the utility of precepts that guide ethical action and warn against lack of action. Assertion Four: Engaged knowledge organization acknowledges objective violence in our work and works toward
following guiding precepts to teach us how to work with awareness and to work less violently. The emergence of an engaged indexing will have to be based on our understanding of how we might act in the practice of indexing to prevent violence. That is, we have to ask at what level of intention should we operate? 6.0 Levels of intention The philosophy of intention operates in conjunction with other philosophical investigations. For our purposes, we can see indexers as possessing a high level of intention. That is, when they go to do their work, they intend on doing indexing. However, beyond this first level of intention, there are others. We can refine our conception of intention by saying that, simply because one acts does not mean one acts with the best of intentions. That is, we can assume that acting carries with it an ethical component. We can decide to act for benefit of ourselves or others, or we can act with the intention of harming others. If we conceive of indexers as interested in benefitting others, we can then begin to examine to what degree Intention for our purposes is: performing an action for a specific purpose. If we want to believe we are doing good work, then we have to believe our intentions are good. However, we immediately see the need for guidance. What happens if someone wants to do good work and works to provide access to the written word, but finds they have to not do certain level of cataloguing because of budgetary restrictions? Or say, someone wants to not harm animals, but accidently steps on an insect? We can see a need to clarify intention here, in these two cases. In order to solve the philosophical and ethical problem that surfaces from this scenario ethicists have constructed a two-part measure for considering how unwholesome an act is. This measure asks: what knowledge do we have of the act, and what is our level of intentionality when carrying out the act? To this end we end up with two sets of measures. Five Levels of Intentionality: 1. An action performed without intending to do that particular actionfor example, accidentally treading on an insectwithout any thought of harming; 2. If one knows that a certain kind of action is evil, but does it when one is not in full control of oneself, for example, when drunk or impassioned;
46
3. If one does an evil action when one is unclear or mistaken about the object affected by the action; 4. An evil action done where one intends to do the act, fully knows what one is doing, and knows that the action is evil. This is the most obvious kind of wrong action, particularly if it is premeditated; 5. An evil action done where one intends to do the act, fully knows what one is doing (as in 4), but does not recognize that one is doing wrong. Measures of Knowledge of the Act: a) One is in a state of mind in which one knows one is doing that act (yes or no); b) One knows the act to be wrong, if it is intentionally done (yes or no). These are binary measures (either yes or no) and are combined with intentionality to see the extent to which the act is unwholesome. If we establish the perspective that knowledge organization, its practice and its products, are at least complicit in, if not tools for, propagating violence as outlined above, we are then forced to engage with this new stance. I would argue that the action advocated by an ethical consideration for knowledge organization, in this case right action, is an engaged one, and it is here where we can take a nod from contemporary ethical theory advanced by Engaged Buddhism. In this context, we can see the manifestation of precepts that guide ethical action, and warn against lack of action. Though this is a marked secularization of the religious concepts present in Buddhism (Engaged or otherwise), it does not lose any of its applicability to the professional environment. That is, though we call this Buddhism, it is, in fact, more philosophy than not. If we remove the core belief that we are setting ourselves free by removing suffering (a religious conception of action), with the belief that we are helping others change their lives for the better through access to information (an ethical conception of action, specifically knowledge organization), then we can trade one for the other without losing the applicability of considering intention and our mental states when engaging in action, and again, specifically when we are engaging in knowledge organization. In an effort to help guide action given the points on intentionality and knowledge of the act outlined above, we can perhaps work with preceptsor ways of judging actionsthat specifically wed knowledge organization actions and conceptions of right action
drawn from the ethical base of Engaged Buddhism. In order to root this discussion in the latter, I have drawn on Thich Nhat Hanhs writing on precepts for an Engaged Buddhism (Nhat Hanh 1998). 7.0 Precepts for an engaged knowledge organization What follows is a list of nine precepts that I believe are useful for us to consider and debate in the context of an engaged knowledge organizationits concepts and praxis. The nine precepts are titled by me, have their foundation text from Nhat Hanh quoted (1998), and a commentary also written by me. 7.1 The precept of bound by doctrine Do not be idolatrous about or bound to any doctrine, theory, or ideology, even Buddhist [professional] ones. All systems of thought are guiding means; they are not absolute truth. Here we can see that we are attempting to get a symbolic violence through detachment. We dont need our identity permanently attached to doctrine, theory, or ideology (even professional ones). This then allows us to act in an engaged way when we organize knowledge, but not act in a dogmatic way. 7.2 The precept of knowledge changes Corollary 1: Schemes need to change Corollary 2: You must constantly learn new knowledge Do not think the knowledge you presently possess is changeless, absolute truth. Avoid being narrowminded and bound to present views. Learn and practice non-attachment from views in order to be open to receive others viewpoints. Truth is found in life and not merely in conceptual knowledge. Be ready to learn throughout your entire life and to observe reality in yourself and in the world at all times. Here we see an amplification of the first precept. And it is key for semantic, conceptual, and reference violence. We have to be able to change our schemes should violence appear in them, to learn our whole working life (and beyond).
47
7.3 The precept of harm (violence) in knowledge organization Do not force others, including children, by any means whatsoever, to adopt your views, whether by authority, threat, money, propaganda, or even education. However, through compassionate dialogue, help others renounce fanaticism and narrow-mindedness. Here we see that a major part of acknowledging violence in knowledge organization is the commitment to educate, but only through dialogue and only by avoiding fanaticism and narrow-mindednesseven in thinking we are doing the right thing by helping to change violence. 7.4 The precept of acting because you have knowledge Do not avoid suffering or close your eyes before suffering. Do not lose awareness of the existence of suffering in the life of the world. Find ways to be with those who are suffering, including personal contact, visits, images and sounds. By such means, awaken yourself and others to the reality of suffering in the world. For engaged knowledge organization, this relates directly to the belief that we should upon being educated on the presence of violence in KO, not close our eyes to it. 7.5 Precept of sharing and connection with all Do not accumulate wealth while millions are hungry. Do not take as the aim of your life fame, profit, wealth, or sensual pleasure. Live simply and share time, energy, and material resources with those who are in need. Do not maintain anger or hatred. Learn to penetrate and transform them when they are still seeds in your consciousness. As soon as they arise, turn your attention to your breath in order to see and understand the nature of your hatred. Here we see the need to eliminate the ego-self, a distinctively Buddhist concept, but one that I think plays well into an engaged conception of knowledge organization and puts us in check as well. We are not saviors. We are not hoarders of conceptual knowledge. We are not in a position to harbor anger or ha-
tred. We are here to share and make better through our work in knowledge organization. 7.6 Precept of joyful work at the present moment Do not lose yourself in dispersion and in your surroundings. Practice mindful breathing to come back to what is happening in the present moment. Be in touch with what is wondrous, refreshing, and healing both inside and around you. Plant seeds of joy, peace, and understanding in yourself in order to facilitate the work of transformation in the depths of your consciousness. And even if we arent the center of the universe, our health is important to organizing knowledge. We have to feel joy and peace in order to carry out the work, and whats more, we need to lead by example. 7.7 Precept of right language Do not utter words that can create discord and cause the community to break. Make every effort to reconcile and resolve all conflicts, however small. Do not say untruthful things for the sake of personal interest or to impress people. Do not utter words that cause division and hatred. Do not spread news that you do not know to be certain. Do not criticize or condemn things of which you are not sure. Always speak truthfully and constructively. Have the courage to speak out about situations of injustice, even when doing so may threaten your own safety. Here we see that speech is an important factor, but more to our purposes in engaged KO, we can see the power of language represented in these precepts. When we acknowledge violences in KO, then we must speak truthfully but not sew conflict. 7.8 Precept of good vocation Do not use the Buddhist community for personal gain or profit, or transform your community into a political party. A religious community, however, should take a clear stand against oppression and injustice and should strive to change the situation without engaging in partisan conflicts. Do not live with a vocation that is harmful to humans and nature. Do not invest in companies that
48
deprive others of their chance to live. Select a vocation that helps realise your ideal of compassion. Here we see Thich Nhat Hahn address his Buddhist community specifically. For us, it would be the community of engaged KO researchers. And we can also see how we benefit and do not benefit from this vocation. 7.9 Precept of scarce resources Do not kill. Do not let others kill. Find whatever means possible to protect life and prevent war. Possess nothing that should belong to others. Respect the property of others, but prevent others from profiting from human suffering or the suffering of other species on Earth. Do not mistreat your body. Learn to handle it with respect. Do not look on your body as only an instrument. Preserve vital energies (sexual, breath, spirit) for the realisation of the Way. (For brothers and sisters who are not monks and nuns:) Sexual expression should not take place without love and commitment. In sexual relations, be aware of future suffering that may be caused. To preserve the happiness of others, respect the rights and commitments of others. Be fully aware of the responsibility of bringing new lives into the world. Meditate on the world into which you are bringing new beings. Here we see the primacy of integrity as it relates to individual human beings and their means of living a happy life. If engaged knowledge organization is built on the practices paid for by others, using natural resources as well as people-power, then we have to act with an ethical imperative on helping them realize their full potential through interaction with the written recordthrough organized knowledge. Thus there are nine precepts proposed here for an engaged knowledge organization: 1. the precept of bound by doctrine, 2. the precept of knowledge changes, 3. the precept of harm (violence) in knowledge organization, 4. the precept of acting because you have knowledge, 5. the precept of sharing and connection with all, 6. the precept of joyful work at the present moment, 7. the precept of right language, 8. the precept of good vocation, 9. the precept of scarce resources.
8.0 Concluding remarks I have made four assertions in the course of this paper: 1. We can see this historical conflict between the Industrial Revolution and the Arts and Crafts Movement as a metaphor for conflicting stances on the work done in knowledge organization today. 2. Not taking right action in knowledge organization practice is an act of violence. 3. Objective violence (symbolic and systematic) is potentially present in contemporary acts of knowledge organization. 4. Engaged knowledge organization acknowledges objective violence in our work and works toward following guiding precepts to teach us how to work with awareness and work less violently. It seems that assertion 1 is not linked to the others. Now we can connect assertion 1 with the rest. In order to make tractable the idea of precepts in engaged knowledge organization, we need to acknowledge the conflict present in contemporary bureaucratic practices of knowledge organization and the art and craft of description. In order to make tractable the idea of precepts in engaged knowledge organization, we need to acknowledge the conflict present in contemporary bureaucratic practices of knowledge organization and the art and craft of description. We have a systemic violence at work in the factory life of 19th century Britain. It might be possible to see systemic violence at work in the bureaucratic routine in the design and maintenance of knowledge organization systemsespecially those that privilege standardization over aiding users reach their full potential through access. The upshot then is perhaps a need for an artistic turn in descriptive practices. And with the artist turn in knowledge organization, realize that we have agency and can operate with intention in carrying out the work of cataloguing, indexing, and classification scheme design. That there is an ethos and ideology in our work and that we can begin to debate the merits of working in an engaged way, understanding our intentions, and test for ourselves whether or not working with precepts is a helpful way forward in knowledge organization. References Bade, David. 2006. Colorless green ideals in the language of bibliographic description: making sense
49
and nonsense in libraries. Language and communication 27: 54-80. Bowker, Geoffrey and Star, Susan Leigh. 1999. Sorting things out: classification and its consequences. Cambridge: MIT. Campbell, D. Grant. 2001. Queer theory and the creation of contextualized subject access tools for gay and lesbian communities. Knowledge organization 27: 122-31. Day, Ron E. 2001. Totality and representation: a history of knowledge management through European documentation, critical modernity, and PostFordism. Journal of the American Society for Information Science and Technology 52: 725-35. Furner, Jonathan. 2007. Dewey deracialized: a critical race-theoretic perspective. Knowledge organization 34: 144-68. Nhat Hahn, Thich. 1998. Interbeing: fourteen guidelines for engaged Buddhism. Berkeley: Parallax.
Olson, Hope A. 2002. The power to name: locating the limits of subject representation in libraries. Boston: Kluwer. Tennis, Joseph T. 2006. Social tagging and the next steps for indexing. In Furner, Jonathan and Tennis, Joseph T, eds., Proceedings 17th Workshop of the American Society for Information Science and Technology Special Interest Group in Classification Research, Austin, Texas. Available http://journals.lib. washington.edu/index.php/acro/article/viewFile/ 12493/10992 York, J. and Downie, S. J. 2012. Data in detail. Presentation at the HathiTrust Research Center, UnCamp 2012. Available htrc-uncamp2012-york_htrc_un camp_sep_2012_-_long.pdf. iek, Slavoj. 2008. Violence. New York: Profile Books.
50
Knowl. Org. 40(2013)No.1 M. L. de Almeida Campos et al. Information Sciences Methodological Aspects Applied to Ontology Reuse Tools
Information Sciences Methodological Aspects Applied to Ontology Reuse Tools: A Study Based on Genomic Annotations in the Domain of Trypanosomatides
Maria Luiza de Almeida Campos*, Maria Luiza Machado Campos**, Alberto M. R. Dvila***, Hagar Espanha Gomes****, Linair Maria Campos*, and Laura de Lira e Oliveira*
* UFF-GCI-PPGI/UFF, Rua. Tiradentes 148, Ing, Niteri, RJ, Brasil, <marialuizalmeida@gmail.com>, <linair@cisi.coppe.ufrj.br>, <llira@gbl.com.br> ** UFRJ-PPGI, Athos da Silveira Ramos s/n, Ilha do Fundo, Rio de Janeiro, RJ, Brasil, <mluiza@nce.ufrj.br> *** FIOCRUZ/IOC, Av. Brasil, 4365, Manguinhos, RJ, Brasil, <davila@ioc.fiocruz.br> **** Rua. Tiradentes 148, Ing, Niteri, RJ, Brasil, <hagarespanhagomes@gmail.com>
Maria Luiza de Almeida Campos is Researcher and Professor, Department of Information Science, Graduate Program, Universidade Federal Fluminense. She has a B.S. in documentation and library science from Universidade Federal Fluminense, and a master's and Ph.D. in information science from Universidade Federal do Rio de Janeiro, covenant with Brazilian Institute of Science and Technology. She held a post-doctorate in Laboratrio Biologia Molecular da FIOCRUZ in the area of ontologies. Research interests include knowledge organization, models and theories of knowledge representation, terminology, and foundational ontologies. Maria Luiza Machado Campos is Researcher and Professor, Department of Computer Science of the Mathematical Institute of the Federal University of Rio de Janeiro. She graduated in civil engineering from Universidade Federal do Rio Grande do Sul, and has a master's in systems engineering and computation from Coppe, Federal University of Rio de Janeiro and a Ph.D. in information systems from the University of East Anglia, Norwich, England. Her areas of focus include databases, knowledge management, data warehousing, management of metadata, and ontologies, applied particularly to the areas of bioinformatics, oil, and emergencies. Hagar Espanha Gomes holds a degree in librarianship and documentation from the National Library Foundation, specializing in master's and bibliographic research by the Brazilian Institute of Bibliography and Documentation, and a Doctorate in documentation. Her areas of interest are: classification, terminology, information architecture and representation, and information retrieval. Alberto Martn Rivera Dvila graduated with a bachelor's in biological sciences from the Federal University of Mato Grosso do Sul (1997) and a Ph.D. in cell and molecular biology from the Oswaldo Cruz Foundation (2002). He is currently a senior researcher of the Instituto Oswaldo Cruz, FIOCRUZ, and he has experience in bioinformatics, computational biology, and molecular biology, mainly in the following areas: bioinformatics and computational biology of protozoa; molecular characterization of protozoa; computational aspects of systems biology; and metagenomics.
51
Laura de Lira e Oliveira studied at the School of Medicine, Medicine and Surgery of Rio de Janeiro (1974), the State University of Rio de Janeiro (1978) and has a master's of information science (covenant UFRJ / IBICT - 1980), and a Ph.D. in information science (UFF / IBICT - 2011). She has experience in medicine, specializing in cardiology and homeopathy. She has interests in the following topics: classification theory, organization of knowledge representation information, and medical terminology. Linair Maria Campos is Manager of Information Technology CISI / COPPE / UFRJ, with a master's in computer science at IM / NCE / UFRJ (2004) and Ph.D. in information science from the UFF / IBICT (2011). She has over 25 years of experience in IT, having worked in management, development, and maintenance of information systems. Currently, she is also a substitute teacher at UFF. De Almeida Campos, Maria Luiza, Machado Campos, Maria Luiza, Dvila, Alberto M. R., Espanha Gomes, Hagar, Campos, Linair Maria, and de Lira e Oliveira, Laura. Information Sciences Methodological Aspects Applied to Ontology Reuse Tools: A Study Based on Genomic Annotations in the Domain of Trypanosomatides. Knowledge Organization. 40(1), 50-61. 33 references. ABSTRACT: Despite the dissemination of modeling languages and tools for representation and construction of ontologies, their underlying methodologies can still be improved. As a consequence, ontology tools can be enhanced accordingly, in order to support users through the ontology construction process. This paper proposes suggestions for ontology tools' improvement based on a case study within the domain of bioinformatics, applying a reuse methodology. Quantitative and qualitative analyses were carried out on a subset of 28 terms of Gene Ontology on a semi-automatic alignment with other biomedical ontologies. As a result, a report is presented containing suggestions for enhancing ontology reuse tools, which is a product derived from difficulties that we had in reusing a set of OBO ontologies. For the reuse process, a set of steps closely related to those of Pinto and Martins methodology was used. In each step, it was observed that the experiment would have been significantly improved if ontology manipulation tools had provided certain features. Accordingly, problematic aspects in ontology tools are presented and suggestions are made aiming at getting better results in ontology reuse. Received 29 April 2012; Revised 24 September 2012; Accepted 27 September 2012
We would like to thank CNPq (Conselho Nacional de Desenvolvimento Cientfico e Tecnolgico) for partially supporting this work.
1.0 Introduction During the last few years, initiatives of the international scientific community in the field of genomics have led to an explosive growth of biological information, which keeps growing today. The initial concern was the creation and maintenance of databases to store and describe biological data. As genomes continue to be sequenced and described, studies shifted their focus gradually from genome mapping to the analysis of a broad range of information resulting from the functional characterization of genes by means of molecular biology and bioinformatics. In this scenario, it becomes essential to support the interoperation of data obtained through various research projects around the world, interrelating enzymes, genes, chemical components, diseases, cell types, organs, etc. (Mendes 2005).
Ontologies play an essential role in this process, supporting semantic interoperability of heterogeneous distributed systems in a standard way. The Open Biological and Biomedical Ontologies (OBO) Library (OBO 2009) is a terminology repository developed for shared utilization among several biological and medical domains. Among OBO's most disseminated vocabularies, we can highlight Gene Ontology (GO) (Gene Ontology Consortium 2001). GO is a large vocabulary, comprising more than 38,000 terms (http://www.geneontology.org/GO.downloads.ontol ogy.shtml), non-dependent on organism species (Ashburner and Lewis 2002). Still, although GO has a large number of descriptors, other vocabularies are needed in the biomedical domain as we can see by the variety of ontologies available in OBO. It is worth noting that some of those ontologies use several terms that are equivalent to GO terms, and some-
52
times even contain references to GO terms IDs, as it can be observed in INOH Molecule Role ontology (Yamamoto et al. 2004). This scenario, considering the complexity of building and maintaining such vocabularies, brings about the issue of ontology reuse. One important aspect of ontology reuse concerns principles adopted for the organization of concepts and their relationships, and also for building definitions associated with such concepts. In this context, this study points towards the importance of investigations within the area of information organization in information science. Unfortunately, information about such principles is not always available, and, even when it is, vocabularies are built based on different approaches that require conciliation when their reuse is intended. In this context, this study points towards the importance of investigations within the area of language compatibility in information science. Research in this area may provide theoretical and methodological guidelines (Gangemi, Steve, and Giacomelli 1996) that can help make ontology reuse tools more useful and precise. In parallel with the adoption of well founded methodological practices, ontology tools can be improved accordingly to support users throughout the ontology construction process, as well as in providing management strategies for the production and reuse of high quality ontologies. This paper intends to discuss issues that are inherent to ontology reuse as a methodological step towards acquisition of knowledge in ontologies, and thus propose supporting guidelines for ontology mapping and alignment tools. A case study within the domain of bioinformatics is presented, more specifically focused on genome annotation of trypanosomatides at the BiowebDB consortium (Biowebdb 2006). This paper is organized as follows: in section 2, common kinds of ontology reuse and related work in computer science is presented; in section 3, we discuss the information science perspective on vocabulary compatibilization; in section 4, some of the issues found in our reuse experience are discussed; in section 5, some semantic aspects of reuse and their impact on ontology tools are presented as result of experiments reusing OBO ontologies; finally, in section 6, future studies are suggested. 2.0 Ontology reuse Guarino and Musen (2005, 1) highlight the role ontologies have been playing in information systems:
Building ontologies is now an essential activity that underlies nearly everything we do in the development of computational systems. Although Grubers (1993, 1) is the most commonly cited definition of ontology: an ontology is the specification of a conceptualization, Guarino (1998, 4) also gives a clear definition: In its most prevalent use in AI, an ontology refers to an engineering artifact, constituted by a specific vocabulary used to describe a certain reality, plus a set of explicit assumptions regarding the intended meaning of the vocabulary words. Ontologies can be reused in many ways depending on users needs and ontologies availability. Pinto and Martins (2001) divide reuse processes in merge and integration. In a merging process, a single ontology is created from the reuse (partial or total) of two or more ontologies about the same subject. In an integration process, an adapted (some of the concepts will probably be extended, joined, deleted, or reformulated) and independent ontology is created from the reuse (partial or total) of two or more ontologies on different (although possibly related) subjects. Some authors (Bruijn et al. 2006; Euzenat and Shvaiko 2007) also include alignment as an ontology reuse process. This process, however, differs from those aforementioned by its result; instead of creating an additional ontology, alignment keeps reused ontologies preserved on their original sources, although creating a set of links between terms of the reused ontologies. Such links express the kind of relationship that connects terms from the reused ontologies and are stored in a separate persistent model. This model is the result of a process named term matching (Euzenat and Shvaiko 2007), which aims to identify terms that express similar concepts. The set of links between ontologies produced by means of the alignment process is a mapping between these ontologies. Information contained in the mapping will depend on the type of semantic relationship existing among elements and on the type of formalism used in the ontology to represent its semantics. For example, two elements may be similar (to varying degrees), or one can be a part of the other, or they may have some other kind of relation that is identified with the help of a domain specialist. One of the issues of mapping concerns how to find candidates. Another aspect involving mapping concerns the type of technique employed to estimate candidates. It can be based, among other aspects: i) on similarities between terms names; ii) on the ontology structure, such as, for in-
53
stance, considering the terms' positions within the hierarchical structure of ontologies under comparison, or their part-of relations, or even other types of relations (Euzenat and Shvaiko 2007); iii) on the addition of supplementary knowledge, such as information from another ontology or vocabulary with a concept hierarchy, such as Wordnet, which may be used to search for synonyms (Reynaud and Safar 2007). 2.1 Studies related to ontology reuse in computer science Regarding methodological aspects on how to reuse ontologies, to the best of our knowledge, literature is more often concerned with computational aspects, such as which algorithms are most effective to promote compatibility among ontologies regarding both the accuracy and the speed of their results (Choi, Song, and Han 2006). Nevertheless, some authors propose general tasks that are necessary in the reuse process. Gangemi, Steve, and Giancomelli (2006), for instance, state that it is necessary to identify the basic terms and their necessary and sufficient conditions in textual format. However, they provide no suggestion on how to perform such identification, or on which principles should be used to build the definitions. The more comprehensive view of Pinto and Martins (2001), on the other hand, suggests that the reuse process actually starts during the selection of ontologies to be reused. No systematic details are given though, on how to perform such tasks. Some of the many studies carried out by Guarino (1998), Barry Smith (2005), and Guizzardi et al. (2011), although not directly focused on reuse per se, may help the process, since they explore the semantic and formal nature of concepts of an ontology. In practice, Guarinos Formal Ontology, as well as Guizzardis UFO Ontology, can be defined as theories of prior distinctions concerning worldly entities of the world (physical objects, events, regions, amounts of matter); and meta-level categories to model the world (concepts, properties, qualities, states, roles, and parts). Guarino accepts the creation of several, not necessarily complementary, views of a same domain, which he calls possible worlds. Barry Smith (2005), on the other hand, is inspired by the Aristotelian Theory of Classes to suggest a jointly developed set of axioms and definitions to be applied in the biomedical domain. Smith, as opposed to Guarino and Guizzardi, advocates the idea that there is only one, commonly agreed, possible world, albeit with different, orthogonal, complementary views.
3.0 Vocabulary compatibilization in information science Semantic issues have been objects of study and research in Information Science since the beginning of the second half of the last century within a computer environment. Such studies focused construction and compatibilization of documentary languages and their contributions are still valid for compatibilization among and reuse of ontologies. Two methods distinctly stand out among others used for converting and creating compatibility between languages based on the integration of vocabularies. These are Nevilles thesaurus reconciliation method (Neville 1972) and Dahlbergs concept correlation matrix (Dahlberg 1983a). Nevilles method is based on the principle that concepts (the conceptual contents of descriptors, which are expressed by the definitions), and not descriptors alone, must be made compatible. This method suggests an intermediate language approach, based on the numeric coding of concepts and a series of 11 scenarios with rules to treat vocabulary compatibility issues, which enables the establishment of a conceptual equivalence of descriptors of different languages. The method suggested by Dahlberg is based on the construction of a concept compatibility matrix and a concept register. The concept compatibility matrix provides the results of the language compatibility analysis from the semantic and structural points of view. The first step to elaborate the matrix is the verbal matching of terms. In the second step, additional information supports the understanding of the terms intended meaning by means of a conceptual analysis, whose result is recorded in a concept register. The concept register may be implemented as a database table, although Dahlberg did not propose a solution to implement it computationally. It contains some useful information that helps to identify the semantics associated to each concept, such as: i) the name of the concept in other vocabularies; ii) the concepts form category, which indicates its nature, e.g., if it is an object, a process, a quality; iii) additional information about the concept, for instance, its source; and, iv) related concepts. Recent studies include these issues within KOS (Zeng and Chan 2004). Nevertheless, this paper aims at pointing to a better concept description so that automatic compatibilization procedures work with better precision.
54
3.1 Information organization in information science Literature on information organization in the field of information science proved to be helpful, specifically those theories strongly related to representation of concept systems. In those, there are solid European theoretical foundations for the elaboration of documentary languages, providing a semantic base for integration. Examples are: Ranganathans faceted classification theory (1967) and Dahlbergs concept theory (1978), which allow the representation of knowledge domains. Ranganathan elaborated a series of principles and canons for knowledge classification, which intended to allow concepts of a knowledge domain to be structured in a systematic way. That is, concepts are organized in arrays and chains, which are, in turn, structured in comprehensive classes, called facets, and the latter are organized within a given Fundamental Category. The grouping of all categories comprises a concept system for a given subject area, and each concept within the category is also the manifestation of that category (Ranganathan 1967). 4.0 Case study methodology and results The purpose of this paper is not a proposal of a reuse methodology, but to show that compatibilization criteria developed in information science are valid to ontology reuse. We may also take advantage of an existing methodology, such as Pinto and Martins (2001), to illustrate how ontology tools can benefit from a joint approach between theory and practice, in the scope of a reuse scenario. The sample of concepts (knowledge capture) consists of a set of GO terms used in a manual genomic annotation of Tripanossoma rangeli made by biologists of the BiowebDB group during a master research project (Wagner 2006). This group of terms constitute a coherent set present in the three branches of GO (cell component, molecular function, biological process); biologists are familiar with those concepts and relations and this is important to validate the structure of these terms when comparing with other ontologies. The result of annotation of T. rangeli consists of 865 terms class distributed in those categories. Five steps of ontology reuse can be summarized: i) finding and selection of candidate ontologies; ii) evaluation of candidate ontologies by domain experts and ontology engineers; iii) final selection of ontologies to be integrated; and, iv) application of operations towards ontology integration, which we consider as a semi-automatic procedure.
4.1.1 Step I: finding and selecting candidate ontologies To begin with, GO was considered the master ontology. One of the criteria for selecting a master vocabulary is its completeness (Dahlberg 1981). In relation to OBO ontologies, specially, for functional genomic annotation, GO is the most complete and used. So, since the beginning, it was assumed that GO could be considered the master ontology for the experiment. To identify themes for the compatibilization a domain study was conducted. Many researchers have studied how to approach a given knowledge domain (Soergel 1982, 1997; Lancaster 1986; Hjrland 2002, 2003, 2004; Broughton et al., 2005; Gnoli and Hjrland 2009). They provide us with systematic guidelines for a preliminary domain analysis. Support provided by these theoretical contributions and by others from the social sciences (Latour 1997) have allowed elaboration of a preliminary draft of thematic groupings on the domain of trypanosomatides. At first, ten thematic groups were identified: protists; functional and systems biology; molecular biology and genomics; evolutive molecular genetics; comparative genomics; philogeny; bioinformatics; diseases; and metagenomics; targets for drugs, each one with its own sub-groupings. The purpose of this selection was to identify a set of ontologies to be reused with the aid of software tools. This strategy is in accordance with Nevilles feasibility study for reconciliation of thesaurus (Neville 1972) and with Dahlbergs intermediate language proposal for compatibilization (Dahlberg 1983b). The intermediate languageor master vocabularywould be the starting point when establishing equivalence relations with terms of other ontologies. To identify possible useful ontologies for the experiment a search was made in the OBO site, where each ontology has a brief summary of its scope. Through this, it was possible to identify those in accordance with the thematic areas previously chosen and, using this opportunity, verify their ontological commitment. This scope analysis showed, for example, that molecular role in one ontology does not refer to molecular role, but is, indeed, an ontology of proteins (Campos 2011). The result of this step led to a selection of eleven ontologies that could be of interest to researchers on trypanosomatides. 4.1.2 Step II: evaluation of candidate ontologies by domain experts and ontology engineers Selection of candidate ontologies for the experiment was validated through seminars with the research
55
group. To support the ontology evaluation process, Onto-Edit was used, as it allows user-friendly visualization of concepts and hierarchies. Such visualization shows clearly taxonomies inherent to each ontology and is useful for compatibilization and, the case being, for further integration. (Jie, Fei, and Sheng-Uei 2011) From the initial group of ten ontologies, six were confirmed by end-users as being of interest. These had already been used as knowledge source so that classes of interest within the domain of Trypanosomatides could be easily identified. 4.1.3 Step III: selecting ontologies Laboratory researchers selected the following ontologies: NCBI organismal classification, pathway, sequence types and features (SO), Brenda tissue/enzyme source, Event-INOH pathway ontology, multiple alignment and system biology (Open Biomedical Ontologies 2009). 4.1.4 Step IV: applying operations towards ontology integration Two procedures were required in order to evaluate the degree of compatibilization between GO and each selected ontology: the identification of hierarchies among selected ontologies, and the semantic analysis of each term within each hierarchy. These also contributed to verifying their potential reuse. 4.2 Identification of hierarchies Visualization of hierarchies was done through application of the OBO-Edit editor (Day-Richter, Harris, and Haendel 2007), which supports multiple visualization forms. Other requirements not supported by this tool were needed, such as facilities for recording justification of the choices made and thematic superimposition. Mapping terms in selected ontologies was done with application of Prompt tool. To allow more effective searching and exploring the hierarchy of terms in several biological ontologies with dynamic trees, with ontology subsets and retrieving of information, OntoExplore was developed. This tool was used afterwards in the process of genomic annotation thus providing researchers with a computational support. OntoExplore was developed in Java, using the API JENA (http://jena.sourceforge.net) to parse ontologies in OWL and RDF formats and the Prefuse Visualization Toolkit (http://prefuse.org) to implement interactive data visualization mechanisms.
OntoExplore allows: (i) Visualization and comparison of terms hierarchies in different ontologies. It is possible to select a term and visualize its hierarchy in two different ontologies (see Figure 1). Thus it is possible to check and study the hierarchy of terms. It also allows (ii) the searching of terms within multiple ontologies. The goal is to find similar terms. Sometimes the term exists in another ontology with a different name. To implement this, a synonym-based search was applied. The purpose of OntoExplore is to align ontologies by means of an algorithm that explores their hierarchical structure, the term, and also the semantic nature of the concepts, according to the Classification Theory (Ranganathan 1967). For the latter to be possible, the root classes of two of the reused ontologies were previously manually associated to terms denoting Fundamental Categories. Our goal in building such tool instead of using existing ones, like Prompt (Noy and Musen 2003), is to implement and test some of the aspects we consider important to ontology mapping in order to evaluate its helpfulness on the reuse process. The use of Fundamental Categories is an example of such aspects. From 865 GO terms, 28 were common in selected ontologies. That means that each term was listed at least once in each ontology besides GO. In great measure, terms were found only in GO and Event (INOH pathway ontology). This result suggests that this ontology has enough thematic superimposition with GO. It deals with biological events such as mechanisms of gene expression and immunological response, concepts that belong to the Functional Genomics domain, and biological process is one of the
three components of GO.
This paper limits discussion to results of experiments between GO and INOH. 4.3 Semantic analysis Once a term is found in a target ontology, a subset is derived from such an ontology, composed of its ascending and descending hierarchy within GO and INOH. Mapping is done with the assistance of the Prompt tool and with our prototypical tool. Each resulting mapping is then manually analyzed based on: (i) similarities in term designations; (ii) semantic similarity indicating concepts of similar nature (logically related); (iii) relations indicating concepts that are not similar, but that may be associated by means of category (logic) relations which are relevant to the domain, for instance, between a protein and a bio-
56
Figure 1. Comparing the same term hierarchy in two ontologies
logical process in which it participates, a biological process and its product. Term definitions were also analysed according to methodological principles (Dahlberg 1978a, 1978b, 1981, 1983a; 1983b; Neville 1972). A manual analysis and comparison of terms and their definitions both in GO and INOH were made, aiming at verifying how much future automatic processing would provide consistent results relating to semantic aspects. The analysis made measuring the degree of verbal and conceptual compatibility possible. Verbal coincidence was the starting point, following analyses of definitions to obtain conceptual coincidence. When no definition was provided, it was necessary to observe term position in its respective hierarchy so that similarity of classification between ontologies could be observed. Manual analysis of definitions was made to verify the degree of consistent results when applying future automatic processing; in other words, to evaluate semantic potentiality of compatibilization when comparing common characteristics between definitions. Dahlbergs matrix of semantic compatibility starts with verbal coincidence. The rate of verbal coincidence indicates the possibility of measuring the degree of conceptual compatibility. Two measures were investigated:
Concept coincidence Concept correspondence. Homonyms were also investigated. 4.3.1 Conceptual coincidence Conceptual coincidence occurs when for the same verbal form and same content 80% of characteristics occur in both definitions. In this case, 31% of terms are considered conceptually identical. But two different situations were identified in relation to hierarchical structure: 1 Some possess the same generic term: in both ontologies, "cell-cell signaling" is subordinated to "cell communication." It is worth observing that cell-cell signaling definition in both ontologies is: Any process that mediates the transfer of information from one cell to another. 2 Terms have different generic term: DNA repairin GO, it is subordinated to "DNA metabolic process"; in INOH, it is subordinated to "molecular event." It is worth noting that DNA Repair, in both ontologies, is defined as: The process of restoring DNA after damage. Ge-
57
nomes are subject to damage by chemical and physical agents in the environment (e.g., UV and ionizing radiations, chemical mutagens, fungal and bacterial toxins, etc.) and by free radicals or alkylating agents endogenously generated in metabolism. DNA is also damaged because of errors during its replication. A variety of different DNA repair pathways has been reported that include direct reversal, base excision repair, nucleotide excision repair, photoreactivation, bypass, doublestrand break repair pathway, and mismatch repair pathway. The analysis of these terms indicates that, although they seem conceptually identical, according to their definitions, they have different hierarchies, and a conflict will rise in an automatic analysis when determining conceptual similarity. This is due to lack of a definition pattern that ensures that the first element in definition be its immediate superordinated term in the conceptual chain. In this case, they were considered identical. 4.3.2 Conceptual correspondence Conceptual correspondence occurs when the same verbal form and similar concept content are considered quasi-synonyms when 60-79 % common characteristics occur in both definition. In this case, 63% of terms can be considered quasi-synonyms. Organ morphgenesis" is an example. In INOH, the term has the following definition: Morphogenesis of a tissue or tissues that function together to perform a specific function. Organs are commonly observed as visibly distinct structures, but may also exist as loosely associated clusters of cells that function together as to perform a specific function." In GO, the definition is as follows: Morphogenesis of an organ. An organ is defined as a tissue or set of tissues that work together to perform a specific function or functions. Morphogenesis is the process by which anatomical structures are generated and organized. Organs are commonly observed as visibly distinct structures, but may also exist as loosely associated clusters of cells that work together to perform a specific function or functions. 4.3.3 Homonyms Only two terms (6%) in each ontology were semantically different so they were considered homonyms (for example, Phosphorylation). The definition in
INOH is as follows: Reversible reaction that can affect D,C,H,S,T,Y ,R residues. The definition in GO is: The process of introducing a phosphate group into a molecule, usually with the formation of a phosphoric ester, a phosphoric anhydride or a phosphoric amide. Analyses showed that, due to lack of a pattern, definitions do not allow consistent results in an automatic processing for semantic compatibility degree between concepts. As it could be observed, besides having conceptual coincidence (identical definitions), two identical terms cannot be considered identical because each hierarchical structure does not match. To obtain consistent semantic compatibilization degree between ontologies, interference in definitions will be needed; or, to provide quantitative analyses of similar characteristics as well as to be able to verify superordinated term in each hierarchy, software will have to be developed. The correspondence between concepts would be better obtained if granularity, synonymy, and establishment of principles for a standard terminology were previously established. The use of categories associated to ontologies was one of the functionalities aggregated to OntoExplore. It resulted in an increased accuracy when handling false positives, which brings us closer to the ideal set of intended mappings. As an example, we can mention the case of the excretion concept, found in GO and Brenda ontologies. In the former, the term refers to a process and means elimination of excreta by an organism, resulting from metabolic activity. In the latter, it refers to the product of an activity and means the matter, such as urine or sweat, excreted by blood, tissues, or organs. When both ontologies are mapped through the Prompt tool, it indicates that the terms are similar but they actually require a semantic analysis. Similarly, the terms transporter, from the MoleculeRole ontology, and transport, from GO, also generate false positives in the mapping suggested by Prompt. Transport, as in GO, is a process defined as processes specifically pertinent to the activities of integrated living units: cells, tissues, organs and organisms. Transport, as in MoleculeRole, on the other hand, is a protein defined as linking specific solutes to be transported that undergoes a series of conformation changes to transfer the linked solute. As it can be seen, these term pairs, despite their linguistic similarity, denote concepts with distinct natures (different categories), therefore Prompt should not have suggested those terms as mapping candidates. A person can observe this but the tool provided no mechanisms to register it, so one will deal with this same issue when trying to align ontologies.
58
In this context, it was possible to manually confirm a suggested relation between the terms excretion (Brenda) and excretion (GO) by means of a process-product category relation, that is, excretion (a matter, in Brenda) is the product of excretion (an activity, in GO). Similarly, transporter (a protein, in MoleculeRole) and participates in transport (a process, in GO), suggests a relation between a biological object (a protein) and a process (transport). In this scenario, OntoExplore provides mechanisms, absent in Prompt, to persist in the acknowledgement of the validity of such a relation, so an alignment attempt could be made in an incremental and more precise way. 5.0 Semantic aspects of reuse and the impact on ontology tools During this work on OBO ontologies, several aspects of importance to ontology reuse have been found, considering not only machines, but also humans, such as: (a) concept comprehension; (b) concept categorization; (c) concept definition; (d) ontological commitment elucidation; (e) concept matching; and (f) ontology articulation. Aspect (a) regards showing people (and not machines) information regarding the ontology as a whole (for instance, its purpose and design rationale) and about the intended meaning of each term on each ontology as accurately as possible. This may be helpful when people are trying to understand the perspectives used by different ontologies to represent domain knowledge. Aspect (b) regards providing people some input about the principles by which ontology categories are organized. This may be particularly helpful if one intends to extend the ontology or relate it with another one, because it aids preventing ambiguous categorization or association. Some of these principles can be formalized in order to be used by tools, for instance, in the context of ontology alignment. Aspect (c) regards improving consistency among the definitions of terms and, with such well-formed definitions, help people to organize and extend ontology taxonomy structure. Besides, the use of standard definitions can improve the results of ontology tools (for instance in mining operations, to propose relations between terms), which can be configured to take advantaged of such semi-structured information. Aspect (d) regards helping people to evaluate and decide if the ontologies considered in a first selection are useful to the purpose they have in mind.
Aspect (e) regards providing an overview of issues encountered during an ontology compatibility enterprise. Although it may be difficult to keep such records up to date when ontologies change, it is worth keeping this information available to an organized ontology community (such as OBO) as a feedback of a process of ontology compatibilization; it can be used to improve ontologies evolution. Finally, aspect (f) regards helping users envision possibilities of extending the scope of a particular ontology by connecting terms on this ontology with terms of another ontology that may complement it. It is worth noting that this list does not pretend to be exhaustive, but, instead, is a proposal of a set of issues derived from our own experience reusing a set of OBO ontologies. Besides, we have observed that the aspects presented on our list are present among the steps realized in a reuse process, and so, accordingly, should be present somehow on an ontology reuse methodology. It is assumed that a more consistent and accurate reuse can be achieved if ontology tools reflect the multiple aspects and steps of ontology reuse accordingly. Some of those aspects, following the theories presented so far and the experiments conducted in the Biowebdb project, are illustrated in Table 1. Their usage, as situated within the steps of a reuse methodology, may improve the precision of ontology compatibilization mechanisms. This happens because they enhance the semantics associated to ontologies concepts and help users to accomplish most of the tasks carried out on each step of such a reuse methodology. 6.0 Conclusion Although the aforementioned tools provided valuable help on reusing ontologies, especially regarding the task of finding candidate terms (matching) for mapping, many of the features observed as useful are lacking while reusing ontologies. In this sense, further studies will investigate whether the application of the proposed suggestions, based on semantic aspects of reuse, contribute to an increase in the accuracy using software tools. Future enhancements, modifications and investigations are necessary to improve Onto Explore such as to provide a broad set of metrics to compare term hierarchies in distinct ontologies. This paper points to the need of systematization of definitions when constructing semantic tools. It is important to follow a pattern that reveals the nature of concepts and their epistemological contexts, or the
59
Aspect (a) Concepts comprehension
Tools should tackle Exhibit the matching concepts definitions alongside with their main hierarchic structure.
How to do it Multiple ontology visualization mechanisms, showing terms definitions and other relevant information, such as concepts categorization and ontological commitment. Standard metadata associated to each concept of each ontology, possibly stored as a concept register, similar to Dahlbergs proposal, on an ontology repository provided along by the tool. Standard metadata associated to each category, suggesting attributes that should be present on the definition of a concept belonging to such category. Standard metadata associated to each ontology, possibly stored on an ontology repository provided along by the tool. Standard metadata, related to such compatibility issues, associated to the equivalence relationship between matched concepts.
(b) Concepts categorization
Support documenting and viewing the fundamental categories under which the ontology has been built.
(c) Concepts definition
Allow definition of concepts based on patterns, possibly associated to underlying fundamental categories identified. Allow documenting and viewing the principles under which the ontology has been built, its purpose, scope, subject, and premises, among others. Show compatibility issues, as presented by Neville, e.g. difference in granularity; different number of terms to denote the same concept; synonyms; homonyms; Provide support to suggest standard term names; Offer suggestion of relationships between concepts on different ontologies, possibly underpinned by categorical relations occurring between concepts that belong to those categories.
(d) Ontological commitment elucidation
(e) Concepts matching analysis
(f) Ontologies articulation possibilities
Through analysis of concepts definition, and domain analysis; e.g., when one concept refers to another already existent in one of the ontologies to be articulated and the concepts involved belong to categories that may be related.
Table 1. Suggestions to improve precision on ontology reuse tools
best computing tools will continue to produce unsatisfactory results. The biomedical domain is complex and challenging. In this scenario, the experiment points towards suggestions for ontology tools improvement. Existing tools lack mechanisms to deal accurately with large and multiple ontologies to help users understand their purpose, subject, scope, and ontological commitment. This paper proposes enhancements that can be performed by ontology tools in order to provide features consonant with ontology reuse methodologies. Such enhancements, if existent, would have been of great utility, as pointed out in the experiment. The experiment suggests the possibility of applying theoretical principles of compatibilization of documentary languages to ontology domain aiming at obtaining a better classification in a taxonomy.
References Ashburner, Michael and Lewis, Suzanna. 2002. On ontologies for biologists: the gene ontology uncoupling the web. In Bock, Gregory and Goode, Jamie A., eds., In silico simulation of biological processes: novartis foundation symposium 247. Chichester, UK: John Wiley & Sons, pp. 66-80. BiowebDB Consortium. [2009]. Comparative genomics approaches. Available http://biowebdb.org/. Broughton, Vanda; Hamsson, Joacim; Hjrland, Birger and Lpez-Huerta, Maria Jos. 2005. Knowledge organization. In: Kayberg, Leif and Lrring, Leif, eds. European curriculum reflections on Library and Information Science education. Copenhagen: Royal School of Library and Information Science, pp. 133-48.
60
Bruijn, Jos, Ehrig, Marc, Feier, Cristina, MartnsRecuerda, Francisco, Scharffe, Franois and Weiten, Moritz. 2006. Ontology mediation, merging and aligning. In Davies, John, Studer, Rudi and Warren, Paul, eds., Semantic web technologies: trends and research in ontology-based systems, Chichester, UK: John Wiley & Sons. Campos, Linair M. 2011. Diretrizes para definio de recorte de domnio no reso de ontologias biomdicas: uma abordagem interdisciplinar baseada na anlise do compromisso ontolgico. PhD dissertation. Universidade Federal Fluminense / Instituto Brasileiro de Informao em Cincia e Tecnologia Choi, Namyoun, Song, Il-Yeol and Han, Hyoil. 2006. A survey on ontology mapping. SIGMOD record 35 no. 3: 34-41. Dahlberg, Ingetraut. 1978a. A referent-oriented, analytical concept theory of INTERCONCEPT. International classification 5: 142-51. Dahlberg, Ingetraut. 1978b. Teoria do conceito. Cincia da informao 7: 101-07. Dahlberg, Ingetraut. 1981. Towards establishment of compatibility between indexing languages. International classification 8: 88-91. Dahlberg, Ingetraut. 1983a. Conceptual compatibility of ordering systems. International classification 10: 5-8. Dahlberg, Ingetraut. 1983b. Terminological definitions: characteristics and demands. In Problmes de la dfinition et de la synonymie en terminologie. Qubec: Girsterm, 13-51. Day-Richter, John, Harris, Midori A, Haendel, Melissa, The Gene Ontology OBO-Edit Working Group and Lewis, Suzan. 2007. OBO-Edit-an ontology editor for biologists. Bioinformatics 23: 2198-200. Euzenat, Jrme and Shvaiko, Pavel. 2007. Ontology matching. Berlin: Springer Verlag. Gangemi, Aldo, Steve, Geri and Giacomelli, Fabrizio. 1996. ONIONS: an ontological methodology for taxonomic knowledge integration. In ECAI-96 workshop on ontological engineering. Gene Ontology Consortium. 2001. Creating the gene ontology resource: design and implementation. Genome research 11: 1425-33. Gnoli, Claudio and Hjrland Birger. 2009, Letter to the editor: Phylogenetic classification revisited. Knowledge organization 36: 78-79. Gruber, Thomas. R. 1993. A translation approach to portable ontology specifications. Knowledge acquisition 5: 199-220.
Guarino, Nicola. 1998. Formal ontology in information systems. Proceedings of FOIS98, Trento, Italy, 6-8 June 1998. Amsterdam: IOS Press, pp. 3-15. Guarino, Nicola and Musen, Mark A. 2005. Applied ontology: focusing on content. Applied ontology 1: 1-5. Guizzardi, Giancarlo, Almeida, Joo Paulo, Guizzardi, Renata S.S., Barcellos, Monalessa P . and Falbo, Ricardo. 2011. Foundational ontologies, conceptual modeling and semantic interoperability. Proceedings of the Iberoamerican meeting on ontological research. Available http://iaoa.org/isc2012/docs/Guarino2005 _Focusing_on_content.pdf. Hjrland, Birger. 2002. Domain analysis in information science: eleven approaches traditional as well as innovative. Journal of documentation 58: 422-62. Hjrland, Birger. 2003. Fundamentals of knowledge organization. Knowledge organization 30: 87111. Hjrland, Birger. 2004. Arguments for philosophical realism in library and information science. Library trends 52 no. 3: 488506. Jie, Xie, Fei, Liu and Sheng-Uei, Guan. 2001. Tree structure based ontology integration. Journal of information science 37. 594-613. Lancaster, Frederick W . 1986. Vocabulary control for information retrieval. 2nd ed. Arlington, VA: Information Resources Press. Latour, Bruno. 1997. Cincia em ao: como seguir cientistas e engenheiros sociedade afora. So Paulo: Editora Unesp. Mendes, Pablo N. 2005. Uma abordagem para construo e uso no suporte integrao e anlise de dados genmicos. M.A. thesis. Federal University of Rio de Janeiro. Neville, Hugh Henry. 1972. Thesaurus reconciliation. Aslib proceedings., 24: 620-6. Noy, Natasha. F. and Musen, Mark, A. 2003. The PROMPT suite: interactive tools for ontology merging and mapping. International journal of human-computer studies 59: 983-1024. Open Biomedical Ontologies. 2009. Available at: http://www.obofoundry.org. Pinto, Helena Sofia and Martins, Joo P . 2001. A methodology for ontology integration. K-CAP '01 proceedings of the 1st international conference on knowledge capture: 131-8. Ranganathan, Shiyali R. 1967. Prolegomena to library classification. New York: Asia Publishouse. Reynaud, Chantal and Safar, Brigitte. 2007. Exploiting WordNet as background knowledge. In International ISWC'07 ontology matching (OM-07) workshop, Busan, Corea.
61
Smith, Barry. 2005. The logic of biological classification and the foundations of biomedical ontology. In Logic, methodology and philosophy of science. Proceedings of the 12th international conference, London, pp. 505-20. Soergel, Dagobert. 1982. Compatibility of vocabularies. In Proceedings of conference on conceptual and terminological anlysis in the social sciences. Bielefeld, Frankfurt, pp. 209-23. Soergel, Dagobert. 1997. Multilingual thesauri and ontologies in cross-language retrieval. In AAAI Spring Symposium on Cross-Language Text and Speech Retrieval Stanford University, March 24-26, 1997. Available http://www.dsoergel.com/cv/B60 .pdf.
Wagner, Glauber. 2006. Gerao e anlise comparativa de seqncias genmicas de Trypanosoma rangeli. M.A. thesis. Instituto Oswaldo Cruz. Yamamoto, Satoko, Asanuma, Takao, Takagi, Toshihisa and Fukuda, Ken I. 2004. The molecule role ontology: an ontology for annotation of signal transduction pathway molecules in the scientific literature. Comparative and functional genomics 5: 528-36. Zeng, Marcia L. and Chan, Lois M. 2004. Trends and issues in establishing interoperability among knowledge organization systems. Journal of the American Society for Information Science and Technology 55: 377-95.
62
Knowl. Org. 40(2013)No.1 Book Review
Book Review
Edited by Joseph T. Tennis
Book Review Editor
Library Classification Trends in the 21st Century, by Rajendra Kumbhar. Publisher Name: Chandos Publishing (Oxford) Ltd. Place of Publication: Witney, UK Publication Year: 2012 Number of Pages: 172 pp. ISBN: 1843346605, 9781843346609 The intent of this book is to trace the developmental trends in classification as reflected in the library and information science literature published in the last decade, i.e., the first decade of the 21st century (p. ix). The method used was to search ten years (19992009) of the Library and Information Science Abstracts (LISA) using the search term classification and further refining the resulting set to the abstracts that dealt specifically with classification (omitting book reviews and other publications that the author judged not on the topic). These were reviewed and organized into ten chapters, each dealing with a different aspect of classification, from KO systems, to classification uses, to classification schemes, education and modern trends. Within each chapter the articles are organized into subthemes, and the literature is summarized in the style of a descriptive annotated bibliography. As far as I could tell, the content of the chapters and subdivisions is determined exclusively by what was covered in LISA, but no more. The author offers very little in terms of reflective synthesis or commentary that does not directly come from the literature being reviewed. In this sense, then, the book is uneven in its coverage and thin in its cohesion as a discussion of trends except to note that something was published on a given topic. Annotations are strung together without a general framework in the form of an evaluative (rather than merely descriptive) introduction to each chapter or section. Nevertheless, despite the uneven coverage and lack of editorial voice, the book can serve as a starting point for a deeper discussion of what forms the actual intellectual movement in the field. A follow-up work could put the pieces together to create rhetorical arguments for describing evolving thought, change in emphasis in
practice and research, novelty, influences both internally or in cognate areas, or most telling, debate in the field. These are missing in the present work, but begging for development. Many of the components are here, but unfortunately they are put together without a conceptual shell to explain the trends. In reading the specific entries I was struck by how difficult it was create such interpretation for oneself. There is a bibliography of the abstracts covered, but these are not indexed to the text. There is no index of authors, so it is impossible to discover the various contributions of individuals to developments. In other words, it is difficult to get a bigger picture through tracing the network of publications and the connections among them. This is a shame because there are parts of the book that are quite rich in detail the sections on text categorization and classification schemes being two of them, and yet there is no graceful way to connect them to anything else in other sections. A big part of the problem arises from the sparse and limited criteria for establishing the data set in the first place (only what was covered by LISA, only what was mentioned in the abstracts and only searches on the term classification). While a substantive number of abstracts were used as the base, there are very large gaps. For example, there is a brief mention of work on genre, but none of the several HICSS (Hawaii International Conference on Systems Science) sessions is included. These are evidently not covered by LISA, but in fact form an interesting and strong trend in which the principles of classification influence many other areas of scholarly endeavor outside the field of library and information science. Similar filtering of classification into other fields has been occurring in computational linguistics, retrieval systems, and personal information management. All of these are touched on lightly in the book, but without any explicit discussion of how the trend lines are crossing, which way the trends are moving, or how the various fields are benefitting from cross-pollination. Closer to home, the author devotes only about a page (138-139) on conferences. This is, in principle, an excellent way of tracking trends because not only do
Knowl. Org. 40(2013)No.1 Book Review
63
the papers and panels reflect the latest thinking, the themes of the conferences themselves form a sort of timeline of evolving interest and focus, as well as a peek into what is trending (to use some popular jargon) into the future. Perhaps if LISA and conferences were combined for a decades worth of literature, the picture may have been more complete. As it is, the author omits mention of several of the ISKO conferences that occur biennially, the work emanating from the many ISKO chapters around the world that also have conferences, seminars and workshops, and the American Society for Information Science and Technology Special Interest Group on Classification Research (SIG/CR) that has been specializing for over two decades in convening both researchers and practitioners in an annual workshop specifically focused on trends (e.g., social networking, museums, and so on). I would encourage the author to pursue the review already started in this volume and push the work in three ways: Expand the base of literature beyond LISA to other literatures that are more reflective of the broadening reach of classification work. Keeping it parochial produces a local picture that does not do the field justice. Include what is published outside of the traditional literature, such as conferences, workshops, institutes and the private sector. If, as the author rightly asserts, classification is at the core of much important knowledge work, then the boundaries of a review must be expanded accordingly. Provide a much needed overview of the connections between the various contributions to our im-
pressive body of work as suggested in the pages of this book. These could be in the form of a semantic map, a citation network, a description of the relationships among working practitioners, researchers and applications developers. At the very least a way of tracing authors and works should be provided so that readers can develop their own pathways and connections. By way of example, the call for papers for the ISKO UK Biennial Conference for next year summarizes some current trends in the role of classification. These all seem to be towards pushing the boundaries, and include the boundaries between research and practice and how to achieve better synergy; between knowledge management professionals and IT professionals; or between one scientific discipline and another (with new knowledge taking shape at the boundary). What are the trends that are captured in the ten years of LISA-indexed literature? Its difficult to tell from the volume as published, but an interesting narrative is just waiting to be told. Reference ISKO UK Conference 2013: Call for Papers. Available: http://www.iskouk.org/conf2013/index.htm
Barbara H. Kwasnik School of Information Studies Syracuse University Syracuse, NY 13244 USA bkwasnik@syr.edu
64
Knowl. Org. 40(2013)No.1 Classification Issues
Classification Issues
Paradigms and Conceptual Systems in Knowledge Organization, the Eleventh International ISKO Conference, Rome, 2010
Nancy J. Williamson
Faculty of Information, University of Toronto, 140 St. George Street, Toronto M5S 3G6 Ontario Canada <william@fis.utoronto.ca>
The eleventh International ISKO Conference on Paradigms and Conceptual Systems in Knowledge Organization was held in Rome, February 23-26, 2010. The proceedings were edited by Claudio Gnoli and Fulvio Mazzocchi and published by Ergon Verlag in 2010. This analysis follows the order of the text of the proceedings, an order prescribed from the abridged scheme for KO literature published in Knowledge Organization, 25, 1998, no. 4, p. 226. Some invited papers, marked with [LR], have been included and are labelled as such in the table of contents. In all, 64 papers were published. The keynote address, Organizing and disseminating knowledge: theoretical and instrumental innovations of Paul Otlet was presented by Boyd Rayward (United States and Australia). In his presentation he recognizes the present day efforts to understand the relationship between knowledge organization and knowledge management. In doing so, he referred back to Paul Otlet, one of the founders of knowledge organization and traced the developments of Otlets ideas from bibliography through his enlarged concept of document and documentation to his attempts to envisage new organizational forms for knowledge management. There is a brief abstract of the paper in the proceedings. The first group of two papers deals with Order and Knowledge Organization. Thomas Dousa (United States) deals with The simple and the complex in
E.C. Richardsons theory of classification. It focuses on an early KO model of the relationship between ontology and epistemology. He discusses the topic in general and turns to Richardsons work. There he discusses Richardsons classification as an ontological order from the simple to the complex and then as an epistemological order from the complex to the simple. He concludes with the nature and limits of the application of Richardsons epistemology. In the end he states that what may have been suitable in Richardsons day would fail to do justice to the variety of perspectives that one encounters in the multicultural world of today. In the second paper Hope Olson (United States) tackled Hegels epistemograph, classification, and Spivak's postcolonial reason. In this context, she describes teleology as a major characteristic of classification and uses Hegels three stage of development (being, essence and idea) as set out in his Science of Logic, as an example of codified progression that Gayatri Spivak refers to as an epistemograph a graduated diagram of the coming to being of knowledge and applies it to bibliographic classification. She then looks at order in terms of DDC and UDC. Hegels three stages are discussed followed by Spivaks approach leading into a discussion of globalization as exemplified in DDC and UDC. Olson concludes that Hegel provides a rationale for the sequence of main classes but it is now more of a convention than a teleological progression . Nevertheless, bibliographic classification, at least DDC and UDC, reflects and reinforces the mainstream epistemograph even though its meaning and significance are obscured.
65
Four papers were presented on the topic of Conceptology in Knowledge Organization. Alfred Gerstenkorn (Germany) discussed Entities and quiddities He is concerned the binarity of concepts and their connection with two kinds of cenceptualizationthe ontological and epistemological points of view. He introduces the topic of binarity, and compares some concept models. Finally, he proposes a chosen concept model. In his conclusion he states that the binary conceptualization seems to be an operable transdisciplinary approach which facilitates the communication of KO, especially concerning paradigms and conceptual systems. Further application needs to be done before its usefulness can be proven. Birger Hjrland (Denmark) addressed Concepts, paradigms and knowledge organization. He begins with the generally accepted agreement that concepts should be the building blocks of knowledge organization systems. This view is held by Dahlberg and many others but some researchers disagree. Hjrland sets out his understanding of concept theory with its four idealsempiricism, rationalism, historicism and pragmatism. With this basis he proceeds with a discussion of the criticism of concepts as units. In particular he examines the suggestion that ontologies should not be based on concepts, but rather on universals and particulars which exist in reality and are captured in scientific laws and on the idea that a concept is a sign. The nature of the topic is very complex. In his conclusion Hjrland states that it would seem very problematic not to inform users about different opinions at play. To accept concepts as units in KO by implication means to accept the theory-laden nature of KO and recognize that specific KOSs are supporting specific views about the knowledge being organized. Agnes Hajdu Barat (Hungary) looked at conceptology From paradigms of cognition and perception to phenomenon. Her paper explores the possibities of perception and cognition in the field of knowledge organization from an epistemological point of view. She is attempting to reveal some examples of new elements in the theory and practice of knowledge organization and to emphasize the necessary connection to human perception, phenomena and content dimensions. In doing so, she studies the epistemological questions, summarizes the knowledge from different sciences according to perception, phenomena and influences and makes a case for cognitivism in knowledge organization systems. She concludes that there needs to be comprehensive rethinking of knowledge and information along new and completely different lines. In the final
paper in this group, Charles van den Heuvel (Netherlands) and Richard Smiraglia (United States) discussed Concepts as particles: metaphors for the universe of knowledge. They use the metaphor of the particle to accumulate the components of a theory of knowledge that underlies the science of knowledge organization. They outline the concepts of the universe, identify the central role of the concepts and the intertwining roles of works, instantiations and documents. This research takes a different approach in that it demonstrates a semantics that is based on structure and related forces between components rather than on content. It permits the development of mechanisms for linking related entities with so far undiscovered similarities. The paper is an attempt to outline the central role of concepts in the knowledge universe, and the intertwining roles of works, instantiations and documents. They begin with an analysis of Paul Otlets views of the universe of knowledge through his work in the development of UDC and follow with some later efforts on the subject. Finally, they describe their own explorations. Eventually, they hope to demonstrate a semantics that is based on the structure of knowledge rather than on the content of documents. The section dealing with concepts in general is followed by three sets of papers dealing with concepts in particular subject disciplines. On Mathematics in Knowledge Organization there was one paper by gota Fris (Hungary) who examined Change of paradigm in terminology: new method in knowledge organization. The goal of the author is to describe the application in terminology as a scale-free network model used in the fields of natural sciences and information technology. In the research, the author used network theory as the basis, examined and interpreted the historic development of knowledge storage and knowledge transfer. The general laws of network theory are discussed, as are the properties of a scalefree network. This leads into a description of language networks and the model of the terminological network created in the research. The author has produced a model in which each of the three aspects of terminological approach, the cognitive, linguistic and the communication components each form a scale-free network and combining these three networks it is possible to model the process of communication. Two papers address Psychology and Knowledge Organization. A paper by Jos Antonio Frais Montoya (Spain) entitled Postmodernism, constructivism and knowledge organization: applications of repertory grid to knowledge construction and representation was
66
not available at the time of printing. The second paper under this topic was Perception, knowledge organization and noetic affective social tagging by Richard Smiraglia (United States). The author points out that studying perception and its role in the identification of concepts is critical for the advancement of KO. The purpose of the research is to advance our understanding of the role of perception in knowledge organization systems. Our present understanding of perception in KO is explained, as is its place in social tagging. Noesis, which is rooted in the ego is defined and noetic tagging is presented as a methodology. At the broader disciplinary level Science and Knowledge Organization, four papers were presented. Rick Szostak (Canada) examined Universal and domain specific classifications from an interdisciplinary perspective. His starting point is an exchange of views on the nature of classification between the author and Birger Hjorland which was published in the Journal of Documentation in 2008. Specifically there were two views1) the view that urges the development of a superior universal classification, and 2) the view that concepts are ambiguous and it is best to classify documents only broadly in domains. These opposing views have long been known. It is possible that these views could be complementary rather than substitutes. This paper examines the possibility of their complementarity. The author first sets out the key arguments regarding both feasibility and desirability of pursuing the two types classification. The second section looks at the strengths and weakness of the two types. Then the author reviews theoretical and practical reason for the two types to be complementary. Finally, he discusses the ways and means the two methods might be set up to be used in a complementary fashion. He closes by list of five questions that might be used in setting up such a system. In the second paper, Thomas Dousa (United States) asks Whither pragmatism in knowledge organization? Specifically the author is concerned with classical pragmatism (cp) neopragmatism (np) as KO metatheories. Up to now, the two theories have appeared virtually indistinguishable in their philosophical differences. His research appears to have found that this is still virtually true. In the third paper Joliza Chagas Fernandes and Nair Yumiko Kobashi (Brazil) considered The complexity challenge: a contribution to the epistemological reflection regarding information science. For purposes of a better understanding, this paper studies or reflects on the epistemology of information science. The authors first define epistemology and science, then identify the elements of the
reflection and report on the expected results. In the final presentation of this section Maria LpezHuertas and Maria Jos Lpes-Prez (Spain) examined the Epistemological dynamics in scientific domains and their influence in knowledge organization. They begin with the premise that socio-cultural context can affect the theoretical and epistemological development of a scientific domain. In doing so, it may affect not only a robust, consistent theoretical framework but also good practice. Knowledge organization systems should be concerned with this and avoid creating epistemological biases. This paper takes two domainspsychiatry and information science to demonstrate this situation. In their conclusion the authors state that LIS and KO should look for research methodologies capable of producing integrated knowledge KO and KOS designers should explore in a reflexive way the impact of external circumstances on the knowledge domain that will be represented and organized, trying to identify the bias that such a domain might have. Three papers discussed Problems in Knowledge Organization. Sergey Zherebchevsky (United States) addresses Formalism in knowledge organization using a thematic analysis of the ISKO proceedings. The purpose of the study was to improve theoretical comprehension of the domain of KO by investigating the presence of formalism and the amount of attention it receives in the field of KO as compared with the attention and efforts invested in addressing theoretical concerns. Two sets of six terms taken from an article by Elaine Svenonius in 2000 (citation not given) were used in the experiment. One set of terms were identified as Intellectual/Theoretical terms and the other as Methodological/Formalistic terms. ISKO 10 proceeding were then used to determine the number of terms of each type were used. More of the terms (61.56%) of the intellectual/theoretical type were retrieved than the methodological/formalistic type (34.44%). The finer details of the results are discussed in full. The author suggests a similar test should be tried on other KO data and comparison could be made on data in other disiciplines, e.g. cognitive psychology. Luciano de Souza Gracioso (Brazil) presented a paper on the Pragmatic approach to virtual information action from Wittgenstein. The author is concerned with the fact that information science is concerned with human actions while today, a large part of information action is configured in the virtual technological plane. He begins with a description of the repositioning of the subject and the use of language in a virtual field, followed by a discus-
67
sion of the philosophy of language, and the proposals of Wittgenstein. While the investigation did not lead to the use of the proposal as an application in information science, it has provided direction for further research. In the third and final, paper in this category Fidelia Ibekwe-SanJuan and Erik SanJuan (France) examined the output of Knowledge organization research in last two decades: 1988-2008. The authors apply an automatic system to records of publication in knowledge organization over two decades. The data came from journal articles in the KO field available from the Web of Science database (WoS). Motivation for the study and data collection are described. Methodology for term extraction and term clustering is given. Previous research on this topic was examined, The trends identified in the study were located automatically without human effort and the authors plan further more intense research. Cluster maps are included with the article. The topic Knowledge Organization Systems: General Questions includes seven papers focusing on a number of subtopics, including: faceting, categories, ontologies, integration and genres. Renato Sousa (Brazil), Douglas Tudhope (Wales) and Mauricio Barcellos Almeida (Brazil) gave a presentation entitled The KOS spectra: a tentative typology of knowledge organization systems. The purpose of the paper is to discuss why and how the KOSs should be tentatively classified on a new basis, aiming to shed some light to the discussion. Representation and knowledge organization systems are discussed, a taxonomy is drawn up and illustrated in a diagram of KOSs by type. The question, what to evaluate? is addressed and the evaluation dimensions or spectra is illustrated. As a result of the research, the authors found that we are far from having a consensus on KOS taxonomies and the related terminology. This paper is seen as a step in the evolving discussion, presenting some of the most important aspects to be taken into account when evaluation and choosing a specific KOS. Further research will publish a full model as a high level ontology. The next four papers by Gnoli, Scognamiglio, Poli and Kameas were invited and originally given at a workshop on levels of reality as a KO paradigm The paper presented by Claudio Gnoli (Italy) was entitled Levels, types, facets: three structural principles for KO. The three major principles involved in the structure of knowledge organization are identified and discussed. Then they are considered as to how they interact with each other, working as substructures that determine the macro-structure of a KOS. In what order should they be combined? (e.g. types, then lev-
els? Facets, then type; Levels, then facets; etc.). Six possible orders for interaction are examined. It was found that types is the most classical principle while facets, studied and applied later, can be seen as another classical principle. Levels were acknowledged even later, by studies of the CRG. Levels interact as an implicit principle and their interaction with the other principles is believed to be a useful contribution to the development of a more complete theory of KO. Carlo Scognamiglio (Italy) prepared a paper entitled Strata and top categories for an ontologically oriented classification. It addressed communication between ontological research and knowledge organization. This paper has as its theme the something (the matter) of classification. It begins with discussion of the debate on the alternative between the classification of documents and the classification of entities. The author shows how every kind of rigorous classification must be supported by an ontology. An example of the ontological approach is a theory of levels of reality. In turn, the author states that a theory of levels needs a set of general categories common for all levels. In this paper the exploration of top categories uses a combination of the critical ontology elaborated by Nicolai Hartmann and some notions from General Systems Theory. The paper by Robert Poli offered Domain theory: a preliminary proposal. The author states that domain theory can be built both on a theory of levels of reality and the theory of wholes. To test this approach he uses biology as a test domain. The domain is analyzed and core entities and facets are identified. Four types of domains were located. Two of the types (domain in general and sub-domain or facet frames) are analyzed. The other two types will be described in later papers. In conclusion, he advances the hypothesis that the structure of the top levels is different for each of the four types of domain. This reinforces the importance of distinguishing the various domain types. A fourth invited paper, entitled Ontologies in adaptive systems supporting every human activity by Achilles Kameas (Greece) was not available at the time of printing. The last paper under this topic is An integrative approach to the design of knowledge organization systems presented by Melanie Feinberg (United States). It presents a design process that negotiates between the communicative goals of an author, the information needs of an audience, and the structure of existing subject literature. Her design process is a theoretical model of the six primary activities that take place in the development of a research project as follows: 1) Envisioning: persona and scenario development or the identifica-
68
tion of the target audience for the product of the research being involved in research; 2) Strategizing: making a plan to achieve the nacent vision, or a scenario for identifying the research problem presented in a brief; 3) A learning process in which the designer surveys the subject literature and compiles a source book of concepts to use as raw material to be used in the strategy and in which changes in strategy may take place; 4) Sketching, a process in which categories are sketched out, definitions developed and potential hierarchical and associative relationships created; 5) Revisiting, refining and reflecting until the interconnection of the documents which represent the user experience occurs; and, 6) Finally, analysis and critique in preparation for further implementation. She concludes with a diagram showing the interrelationships of the six steps showing how the design process is neither linear nor circular while all activities are independent they may be occurring simultaneously, and knowledge gained in one activity may necessitate revisiting. In the final paper of this section Amelia Abreu (United States) presented a papers entitled Medium cool: Genres, attitudes and affect. Cool is used as a metaphor to examine the role that affect and social context play in the design and use of knowledge organization systems. The author provides a theoretical basis for her study, considers how knowledge organization might produce affect and how it may be used for further study. Finally, she considers the value cool may have as a commodity in the larger economy of information. Another popular area of interest was Knowledge Organization Systems Structure and Elements, Facet Analysis. Five papers were presented in this category. Uta Priss and John Old (Scotland) gave a paper on Concept neighbourhoods in knowledge organization systems. They began by considering previous research on the topic. A brief overview describes the FCA technologies for concept neighbourhoods and neighbourhood lattices and its application using the electronic version of Rogets Thesaurus It was logical that the application might also be used with other lexical databases such as WordNet and Wikipedia. The paper describes further research using the application and the word sleep from Roget to create and illustrate its use with all three databases. An on-line interface for exploring Rogets Thesaurus in this manner is available at www.roget.org. A similar interface for WordNet will be available at the same site in the near future. An interface for Wikipedia is more difficult because of its size. Rebecca Green and Michael Panzer (United States) discussed The ontological char-
acter of classes in the Dewey Decimal Classification. The ontological relationships among topics and classes in DDC are examined through a case study. The paper begins with a description of the nature of DDC classes as neighborhoods. This identifies implications for the ontological representation of the system and for operations on that representation. The association of topics with classes is examined and the development of neighborhoods is explored. The description of a neighborhood was achieved by expressing classtopic relationships as OWL class axioms. In concluding the experiment, the authors state that there is conflict with the possibility of a DDC ontology but they believe that the construction of an ontological model that reuses a certain level of knowledge in the DDC is feasible. In a paper entitled Semantic interoperability and retrieval paradigms, Felix Boteram, Winfried Gdert and Jessica Hubrich (Germany) present a new approach to understanding how indexing strategies, models for interoperability and retrieval paradigms interact and how it can be used to support semantic navigation in information retrieval systems. A comprehensive interoperability model is presented. Clarification of characteristics and structural qualities are required to implement semantic interoperability on various levels. Some of these levels are introduced and discussed. The concept theatre is illustrated and discussed. In the fourth paper in this group entitled Finding Bliss on the Web Vanda Broughton (United Kingdom) used the Bliss Bibliographic Classification, second edition (BC2) to address some problems of representing faceted terminologies in digital environments. Bliss, the only example of a fully faceted general classification is maintained and managed in electronic format but the format is not suitable for use in a public interface. What are the problems of achieving such a format? Broughton describes the coding used in BC2, some work done in converting the schedules to a thesaurus format and problems found in vocabulary control. Some degree of manual editing is required and the system cannot yet deal with many of the associative terms. Some of the coding added manually should be incorporated into the mark-up language. The role of encoding is described and various encoding systems such as EXML (Extensible faceted mark-up language), SKOS (Simple Knowledge Organization System) and TEL (Text encoding initative) are considered for representation of the concepts, their functional roles and the relationships between them. Integrating these aspects in a coherent and interchangeable manner appears to be achievable but the most
69
appropriate system is as yet unclear In the final paper in the session, Kathryn La Barre (United States) in a paper entitled Facets, search and discovery in next generation catalogs: informing the future by revisiting past understanding describes a project entitled Folktales, Facets and FRBR funded by a grant from OCLC/ALISE. The paper begins with a description of the North American encounters with Ranganathan and faceted classification theory through the years 1950 to 1969. This is followed by the contemporary understanding of facets in North American applications and examines faceted search and navigation in six next generation catalogues which provide such interfaces as guided discovery, mobile browsing and social networkingsearch refinements which have been called facets. A sample of 200 catalogues was selected and the search term folktales was used. A variety of types of facets were located (e.g. subject/ topic, format, location, call number, genre, etc.). Faculty members were interviewed about their searches and six categories of information tasks identified including exploring, creating, synthesizing, studying, collecting, and searching. Additional facets were found in the interviews. Some, but not all, of the facets were found in existing bibliographic records. Greater understanding of user information tasks is needed. Facet analysis should precede system design. This paper highlights the ways in which the heritage of facets and facet analysis may continue to inform research and development and spark a dialogue between system designers and facet theorists, thereby enhancing future access and discovery systems. In the session on Knowledge Organization Systems Construction three papers were presented. Two of papers deal with specialized subject areas gold and neurosurgery. Elena Cardillo, et al. (Italy) presented a paper entitled GoldThes a faceted thesaurus for goldsmith handcraftsmanship in a regional context. The authors present the construction of a very specialized thesaurus for a very precise subject area. Their intention is to demonstrate how classifying and organizing information into multi-dimensional hierarchies makes it more accessible than using a single taxonomy, that is a unique hierarchical dimension. The paper first sets up the background, citing AAT (Art & Architecture thesaurus) as an example of the type of thesaurus required for their own particular approach. The project is carried out in two macrophasesthe creation of a knowledge base and the construction of the thesaurus. Construction follows ISO 2788 (1986) and ANSI NISO Z39.19 (2005).
The facets were chosen from those defined by the Classification Research Group The second paper by K.S Raghavan and Chathoth Sajana (India) was entitled NeurOn: modeling ontology for neurosurgery. The authors report on the initial results of an ongoing experiment in building an ontology using concepts extracted from the patient records in a large hospital. The process of building the ontology is described, including the nature of the domain concepts and the creation of a small query library to be used in defining the classes as well as properties to be included in the ontology. The classes and subclasses are described, as is the terminology. The authors hope that the final product will provide a usable decision support system for health care personnel. A paper on Development of thesaurus structure through a work-task oriented methodology was presented by Asam Sanatjoo (Iran). It describes an empirical study which investigated a mixed set of methods and developed a prototype thesaurus to evaluate the potential of a worktask oriented methodology (WOM) for constructing a more enriched thesaurus. The result was evaluated for usability and performance against a conventional thesaurus (specifically Agrovoc). The methodology and construction of the thesaurus is outlined. The research design and findings are described. From the research result it was concluded that an enriched work-task oriented thesaurus inspires information searchers by offering enhanced conceptual content in contrast to the classic thesaurus in the traditional format. However the author states that the method cannot be a stand-alone one and must be combined with other construction methods. The challenge is to combine the methods in harmony with the context and purpose of the thesaurus in such a way that the advantages of each method are exploited optimally. Under the topic Knowledge Organization Systems Maintenance, Updating and Storage one paper was presented. Joseph Tennis (United States) spoke on Measured time: imposing a temporal metric to classificatory structures. For purposes of understanding and evaluating classificatory structures, he divides time into three units: long time (versions and states of classification schemes); short time (the act of indexing as repeated ritual or form) and micro-time (where stages of the interpretation process of indexing are separated out and inventoried). As professional practice, the act of classification has inherited the assembly-line work ethic of early twentieth century scientific management. This suggests assignment of a work to an appropriate location, and assigning only once. That is, in time an act of permanence. However as
70
time goes by new subjects appear and new documents are added, some of which fit or do not fit and require a new location. Thus the scheme has to be revised and permanence is called into question. Given the state of impermanence. This paper takes as its purpose the identification and characterization of impermanence (temporal metric) of classmarks in schemes. Each of the units is discussed. The author concludes that the assembly line approach is not art because it removes the authorial or artists presence. Yet it is still a major component of the institutions of long term classification schemes. Under the rubric of Compatibility, Concordance, Interoperability between Indexing Languages two papers were presented. Barbara Kwasnik and Mary Grace Flaherty (United States) discussed Harmonizing professional and non-professional classifications for enhanced knowledge representation. They compared two separate but related classification schemes in the area of medical information in order to understand how they might be used together and support each other. Used in the experiment were Medical Subject Headings (MESH) and a nave scheme used by the consumer health website WebMD.com. The term autism was used to compare the strengths and weakness from the perspective of vocabulary, syntax, classificatory structure, context and warrant. The paper describes previous work, the method used and the results. In the conclusion, the authors recognize the many differences between the two and make some suggestions as to how they might be used to support each other, including, the use of MESH to informally update and keep things current, harvesting current usage and concepts from the site and giving them temporary or pending status to bridge the gap between the scientific and lay perspective. They have provided some first steps in developing some guidelines for mapping of disparate classification. A presentation by Jan-Helge Jacobs, Tina Mengel and Katrin Mller (Germany) addressed Benefits of the CrissCross project for conceptual interoperability and retrieval. The paper discusses the goals, methods and benefits of the conceptual mapping approach of this particular project in which the topical headings of the German subject headings authority file (SWD) are being mapped to notations from the Dewey Decimal Classification. The purpose of the project is to create crosswalks between the two systems to improve retrieval processes, at the same time ensuring the continuous use of already existing indexing data. The two systems are briefly described and the methodology set out. To date, three applications of the
system have been worked out: enhancing access to DDC class and DDC-indexed documents, structuring document sets, and conceptual exploration. These are described in the paper and examples given. In the category Theory of Classing and Indexing, two papers were given but only one was available at the time of printing. Carlos Alberto Corra and Nair Yumiko Kobashi (Brazil) described A hybrid model of automatic indexing based on paraconsistent logic. Methods of automatic indexing are based on different theoretical assumptions. This paper aims to argue the theoretical potential for the use of hybrid models of automatic indexing, specifically the paraconsistent logic, a non-classical logic with capacity to handle situations that involve uncertainty, imprecision and vagueness. The type of system is described. A number of authors made reference to special subject taxonomies in geography, pathology and practical medicine, and psychology, in their papers focused on other topics. One author focused specifically on Taxonomies in Communications Engineering. Michiko Tanaka (United States) presented a paper entitled Domain analysis of computational science: fifty years of a scientific computing group. Bibliometric and historical methods were used to study the domain of the Scientific Computing Group at the Brookhaven National Laboratory over a period of fifty years from 1958 to 2007. The methodology and data analysis is described and statistics included. In his results the author noted the growing emergence of interdisciplinarity and identified a strong and consistent mathematics and physics orientation within the group. Similarly, there were a number of presentations that briefly referred to knowledge organization systems from special subject areas (e.g. biology, agriculture and horticulture, food science and technology, sociology, social aid and social politics, and general economics) while their papers focused mainly on broader topics. Three papers focused specifically on Special Knowledge Organization Systems in Literature. Two were printed in the proceedings. Pauline Rafferty (Wales) described Genre theory, knowledge organisation and fiction. The author is concerned with the epistemological assumptions underpinning fiction categorisation, explores current genre theory and argues for an approach to the understanding of genre, and ultimately the description of genre, that is based on a cultural-materalist, historical world-view. There are sections on access to fiction in historical terms; also on genre theory and mapping generic history. Finally she proposes a fiction retrieval tool through a user-based website. Francisco-Javier Gracia-Marco (Spain), Joo-
71
Batista-Ernesto de Moraes (Brazil), et. al. described Knowledge organization of fiction and narrative documents. They were dealing with the challenge in the age of multimedia. The paper examines the key facets for knowledge organization in the field of fiction building on literature theory and faceted classification theory. The research focuses on the integration of two fieldslibrary and information science and research from the researchers in literature theory. Subtopics addressed include the nature of narrative and fictional narrative, specificity in fiction and the theory of literature and subject indexing, fictional documents, levels of complexity and intertextuality and the problem of canonical order. A model is proposed, recognizing that there is no single classificatory approach to fiction. In their conclusion, the authors recognize that much work still needs to be done in clarifying the big levels of analysis proposed in the paper. Also there was one paper on Special Knowledge Organization System in Cultural Sciences. Carol Tilley and Kathryn La Barre (United States) proposed New models from old tools: leveraging an understanding of information tasks and subject domain to support enhanced discovery and access to folktales. Their paper provides an introduction to an ongoing research project, the purpose of which is to provide users with a method of enhancing the effectiveness and efficiency in discovering and accessing folktales. In general, the research combines task analysis with facet analysis and plans are to develop an enhanced bibliographic record type. This paper describes the first phase of the project, specifically the information tasks to be used, the information seeking obstacles and the desired features of the project. It also includes some of the bibliographic, cultural, and intellectual facets derived from a sample of folktale resources. Finally, it proposes a model for enhanced bibliographic records. The methods and findings are described and the model is illustrated. The very broad category entitled General Problems of Natural Language, Derived Indexing, Tagging was popular among participants. Five papers were given in this category. Marianne Lykke ((Denmark), Susan Price and Lois Delcambre (United States) explained Using semantic components to represent and search domain-specific documents: an evaluation of indexing accuracy and consistency. The authors developed a semantic component model to supplement the existing representations of documents. Then they conducted a comparative indexing study using a national health portal to assess the feasibility of semantic component indexing. Findings suggest that accuracy and consistency might be higher for semantic com-
ponent indexing (SC) than conventional indexing. Additional study is needed. Future analysis will evaluate the nature and number of indexing facets. Jungran Park, et al. (United States) described Locally added homegrown metadata semantics: issues and implications. The authors used data from a nationwide study carried out by cataloguing and metadata professionals to assess the current state of metadata elements used in digital depositories. The homegrown elements included local notes and description, local personal and place names and local subjects, as well as administrative, technical and preservation data that had been added locally to records. The additions are seen as examples of perceived needs of local users. Currently there is a lack of a common data model of this kind of data used in records. The aim of the study was to examine records to find out the answers to three questions: What homegrown elements are added? What were the criteria for adding such data? How are local metadata practices documented and shared? An overview of previous studies is given; methodology used in the study is provided and conclusions drawn. The results indicated that widespread use of homegrown metadata elements may present a potential challenge to the effective reuse and sharing of metadata in the networked environment. Further research is needed in which other research methods are used and more varied data sources would provide a fuller picture. Maria Aparecida Moura and Juliana Assis (Brazil) investigated Social networks, indexing languages and organization of knowledge: a semiotic approach. They conducted a theoretical discussion on semiotic categories and their application in information organization. This was followed by an experiment on the performance of the Gemet and Eurovoc thesauri using the subject sustainable development and comparing with folksonomies and classification systems. The result was a proposal for a semiotic approach to design of indexing languages. In the fourth paper Pertti Vakkari (Finland) described How specific thesauri and a general thesauri cover lay persons vocabularies concerning health, nutrition and social services. The aim of the study was two-fold: 1) to compare the semantic structures in lay persons questions addressed to an expert service in the areas of health, nutrition and social services; and 2) to determine to what extent lay persons vocabularies are covered by a general thesaurus and a specific thesaurus in each of the three fields. Questions were for tests in each of the three areas. The results show that the overlap between general controlled vocabulary and a specific one was most ex-
72
tensive in health (32%) and least extensive in social services (9%). It seems that in all fields tested there are limited links between the general and specific vocabularies from the point of view of users. In the case of nutrition and social services, the match was low and the need for enrichment from the specific tool is very great. In the final paper in this section Isto Huvila (Sweden) researched Aesthetic judgements in folksonomies as criteria for organising knowledge. Using Flickr photosharing service as an example the folksonomies were examined as a potential source of collective judgements of a large group of people with a special focus on everyday life aesthetics. Visual analysis of clusters of photographs was carried out using a system of the tags. One presentation fell in the category of Automatic Language Processing. Klaus Lepesky et al. (Germany) provided a paper on Metadata improvement for image information retrieval. It discusses the goals and results of the research project Perseus-a. This project attempts to improve image retrieval by automatically connecting the images with text-based descriptions. The project uses the image collection of Prometheus, a distributed digital image archive for research and studies. In order to connect the works with related texts a matching process for images and texts had to be developed. Art historical terminological resources, classification data and an open source system for linguistic and statistical automatic indexing called lingo were used. It was concluded, that while the principle idea of the project was successfully demonstrated, there needs to be much more research on the underlying algorithms, Under the category Online Retrieval Systems and Technologies seven papers were given. Six were available for printing. Margherita Sini, et al. (Italy) provided a paper on Smart organization of agricultural knowledge: the example of the AGROVOC concept server and Agropedia. The authors noted the importance of the use of the computer in disseminating information in the food and agriculture field. This paper analyses projects developed by two such organizations, aiming to make use of a concept-oriented approach, while describing agricultural topics. The two projects are described with respect to their innovative aspects, their benefits and the technology used. The authors conclude that the work undergoing by FAO and other AOS partners for making better use of traditional thesauri is in line with the current strategies of making data more processable. Similarly, the Agropedia project opens the road to the representation of agricultural knowledge in the form of concept
based maps. It is noted that there is still much to be done. For example for the AGROVOC Concept Server investigations on the role of OWL2 and OWL rules should be carried out as well as the completion of the collaborative tool to maintain the data pool. Further work is planned. A discussion by Currado Di Benedetto, et al. (Italy) fcocused on a Semantic approach to bioethics in the Ethicsweb project. Specifically, the authors describe building a semantic architecture for a European documentation system. The purpose of the paper is to present the activities of the European project referred to as Ethicsweb. The project has four general objectives: 1) to facilitate access to information on ethics in science using an integrated infrastructure; 2) the development of sophisticated tools, technical and semantic, to establish the infrastructure; 3) the creation of a European Reference Center for Bioethics; and finally 4) the development of multilingual tools (thesauri and ontologies) for searching of documents in the bioethics field. The content of the paper focuses on the steps taken up to now. Maria Teresa Biagetti (Italy) discussed Pertinence perspective and OPAC enhancement. A starting point for her paper is the ongoing debate on OPAC enhancement and the necessity to design OPACS based on search engine features. The previous work done on OPAC enhancement is outlined; relevance/pertinence is defined; the semantic perspective addressed and an improved model using traditional semantic indexing strategies is proposed. Anna Nosek, et al. (Poland) reported on research on Multidimensional analysis of the information structure of public libraries websites in the Podlasie region (Poland). It includes the results of quantitative surface research, covering the contents of library websites and a detailed analysis of three subjects: information about literature, borderline knowledge and formal website quality assessment. The introduction discusses the general nature of website use in small public libraries in Poland. The methodology and scope of the study is set out and quantitative research results are outlined. The results of the study were both revealing and disappointing. They indicated that these small libraries are most connected only within the area in which they exist. There is little connection between the main regional library and the county libraries. Some of the libraries do not have websites of their own and there are few links between library websites. Thus an essential regional information network of libraries is practically non-existent. Many libraries see their websites as virtual bulletin boards giving practical information such the librarys address, telephone number, hours, etc.
73
The librarians often do not see the importance and benefits of websites as an information service. There is still a lot to be done in the development of a regional information service here. Elizabeth Milonas (United States) tackled The use of facets in Web search engines. There were four web search engines in the studytwo that utilized facets or facet terms, Exalead and Excite, respectively, and two search engines that do not use facets, namely, Google and AltaVista. The two faceted systems are described. Related research studies are identified and the methodology described. Participants were library and information science masters students (LIS) and PhD information studies students (IS) and the search terms were social networks and lymphoma. The analysis looked at three characteristicsease of the search process, search time, and confusion during the search process. The results provided three significant findings: 1) Facets make the search process easier, whether searching for familiar or unfamiliar topics; 2) when using facets it takes longer when searching familiar topics, than unfamiliar topics; and 3) when searching for familiar topics, facets do not cause confusion for the searcher. Findings 1 and 3 are supported by the literature. Finding 2 is not supported by the literature. Some discrepancies were found. For example, IS students did find that facets made the search process easier but were confused when searching the term social networks. LIS students found that facets did make the search process easier and were not confused when searching the term social networks. Marcia Lei Zeng et al. (United States) spoke on Expressing classification schemes with OWL 2. In doing so they explored issues and opportunities based on experiments using OWL 2 for three classification schemes. The schemes used were the Dewey Decimal Classification, the Chinese Library Classification and the Library of Congress Classification. The characteristics that OWL 2 and traditional classification have in common were identified. Most important were the issues in presenting various types of classes and their relationships were discussed and included the following: centered entries, synthesis in classification schemes, class-topic relationships, alternative class location, presentation of auxiliary tables, presentation of index entries, presentation order/sequence of sibling classes, the internal structure of notes and the presentation of notationbuilding rules. The authors continue to explore issues for evidence of possible use of OWL 2 to resolve some classification issues is emerging. Under the category of Problems of Terminology was one paper by Boyan Alexiev (Bulgaria) and Nancy
Marksbury (United States) entitled Terminology as organized knowledge. It explores the possibilities of integration between knowledge organization and terminology based on analyzing and comparing the basic theoretical methodological premises of the two disciplines with the idea of identifying threads that could be used to apply an interdisciplinary approach to a knowledge-oriented terminology. The theoretical and methodological nature of each discipline is analyzed. The commonalities were sought in three ways; semantic similarity in terminology used; similarity in underpinnings and similarity in methodological approaches. In conclusion, the authors point out that in both KO and terminology there is a tendency to move forward to a domain specific approach. A final conclusion can be drawn that combined KO and terminology research methods would lead to strengthening the collaborative links between specialists in the two fields bringing about the development and improvement of their theoretical, methodological and practical achievements. In another single paper topic on Subject-Oriented Terminology Work, Peter Ohly discussed Interrelations and dynamics in thematic networks: how to present bibliometric outcome? In this presentation, network analyses of term and concept co-occurences are examined to demonstrate their potential in combining both in one map. Alternative possibilities are discussed, and examples are taken from German literature. Using as the example, elderly employees a thematic network analysis was demonstrated, shaping both concept specific words as well as broader concepts. Under the general category General Problems of Applied Classing and Indexing, Catalogues, Guidelines two papers were presented. Lynne Howarth (Canada) talked on the topic Mapping the world of knowledge: cartograms and the diffusion of knowledge. One issue in providing access to knowledge is the use of non-verbal representations such as notations, symbols or icons, or rich visual displays, including topical map to facilitate access to information and warrant more attention. Here, Howarth uses Wordmapper as an example and examines cartograms a derivative of the data map which adds dimensionality to the geographic positioning of information. This is one approach to representing and managing subject content and to tracking the diffusion of knowledge across place and time. The paper discusses applications of information visualization, mapping of data content and context, mapping the diffusion of knowledge and the use of cartograms to represent and manage subject content.
74
In this paper, cartograms emerge as key and opportunistic players in finding new ways of approaching knowledge. In the second paper in this section Athena Salaba (United States) looked at Use and users of subject authority data. Here, the author reports on the findings of two surveys of subject authority data and its use by information professionals in the semantic Web environment and in libraries and information agencies. An introduction is provided on authority data and its changing and expanded use outside dedicated information retrieval systems. The two surveys are describedone on use by semantic web professionals and the other on use by information professionals. In the conclusion, the author points out that there would be addition information provided beyond her paper in her presentation, including the implications of the findings and future directions for the research. In a category on Classing and Indexing of NonBook Materials (Images, Archives, Museums) four presentations were given. Edward Ismael Murguia (Brazil) discussed Collecting and knowledge organization: a theoretical approach from the material culture studies. Thiago Henrique Bragato Barros and Jon Batista Ernesto de Moraes (Brazil) addressed the topic From archives to archival science: elements for a discursive construction. They are studying known concepts of archival science. The problem addressed is the identification and analysis of the discourse produced by archival science methodology from its key functionsdescription, organization, classification (current and intermediate archives) and arrangement (permanent archives). Two manuals were analyzed the Manual of Dutch Archivists and Hillary Jenkinsons A manual of administration including the problems of war archives and archive making. The method used was discourse analysis. In their conclusion the authors state that both manuals are fundamental to the construction of archival science as a discipline However, though they seek a theoretical approach, as all discourses and productions, their concepts and their approaches are dated by their historical and social space. In the third paper in this category Natalia Bolfarini Tognoli and Jos Augosto Chaves Guimares (Brazil) addressed the topic Postmodern archival science and contemporary diplomatics in a search for new approaches for archival knowledge organization. New information technologies and new forms of document production have lead archivists to rethink the role of archival science in the socalled information age. The authors have chosen to examine two trends with different approaches that
have emerged in North America and Europe. These are 1) the reformulation of the basic concepts and the functional analysis method focusing on the process and context of document creation, and 2) incorporation of all the theoretical and methodological models of classic diplomatics. The purpose of this study is to elucidate the connection points and distinct features between the two trends concerning the organization of archival knowledge. Each method is described in detail. In conclusion, both approaches have important insights to offer in understanding the record and both should be used as interrelated tools. The final paper in category by Hemalata Iyer and Abebe Rorissa (United States) is entitled Representative images for browsing large image collections: a cognitive perspective. This paper addressed the issue of choice of representative images within categories. A study of the free sorting of 50 images by 75 participants was conducted, in which they sorted the images into categories and selected a representative for the category. They also indicated the prominent feature of the image in the selected image. The authors also found reasonable agreement in the choice of representative images and the identification of prominent features. In the final major category of these proceedings, Personas and Institutions in Knowledge organization, Cultural Warrant, there were three papers. Gloria Origgi and Judith Simon (France) wrote On the epistemic value of reputation: the place of ratings and reputational tools in knowledge organization. The authors explored epistemological revelance and value of reputation, understood as evaluative social information. They introduced a model of rational concensus and followed with an analysis of different reputational tools on the Web. The nature of the situation is described and caution given of the dangers of using social information for epistemic purposes. In conclusion, it is stated that a purely epistemologtical or cognitive analysis of using reputation for epistemic purposes will not suffice for KO. Nevertheless, reputational tools open up new possibilities for KO. Suellen Oliveria Milani and Jos Augusto Chaves Guimares (Brazil) examined Bias in the indexing languages: theoretical approaches about feminine issues. They take as starting point the fact that the process of knowledge representation as well as its procedures or tools and its products are not neutral in value. Instead they imply moral values. In this context, they address the problems of bias in classification and thesauri. Starting from the reflections of earlier writers they propose a preliminary categorization aimed at facilitat-
75
ing the identification of bias concerning feminine issues in indexing languages. Among other things, they offer suggestions in the form of use of feminine form, insertion of notes and the use of gender qualifiers, to minimize the problems. In the final paper of the proceedings, Carel de Beer (South Africa) described The troubadour of knowledge: a knowledge worker for the new knowledge age. The author provides a portrait of the new age and identifies new qualities or qualities
to be reinvented.. The troubadour is described as the instructed third, competent and very able to link the sciences (the instructed first) with the humanities (the instructed second), while taking him/herself the third position with the ability to move from one to the other and back again a kind of traveller or voyager, hence the troubadour. Six characteristics of a troubadour emerge.
76
Knowl. Org. 40(2013)No.1 Letters to the Editor
Letters to the Editor

Speaking Truth to Power in Classification: Response to Foxs Review of My Work; KO 39:4, 300 It is always a pleasure to see ones scholarship reviewed at length. And it is especially nice to see a review that shows how one has built an interconnected series of arguments over a series of publications, especially when these collected arguments support a novel approach to classification. There are however a few misunderstandings that should be corrected. And I think that these reflect broader issues of interest to the KO community. Fox notes at the outset that the books under review were aimed at a general scholarly audience. My research has become increasingly focused on information science since that time. Yet she misconstrues several of my remarks as if they were intended as advice on classification rather than advice on the performance of scholarly research in general. It would be absurd to suggest that we should not classify works that we thought extreme or substandard. The duty of information science is to make sure that there is a place for everything in our classifications. The quotes she cites concerning how we should be aware of the strengths and weaknesses of different theories and methods, or how we should be careful of extreme views, were advice on how to do research (and perhaps use a classification), not whether to classify certain works. It is ironic that after this misplaced plea for inclusiveness Fox reproves me for finding it regrettable that some scholars might reject the literature of others on a priori grounds. The point of that entire section of the 2003 book was to show that there were a variety of literatures that reached quite different conclusions but were talking about the same recognizable variables: they just disagreed about their relative importance. The point of the paragraphs she references is simply Some scholars say this. Even if I had been judging the value of these different argumentswhich I was notthe important point for information science that it makes sense to classify all these works in terms of this common set of variableswould still stand. Fox is likely not the first to conflate two distinct though related issuesthe debate between me and Hjrland regarding the possibility of a universal classification (the latest installment is Szostak 2011), and the question of how to make sure that the views of disadvantaged groups are best represented in our classifications. The reviewers main concern seems to be the second, whereas my writing has mostly focused on the first. So let me dip my toe into the second. I would ask a question: Are members of disadvantaged groups better served if the literature they generate is found easily by members of more powerful groups, also stumbled upon by accident often by members of those groups, and then understood when it is encountered by others? Or alternatively if it is classified in a unique fashion so that members of any other groups have to make a special effort to find it and have difficulty navigating it once they do? I think that the first is most important, though I have consistently argued (see Szostak 2010 in particular) for the complementary pursuit of domain analysis and a universal classification: this would at least ensure that the meanings of that literature are well captured in the universal classification, and we might find it advantageous to have domainspecific classifications that are translatable into the universal. It would take a much longer letter to justify, if necessary, my non-nave reasons for emphasizing that first option. Note that if one prefers the second option, then the debate between myself and Hjrland is moot: only a domain classification is desired. Hjrlands argument that domain analysis is all that we can do should be carefully distinguished from an argument that domain analysis is all that we should want. They are, in my opinion, wrong for quite different reasons, but equally deserving of classification. But if one prefers the first option, then the debate between Hjrland and me becomes critical because it focuses on the feasibility of precisely the sort of universal classification that would facilitate cross-group understanding. I have also argued consistently that works can and should be classified by the perspective of the author (among other things). This argument is admittedly far less prominent in the books reviewed than in the later articles cited (which reflect the benign influence of Claudio Gnoli and the knowledge organization community more generally). So I seek a universal classification which facilitates cross-group conversation and understanding but yet allows the literature of any perspective to be readily identified.
77
The reviewer finds my recognition that there are potentially hundreds of thousands of relationships that scholars might study to be an argument against a universal classification. But it is in fact an argument in favor of the sort of universal classification I advocate: Rather than try to signify individually this massive number of possible combinations (and especially to do this over and over in different domains), we instead rely on identification of the much smaller set of things and relationships that generate this huge number of combinations (Szostak 2011, 2012a). And if users can then search by any combination, the recall issues Fox mentions will be greatly alleviated (my approach allows structured postcoordinated searching). And the beauty is that these things and relationships lend themselves to a far greater degree of crossgroup understanding than the combinations they generate. Groups, that is, disagree far more about how/if one thing affects another than about the nature of the things and relationships themselves (the key argument of Szostak 2003). Whether these more basic concepts lend themselves to enough shared understanding is an empirical question. We should not assume that we know the answer because of prior beliefs regarding ambiguity. The Basic Concepts Classification is now developed far enough for people to judge for themselves (Szostak 2012b). (see also the Integrative Levels Classification at www.iskoi.org/ilc.) I am the first to confess that with some elements of that classification I encountered an apparently irreducible degree of ambiguity greater than I would like (political ideology leaps to mind), but I would submit that the vast majority of the terms used are very non-ambiguous and all are nonambiguous enough for the purposes of classification. Does the BCC manage to eschew bias toward any group? That is certainly the aim. If some bias can be spotted I am confident that it can be repaired. But of course this is also a matter of judgment. Fox implies that the decisions made in developing the classification may have been biased, but though noting that I describe those decisions in detail provides no example of a decision that I made while developing the classification that reflected any particular bias. I am, I confess, guilty of making decisions. A final point: I am also not sure why we bother classifying anything if we do not think that human understanding can advance, but that is a discussion for another day. But I would close by sincerely thanking Fox for reading my work, commenting upon it at length, and recognizing the importance of the goals I have been
pursuing. There is a great deal to like in her review. If her review (and this response) stimulate greater interest in my research I am in her debt. Rick Szostak Department of Economics, University of Alberta, Tory Building 9-18, Edmonton, Alberta, T6G 2H4, Canada, <rick.szostak@ualberta.ca> References Szostak, Rick. 2003. A schema for unifying human science: interdisciplinary perspectives on culture. Selinsgrove PA: Susquehanna University Press. Szostak, Rick. 2010. Universal and domain-specific classifications from an interdisciplinary perspective. In Gnoli, Claudio and Mazzochi, Fulvio eds., Paradigms and conceptual systems in knowledge organization: Proceedings of the Eleventh International ISKO Conference 23-26 February 2010, Rome, Italy. Wrzburg: Ergon Verlag, pp. 71-77. Szostak, Rick. 2011. Complex concepts into basic concepts. Journal of the American Society for Information Science & Technology 62: 2247-65. Szostak, Rick. 2012a. Classifying relationships Knowledge organization 39: 165-78. Szostak, Rick. 2012b. Basic Concepts Classification. http://www.economics.ualberta.ca/en/Facultyand Staff/~/media/economics/FacultyAndStaff/ Szostak/Szostak-Basic-Concept-Classification2 .pdf.
A Knowledge Classification Model Based on the Relationship Between Science and Human Needs The basic needs of human beings are in natural aspect. Health, sex desire, foods and a house to live etc. belong to this level. As social animals humankind should have middle level needsthe social needs. These needs include money and love (to love and be loved). Money is the foundation for one to keep relationships in society. And the objects of love are relatives, friends and lovers. Curiosity promotes human beings to learn and think. It is the highest need of humansthe thinking level. By thinking about how to deal with the 3-level needs, people create science. People need each other and individuals needs sum up the needs of all people. As a free human, he/she needs sports knowledge to keep health, appropriate sex knowledge in order to
78
give birth to a healthy baby, and some cooking knowledge to get some delicious foods. He also need some common sense to ensure the safety of living and traffic. All these knowledge are applications of natural sciences to individuals. Similar to needs of individuals, applications of natural sciences are medicine, agriculture and technology to whole human beings. What are the basic sciences of these applications? They are biology and physics. Of course, the basic of biology is physics and physics should include chemistry, earth science and astronomy. According to human needs, biology is the basis of all knowledge on sports, sex and foods (to individuals), medicine and agriculture (to all human beings). And the footstone of common senses of living and technology (such as material science, energy and information technology etc.) is physics. Same as natural sciences, social sciences should also be divided into basic sciences and applications. Basic social sciences are economics and sociology. And economics is the base of sociology. Accordingly, individuals need investment knowledge to make money. And they also want knowledge about family and public relationships. And finance, commerce and politics (law etc. should be included) are applications of social science to all people.
The thinking science is about how people think. When a free human owns all he/she needs in the natural and social levels, he/she may want to rethink, rethink about all about the world and all about 3-level needs of human beings. Individuals need all knowledge of the world to satisfy their curiosity, so library science becomes important. To all people, education should be used to spread these knowledge. The basic of thinking science is itself: thinking science. Natural sciences and social sciences are essential to thinking science. And natural sciences are necessary to study social sciences. In order to study science, people invent some tools to deal with all scientific problems. Languages are tools for describing and communicating these problems. And maths, philosophy and history etc. are thinking tools: maths help people think quantificationally and philosophy , history and arts (literature, music etc. included) from a qualitative view. Guohua Xiao Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200032, China, <guohua.xiao@gmail.com>
Knowl. Org. 40(2013)No.1 Index to Volume 39 (2012)
79
No. 1, pp. 1-65; No. 2, pp. 69-150; No. 3, pp. 153-228; No. 4, pp. 233-304; No. 5, pp. 309-399; No. 6, pp. 405-468.
Index to Volume 39 (2012)
ALPHABETICAL INDEX 1.0 Articles Adler Melissa A. Disciplining Knowledge at the Library of Congress................................................370 Almeida, Carlos Cndido de. The Methodological Influence of Peirces Pragmatism on Knowledge Organization .....................................................................204 Baio, Guilherme, Salgado Silva and Gercina ngela Borm de Oliveira Lima. Using Topic Maps in Establishing Compatibility of Semantically Structured Hypertext Contents .................432 Barrionuevo Almuzara, Leticia, M Luisa Alvite Dez, and Blanca Rodrguez Bravo. A Study Of Authority Control in Spanish University Repositories.........................................................................95 Bianchini, Carlo. Colon Classification and Nuovo Soggettario: The Case of the Library of the Natural History Museum of Udine, Italy ...................23 Bonome, Mara G. Analysis of Knowledge Organization Systems as Complex Systems: A New Approach to Deal With Changes in the Web..........104 Bourdenet, Philippe. The Catalog Resisting the Web: An Historical Perspective........................................276 Chiaravalloti, Maria Teresa, Erika Pasceri and Maria Taverniti. URT Indexing and Classification Systems Projects and Biomedical Knowledge Standards ...............................................................................3 Choi, Yunseon. A Practical Application of FRBR for Organizing Information in Digital Environments ....................................................................233 Clavier, Viviane and Cline Paganelli. Including Authorial Stance in the Indexing of Scientific Documents ........................................................................292 Cope, Jonathan. Librarianship as Intellectual Craft: The Ethics of Classification in the Realms of Leisure and Waged Labor .............................................356 Couzinet, Viviane. Knowledge Organization in Information and Communication Sciences, a French Exception? ............................................................259 Doria, Orlie Desfriches. The Role of Activities Awareness in Faceted Classification Development.........283 Fox, Melodie J. and Austin Reece. Which Ethics? Whose Morality?: An Analysis of Ethical Standards for Information Organization ..........................................377 Freitas, Juliana Lazzarotto, Rene Faustino Gabriel Junior and Leilah Santiago Bufrem. Theoretical Approximations Between Brazilian and Spanish Authors Production in the Field of Knowledge
Organization in the Production of Journals on Information Science in Brazil......................................216 Gilliland, Anne J. Contemplating Co-creator Rights in Archival Description.........................................340 Gnoli, Claudio. Metadata About What? Distinguishing Between Ontic, Epistemic, and Documental Dimensions in Knowledge Organization ....................................................................268 Gross, Tina. Eliminate, Abandon, Dismantle: Cataloging in Library Consultant Reports......................398 Homan, Philip A. Library Catalog Notes for Bad Books: Ethics vs. Responsibilities .........................347 Keilty, Patrick. Tagging and Sexual Boundaries ...............320 Keilty, Patrick. Sexual Boundaries and Subcultural Discipline.......................................................417 Kim, Jong-Ae. Understanding Knowledge Representation in the Knowledge Management Environment: Evaluation of Ontology Visualization Methods ......................................................193 Malheiro da Silva, Armando and Fernanda Ribeiro. Documentation / Information and Their Paradigms: Characterization and Importance in Research, Education, and Professional Practice .........................................................111 Marijun, Pedro C., Raquel del Moral, and Jorge Navarro. Scientomics: An Emergent Perspective in Knowledge Organization..........................153 Martnez-vila, Daniel, Hope A. Olson, and Margaret E.I. Kipp. New Roles and Global Agents in Information Organization in Spanish Libraries ...............................................................125 Martnez-vila, Daniel, Margaret E. I. Kipp and Hope A. Olson. DDC or BISAC: The Changing Balance between Corporations and Public Institutions ............................................................309 Milani, Suellen Oliveira and Fabio Assis Pinho. Knowledge Representation and Orthophemism: A Reflection Aiming to a Concept ..................................384 Mustafa El Hadi, Widad and Clment Arsenault. Dynamism and Stability in Knowledge Organization: From one Conference to Another: Toronto 2000, Lille 2011...................................255 Noruzi, Alireza. FRBR and Tillets Taxonomy of Bibliographic Relationships .........................................409 Oh, Dong-Geun. Developing and Maintaining a National Classification System, Experience from Korean Decimal Classification ................................... 72 Ortega, Cristina Dotta. Conceptual and Procedural Grounding of Documentary Systems ...........224
80
Knowl. Org. 40(2013)No.1 Index to Volume 39 (2012)
Pinho, Fabio Assis and Jos Augusto Chaves Guimares. Male Homosexuality in Brazilian Indexing Languages: Some Ethical Questions ................363 Raieli, Roberto. The Semantic Hole: Enthusiasm and Caution Around MultiMedia Information Retrieval .........................................................13 Seeman, Dean. Naming Names: The Ethics of Identification in Digital Library Metadata.......................325 Souza, Renato Rocha, Douglas Tudhope, and Maurcio Barcellos Almeida. Towards a Taxonomy of KOS: Dimensions for Classifying Knowledge Organization Systems.......................................................179 Szostak, Rick. Toward a Classification of Relationships .......................................................................83 Szostak, Rick. Classifying Relationships .........................165 Tennis, Joseph T. A Convenient Verisimilitude or Oppressive Internalization? Characterizing the Ethical Arguments Surrounding Hierarchical Structures in Knowledge Organization Systems .............394 van den Heuvel, Charles. Multidimensional Classifications: Past and Future Conceptualizations and Visualizations..............................................................446 Zhang, Jane. Archival Context, Digital Content, and the Ethics of Digital Archival Representation..........332 2.0 Book Reviews Boteram, Felix, Winfried Gdert and Jessica Hubrich, eds. Concepts in Context. Proceedings of the Cologne Conference on Interoperability and Semantics in Knowledge Organization, July 19-20, 2010. Biblioteca Academica, Reihe Informations- und Bibliothekswissenschaften, Bd. 1. Wrzburg: Ergon Verlag, 2011. 183 pages. ISBN 9783899138719........................................................461
Frngsmyr, Tore, ed. The Structure of Knowledge: Classifications of Science and Learning since the Renaissance. Berkeley, California: Berkeley, 2001. 158 pages. ISBN 0-9672617-1-6 .......................................137 Gilchrist, Alan, ed. Information Science in Transition. London: Facet, 2009. xxix, 401 pp. ISBN 9781856046930 .......................................................463 Szostak, Rick. A Schema for Unifying Human Science: Interdisciplinary Perspectives on Culture. Selinsgrove, Pennsylvania: Susquehanna University Press, 2003. 389 pages. ISBN 9781575910604.................300 Szostak, Rick. Classifying Science: Phenomena, Data, Theory, Method, Practice. Norwell, Massachusetts: Springer, 2004. 389 pages. ISBN 9781402030949 .......................................................300 3.0 Reports, Communications, Features, etc. Dahlberg, Ingetraut. A Systematic New Lexicon of All Knowledge Fields based on the Information Coding Classification .......................................................142 Marradi, Alberto. The Concept of Concept: Concepts and Terms ...........................................................29 Satija, Mohindar Partap. Enhancing the Subject Headings Minting Capacity of the Sears List of Subject Headings: Some Suggestions.................................60 Satija, Mohindar Partap. Abridged Dewey-15 (2012) in Historical Perspectives .....................................466 Smiraglia, Richard P . Shifting Intension in Knowledge Organization: An Editorial...........................405 Williamson Nancy J. International UDC Seminar 2011, The Hague .................................................................55

Ko 40 2013 1

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Ko 40 2013 1

Hochgeladen von

Copyright:

Verfügbare Formate

Knowl. Org. 40(2013)No.

Official Bi-Monthly Journal of the International Society for Knowledge Organization

Knowl. Org. 40(2013)No.1

Knowl. Org. 40(2013)No.1 ISKO 12s Bookshelf Evolving Intension: An Editorial

ISKO 12s BookshelfEvolving Intension: An Editorial

Knowl. Org. 40(2013)No.1 ISKO 12s Bookshelf Evolving Intension: An Editorial

Figure 2. Countries of affiliation

Knowl. Org. 40(2013)No.1 ISKO 12s Bookshelf Evolving Intension: An Editorial

Figure 3. Theme by Country; Country by Theme

Figure 4. References and age of citation by country of affiiliation

Knowl. Org. 40(2013)No.1 ISKO 12s Bookshelf Evolving Intension: An Editorial

Table 2. References and age of citation by theme

Knowl. Org. 40(2013)No.1 ISKO 12s Bookshelf Evolving Intension: An Editorial

No. of citations 33 21 17 9 5 4 4 4 4 Table 4. Most cited journals

Knowl. Org. 40(2013)No.1 ISKO 12s Bookshelf Evolving Intension: An Editorial

Figure 6. Interconference author co-citation (stress = 0 R2 = 1)

Figure 7. Author co-citation from Web of Science (stress = .03066 R2 = .99763)

Knowl. Org. 40(2013)No.1 ISKO 12s Bookshelf Evolving Intension: An Editorial

Table 6. Title Keywords

Knowl. Org. 40(2013)No.1 ISKO 12s Bookshelf Evolving Intension: An Editorial

Figure 8. Co-Word Analysis (stress = 0.26976 R2 = 0.7517)

Spanish Research in Knowledge Organization (2002-2010)

Table 1. Search terms in ISI and LISA

Table 2. Search terms in ISOC and Dialnet

Table 3. Search codes in ISOC

Table 4. Documents in the databases consulted

Table 6. Number of authors per work published

Table 5. Publications in journals and proceedings of conferences, by author

Table 8. Authors of monographic works and their productivity

Articles 134 21 13 5 4 3 1 1 1 183 134 49 26,78 73,22

Journals 39 6 10 5 2 3 1 1 1 68 39 29 42,65 57,35

Table 7. Knowledge areas of journals of selected publications

Table 9. Most productive authors of articles and monographs

Table 10. Distribution of citations received

No. of publications 61 23 6 5 4 4 4 3 3 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 135

% OF OUTOUT 80.10 7.10 2..96 2.84 2.38 0.49 4.00 99.87

Table 12. Most productive institutions

Table 11. Origin of the citations received

Table 13. Most productive universities 2002-2012

Figure 1. Output by year

Table 14. Distribution of authors by gender

Figure 2. Display of percentage of general themes

Table 15. Representation of the subject areas of the Documents 1992-2001

Table 16. Representation of the topics in no. of documents 2002-2010

Figure 3. Percentage-wise distribution of the group Knowledge Organization Systems

Figure 4. Comparative evolution of Knowledge Organization Systems (1992-2001 and 2001-2010)

Figure 1. Comparing the same term hierarchy in two ontologies

Aspect (a) Concepts comprehension

(b) Concepts categorization

(c) Concepts definition

(d) Ontological commitment elucidation

(e) Concepts matching analysis

(f) Ontologies articulation possibilities

Table 1. Suggestions to improve precision on ontology reuse tools

Knowl. Org. 40(2013)No.1 Book Review

Knowl. Org. 40(2013)No.1 Book Review

Knowl. Org. 40(2013)No.1 Classification Issues

Knowl. Org. 40(2013)No.1 Classification Issues

Knowl. Org. 40(2013)No.1 Classification Issues

Knowl. Org. 40(2013)No.1 Classification Issues

Knowl. Org. 40(2013)No.1 Classification Issues

Knowl. Org. 40(2013)No.1 Classification Issues