Sie sind auf Seite 1von 12

INM304 Penelope Swan abfc607

Report on Search and Evaluation / Research paper –


Digital libraries & local history

Part 1 – Evaluation of online and web search – using Factiva, Altavista, Dialog Classic and
Clusty.

Topic description

Using Digital Libraries for local history research in the UK.

What kind of local history resources are there in digital libraries? Local history is concerned
with the social aspects of life in a small area, concentrated on people and their everyday lives.
It is not concerned with world politics, except in so far as world events impinge upon the lives
of ordinary people. It is an important part of the cultural heritage of a country.

To be judged relevant the documents retrieved must be digital libraries, which are defined as
online resources which

provide access to different types of informati0n sources in a variety of


formats ... a digital library may contain simple metadata or catalogues of
information resources ... or may contain the full text of documents, images,
audio and video materials. (Chowdhury, 2004, p.426)

To satisfy this query they must provide resources relating to some aspect of local history
research in the United Kingdom. These resources might be church records or other historical
documents such as bills, maps, wills, letters, diaries, records of oral history or photographs.

Facet analysis

Facets are “clearly defined, mutually exclusive, and collectively exhaustive aspects, properties,
or characterisations of a class or specific project.” When we analyse the facets of an
information need we are breaking the subject into its core concepts: the information finally
retrieved should result in synthesis, with the concepts recombined to inform us about the
original subject. (Taylor, 1992)

The facets of this topic can be defined as:

Method – digital libraries – organised collections of documents held online.

Intention – local history – social aspects of history, not concerned with wider politics.
Concentrated on people, places, the local view. Restricted to the UK.

Developing the query

In devising a query for the information need we must first look the different ways that it might
be expressed. A synonym dictionary can be a useful tool in the first instance, especially for
finding terms for the natural language search. From the dictionary we find:

Cultural, ethnic, population, dwelling, lives

1
INM304 Penelope Swan abfc607
Report on Search and Evaluation / Research paper –
Digital libraries & local history

Digital libraries are sometimes referred to in different ways:

Electronic libraries, virtual libraries

A quick look through the web pages that are returned in a search for ‘local history’ on
Altavista finds some connected words:

Parish record, archive, genealogy, regional, village, document, museum,


heritage

Wikipedia’s entry provides other interesting terms:

Community, cultural, social, oral, record, settlement, parish, county

An initial search on Altavista using all of these terms returned only two results, one of which
was not relevant. The same search on Clusty returned only one, non relevant, site.
Obviously, we need to use discrimination in our choice of terms, and to find useful sites we
also need to consider which terms to exclude from our searches.

The most important terms are those that define the essence of the search: ‘digital library’ and
‘local history’. Digital library is used more widely – and with a broader range of meaning -
than ‘electronic library’ or ‘virtual library’ and so this is the term we will use. Searching for
these two terms brings reasonable results, judging by the summaries, but they are not yet
precise enough. As we want to find information about sources dealing with history in the UK,
we should include ‘UK’, and perhaps ‘Great Britain’ and ‘England’. This simple search brings
good results on both Altavista and Clusty, and removes American sources which are not
relevant to the defined topic.

As local history research centres on local records, especially church registers and archives, we
could try adding ‘social’, ‘archive’, ‘cultural’, ‘parish’ and ‘register’. This brings up some
interesting hits, but not necessarily better ones as an initial look at the summaries shows
fewer references to ‘digital library’ and ‘local history’. Some archives are returned, however,
and as ‘digital archive’ might be an alternative to ‘digital library’ we will retain ‘archive’. Parish
records are very important to local history research, being the only records of the past in some
places, so we will retain ‘parish’. ‘Record’ can also refer to an individual document entry, and
may bring up irrelevant hits, so we will not include it, but substitute ‘register’, which is the
proper name for church lists of births, marriages and deaths.

This leaves us with the following natural language search:

digital library local history uk england great britain archive parish register

We express this in Boolean terms as:

‘Digital librar*’ AND ‘local history’ AND (UK OR Great Britain OR England) AND (archive* OR
parish* OR register*) NOT (US OR USA OR America)

2
INM304 Penelope Swan abfc607
Report on Search and Evaluation / Research paper –
Digital libraries & local history

Evaluation of natural language and Boolean searches

All of the evaluations are based on the first 10 documents retrieved.

Repeated documents are any that contain the same information, expressed in the same way.

P@5 P@10 EAP Repeated Link Not Spam


(top 10) docs. broken retrieved
FACTIVA 0 0 0 0 0 0 0
natural
Boolean 0 0 0 0 0 0 0
ALTAVISTA 1 1 1 0 0 0 0
natural
Boolean 0.8 0.6 0.589 0 0 0 0
DIALOG 0 0 0 0 0 0 0
Boolean
CLUSTY 0.6 0.8 0.529 0 1 0 0
natural
Boolean 0.8 0.7 0.60 0 0 0 0

Factiva

Factiva’s speciality is business news articles, so it was not a good match for this query which
required links to digital libraries. The search as defined above returned no results. Factiva’s
search builder, however, uses Boolean categoriesi to build a query. The options pane was also
used to select the UK, and this was slightly more successful as eight documents were retrieved.
Although none fully satisfied the research criteria, two referred to online historical archives.

Altavista

Excellent results were returned with the natural language query – every document was
relevant and there were no duplications or broken links.

The advanced search with the ‘build a query’ form uses Boolean categories and options for
timeframe and domain. ii All of the chosen search terms were included. ‘UK, England, and
Great Britain’ were included by selecting the .uk domain, and ‘anytime’ was selected as the
timeframe.

The results were not quite as good with the advanced search, as some documents were
returned which discussed the use of digital archives or how to make a collection. Although
connected to the topic, these were not relevant as the intention was to retrieve searchable
resources themselves.

3
INM304 Penelope Swan abfc607
Report on Search and Evaluation / Research paper –
Digital libraries & local history

Dialog

Dialog is designed to retrieve journal articles and abstracts, and was not suitable for the search
as defined – no digital libraries were found, and combining the sets ‘local history’, ‘UK OR
England OR Great Britain’ and ‘digital libraries’ resulted in no hits at all.

The articles that were retrieved were all of interest to the local history researcher, however.
As sources can be carefully selected by using the descriptive bluesheets a high degree of
accuracy can be achieved if facets are carefully chosen to define the search. Although unable
to provide links to digital libraries the items retrieved were all otherwise relevant to the
topic.iii A disadvantage of this site is that the documents are not ranked, chosen databases are
searched in order as entered by the searcher. The databases can be ranked according to the
number of documents found in each, however, as was done here. The first ten documents
were all from ‘Historical Abstracts 1973-2005’.

Clusty

The results were good for the natural language search, with just one broken link and one site
that was not relevant because it was concerned with American local history. The interesting
feature of Clusty is its inclusion of alternative ‘clusters’, listed on the left side of the page.
These offer further refinements of the search. For this search the best ‘cluster’ was ‘local
studies’ which contained a further eight sites, three of which were relevant.

Clusty’s advanced search offers a choice of host, language, file type and number of results. UK
and English were chosen, and the results were restricted to 100. This brought back seven
suitable sites, and some useful clusters.

It was found that using the Boolean search in the query box without any of the advanced
options brought back 8 relevant results. There was little overlap - 9 out of the 15 relevant sites
were distinct. This search was not included in the evaluation as the intention was to look at
the tools proved for advanced search by the different search engines.

Overview

The usefulness of different search engines depends at least in part on the nature of the query.
This topic required that links to digital libraries be returned, and the Altavista web search and
Clusty meta search engines performed well. Factiva and Dialog Classic search online
collections of carefully chosen documents – where a search matches their remit they are very
accurate and the results are trustworthy. They were not able to provide what was asked for.
Although Dialog does not provide links to the specified format, the history subject area covers
the topic very well and queries can be built facet by facet to give precise results. Factiva is not
suitable for a query of this kind, as it does not cover the right kind of information.

Altavista and Clusty found a broad range of digital libraries dedicated to local history,
especially when the simple search was used, and Dialog contains some useful information for

4
INM304 Penelope Swan abfc607
Report on Search and Evaluation / Research paper –
Digital libraries & local history

the local history researcher to use. Our conclusion must be that the most important first step
in forming a search strategy is to choose the proper service for the information need, and to
use the form of query best suited to the search engine.

Part 2 – research topic

Digital libraries and the preservati0n of Cultural Heritage

Abstract

This paper looks at the major issues and difficulties of digitisation, at the importance of access
to digital information and at the particular relevance of digital libraries to the historian.

Key words

Digital libraries, digitisation, information, preservation, history

Introduction

A perennial question for libraries and museums has always been how best to preserve and
make available the various documents and artefacts of our common cultural heritage that are
in their possession. Many years of experience with the preservation of paper and the
development of new materials – such as acid-free paper – mean that we are well equipped to
restore and maintain physical archives: original items are the treasured centrepieces of
museums and archives. These artefacts require space and specially controlled environments,
however, and access to them is necessarily restricted both by location and by their fragility.
Also, paper cannot be preserved for ever, and many libraries and archives have turned to
digitisation as an alternative means of preservation. That this is a popular measure is easily
demonstrated by the fact that the first ten hits on the Altavista search in part 1 displayed ten
different sites offering digital access to historical documents. The general public can now read
(either transcribed or scanned) documents that previously would have been available only to
researchers under controlled conditions.

Major issues in digital preservation

Digital information, however, has its own problems of preservation: information has a longer
lifespan than the technology. New software formats can mean that there are viewing
problems with older forms of digital documents, and new computer hardware may not
support older programmes. There is a cycle of perhaps 3-5 years for new products and
methods and companies may cease to support older products, although there is some utility
for ‘backwards compatibility’ in new programmes. (Hedstrom, 2002) As well as changes in the
technology, the digital information itself is liable to degrade over time and must be checked
for errors and losses. Obsolescence is a major problem, and plans for migrating information
to new media – a time-consuming and expensive task - must be part of any programme of

5
INM304 Penelope Swan abfc607
Report on Search and Evaluation / Research paper –
Digital libraries & local history

digitisation. “Migration, emulation and enduring standards are the options for future
readability.” (Seadle, 2008, p.6) The digital archive JSTOR, for example, preserves multiple
copies of the print originals from which new digital copies can be made, has redundant data
centres for cd-roms and tape copies, upgrades the archive regularly to keep pace with new
technology and formats and has strategies in place to protect its collection and mission in the
event that JSTOR ceases to function. This vigilant approach is in line with that suggested in
the literature:
we ... cannot count on a stable technical or financial environment for
archiving works that we want to be available in the digital library in 100
years. Our safest route to minimize the risk of losing intellectually valuable
documents is not to link their survival to a single system or a single means of
financial support. (Seadle, 2008, p.9)

Because of this continual danger of obsolescence, and because it is almost impossible


accurately to predict which systems will fail and be discontinued and which will succeed, all
digitisation projects should plan for interoperability so that documents can more easily be
moved to new systems without loss of information.

Interoperability involves thinking about archiving systems the way we do


about OPACs, as commodity products that are in effect storage containers
for information that we can replace when they wear out. (Seadle, 2008)

Another major issue for digitisation is the lack of a unifying strategy. JISC and the British
Library both have Digitisation Strategies to define how items should be chosen and how the
programme is to be funded, and they are committed to establishing and maintaining
sustainable standards for digital capture, description, preservation and interoperability. But
there is not yet a nationwide agreement about what should be digitally preserved or how it is
to be stored and paid for. As a result many digitisation projects are small-scale and isolated,
use a variety of standards and formats and sometimes duplicate each others’ work. (Bultman,
2006) A recent CILIP response to the communications minister’s (Lord Carter) report ‘Digital
Britain’, found that the weakest section was the one about digital content. The recent Blue
Ribbon Task Force report on the financial imperatives for our digital future - ‘Sustaining the
digital investment’ found inadequate funding, confusion about responsibility for maintenance,
inadequate long-term planning and agreement on standards, complacency and a fear of the
sheer size of the task of preserving the best of our current ‘born digital’ output as well as the
older materials that are being digitised. “Digital preservation requires both technical
strategies and supporting infrastructure” (Bultman, 2006, p.111) and these twin requirements
should be as uniform as possible to allow for interoperability. As the amount of digital
information is growing faster than our ability to store it, it must be decided what should be
kept, and what is disposable, and this is an urgent task.

Decisions about the future of ... digital reference collections -- how they will
be migrated to future information technologies without interruption, what
kind of infrastructure will protect their digital content against damage and

6
INM304 Penelope Swan abfc607
Report on Search and Evaluation / Research paper –
Digital libraries & local history

loss of data, and how such efforts will be supported -- must be made now to
drive future innovation. (Blue ribbon task force, 2008, p.1)

Are digital libraries necessary?

Digital access to all kinds of information has become a part of modern life; for research, for
news, for communication and for fun. Indeed: “Information has to be accessible easily at a
mouse click, on the internet, otherwise it is no longer even taken notice of” (Ceynowa, 2008,
p.1) The 2006 OCLC report “College students” found that most students now prefer to use the
internet than the college library (referenced in Ceynowa). These facts mean that digital
library provision is essential to the sharing of resources and dissemination of information that
is at the heart of a civilised society. Some have seen digitisation as the beginning of the end
for libraries, but Klaus Ceynowa, in his examination of the Bavarian State Library’s digitisation
project, sees it as a positive reply to the perceived threat. The library has entered into a
contract with Google and their collection is being digitised as part of Google’s major library
project. The library will obtain a “library digital copy” which will be full-text indexed,
catalogued with structured data and web 2.0 functions, and integrated in the regional,
national and international portals without restrictions. If the full text of its books can be
accessed via a library’s catalogue, Ceynowa argues, then the library can better fulfil its mission.

The current development can be characterised as a renaissance of libraries,


which are enjoying a continuous increase in use as places of cultural and
scholarly exchange and concentrated learning, even though increasingly
comprehensive parts of their information offer are made available online.
(Ceynowa, 2008, p.12)

The addition of metadata is vital in this new library world: digital libraries must be more than
just copies of paper documents - they must support the integration of resources through
descriptive metadata which enables search and retrieval to a high degree. They must live up
to this definition:

A (potentially virtual) organization that comprehensively collects, manages,


and preserves for the long term rich digital content and offers to its user
communities specialized functionality on that content, of measurable
quality, and according to prescribed policies. (Candela et al, 2002-2006, p.8)

Advantages of digital libraries for historical researchers

The Institute of Historical Research maintains the British History Online digital library as a
means of widening access to resources, preserving them (by making digital copies, that allow
access and reduce physical handling of delicate originals) and making new connections:
The site juxtaposes many different but related resources, which opens up
many connections between sources that would be more difficult or

7
INM304 Penelope Swan abfc607
Report on Search and Evaluation / Research paper –
Digital libraries & local history

impossible in a print environment. Our site structure allows users to browse


our resources by place, subject and period. It is also possible to search within
and across different sources, allowing many new and unforeseen connections
to be made. (Institute of Historical Research)
It is this juxtaposition and interconnection that makes digitisation an important tool for
historical researchers. When a researcher had to search through two or three quite
unconnected archives the connections that can now be made between two documents might
have been missed altogether.
Looking quickly through the results of the Altavista search from part 1, we find ‘Family and
Local History Links’ which contains, amongst other interesting documents, the Oxfordshire
Archdeacon’s marriage bond index. Following this link takes us to a useful introduction,
prepared by an historian, and so to copies of the documents arranged as in the original, and
also alphabetically by the grooms’ and brides’ names and parishes. This is a simple example of
how ‘value’ can be added in a digital library. The lay genealogist benefits not only from having
the records to hand wherever he or she may happen to be, but also from an expert’s
instruction and a very helpful and easily searchable rearrangement of the data.
Another site that was retrieved in the search is British History Online, an authoritative digital
library of “some of the core printed primary and secondary sources for the medieval and
modern history of the British Isles.” (BHO, homepage) This has been created and is
maintained by the Institute of Historical Research as a reliable, free and invaluable guide and
resource for professional and amateur historians. Before digitisation and the online world it
would not have been possible to see all the information gathered here without extensive travel
and a considerable historical knowledge. This new world of the digital library enables
anyone’s thirst for history to be satisfied.
Conclusion
Digital libraries are an excellent way to share information, especially in the field of historical
documents where copyright restrictions do not usually apply, and many documents that are
too delicate to be handled can be scanned and used more widely than ever before.

Digital incarnations of information are, however, just as delicate and subject to time and age
as any others. They may decay, their physical carriers may become outdated or broken - we
may forget how to read them. It is easy to assume that once copied onto disk, tape or hard
drive, our paper documents are preserved but care must be taken to maintain digital
information with careful storage, migration to new media when necessary, and establishment
of widely held standards to ensure that that migration is possible. At the same time, libraries
and archives must continue to look after the originals - a digital surrogate is a form of
preservation, but not a substitute for other means of preservation.

8
INM304 Penelope Swan abfc607
Report on Search and Evaluation / Research paper –
Digital libraries & local history

Bibliography

Blue Ribbon Task Force, interim report (2008) ‘Sustaining the digital investment: issues and
challenges of economically sustainable digital preservation’
http://brtf.sdsc.edu/biblio/BRTF_Interim_Report.pdf (accessed 20-04-2009)

Bultman, B; Hardy, R; Muir, A; Wictor, C (2006) ‘Digitised content in the UK research library
and archives sector’ Journal of library and information studies 38:2 pp 105-122

Candela, Leonardo; Castelli, Donatello; Ioannidis, Yannis (2008) ‘Report on the third
workshop on foundations of digital libraries’ D-lib magazine 14:1/2

Candela, L; Castelli, D; Ioannidis, Y; Koutrika, G; Pagano, P; Ross, S; Schek, H-J; Schuldt, H


(2002-2006) ‘The Digital Library Manifesto’ DELOS http://delos-
dl.isti.cnr.it/OLP/UI/1.0/Load/manifestation_ext/1240318472GqDqB7Y7pH?uri=http%3A//146.
48.87.21%3A8003/OLP/Repository/1.0/Disseminate/delos/2006_other_0081/content/auto-
html%3Fversion%3D1&epl_content_type=text/html (accessed 01-04-2009)

Ceynowa, Klaus (2008) ‘Mass digitization for research and study: the digitisation strategy of
the Bavarian state library’ World library and information congress: 74th IFLA general conference
and council http://archive.ifla.org/IV/ifla74/papers/139-Ceynowa-en.pdf (accessed 20-04-
2009)

Chowdhury, G G (2004) 2nd edition London: Facet Publishing

Gladney, H M (2008) ‘Information science and scholarly writing’ Digital document quarterly
7:3

Graddmann, S (2008) ‘Interoperabilty: A key concept for large scale, persistent digital
libraries.’ A DPE Briefing paper
http://www.digitalpreservationeurope.eu/publications/briefs/interoperability.
pdf (accessed20-04-2009)

Hedstrom, M (2002) ‘Digital preservation: a time bomb for digital libraries’


http://www.uky.edu/~kiernan/DL/hedstrom.html (accessed 20-04-2009)

Institute of Historical research British history online http://www.british-history.ac.uk/

Lubell, Joshua; Rachuri, Sudarsan; Mani, Mahesh (2008) ‘Sustaining engineering informatics:
towards methods and metrics for digital curation’ International journal of digital curation 2:3
pp 59-73

NINCH guide to good practice in the digital representation and management of cultural
heritage materials http://www.nyu.edu/its/humanities/ninchguide/index.html

Seadle, Michael (2008) ‘The digital library in 100 years: damage control’ Library hi tech 26:1 pp
6-10

9
INM304 Penelope Swan abfc607
Report on Search and Evaluation / Research paper –
Digital libraries & local history

Taylor, Arlene G ‘Introduction to cataloging and classification’ 8th edition Eaglewood,


Colorado: Libraries Unlimited

Appendix to part 1

i
Factiva search form
All of these words – digital library
At least one of these words – archive parish register
None of these words – USA America US
This exact phrase – local history
Region – united kingdom

ii
Altavista advanced search
All of these words – local history
This exact phrase - digital library
Any of these words - archive parish register
None of these words - America USA US
Timeframe - anytime
Domain - .uk
iii
Dialog search
? b 411

DIALINDEX(R)

? set files history

? show files

7: Social SciSearch(R)_1972-2009/Apr W1
35: Dissertation Abs Online_1861-2009/Mar
38: America:History & Life_1963-2005/Q3
39: Historical Abstracts_1973-2005
47: Gale Group Magazine DB(TM)_1959-2009/Apr 07
88: Gale Group Business A.R.T.S._1976-2009/Apr 16
141: READERS GUIDE_1983-2009/MAR
142: Social Sciences Abstracts_1983-2009/Mar
190: BIBL.HISTORY OF ART_1991-2008/Q3
436: HUMANITIES ABS_1984-2009/APR
439: Arts&Humanities Search(R)_1980-2009/Mar W4

? b 39, 88, 436, 439, 7

File 39:Historical Abstracts 1973-2005


File 88:Gale Group Business A.R.T.S. 1976-2009/Apr 16
File 436:HUMANITIES ABS 1984-2009/APR
File 439:Arts&Humanities Search(R) 1980-2009/Apr W1
File 7:Social SciSearch(R) 1972-2009/Apr W1

? s local history

S1 1135 LOCAL HISTORY

? s digital librar?

S2 958 DIGITAL LIBRAR?

? s great britain or england or UK

10
INM304 Penelope Swan abfc607
Report on Search and Evaluation / Research paper –
Digital libraries & local history

69725 GREAT BRITAIN


255851 ENGLAND
190398 UK
S3 485895 GREAT BRITAIN OR ENGLAND OR UK

? c 1 and 3

1135 1
485895 3
S4 275 1 AND 3

? c 4 and 2

275 4
958 2
S5 0 4 AND 2

? s archive? or parish? or register?

93818 ARCHIVE?
22727 PARISH?
96193 REGISTER?
S6 202239 ARCHIVE? OR PARISH? OR REGISTER?

? c 4 and 6

275 4
202239 6
S7 50 4 AND 6

? ds

S1 1135 LOCAL HISTORY


S2 958 DIGITAL LIBRAR?
S3 485895 GREAT BRITAIN OR ENGLAND OR UK
S4 275 1 AND 3
S5 0 4 AND 2
S6 202239 ARCHIVE? OR PARISH? OR REGISTER?
S7 50 4 AND 6

? rd S8 49 RD (unique items)
? t 8/9/1-10

Mark 74%

Comments:
Part 1: Your topic focuses on using Digital Libraries for local history research in the UK. Your
topic is clearly defined, but think about how you could give your information need more
structure using the TREC style topic template (see the topics we used in the tutorial for this
purpose). Your facet analysis is quite good, but it would have been better had you provided
more information on how your facets were developed from the information need (rather than
providing a definition of facets, which is not required in the coursework specification). You
have clearly shown the development of you query from the facet analysis and the process
you used in order to develop your search strategy is fully transparent. You have shown a
pretty good understanding of evaluation, particularly the effect of sources on your search
results (e.g. Factiva really was not much use for your need). Your report is well written and
structured. Part 2: You have concentrated more on the issue of preservation as it applies to
historians. You have provided a very good review of the issues, both in terms of the
preservation of material and the meta-data applied to it. You have also provided some useful
examples of DL's, thus showing quite clearly the implications of DL's on the profession. In

11
INM304 Penelope Swan abfc607
Report on Search and Evaluation / Research paper –
Digital libraries & local history

doing so you have provided a very good synthesis of the literature you gathered, and it
would have been excellent had you developed your abstract more. Your referencing is
excellent and you have adhered to the Harvard style. Your paper is very well written and you
have (mostly) adhered to the required structure.

Overall
You have met the learning requirements of the module at the distinction level.

12

Das könnte Ihnen auch gefallen