Sie sind auf Seite 1von 4

Seminar 11: Computer aided terminology The role of computers in terminology/terminography: - Documentation access to databanks, specialized texts in electronic

c format prior to beginning work - Creation of corpus automatic term extraction - Terminological records in electronic format writing the entry - Checking the information in the entry - Editing glossaries in electronic format (ap. Rucareanu & Pavel, Cabre) ACTIVITY 1: Assess your own acitivity and list the tools and instances you normally use electronic tools in terminology (+/- for translation). Problems related to using electronic tools in terminology work: Lack of integrating computer resources in work methods Lack of compatibility among the resources Limited degree of computer processing available by each resource, human intervention constantly required (?) The lack of an operative user friendly interface between humans and computers (resulting from difficulties related to communication in natural languages) Limited number of corpora, especially in languages other than the major international languages (English, French) (ap. MT Cabre1999) ACTIVITY 2. List (at least three) problems you have faced in using electronic tools in terminology (+/- for translation). Data banks = information organized in records, subdivided into data fields Steps for creating a databank - Identification of needs - Defining parameters (by group of experts) - Choosing the working group - Design of the databank - Choosing the hardware and software - Entry type of information, sources, structure of the information etc. - Storage type of records, relationship among records, structure of records - Retrieval types of queries, formats of retrieved information (ap MT Cabre, 1999) INPUT Data banks of interest to terminology o Document databases (Eurlex) o Specialized text data banks (corpora) Economic neology in Romance languages (Romanian among them) glossary of economic terms extracted from the press (useful for a comparative approach to terminology and for the study of neology in a specific domain. http://obneo.iula.upf.edu/economia/esp/index.html Ordilex.com, Lexiques multilingues, Lexique Franais-anglais de l'immobilier, lexique franais-anglais des termes de l'immobilier http://www.ordilex.com/documents/index.html Grand dictionnaire terminologique http://w3.granddictionnaire.com/btml/fra/r_motclef/index1024_1.asp o Specialized corpora1 (Check the Pavel Terminolgy Tutorial section on Corpus Search tools at http://www.btb.gc.ca/btb-pavel.php?page=chap4-71&lang=eng&contlang=eng ) BNC at http://corpus.byu.edu/bnc/ ACTIVITY 3 Register to BNC; check the status (synchronically in various types of texts and diachronically) of the sigular plural forms datum / data Corpus of Professional English (CPE) a major research project of PERC (Professional English Research Consortium) currently underway that, when finished, will consist of a 100-million-word computerized
1

a good page on corpora at http://corpus.leeds.ac.uk/list.html

database of Eng used by professionals in science, engineering, technology, law, medicine, finance & other fields. ACTIVITY 4 Runs a similar search on the term refund with the purpose of indentifying specific collocational patterns.

Wolverhampton Business English Corpus 10,186,259 wds in the general domain of business, collected from 23 different web sites around the world (from six months within the period 1999-2000), covering a wide variety of categories including product descriptions, company press releases, annual financial reports, business journalism, academic research papers, political speeches & government reports. POS-tagged. Alternatively you can see & compare frequency lists & ngrams for various subcorpora/text genres (including business texts).

INPUT - Terminological databank: a structured collection of information about the units of meaning and designation of a specialized field addressed to the needs of a specific group of users - Consists of: o A main database (including the terms) o Other databanks related to each other (containing information on some aspect of the terms) Classification of databanks: - By objectives: informative (they disseminate terminology); prescriptive (they intervene in term usage); - By entries: based on terms; based on concepts; - by subject matter: specialized in a subject field; specialized in several related subject fields; - By size: large banks (administrative bodies, ex. IATE); terminology minibanks (developed by a professional /centre specializing in a subject field - By type of data (term banks, with definitions, phrase banks, encyclopedic banks, visual banks) - By number of languages (mono-, bi- multilingual) - By type of data organization (organized by document, organized by terms without context) (M.T. Cabre, 1999) Other tools: Tools for term extraction: o Term extractor Termostatweb http://idefix.ling.umontreal.ca/~drouinp/termostat_web/

ACTIVTY 5 Run a similar search extracting terms from a text you have been working on in Genetics (.txt or .doc) format; compare the resuts with your own list of terms. Draw a conclusion. o Alignment tools: Alignment tool in Terminotix athttp://www.youalign.com/Default.aspx ACTIVITY 6 Prepare two chunks of text from EURLEX (English version at http://eurlex.europa.eu/LexUriServ/LexUriServ.do?uri=OJ:C:2012:287:0009:0009:EN:PDF) and Romanian version and align the two texts. o Tools for corpus analysis2 (providing concordances, word list, n-grams, collocations, +/- contexts) - AntConc at http://www.antlab.sci.waseda.ac.jp/antconc_index.html - Linguateca at http://www.linguateca.pt/corpografo/ o Tools for text analysis: Textalyser This text analysis tool provides information on the readability and complexity of a text, as well as statistics on word frequency and character count. It can be of assistance to translators when calculating quotes for clients. http://www.lexicool.com/text_analyzer.asp

More about corpora and tools at http://courses.washington.edu/englhtml/engl560/corplingresources.htm

ACTIVITY 7. Prepare a text you are currently working on in one of your PC Specialized Translation courses and analyse it for degree of complexity and statistics as provided by Textalyser Knowledge bases (Protg a free open source ontology editor and knowledgebase framework at http://protege.stanford.edu/index.html )

Das könnte Ihnen auch gefallen