Sie sind auf Seite 1von 9

MLTM03 Language Technology I

Week 4: Building MultiTerm 2009 termbases

MultiTerm 2009 is a powerful multilingual terminology management application developed by SDL Trados and sold as part of its Studio 2009 CAT (Computer-Assisted Translation) system. In addition to terms and their translations into one or more languages, MultiTerm allows you to add different types of descriptive fields, including graphics (a photo or diagram of the object) and even video clips. MultiTerm is the package that you will use for the group assignment in this module, as well as for your second Terminology Acquisition Project (TAP) in Advanced Translation. In this weeks session you will first explore the MultiTerm interface [NB: MultiTerm has a capital T in the middle: it is important to spell application names accurately]. In Part 2 you will learn how to build a new termbase, and in Part 3 how to import data from an Excel spreadsheet. Finally, in Part 4, you will export your data again in different formats this is a good data security measure, and a habit you should acquire.

Part 1: MultiTerm interface


1. Make a MultiTerm folder on Q and download the MultiTerm 2009 Sample Termbase into it from the Week 4 folder on Blackboard. 2. Run MultiTerm, click the Termbase > Open Termbase menu, in the Select Termbases dialog browse to the location of the sample termbase, and select its friendly name of SDL Sample 2009, then click OK:

2 3. If the terms list does not start with AC as above, ensure that the indexes are set with English first (also select Flags layout):

4. Select the Help > Help Topics menu and expand the Getting Started Tutorials folder:

5. Click the Online button in the lower left corner and note that you can choose to view online or local Help (online is best because it is likely to be most up to date). You should aim to work through the Tutorials before next week, but for the moment you may prefer just to follow the demonstration.

Part 2: Building a new termbase


Now that you have had a chance to explore the MultiTerm 2009 interface and started to understand the types of data the program can contain, we will see how to create a new termbase. First, though, a reminder of some basic principles. MultiTerm is a conceptoriented termbase, which means that each entry contains data (source and target terms and information about them) which relates to a single concept. There is a crucial distinction between index fields (the different languages contained in the TB) and descriptive fields (such as Part of Speech, Gender, Definition etc.) which give information about the term. Finally, there is a hierarchy of locations in which descriptive fields can be placed within an entry. Information relating to the whole entry (such as a photo of the object it describes) is placed an Entry level (the photo applies equally to all the languages). Information specific to a given language goes at Index level, and information about the actual term in a given language goes at Term level. When you make a new TB you not only specify the name of

3 each descriptive field, but also the level at which it will be placed. Before closing the sample termbase, browse to an entry containing a definition. At which level is the definition? What effect does this have? Why has it been placed there? Now close the sample termbase. 1. Select Termbase > Create Termbase, give your new TB a name (e.g. Test1), select your Q:\MultiTerm folder and click Save. This launches the Termbase Wizard; read through the 5 steps, then click Next. 2. In the Termbase Definition step we will create a new definition from scratch (the default option), but note the other choices which can save time if you have existing resources to use. You will come across the termbase definition file later in the class: when you export a TB in XML format, its structure is saved as a text-only definition file. Click Next. 3. The Termbase Name step requires you to create a Friendly Name and gives you the opportunity to add some further descriptive information, including if you wish a Copyright statement. The Add more button lets you change the default icon, etc. When you are ready, click Next. 4. In the Index Fields step you set the languages that your TB will contain. Check the Show sublanguages option as this will allow you to select specific national varieties (e.g. US or UK English), as well as showing the appropriate flag (except, for political reasons, in some cases including Chinese). Select your SL, click the Add>> button to add it to the list, the do the same for your TL. Click on each of your Selected index fields and note the default Index field label which appears: you can edit this if you wish. When you are ready, click Next. 5. In step 4 you decide which descriptive fields to include in your TB. For the moment, we will keep it simple. In the Field label box, type Definition, click the Add>> button to add it to the list, and type a short description in the Description (optional): box. 6. Click the Properties button and note that a small Properties dialog pops up:

7. Open the Data type: drop-down menu and note the different options you can select to identify the type of data the descriptive field will contain. The default (Text) is OK for our Definition field. If you were inserting a graphic, sound file or video, however, it would be important to select Multimedia File. 8. Repeat instructions 5 and 6 to create a field called Example and another one called PoS with the description Part of speech. 9. Since there only a small number of possible parts of speech, we will set up a picklist (a drop-down list of selectable items) for this field. In addition to saving time, a

4 picklist has the advantage that it enforces consistency: if you need to designate terms as nouns, on different occasions you might type n., n, noun etc. and to the software these would all appear to be different. With the PoS field selected, open the Data type: drop-down menu and select Picklist. Now in the Picklist editing dialog, click the icon and in the box that opens up, enter n. for noun. Click outside the editing box for n., then do the same for v. (verb), n.p. (noun phrase), v.p. (verb phrase), a. (adjective) and adv. (adverb). It is important to be comprehensive at this stage because once the termbase contains data it is impossible to modify a picklist. The other picklist editing icons ( up and down the list. ) allow you to delete an item and move items

10. If you make a mistake you can remove any descriptive field at this stage by clicking the <<Remove button. When you are ready, click Next. 11. The final step in the wizard, Entry Structure, requires you to decide at which level each descriptive field will appear. We dont have any Entry level fields, and although we might place Definition at Index level (as in the sample TB), it is probably better to place all 3 fields at Term level. To do this, in the Entry structure pane click , then in the Available descriptive fields pane, click on each field in turn to select them all, then click the <<Add button. You should now have a structure like this:

12. At this point we decide that every entry must have a completed PoS field, so select PoS in the Entry Structure dialog, and in the Field settings box, check Mandatory (this means that a new entry will not be saved until the PoS field is complete) and uncheck Multiple. If you want to allow the option of more than one example per entry, select Example, check Multiple and uncheck Mandatory (so it will not be compulsory to add an example). 13. Finally, click Next and Finish to complete the wizard. Now we can add a specimen entry. 14. Select Edit > Add New (or click the toolbar icon, or hit F3): a new entry will open in editing mode in the main termbase pane:

15. Note the flags, and that a mandatory PoS field has appeared beneath each term (we will need to add the other descriptive fields manually, when we want them). Under the English index, double-click the small empty box to the right of the drop-down arrow in the little editing toolbar( ) to open the term editing window, and enter a sample English term. 16. Now double-click the editing box next to the English PoS field; this time a drop-down menu containing your picklist items appears, so select the appropriate one. 17. Next, click the drop-down arrow in the middle of the editing toolbar ( ) and note that it allows you to add the other descriptive fields you defined for this termbase. Add a Definition in English. On the TL side, add your term translation, select the correct PoS, add an Example, then Save the entry by clicking the icon, hitting F12 or selecting Edit > Save. Your new entry should look something like this (NB: a term should never be capitalized unless that is the form in which it is always found):

Part 3: Importing terminology from Excel


Before encountering MultiTerm, you have become accustomed to collecting terminology data in Word. A Word data table can easily be pasted into Microsofts spreadsheet application, Excel, and Excel data can in turn be imported into MultiTerm. In fact, Excel has become an informal standard for exchanging terminology data between different CAT tools, as you will see if you take the Language Technology 2 module. In this exercise we will import

6 a small test data table in English and French, TGV termbase test.xls (dont worry if you dont know French, its the principle that matters). The terminology data relates to the French high speed train, or TGV, and the Excel spreadsheet looks like this (you need to start by downloading it from Bb):

The top row contains column labels which will become MultiTerm field names:

EN and FR are the index (language) fields, the others are descriptive fields (PoS, Definition and its Reference on the English side, Gender, Example and Reference in French). To import this spreadsheet into Multiterm we need to run another wizard, called MultiTerm Convert, which allows MultiTerm to understand the structure of the Excel data. The end product of the wizard is a pair of text files, one (with extension .xdt) containing the termbase structure, the other (.xml) containing the terminological data. The structure file allows you to build an empty termbase with the correct structure to receive the data, and then import it. 1. 2. 3. 4. Run MultiTerm Convert, read the text in the opening screen, and click Next. In step 2, leave the default New conversion session selected and click Next. Now select the Microsoft Excel format option and click Next. In the Specify Files dialog, browse to the location of TGV termbase test.xls and select is as the Input file. Note that when you click the Open button, the other file location fields (Output file, Termbase definition file and Log file) are populated with default values: you can change these if you wish by clicking the appropriate Save as button. Click Next. 5. In step 5, the wizard reads the spreadsheet column headers, presents them in a list in the Available column header fields pane, and invites you to tell it what kind of data each one contains.

7 6. With EN selected, check Index field, open the drop-down menu, scroll down to English (United Kingdom) and click it. This tells MultiTerm Convert that the EN column contains the English terms. 7. Now move down the column header fields list and select PoS. This is a Descriptive field, so leave the default checked, but open the drop-down menu and select Picklist (as in the previous part of this workbook, though we will need to set up the picklist items once we have made the termbase). 8. Continue down the list of column header fields, leaving them all identified as the default, descriptive fields with Text content, except FR which is the French Index field, and Gender which is a descriptive field with a picklist. When you have finished, click Next. 9. In step 6 we will tell the wizard to which level all the fields belong. Exactly as you did in Part 2, add PoS, Definition en and Ref en at English Term level and Gender, Example fr and Ref fr at French term level:

10. Click Next, read the summary of the conversion process, click Next again and wait while the wheels of the program convert your data. If all is well you should see the notification that 7 entries were successfully converted; if so, click Next, then Finish to complete the wizard. 11. Now we will use the termbase definition (.xdt) file to create a new empty termbase, then import the data from it. 12. Run MultiTerm, select Termbase > Create Termbase, give your new TB a name (e.g. TGV Test), select your Q:\MultiTerm folder and click Save to launch the Termbase Wizard, then click Next. 13. In the Termbase Definition dialog, this time check Load an extisting termbase definition file and browse to the location on Q of TGV termbase test.xdt. Select it, click Open, then Next. 14. Enter the Friendly Name TGV, click Next and note that the languages (index fields) are already correctly selected. At this stage you could add further languages to your termbase if you wished; however, for the moment click Next.

8 15. Again, note that all the required descriptive fields are already listed, but you could choose to add new ones if you wished. Select those that you wish to make Mandatory (e.g. PoS) and/or Multiple (you wouldnt want multiple Definitions in a single entry, but what about Examples? If so, youd also need multiple References for them), then click Next. 16. The wizard is now complete and your empty termbase structure has been created. Now you need to import the data. 17. Click the tab, then in Catalogue categoriesselect . In the right-hand pane, right-click on Default input definition, then select Process to launch the Import Wizard. In the Import file: dialog, browse to the location of your XML data file, select and Open it, then click Next. 18. In step 3, enter any name for the Exclusion file (which is where rejected entries would be stored), click Next, read the summary, then Next again. In step 8, watch while the import is processed, then click Next again, and Finish to complete the wizard. As the wizard closes, you should be taken to your new termbase:

19. You should be able to navigate it in exactly the same way as the sample termbase we started with, and add entries etc, if you like.

Part 4: Exporting termbase data


Just as a MultiTerm termbase can be created out of two text files, the XDT structure file and XML data file, so it is possible to export the structure and data of any MultiTerm termbase back into these text file formats. This is a valuable backup option, because if anything goes wrong with your termbase you can always reconstruct it from the text files. You should therefore get into the habit of doing regular exports particularly as it is a very easy procedure. 1. With your TGV termbase still open, click the tab, then in

2.

3.

4.

5.

Catalog categories, right click on , then Save. This will save the termbase definition as an XDT file to a location of your choice. Now we will export the termbase data. With the Catalog still active, select . Right click the Default export definition (note also the other options, particularly the Word Dictionary export) and select Process. In the Export Settings dialog, click Save As to give your Export file a name and location, the Save. Click Sort termbase content by index field and open the dropdown list of indexes to select the one you wish to use. Click Next to perform the export, and Next again to complete the wizard. If you wanted to send your termbase to another translator, for instance, all you would need to do is attach the XDT and XML files to an email and they would be able to reconstruct it. Finally, you might like to experiment with some of the other export options try the Word dictionary, for instance. TBX (termbase export) is set to become a termbase interchange standard, and is another XML based file format.

Das könnte Ihnen auch gefallen