Sie sind auf Seite 1von 20

Human Translation vs.

Machine Translation: Rise of the Machines


Abdelhak Jebbar
Different approaches to automatic evaluation of machine translation (MT) quality are considered. We describe several methods for automatic evaluation of MT, such as methods based on string matching and n-gram models. The candidate translations done by Google and PROMPT are compared with the reference translation by an automatic translation evaluation program and the results of the evaluation are presented. Keywords: automatic evaluation, quality of translation, machine translation, BLEU, F-measure, TER.

1. Introduction
The idea of machine translation (MT) of natural languages first appeared in the seventeenth century, but became a reality only at the end of the twentieth century. Today, computer programs are widely used to automate the translation process. Although great progress has been made in the field of machine translation, fully automated translations are far from being perfect. Nevertheless, countries continue spending millions of dollars on various automatic translation programs. In the early 1990s, the U.S. government sponsored a competition among MT systems. Perhaps, one of the valuable outcomes of that enterprise was a corpus of manually produced numerical evaluations of MT quality, with respect to a set of reference translations [1]. The development of MT systems has given impetus to a large number of investigations, thereby encouraging many researchers to seek for reliable methods for automatic MT quality evaluation. Machine translation evaluation serves two purposes: the relative estimate allows one to find out whether one MT system is better than the other, and the absolute estimate (having a value ranging from 0 to 1) gives an absolute measure of efficiency (for example, when equal to unity, it means perfect translation). Although been machine great in progress the field has of fully However, the development of appropriate methods for numerical MT quality evaluation is a challenging task. In many fields of science, measurable efficiency indices exist, such as, for example, the difference between the predicted and actually observed results. Since natural languages are complicated, an assessment of translation correctness is extremely difficult. Two completely different sequences of words (sentences) can be fully equivalent (e.g., There is a vase on the table and The vase is on the table), and two sequences that differ by a small detail can have completely different meanings (e.g.,There is no vase on the table, and There is a vase on the table). Traditionally, the bases for evaluating MT quality are adequacy (the translation conveys the same meaning as the original text) and fluency (the translation is correct from the grammatical point of view). Most modern methods of MT quality assessment rely on reference translations. Earlier approaches to scoring a candidate text with respect to a reference text were based on the idea of similarity of a candidate text (the text translated by an MT system) and a reference text (the text translated by a professional translator), i.e., the similarity score was to be proportional to the number of matching words [2]. At about the same time, a different idea was put forward. It

made

translation,

automated translations are far from being perfect.

was based on fact that matching words in the right order in the candidate and reference sentences should have higher scores than matching words out of order [3]. Perhaps the simplest version of the same idea is that a candidate text should be rewarded for containing longer contiguous subsequences of matching words. Papineni et al. [4] reported that a particular version of this idea, which they call BLEU, correlates very highly with human judgments. Doddington [5] proposed another version of this idea, now commonly known as the NIST score. Although the BLEU and NIST measures might be useful for comparing the relative quality of different MT outputs, it is difficult to gain insight from such measures [6]. In this paper we consider different methods of MT quality assessment and analyze the translations of candidate and reference texts. In the following sections, we describe several automatic MT evaluation methods: some of them are based on string matching, others, such as n-gram models, are based on the use of information retrieval. Next, we will assess the quality of translation by using an automatic program. 2

. Methods of automatic MT quality evaluation

To date, the main approach to the quality assessment of language models for MT systems relies on the use of statistical methods. In this case, the model is, in fact, a probability distribution on a set of all sentences of a language. Naturally, it is impossible to employ the model in this way; therefore, use is made of more compact algorithms. Let us briefly consider what models are currently used in commercial and experimental systems of MT quality assessment with unlimited dictionaries.

2.1 Method of approximate string matching


In computer science, approximate string matching (often colloquially referred to as fuzzy string searching) is a technique of finding strings that match a pattern approximately (rather than exactly). The problem of finding approximate string matching is typically divided into two sub-problems: finding an approximate substring inside a given string and finding dictionary strings that match the pattern approximately [7]. The word error rate (WER) is a metric based on this approach. The WER is calculated as the sum of insertions, deletions, and substitutions, normalized by the length of the reference sentence. If the WER is equal to zero, the translation is identical to the reference text. The main problem lies in the fact that the resulting estimate is not always in the range from 0 to 1. In some cases, when the translation is wrong, the WER can be greater than 1. Another version of the WER is the WERg metric, in which the sum of insertions, deletions and substitutions is normalized by the Levenshtein distance, i.e., the length of the edits. In information theory and computational linguistics, the Levenshtein distance (editorial distance, or edit distance) between two strings is defined as the minimum number of edits needed to transform one string into the other, with allowable edit operations being insertion, deletion, or substitution of a single character [8]. The advantage of this metric is that the value of the translation quality will always be in the range from 0 to 1 (even in the worst case of coincidence, or in the absence of translation, the value will not exceed unity). Experiments performed by Blattsom et al. have shown that the WERg metric is not reliable and does not agree with the estimates obtained when the machine translation is analyzed by humans [9]. The position-independent error rate (PER) neglects the order of the words in the string matching operation. In this case, the difference between the candidate text and the reference text, normalized by the length of the reference translation, is calculated [10]. Another metric that is widely used in assessing the translation quality is the translation error rate (TER). This metric makes it possible to measure the number of edits required to change a system output into one of the given reference translations [11].

In fact, any string matching metric can be used for assessing the MT quality. One such example is the string kernel, which allows one to take into account different levels of natural language (e.g., morphological, lexical, etc.), or the relationship between synonyms [12].

2.2 N-gram models


In n-gram language models, use is made of an explicit assumption that the probability of the next word in a sentence depends on the previous n-1 words. In practice, the models with n = 1, 2, 3 and 4 are used. For the English language, the most successful are three-gram or four-gram models. Today, almost all systems of MT quality assessment rely on n-gram models. In this case, the probability of the whole sentence is calculated as the product of the probabilities of its constituent n-grams. The main advantages of n-gram models are their relative simplicity and the possibility of constructing a model that can be trained on a sufficiently large corpus of a language. However, such models are not devoid of drawbacks. The n-gram models make it impossible to simulate semantic and pragmatic relationships in a language. In fact, if a dictionary contains N words, the number of possible pairs of words will be N2 . Even if only 0.1% of them actually occur in the language, the minimum volume of the language corpus, necessary to obtain statistically valid estimates, will amount to 125 billion words or about 1 terabyte. For three-gram models, the minimum corpus will reach hundreds of thousands of terabytes [13]. To overcome the drawbacks, use is made of well-developed smoothing techniques, which enables the assessment of the model parameters under the conditions of insufficient or non-existent data. The main metrics based on n-grams are BLEU, NIST, F-measure, and METEOR. BLEU (Bilingual Evaluation Understudy) is an algorithm for automatic evaluation of the quality of a machine translation, which is compared to the reference translation, using n-grams. This metric of MT quality assessment was first proposed and implemented by Papineni et al. [4]. Measuring translation quality is a challenging task, primarily due to the lack of definition of an absolutely correct translation. The most common technique of translation quality assessment is to compare the output of automated and human translations of the same document. But this is not as simple as may seem: One translators translation may differ from that of another translator. This inconsistency between different reference translations presents a serious problem, especially when different reference translations are used to assess the quality of automated translation solutions. A document translated by specially designed automated software can have a 60% match with the translation done by one translator and a 40% match with that of another translator. Although both professional translations are technically correct (they are grammatically correct, they convey the same meaning, etc.), 60% overlap of words is a sign of higher MT quality. Thus, although reference translations are used for comparison, they cannot be a completely objective and consistent measurement of the MT quality. The BLEU metric scores the MT quality on a scale from 0 to 1. The closer the score to unity, the greater is the overlap with the reference translation and, therefore, the better the MT system. To cut the long story short, the BLEU metric measures how many words coincide in the same line, with the best score given not to matching words but to word sequences. For example, a string of four words in the translation that matches the human reference translation (in the same order) will have a positive impact on the BLEU score and is weighted more heavily (and scored higher) than a one- or two-word match [14]. The NIST (National Institute of Standards and Technology) precision measure is a metric used to evaluate the MT variants [5]. NIST was intended as an improved version of BLEU. In this case, the arithmetic mean of n-grams is calculated. An important difference from the BLEU metric is the fact that NIST also relies on the frequency

component (precision and recall). If BLEU simply calculates the n-gram precision by adding an equal weight for each exact match, NIST also calculates how informative each matching n-gram is. For example, even if the bigram on the coincides with the same phrase in the reference text, the translation still receives a lower score than the correct matching of the bigram size distribution, because the latter phrase is less likely to occur. The F-measure is a metric which calculates the harmonic mean of precision and recall [15]. The metric is based on the search for the best match between the candidate and reference translations (the ratio between the total number of matching words to the length of the translation and the reference text). Sometimes it is useful to combine the precision and recall of the same averaged value [16]. The metric for evaluation of translation with explicit ordering (METEOR) is an improved version of the F-measure [17]. This system was designed to address some of the weaknesses in the BLEU metric. The METEOR scores the output by matching the automated and reference translations word-for-word. When more than one reference translation is available, the automated translation is compared with each of them and the best result is reported [18]. One can have different attitudes to the different metrics, but at this point BLEU, METEOR and NIST are most widely used. It is these metrics that are compared with all the other MT quality assessment systems. The developers of the F-measure claim that their metric shows the best agreement with the assessment made by a human [15]. However this is not always the case. The F-measure does not work well with the smallest average edit distance [9]. Empirical data show that more attention should be paid to the completeness (recall) of the translation. Studies suggest that the recall is most often the parameter, which allows one to determine the quality of translation [17].

3. Automatic evaluation of the quality of statistic (Google) and rulebased (Prompt) MT systems
Translation is an intellectual challenge, and, therefore, skepticism about the possibility of using a computer for automated translation is quite natural. However, the creators of MT systems have managed to endow their systems with a form of understanding, and machine translation now belongs to a class of artificial intelligence programs. Currently, we can speak of two approaches to written translation: the first one is machine translation based on the rules of the source and target languages and the second approach involves statistical machine translation. The earliest translation engines in machine-based translations were all based on the direct, so-called transformer, approach. Input sentences of the source language were transformed directly into output sentences of the target language, using a simple form of parsing. The parser did a rough analysis of the source sentence, dividing it into subject, object, predicate, etc. Source words were then replaced by target words selected from a dictionary, and their order rearranged so as to comply with the rules of the target language. This approach was used for a long time, only to be finally replaced by a less direct approach, which is called linguistic knowledge. Modern computers, which have more processing power and more memory, can do what was impossible in the 1960s. Linguistic-knowledge translators have two sets of grammar rules: one for the sourcelanguage, and the other for the target language. In addition, modern computers analyze not only grammar (morphological and syntactic structure) of the source language but also the semantic information. They also have information about the idiomatic differences between the languages, which prevents them from making silly mistakes. The representative of rule-base approach to machine translation is the Prompt software developed by the leading Russian developer of linguistic IT solutions. The second approach is based on a statistical method: by analyzing a large number of parallel texts (identical texts in the source and target languages), the program selects the variants that coincide most often and uses

them in the translation. It does not apply grammatical rules, since its algorithms are based on statistical analysis rather than traditional rule-based analysis. In addition, the lexical units here are word combinations, rather than separate words. One of the well-known examples of this approach is Google Translate, which is based on an approach called statistical machine translation. However, the translated sentences are sometimes so discordant that it is impossible to understand them [19]. In this section using concrete examples we will compare the quality of translations made by such MT systems as Google ( http://translate.google.ru/) and Prompt (www.translate.ru). For the analysis, we selected five titles, abstracts, and keywords from the Kvantovaya Elektronika journal [20], which is first published in Russian and then translated into English by a group of professional translators. Text 1 Au . . . , . , ~106 /2. : , , , , . Text 2 .

, . - . . : , , , . Text 3 . , 0 106 / ~106 /2 (Cu, Al, Sn, Pb) , . , , ( ) . : , , , .

Text 4 ( ) . , . , . - . : , , , , . Text 5 . ,

. : , , , , . The corresponding translations were taken from http://iopscience.iop.org/1063-7818/42/2 . For an automatic analysis, we used the relevant software that is publicly available

fromhttp://www.languagestudio.com/LanguageStudioDesktop.aspx#Pro. Language StudioTM Lite is a free tool that provides key metrics for translation quality. This tool can be used to measure not only the quality, but also the improvements in quality because custom translation engines are constantly being updated via the quality improvement feedback cycle. Language StudioTM Lite currently supports such metrics as BLEU, F-Measure, and TER. From the point of view of syntax, the abstracts presented for the analysis are characterized mainly by simple sentences, i.e., smth is presented or smth is investigated. Besides, most frequently used are compound sentences with an object clause, for example, it is shown that ... or it is found that . As to the vocabulary, translators most often use one-word termswaveguide, two-word termslight wave,uncertainty relation, and three-word terms target material droplets, whereas four-word termscrystal-like spatially periodic structureare extremely rare. For the program to correctly score the translations, we preliminary processed the reference translations and candidate translations made by Google and PROMPT. Each sentence started a new paragraph, and the texts were converted into .txt format. Initially, we compared the reference translation and the outputs from Google and PROMPT, using n-gram metrics. The results of the translation evaluation summary are presented below.

Translation Evaluation Summary

Job Start Date:

9/18/2012 1:52 PM

Job End Date:

9/18/2012 1:52 PM

Job Duration:

0 min(s) 2 sec(s)

Reference File:

reference_1.txt

Candidate File:

candidate_google_1.txt

Evaluation Lines:

28

Tokenization Language:

EN

Results Summary:

62.554

Reference

Candidate

Gram Gram Gram Gram

Score

1.

Evolution

of

the

distribution 1. The evolution of the distribution 18/19 16/18 15/17 14/16 92.490

function of Au nanoparticles in a function of Au nanoparticles in a liquid under the action of laser liquid under the action of laser radiation radiation

2. Abstract.

2. Abstract.

3/3

2/2

1/1

0/0

100.000

3. Fragmentation of nanoparticles in 3.

Studied

theoretically the process

and 19/22 15/21 13/20 11/19 73.753 of

a liquid under the action of pulsed experimentally and experimentally.

laser heating is studied theoretically fragmentation of nanoparticles in a liquid under the action of pulsed

laser heating.

4. Fragmentation is simulated by 4. Simulation of the process carried 27/34 22/33 18/32 16/31 67.629 solving the kinetic equation for the out by solving the kinetic equation nanoparticle size distribution for the distribution size, the function taking of into function, taking into account the nanoparticles temperature dependence of the account medium.

temperature

thermophysical parameters of the dependence of the thermophysical parameters of the medium.

5. It is shown that fragmentation 5. It is shown that fragmentation 15/17 9/16 occurs after separation of smaller occurs after separation from the fragments nanoparticle. from a molten molten fragments of smaller nanoparticles.

7/15

6/14

58.502

6. The simulation results are in good 6. The simulation results are in good 33/41 28/40 23/39 20/38 69.933 agreement with experimental data agreement with experimental data obtained in the fragmentation of obtained in the fragmentation of gold nanoparticles irradiated in gold nanoparticles in water under a peak intensity of the radiation in the environment of about ##. water by a copper vapor laser with a irradiation of copper vapor laser with peak radiation intensity of about ##.

7. Keywords: nanoparticles, colloids, 7. Keywords: nanoparticles, colloids, 17/18 15/17 13/16 12/15 88.671 laser ablation of metals, plasmon laser ablation of metals, plasmon resonance, fragmentation. resonance, the fragmentation.

8.

Interaction

of laser

noncollinear 8. filaments

Noncollinear laser

interaction filaments

of 9/9 in

5/8

3/7

2/6

59.673

femtosecond sapphire

in femtosecond sapphire

9. Abstract.

9. Abstract.

3/3

2/2

1/1

0/0

100.000

10. The interaction of two coherent 10. Numerically and experimentally 29/31 19/30 14/29 9/28 femtosecond laser pulses, investigated the interaction of two propagating at a small angle with coherent femtosecond laser pulses respect to each other in a sapphire propagating at a small angle to crystal in the filamentation regime, each other in the sapphire crystal in has been investigated numerically the regime of filamentation. and experimentally.

54.746

11. Distributions of the fluence and 11. plasma channels formed in

Obtained

by

dividing

the 15/24 3/23

0/22

0/21

19.291

free-electron density in the laser- distribution of the surface energy the density and the concentration of free electrons in laser-produced plasma channels. crystal are obtained.

12. Additional filaments are found to 12. propagation.

Revealed

the

formation outside

of 9/17 the

4/16

2/15

1/14

26.224

form outside the plane of initial pulse additional momenta.

filaments

plane of the initial distribution of

13.

Keywords:

filamentation, 13.

Keywords:

filamentation, 12/14 9/13

8/12

7/11

71.312

femtosecond radiation, laser plasma, femtosecond radiation, laser plasma filament interaction. interaction of filaments.

14. Influence of an electric field on 14. Effect of electric field on the 13/16 8/15 near-surface processes in laser near-surface processes during laser processing of metals processing of metals

5/14

2/13

46.421

15. Abstract.

15. Abstract.

3/3

2/2

1/1

0/0

100.000

16. It is shown that by varying the 16. It is shown that when the external 54/66 39/65 27/64 20/63 53.525 external electric field with different electric field of opposite polarity polarity from 0 to ## in the course of from 0 to ## during the action of laser processing with the mean laser radiation with an average flux radiation flux density ## the change density of ## at the surface of some in the evolution features of the metals (Cu, Al, Sn, Pb) modified plasma torch at the surface of some features of the evolution of the metals (Cu, Al, Sn, Pb) at early stages plasma torch in the early stages is is quantitative rather than quantitative, not qualitative. qualitative.

17.

At

the

same size of

time the

the 17.

At

the

same

time, from

the 33/41 23/40 16/39 10/38 48.883 the

characteristic the

target characteristic droplet size of the material, made decrease with

material droplets, carried out from target irradiated zone, essentially (by several times) smaller times) electric field strength

becomes irradiated zone, significantly (several increasing

as the amplitude of the external amplitude of the external electric grows, field, regardless of its polarity. independently of its polarity.

18. Keywords: laser radiation, electric 18. Keywords: laser light, electric 16/17 14/16 12/15 10/14 83.262 field, plasma formation, gravity- field, plasma formation, gravitycapillary waves. capillary waves.

19. On associations of noninteracting 19. particles structures) (crystal-like

On

associations

of

non- 10/14 5/13

2/12

1/11

35.248

neutron interacting particles (neutron crystalstructure)

20. Abstract.

20. Abstract.

3/3

2/2

1/1

0/0

100.000

21. We discuss the physical feasibility 21. of association with of noninteracting uncertainty each

We

discuss

the

physical 33/39 27/38 22/37 17/36 66.469

particles realizability of association of nonother, interacting particles with each other, under uncertainty in the corporate spatial a whole.

which arises in accordance with the which arises in accordance with the relation the corporate spatial confinement limitation of the particle ensemble as of the particle ensemble as a whole.

22. Investigation is conducted by the 22. Examination conducted by the 21/22 19/21 18/20 17/19 89.173 example of an ensemble of ultracold example of an ensemble of ultracold neutrons placed in a common neutrons placed in a common potential well of infinite depth. potential well of infinite depth.

23.

We

present

quantitative 23.

Quantitative

estimates

and 10/14 5/13

2/12

0/11

25.854

estimates and indicate the expected expectations are the properties of properties of the arising crystal-like the crystal-space-periodic structures. spatially periodic structures.

24. Keywords: quantum nucleonics, 24. Keywords: quantum nucleonics, 28/30 25/29 23/28 21/27 84.865 ultracold neutrons, laser methods of ultracold neutrons, laser methods of production of ultracold neutrons, production of ultracold the potential well of infinite depth. neutrons, neutron associations, and neutrons in neutron Association, the neutrons in the potential well of infinite depth.

25.

Elliptically

polarized

cnoidal 25.

Elliptically in

polarized with

cnoidal 12/13 10/12 8/11 spatial

6/10

73.900

waves in a medium with spatial waves dispersion of cubic nonlinearity

media

dispersion of cubic nonlinearity

26. Abstract.

26. Abstract.

3/3

2/2

1/1

0/0

100.000

27. We present new specific analytic 27. We present new analytic solutions 56/63 45/62 36/61 29/60 67.754 solutions of a system of nonlinear of Schrodinger corresponding polarized isotropic spatial frequency conditions each of cnoidal gyrotropic dispersion and dispersion of the to partial system to of an nonlinear equations, elliptically in an with cubic waves of equations, Schrdinger elliptically corresponding in an polarized with isotropic cubic spatial medium of under of waves cnoidal gyrotropic dispersion and

medium

nonlinearity

second-order nonlinearity

frequency

the dispersion of the second order under the the conditions of formation of the

formation circularly

waveguides of the same type for waveguides single profile for each of polarized the circularly polarized components of the light field. components of the light field.

28.

Keywords:

cubic

nonlinearity, 28.

Keywords:

cubic

nonlinearity, 17/20 13/19 11/18 9/17 nonlinear elliptic

68.713

spatial Schrodinger

dispersion, equations,

nonlinear spatial elliptic Schrodinger

dispersion, equation,

polarization, cnoidal waves.

polarization, the cnoidal wave.

-- Report End --

Translation Evaluation Summary

Job Start Date:

9/18/2012 1:53 PM

Job End Date:

9/18/2012 1:53 PM

Job Duration:

0 min(s) 2 sec(s)

Reference File:

reference_1.txt

Candidate File:

candidate_prompt_1.txt

Evaluation Lines:

28

Tokenization Language:

EN

Results Summary:

35.528

Reference

Candidate

Gram Gram Gram Gram

Score

1. Evolution of the distribution function 1. Evolution of function of distribution 15/18 8/17 of Au nanoparticles in a liquid under of nanoparticles of Au in liquid under the action of laser radiation the influence of laser radiation

3/16

0/15

37.286

2. Abstract.

2. Summary.

2/3

0/2

0/1

0/0

22.222

3. Fragmentation of nanoparticles in 3.

Theoretically

also

process

of 15/21 7/20

4/19

1/18

34.101

a liquid under the action of pulsed fragmentation of nanoparticles in laser heating is studied theoretically liquid under the influence of pulse and experimentally. laser heating is experimentally investigated.

4. Fragmentation is simulated by 4. Modeling of process is carried out 22/38 10/37 5/36 solving the kinetic equation for the on the basis of the solution of the nanoparticles temperature medium. size distribution kinetic of equation for function of function, taking into account the distribution of nanoparticles in the dependence the sizes taking into account temperature of heatphysical parameters of the environment. thermophysical parameters of the dependence

1/35

28.465

5. It is shown that fragmentation 5. It is shown that fragmentation 14/20 6/19 occurs after separation of smaller occurs through separation from the fragments nanoparticle. from a molten melted nanoparticle of fragments of the smaller size.

5/18

4/17

41.519

6. The simulation results are in good 6. Results of modeling are in a good 27/47 9/46 agreement with experimental data consent with the experimental data

1/45

0/44

22.454

obtained in the fragmentation of received

at

fragmentation

of

gold nanoparticles irradiated in water nanoparticles of gold in water under by a copper vapor laser with a peak the influence of radiation of the laser radiation intensity of about ##. on pairs of copper at peak intensity of radiation in the environment of ~##.

7. Keywords: nanoparticles, colloids, 7. Keywords: nanoparticles, colloidal 14/18 10/17 6/16 laser ablation of metals, plasmon solutions, laser ablyatsiya of metals, resonance, fragmentation. plazmonny fragmentation. resonance,

3/15

50.002

8.

Interaction

of laser

noncollinear 8. filaments

Interaction

of laser

not

collinear 6/10 in

3/9

1/8

0/7

27.947

femtosecond sapphire

in femtosekundny sapphire

filament

9. Abstract.

9. Summary.

2/3

0/2

0/1

0/0

22.222

10. The interaction of two coherent 10. femtosecond laser

Chislenno

also

experimentally 19/32 8/31 of two laser

5/30

3/29

26.357

pulses, investigated

interaction femtosekundny

propagating at a small angle with coherent

respect to each other in a sapphire impulses extending under a small crystal in the filamentation regime, corner to each other in a crystal of has been investigated numerically sapphire and experimentally. in a mode of a filamentatsiya.

11. Distributions of the fluence and 11. Conflicts of division of superficial 12/24 1/23 free-electron density in the laser- density of energy and concentration plasma channels formed in the of free electrons in being formed laser and plasma channels are received. crystal are obtained.

0/22

0/21

13.877

12. Additional filaments are found to 12. Formation of additional filament 7/17 form outside the plane of initial pulse out of the plane of initial distribution propagation. of impulses is revealed.

3/16

2/15

1/14

21.432

13.

Keywords:

filamentation, 13.

Keywords:

Filamentatsiya, 12/15 6/14 radiation, laser

4/13

2/12

44.149

femtosecond radiation, laser plasma, femtosekundny filament interaction.

plasma, interaction of filament.

14. Influence of an electric field on 14. Influence of electric field on 11/13 7/12 near-surface processes in laser pripoverkhnostny processes at laser processing of metals processing of metals

4/11

1/10

42.105

15. Abstract.

15. Summary.

2/3

0/2

0/1

0/0

22.222

16. It is shown that by varying the 16. It is shown that at change of 48/76 28/75 17/74 10/73 36.477 external electric field with different intensity of external electric field of polarity from 0 to ## in the course of various polarity from 0 to 106 ## in a laser processing with the mean course impact of laser radiation with radiation flux density ## the change average density of a flow of ~## on in the evolution features of the a surface of a number of metals (Cu, plasma torch at the surface of some Al, Sn, Pb) change of features of metals (Cu, Al, Sn, Pb) at early stages evolution of a plasma torch at early is quantitative rather than qualitative. stages carries quantitative, instead of qualitative character.

17.

At

the

same size of zone,

time the

the 17.

At

the

same sizes of

time drops

the 30/44 21/43 12/42 6/41 of

39.596

characteristic the

target characteristic irradiated

material droplets, carried out from substance of the target, taken out of irradiated becomes the zone, essentially essentially (by several times) smaller (several times) decrease at increase as the amplitude of the external in amplitude of intensity of external electric field strength grows, electric polarity. field irrespective of its independently of its polarity.

18. Keywords: laser radiation, electric 18. Keywords: laser radiation, electric 13/16 10/15 8/14 field, plasma formation, gravity- field, plazmoobrazovaniye, capillary waves. gravitational and capillary waves.

6/13

60.731

19. On associations of noninteracting 19. particles structures) (crystal-like

About

associations

of 9/11

6/10

4/9

2/8

47.946

neutron noninteracting

particles

(kristallopodobny neutron structures)

20. Abstract.

20. Summary.

2/3

0/2

0/1

0/0

22.222

21. We discuss the physical feasibility 21. Physical feasibility of associations 23/34 12/33 7/32 of association with of particles noninteracting with each other the other, particles, arising according to an ratio is discussed under at corporate spatial restriction of noninteracting uncertainty each

3/31

31.964

which arises in accordance with the uncertainty relation

the corporate spatial confinement ensemble of particles as a whole. of the particle ensemble as a whole.

22. Investigation is conducted by the 22. Consideration is carried out on an 17/24 8/23 example of an ensemble of ultracold example of ensemble of the ultracold neutrons placed in a common neutrons placed in the general potential well of infinite depth. potential hole of infinite depth.

4/22

2/21

34.064

23. We present quantitative estimates 23. of the arising crystal-like spatially of periodic structures.

Quantitative being formed

estimates

are 10/19 4/18

1/17

0/16

19.655

and indicate the expected properties presented and expected properties kristallopodobny spatial and periodic structures are specified.

24. Keywords: quantum nucleonics, 24. Keywords: quantum a nukleonik, 25/30 19/29 14/28 10/27 58.972 ultracold neutrons, laser methods of ultracold neutrons, laser ways of production of ultracold neutrons, production of ultracold neutrons, neutron associations, and neutrons in neutron associations, neutrons in a the potential well of infinite depth. potential hole of infinite depth.

25.

Elliptically

polarized

cnoidal 25. Elliptically the polarized knoidalny 11/15 7/14 dispersion of cubic nonlinearity

4/13

3/12

46.451

waves in a medium with spatial waves in the environment with spatial dispersion of cubic nonlinearity

26. Abstract.

26. Summary.

2/3

0/2

0/1

0/0

22.222

27. We present new specific analytic 27. New private analytical decisions 45/66 19/65 9/64 solutions of a system of nonlinear of system of the nonlinear equations Schrodinger cnoidal gyrotropic waves in an with equations, of Schrodinger, corresponding corresponding to elliptically polarized elliptically to the polarized knoidalny isotropic waves are found in the isotropic spatial girotropny environment with spatial medium

5/63

30.796

dispersion of cubic nonlinearity and dispersion of cubic nonlinearity and second-order frequency dispersion frequency dispersion of the second of wave guides of a under the conditions of formation of order at performance of conditions of the waveguides of the same type for formation each of the circularly components of the light field. polarized uniform profile for each of tsirkulyarno polarized a component of a light field.

28.

Keywords:

cubic

nonlinearity, 28.

Keywords:

cubic

nonlinearity,

spatial Schrodinger

dispersion, equations,

nonlinear spatial elliptic nonlinear

dispersion,

Schrodingers 18/20 13/19 10/18 8/17 elliptic

67.052

equations,

polarization, cnoidal waves.

polarization, knoidalny waves.

-- Report End --

The results of comparison show that Google scored 62.554, while the PROMPT scored only 35.528. All this suggests that Google copes well with the vocabulary, while PROMPT experiences some difficulties in translating unknown words (however, we believe that proper training of this MT system may yield better results). In fact, this is not surprising, since statistical translation relies on n-gram models. All the advantages of statistical systems manifest themselves when the system is trained for a sufficiently long time and high-quality corpora of parallel texts are available. Moreover, qualified linguists are not required in this case, and the system can be trained during its operation. These systems have however some drawbacks: Large parallel corpora of texts are needed for training; such systems rely on a complex mathematical apparatus; high-quality translation is only possible for phrases that match the n-gram model, and translation strongly depends on the corpora, which were used for training.

The second analysis was performed using metrics such as BLEU, F-measure and TER. The two outputs were compared simultaneously with the reference translation. As a consequence, we have the following results: Translation Evaluation Summary

Job Start Date:

9/18/2012 1:50 PM

Job End Date:

9/18/2012 1:50 PM

Job Duration:

0 min(s) 6 sec(s)

Number of Reference Files:

Number of Candidate Files:

Evaluation Lines:

28

Tokenization Language:

EN

Evaluation Metrics:

BLEU, F-Measure, TER (Inverted Score)

Results Summary

Candidate File:

BLEU Case Sensitive

BLEU Case Insensitive

F-Measure Case Sensitive

F-Measure Case Insensitive

TER Case Sensitive

TER Case Insensitive

28.41

59.59

29.83

61.72

67.62

83.09

68.89

84.72

45.77

67.26

46.42

68.08

Candidate Files:

1 2 : candidate_google_1.txt

candidate_prompt_1.txt

Reference Files:

1 : reference_1.txt

Candidate File 1:

candidate_prompt_1.txt

BLEU

F-Measure

TER

Case Sensitive:

28.41

67.62

45.77

Case Insensitive: 29.83

68.89

46.42

Candidate File 2:

candidate_google_1.txt

BLEU

F-Measure

TER

Case Sensitive:

59.59

83.09

67.26

Case Insensitive: 61.72

84.72

68.08

-- Report End -As in the previous test, Google shows better results, which is not surprising, because scientific texts are highly standardized. The syntactic features of scientific and technical texts include syntax and semantic completeness, frequent use of clichd structures, a comprehensive system of connecting elements (coordinating and subordinating conjunctions), etc. Scientific speech is characterized by complicated syntax, which is reflected in the use of sophisticated coordinated and subordinated sentences and in the complexity of simple sentences, mainly with appositives. In adddition, scientific and technical texts are characterized, first of all, by the frequent use of highly specialized and scientific terms. This is explained by the fact that scientific terminology evolves due to the need for experts in a field to communicate with precision and brevity, but often has the effect of excluding those who are unfamiliar with the particular specialized language of the group. Modern terminology is accurate, efficient, nominative, stylistically neutral, and lacks emotional bias. All the above-said allows Google to cope so well with standardized texts. Nevertheless, it should be noted that PROMPT does much better job when it comes to grammar. Thus, there are more grammatically correct sentences in the PROMPT output than in the Google output. This is not surprising, because PROMT relies on rule-based machine translation (RBMT). RBMT is based on linguistic description of two natural languages (bilingual dictionaries and other databases containing morphological, grammatical and semantic information), formal grammars, and proper translation algorithms. The quality of translation depends on the size of linguistic databases (dictionaries) and depth of description of natural languages [21]. 4. Conclusions An overview of the most commonly used metrics of MT evaluation is presented. Automatic evaluation of MT quality by such metrics as BLEU, F-measure, and TER has significantly improved statistical MT. Typically, these metrics show good correlation of candidate translations with reference translations. One of the major drawbacks of these metrics is that they cannot provide an assessment of the MT quality at the semantic or pragmatic levels. Nevertheless, at the present these metrics are the only systems of automatic translation quality assessment.

The quality of the outputs from Google and PROMPT is compared with the reference translations, using n-gram models and different metrics. In both cases, the Google output shows good correlation with the reference translation. The best match is registered at the vocabulary level which is to be expected, because the basis of the statistical translation is the n-gram model. The worst results in terms of grammar is also shown by Google, which is also understandable because PROMPT relies on the RBMT-model in which translation depends on the size of linguistic databases (dictionaries) and the depth of description of natural languages, i.e., the maximum number of features of grammatical structures. Since the translation into English is a priority for Google, this MT system is constantly being improved. All this suggests that the potential of transfer translation systems will be sooner or later exhausted, while the translation quality of statistical MT systems will eventually improve. Nevertheless, we believe that in the future, machine translation will combine these tworule-based and statisticalapproaches, as well as the universal semantic hierarchy (USH) approach [22] in order to produce a correct translation. The development of efficient and reliable evaluation metrics MP has been actively investigated in recent years. One of the most important tasks is to go beyond the N-gram statistics, while continuing to use a fully automatic regime. The need for a fully automated metric cannot be underestimated, as it should provide the highest rate of development and progress of MT systems. Acknowledgements The author thanks S.N. Vekovishcheva for valuable advice during the preparation of the manuscript.

References 1. White, J., OConnell, T., and Carlson, L. (1993) Evaluation of Machine Translation. In Human Language Technology: Proceedings of the Workshop (ARPA), pp 206210. 2. Melamed, I.D. (1995) Automatic Evaluation and Uniform Filter Cascades for Inducing N -Best Translation Lexicons. In Third Workshop on Very Large Corpora (WVLC3), pp 184198, Boston. 3. Brew, C., and Thompson, H. (1994) Automatic Evaluation of Computer Generated Text: A Progress Report on the TextEval Project. In Human Language Technology: Proceedings of the Workshop (ARPA/ISTO), pp 108113. 4. Papineni, K., Roukos, S., Ward, T., and Zhu, W.-J.. (2002) BLEU: a Method for Automatic Evaluation of Machine Translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL), pp 311318, Philadelphia. 5. Doddington, G. (2002) Automatic Evaluation of Machine Translation Quality Using N-gram Co-Occurrence Statistics. In Human Language Technology: Notebook Proceedings, pp 128132, San Diego. 6. Turian, J.P., Shen, L., and Melamed, I.D. (2003) Evaluation of Machine Translation and its Evaluation. In Proceedings of MT Summit IX; New Orleans, USA, 23-28 September 2003. 7. http://en.wikipedia.org/wiki/Approximate_string_matching 8. http://en.wikipedia.org/wiki/Levenshtein_distance 9. Blatz, J., Fitzgerald, E., Foster, G., Gandrabur, S., Goutte, C., Kulesza, A., Sanchis, A., and Ueffing, N. (2004) Confidence Estimation for Machine Translation. In Proceedings of COLING, pp 315321, Geneva. 10. http://en.wikipedia.org/wiki/Evaluation_of_machine_translation 11. http://www.lrec-conf.org/proceedings/lrec2008/pdf/785_paper.pdf

12. Cancedda, N., and Yamada, K. (2005). Method and Apparatus for Evaluating Machine Translation Quality. US Patent Application 20050137854. 13. http://www.intsys.msu.ru/invest/speech/articles/rus_lm.htm 14. http://www.languagestudio.com/TranslationQualityMetrics.aspx 15. Melamed, I.D., Green, R., and Turian, J.P. (2003) Precision and Recall of Machine Translation. In Proc. HLT-03, pp 6163. 16. http://ru.wikipedia.org/wiki/_ 17. Lavie, A., Sagae, K., and Jayaraman, S. (2004) The Significance of Recall in Automatic Metrics for MT Evaluation. Proceedings of the Sixth Conference of the Association for Machine Translation in the Americas (AMTA04), pp 134143. 18. Banerjee, S., and Lavie, A. (2007) METEOR: An Aut omatic Metric for MT Evaluation with Improved Correlation with Human Judgments. In Proceedings of the Second Workshop on Statistical Machine Translation, pp 228231, Prague. 19. Ulitkin, I. (2011) Computer-assisted Translation Tools: A Brief Review. Tra nslation Journal, Vol. 15, No. 1, January 2011. 20. http://www.quantum-electron.ru/ 21. http://ru.wikipedia.org/wiki/ 22. http://www.abbyy.ru/science/technologies/business/compreno/ Retirado de http://www.translationdirectory.com/articles/article2432.php

Das könnte Ihnen auch gefallen