Sie sind auf Seite 1von 1

Human Translation vs.

Machine Translation: Rise of the Machines


By Ilya Ulitkin,

The idea of machine translation (MT) of natural languages first appeared in the
seventeenth century, but became a reality only at the end of the twentieth century.
Today, computer programs are widely used to automate the translation process. Although
great progress has been made in the field of machine translation, fully automated
translations are far from being perfect. Nevertheless, countries continue spending
millions of dollars on various automatic translation programs. In the early 1990s, the U.S.
government sponsored a competition among MT systems. Perhaps, one of the valuable
outcomes of that enterprise was a corpus of manually produced numerical evaluations of
MT quality, with respect to a set of reference translations [1]. The development of MT
systems has given impetus to a large number of investigations, thereby encouraging
many researchers to seek for reliable methods for automatic MT quality evaluation.
Machine translation evaluation serves two purposes: the relative estimate allows one to
find out whether one MT system is better than the other, and the absolute estimate
(having a value ranging from 0 to 1) gives an absolute measure of efficiency (for
example, when equal to unity, it means perfect translation).
However, the development of appropriate methods for
numerical MT quality evaluation is a challenging task. In
many fields of science, measurable efficiency indices exist,
such as, for example, the difference between the predicted
and actually observed results. Since natural languages are
complicated, an assessment of translation correctness is
extremely difficult. Two completely different sequences of words (sentences) can be fully
equivalent (e.g., There is a vase on the table and The vase is on the table), and two
sequences that differ by a small detail can have completely different meanings
(e.g., There is no vase on the table, and There is a vase on the table).
Although great progress has
been made in the field of
machine translation, fully
automated translations are far
from being perfect.

Traditionally, the bases for evaluating MT quality are adequacy (the translation conveys
the same meaning as the original text) and fluency (the translation is correct from the
grammatical point of view). Most modern methods of MT quality assessment rely on
reference translations. Earlier approaches to scoring a candidate text with respect to a
reference text were based on the idea of similarity of a candidate text (the text
translated by an MT system) and a reference text (the text translated by a professional
translator), i.e., the similarity score was to be proportional to the number of matching
words [2]. At about the same time, a different idea was put forward. It was based on fact
that matching words in the right order in the candidate and reference sentences should
have higher scores than matching words out of order [3].
Perhaps the simplest version of the same idea is that a candidate text should be
rewarded for containing longer contiguous subsequences of matching words. Papineni et
al. [4] reported that a particular version of this idea, which they call BLEU, correlates
very highly with human judgments. Doddington [5] proposed another version of this
idea, now commonly known as the NIST score. Although the BLEU and NIST measures
might be useful for comparing the relative quality of different MT outputs, it is difficult to
gain insight from such measures [6].

Das könnte Ihnen auch gefallen