Sie sind auf Seite 1von 3

Relative Performance Evaluation upon Automated Grammar Checkers as

Knowledge Systems

Aung Kyaw Oo , Ni Lar Thein


University of Computer Studies, Yangon, Myanmar
aungkyawoo@wwlmail.com, nilarthein@mptmail.net.mm

Abstract the sentence structure (syntax) is clearly “Subject (proper


noun) + Action verb (singular) + Object (proper noun) +
Communication plays essential role in today’s globalization Conjunction + Parallel Object (proper noun)”. All three
era. Undoubtedly, its original purpose is accomplished by sentences are composed of five words. Despite the fact that
two language modes, viz. speaking and writing. In Natural the syntax does not change at all except the word order, the
Language Processing (NLP), these modes are usually semantics of the three sentences are in contrary to one
known as speech recognition and text recognition. Each another. What should be firmly borne is that syntax and
and every Knowledge System (KS), for text recognition, semantics dwells together in formal written grammar. And
applies and works with the aid of text processing facilities also that the native writers usually make many more
like word processors depending on the application area of mistakes and errors in grammar than their non-native
the respective KSs. This paper evaluates the relative counterparts are worthy to be noted!
performance of the Grammar Checkers (GC) of widely- Since semantics, in nature, is inherently an uneasy problem
accepted processors by specifying problem areas. In to tackle, computational linguists pay their focus on
addition, this paper proposes a better Checker model handling syntax-related matters. Actually, syntax is the
avoiding the language errors and the irrelevancies found composition rules for a written language. It implies that
during the analysis. syntax is a core pillar of grammar. Beginning from 1950’s,
computational linguists of all time try to contribute models
Keywords: of human language recognizers for human-computer
interaction (HIC).
Grammar; error-detection; text; suggestion
Developers of knowledge systems, in combination with
linguists, solve human language problems in HIC .As
Introduction mentioned previously, for text-mode language, automatic
detection and correction of grammatical errors in any
Human beings communicate with one another using running written format is currently checked through by
languages of their own. Some languages have both spoken some sort of Grammar and Spell Checker. Intrinsically,
and written format although some have only spoken one. It such kinds of Checker are knowledge systems. If human
is scholarly accepted that only the civilized people has both grammarians are nowhere in sight, the Checkers, instead of
synchronized spoken and written languages. Between these them, will solve the grammar problems at hand. The unique
two spoken and written modes, grammar for any language purpose of Checkers is to detect grammatical errors in the
must care for its written texts. Linguists of any language are running written texts automatically, give relevant
on the same side for the grammaticality of written texts. suggestions to writers, and ultimately to make auto-
Grammar pays a lot .Examine the sentences correction for writers.

“I am John”, In section 2, the reasons for the initiation of the present


work are spotlighted .Problem areas usually found in GCs
”My name is John”, and are excavated in section 3. Section 4 sees the relative
”I am called John”. evaluation of performance upon three GCs. A proposed
model for a better GC is depicted in section 5.Section 6
It can easily be seen that the meaning (semantics) carried concludes the paper with discussion on related work.
by these three sentences is the same though the syntax are
not.
In the sentences like
Motivation
“Mary beats Elizabeth and Suzan”, With long and earnest studies on linguistics and the
”Elizabeth beats Mary and Suzan”, and impracticalities found in using GCs , the present work
highlights the pros and cons of GCs, proposes a theoretical
”Suzan beats Mary and Elizabeth”, model for the successful achievement of a better GC, and

1
elaborates on developing a reliable , precise, and robust to stop the crowds complaints.
KS.

Classifying of Problem Areas The mayor who was newly elected Comma missing in
asked people to leave in an orderly nonrestrictive
manner. modifier
Nature of Natural Language Errors

As a rule, in any human language around the world, three Turning away from the crowd, the subject – verb
types of errors are found, namely; mayor said, I hope the security chief or agreement
the promoters has a plan to help all
(a) Grammar error, these people leave safely.
(b) Usage error, and
(c) Vocabulary (Vocab) error Although the muddy parking area Unnecessary
Among these three kinds of errors, each and every GC takes caused problems, all the cars and commas
people, left the grounds without
responsibility over grammar and vocab errors. What today’s
incident.
GCs cannot take control is on usage in any specific human
language, such as idioms, colloquials, slangs, dialects,
jargons and so on.
Performance Evaluation upon Grammar
Types of Grammatical Problem Checker

In classifying the grammatical problems, what should be The evaluation of performance upon the under-mentioned
taken notice is that most of the errors are caused by three commercial-of-the-shelf (COTS) GCs is performed
punctuation errors apart from other factors. The following with respect to the previously specified ten grammatical
ten grammatical problems are cited in an established problems. The results are shown in each of the following
English Grammar book and taken from it as an ideal guide sub-sections.
for developing a better GC model {1}.
Microsoft Word 2002
Example sentence Problem
Except for the seventh problem, the GC in MS Word 2002
does not display any error. For the error found, it gives
The heavy rain turned the parking area Fragment
suggestions.
to mud. Which meant that thousands
of cars would get stuck.
Grammar Expert Plus
The promoters called the insurance Fused sentence This Checker displays error for the first, fourth, and ninth
company they discovered their
coverage for accidents was limited.
problems, no errors for the other problems.

Grammar Slammer Deluxe


After talking with the grounds keeper, Unclear pronoun
the security chief said he would not be reference
The evaluation results for the specified problems are the
responsible for the safety of the crowd.
same as that of Grammar Expert Plus.

The local authorities hadn’t scarcely double negative


General Discussion
enough resources to cope with the
flooding.
Apart from evaluating on three GCs with ten problem areas,
The mayor was worried, she urged comma splice another evaluation has been done with the sentences
the promoters to cancel the event. including usual confused words. The test sentences are;-
(a) I would like to see the principle of your school.
After announcing the cancellation dangling modifier
from the stage, the crowd began (b) We shall except the last provisions of the contract.
complaining to the organizers. (c) We shall expect the last provisions of the contract.
For sentence (a), except MS Word 2000 GC, the other two
Even the promoters promise to lack of possessive GCs found error and gave suggestion as “commonly
reschedule and honor tickets did little apostrophe confused words”.

2
For sentence (b), all three GCs displayed the same future natural language processing purpose, developing
suggestions as “commonly confused words” although the GCs to handle usage problems in human languages will be
sentence is completely correct from writer’s intent. the challenging task for all computational linguists.
For sentence(c), all three Checkers showed similar no errors References
and suggestions despite the writer’s intended word “except”
is confused with the word “expect”.
[1] Anson, C.M. and Schwegler, R.A. (1997). The
This being the case, the role of “confused words” should be Longman Handbook for Writers Readers.
taken into account for the reliability of the Checker.
[2] Amold, D. “LG111 Introduction to Linguistics
Semantics (2): Structural and Lexical Semantics”.
Theoretical Model for an Ideal Grammar [3] Bigert, J. (2005). “Automatic and Unsupervised
Checker Methods in Natural Language Processing,” Ph.D
Thesis, Stockholm, Sweden.
Based on the results obtained from the previous section 4, a [4] Borin, L. (2005). “Bilaga 4: CrossCheck Project Status
well-proven theoretical model for a Grammar Checker is to Report,” Nada, KTH. Computational Linguistics, SU,
be proposed in this section. The architecture of the pp. 10-15.
proposed model is shown in the following figure (5-1). [5] Domeij, J. R., Kann, V. and . Knutsson, O. (2002). “A
Swedish Grammar Checker,” Association for
Computational Linguistics.
Tag Parse Input
Words Words Words [6] EnglishPlus +, Copyright©2001.
[7] Helfrich, A. and Music, B. (1999) “Design and
Evaluation of gtammar checkers in multiple languages”.
Check Gram
[8] Johnson, E. ”The Ideal Grammar and Style Checker”.
With mar [9] Johnson, E. (1992). “Strong Writer and the Production
Grammar Knowl and Marketing of Software”, V (2), first published in
Rules edge TEXT Technology, Vol. 2.5, pp. 2-4, September.
Base
[10] Shaw, R. “Grammar Checkers Helpful or Harmful”?
Check Gannett News Service.
Confused- [11] Wei, Y.H. and Davies, G. (1996). ”Do grammar
Output
Words No Words checkers work”? EUROCALL, Dániel Berszenyi
Error College, Hungary.
Error
Found

Display
Suggestions

Figure 1 - A Proposed Model for a Grammar Checker

Although the proposed architecture seems no different from


the structure of the widely-used GCs in the market, the
nucleus of this architecture lies in grammar checking rules.
In checking the tagged words with grammar, punctuation
checks, which are vulnerabilities to the above-mentioned
three sample GCs, are to be carried out thoroughly to cover
the specified problem areas. That is to say to contribute new
punctuation rules into prevailing GCs. In addition, no less
important is to be paid to check confused words. As usual,
the input words will be checked in details till the end of the
entire process.

Conclusion
It is generally accepted that no system is to be expected as
flawless one. Having done performance evaluation upon
three GCs and the grammatical problem areas specified, the
development of a more reliable, robust and precise
Grammar Checker is not far away from our vision. For

Das könnte Ihnen auch gefallen