Beruflich Dokumente
Kultur Dokumente
Intelligent Tutoring
Systems with Conver-
sational Dialogue
Arthur C. Graesser, Kurt VanLehn, Carolyn P. Rosé,
Pamela W. Jordan, and Derek Harter
■ Many of the intelligent tutoring systems that have imately .3 to 1.0 standard deviation units com-
been developed during the last 20 years have pared with students learning the same content
proven to be quite successful, particularly in the in a classroom (Corbett et al. 1999).
domains of mathematics, science, and technology. The next generation of ITSs is expected to go
They produce significant learning gains beyond one step further by adopting conversational
classroom environments. They are capable of interfaces. The tutor will speak to the student
engaging most students’ attention and interest for
with an agent that has synthesized speech,
hours. We have been working on a new generation
of intelligent tutoring systems that hold mixed-
facial expressions, and gestures, in addition to
initiative conversational dialogues with the learn- the normal business of having the computer
er. The tutoring systems present challenging prob- display text, graphics, and animation. Animat-
lems and questions to the learner, the learner types ed conversational agents have now been devel-
in answers in English, and there is a lengthy mul- oped to the point that they can be integrated
titurn dialogue as complete solutions or answers with ITSs (Cassell and Thorisson, 1999; John-
evolve. This article presents the tutoring systems son, Rickel, and Lester 2000; Lester et al. 1999).
that we have been developing. AUTOTUTOR is a con- Learners will be able to type in their responses
versational agent, with a talking head, that helps
in English in addition to the conventional
college students learn about computer literacy.
point and click. Recent developments in com-
ANDES, ATLAS, AND WHY2 help adults learn about
physics. Instead of being mere information-deliv- putational linguistics (Jurafsky and Martin
ery systems, our systems help students actively 2000) have made it a realistic goal to have
construct knowledge through conversations. computers comprehend language, at least to
an extent where the ITS can respond with
something relevant and useful. Speech recog-
nition would be highly desirable, of course, as
long as it is also reliable.
I
ntelligent tutoring systems (ITSs) are clearly
one of the successful enterprises in AI. At this point, we are uncertain whether the
There is a long list of ITSs that have been conversational interfaces will produce incre-
tested on humans and have proven to facilitate mental gains in learning over and above the
learning. There are well-tested tutors of alge- existing ITSs (Corbett et al. 1999). However,
bra, geometry, and computer languages (such there are reasons for being optimistic. One rea-
as PACT [Koedinger et al. 1997]); physics (such son is that human tutors produce impressive
as ANDES [Gertner and VanLehn 2000; VanLehn learning gains (between .4 and 2.3 standard
1996]); and electronics (such as SHERLOCK [Les- deviation units over classroom teachers), even
gold et al. 1992]). These ITSs use a variety of though the vast majority of tutors in a school’s
computational modules that are familiar to system have modest domain knowledge, have
those of us in the world of AI: production sys- no training in pedagogical techniques, and
tems, Bayesian networks, schema templates, rarely use the sophisticated tutoring strategies
theorem proving, and explanatory reasoning. of ITSs (Cohen, Kulik, and Kulik 1982; Graess-
According to the current estimates, the arsenal er, Person, and Magliano, 1995).
of sophisticated computational modules inher- A second reason is that there are at least two
ited from AI produce learning gains of approx- success cases, namely, the AUTO T UTOR and
Copyright © 2001, American Association for Artificial Intelligence. All rights reserved. 0738-4602-2001 / $2.00 WINTER 2001 39
Articles
ATLAS systems that we discuss in this article. This article describes some of the tutoring
AUTOTUTOR (Graesser et al. 1999) is a fully auto- systems that we are developing to simulate
mated computer tutor that has tutored conversational dialogue. We begin with AUTO-
approximately 200 college students in an TUTOR. Then we describe a series of physics
introductory course in computer literacy. An tutors that vary from conventional ITS systems
early version of AUTOTUTOR improved learning (the ANDES tutor) to agents that attempt to com-
by .5 standard deviation units (that is, about a prehend natural language and plan dialogue
half a letter grade) when compared to a con- moves (ATLAS and WHY2).
trol condition where students reread yoked
chapters in the book. ATLAS (VanLehn et al.
2000) is a computer tutor for college physics AUTOTUTOR
that focuses on improving students’ conceptu-
The Tutoring Research Group (TRG) at the Uni-
al knowledge. In a recent pilot evaluation, stu-
dents who used ATLAS scored .9 standard devi- versity of Memphis developed AUTOTUTOR to
ation units higher than students who used a simulate the dialogue patterns of typical
similar tutoring system that did not use natur- human tutors (Graesser et al. 1999; Person et
al language dialogues. Thus, it appears that al. 2001). AUTOTUTOR tries to comprehend stu-
there is something about conversational dia- dent contributions and simulate dialogue
logue that plays an important role in learning. moves of either normal (unskilled) tutors or
We believe that the most effective tutoring sys- sophisticated tutors. AUTOTUTOR is currently
tems of the future will be a hybrid between being developed for college students who are
normal conversational patterns and the ideal taking an introductory course in computer lit-
pedagogical strategies in the ITS enterprise. eracy. These students learn the fundamentals
40 AI MAGAZINE
Articles
WINTER 2001 41
Articles
state automaton that can handle different class- if you have a direct connection to the
es of information that learners type in. The internet.
DAN is augmented by production rules that are Expectation 4: A modem is needed if you
sensitive to learner ability and several parame- have a dial-up connection.
ters of the dialogue history.
AUTOTUTOR decides which expectation to
handle next and then selects dialogue moves
How Does AUTOTUTOR Handle the Stu-
that flesh out the expectation. The dialogue
dent’s Initial Answer to the Question? moves vary in directness and information con-
After AUTOTUTOR asks the question in the tutor- tent. The most indirect dialogue moves are
1 turn, the student gives an initial answer in hints, the most direct are assertions, and
the student-1 turn. The answer is very incom- prompts are in between. Hints are often articu-
plete. A complete answer would include all the lated in the form of questions, designed to lead
points in the summary at the final turn (tutor- the learner to construct the expected informa-
30). What does AUTOTUTOR do with this incom- tion. Assertions directly articulate the expected
plete student contribution? AUTOTUTOR doesn’t information. Prompts try to get the learner to
simply grade the answer (for example, good, produce a single word in the expectation. For
bad, incomplete, a quantitative score) as many example, the tutor turns 3, 4, 5, and 6 in figure
conventional tutoring systems do. AUTOTUTOR 2 are all trying to get the learner to articulate
also stimulates a multiturn conversation that is expectation 3. Hints are in the tutor-3 turn (For
designed to extract more information from the what type of connection do you need a net-
student and get the student to articulate pieces work card?) and the tutor-5 turn (How does the
of the answer. Thus, instead of being an infor- user get hooked up to the internet?). Prompts
AUTOTUTOR is mation-delivery system that bombards the stu- are in tutor-4 (If you have access to the internet
a discourse dent with a large volume of information, AUTO- through a network card, then your connection
TUTOR is a discourse prosthesis that attempts to is…, with a hand gesture encouraging the
prosthesis get the student to do the talking and explores learner to type in information). Assertions are
that attempts what the student knows. AUTOTUTOR adopts the in tutor-5 and tutor-6 (A network card is need-
educational philosophy that students learn by ed if you have a direct connection to the inter-
to get the actively constructing explanations and elabo- net.). AUTOTUTOR attempts to get the learner to
student to do rations of the material (Chi et al. 1994; Conati articulate any given expectation E by going
the talking and VanLehn 1999). through two cycles of hint-prompt-assertion.
Most students manage to articulate the expec-
and explores How Does AUTOTUTOR Get the tation within the six dialogue moves (hint-
what the Learner to Do the Talking? prompt-assertion-hint-prompt-assertion).
AUTOTUTOR has a number of dialogue moves to AUTOTUTOR exits the six-move cycle as soon as
student get the learner to do the talking. For starters, the student has articulated the expected
knows. there are open-ended pumps that encourage answer. Interestingly, sometimes students are
the student to say more, such as What else? in unable to articulate an expectation even after
the tutor-2 turn. Pumps are frequent dialogue AUTOTUTOR spoke it in the previous turn. After
moves after the student gives an initial answer, expectation E is fleshed out, AUTOTUTOR selects
just as is the case with human tutors. The tutor another expectation.
pumps the learner for what the learner knows
How Does AUTOTUTOR Know
before drilling down to specific pieces of an
answer. After the student is pumped for infor-
Whether a Student Has Covered
mation, AUTOTUTOR selects a piece of informa- an Expectation?
tion to focus on. Both human tutors and AUTO- AUTOTUTOR does a surprisingly good job evalu-
TUTOR have a set of expectations about what ating the quality of the answers that learners
should be included in the answer. What they type in. AUTOTUTOR attempts to “comprehend”
do is manage the multiturn dialogue to cover the student input by segmenting the contribu-
these expected answers. A complete answer to tions into speech acts and matching the stu-
the example question in figure 2 would have dent’s speech acts to the expectations. Latent
four expectations, as listed here: semantic analysis (LSA) is used to compute
these matches (Landauer, Foltz, and Laham
Expectation 1: You need a digital camera
1998). When the tutor’s expectation E is com-
or regular camera to take the photos.
pared with the learner’s speech act A, a cosine
Expectation 2: If you use a regular cam- match score is computed that varies from 0 (no
era, you need to scan the pictures onto the match) to 1.0 (perfect match). AUTOTUTOR con-
computer disk with a scanner. siders each combination of speech acts that the
Expectation 3: A network card is needed learner makes during the evolution of an
42 AI MAGAZINE
Articles
answer to a major question; the value of the positive (okay at a moderate nod rate), and pos-
highest cosine match is used when computing itive (right with a fast head nod). Third, there is
whether the student covers expectation E. LSA corrective feedback that repairs bugs and miscon-
is a statistical, corpus-based method of repre- ceptions that learners articulate. Of course,
senting knowledge. LSA provides the founda- these bugs and their corrections need to be
tion for grading essays, even essays that are not anticipated ahead of time in AUTOTUTOR’S cur-
well formed grammatically, semantically, and riculum script. This anticipation of content
rhetorically. LSA-based essay graders can assign mimics human tutors. Most human tutors
grades to essays as reliably as experts in compo- anticipate that learners will have a variety of
sition (Landauer et al. 1998). Our research has particular bugs and misconceptions when they
revealed that AUTOTUTOR is almost as good as cover particular topics. An expert tutor often
an expert in computer literacy in evaluating has canned routines for handling the particular
the quality of student answers in the tutorial errors that students make. AUTOTUTOR currently
dialogue (Graesser et al. 2000). splices in correct information after these errors
occur, as in turn tutor-8. Sometimes student
How Does AUTOTUTOR Select errors are ignored, as in tutor-4 and tutor-7.
the Next Expectation to Cover? These errors are ignored because AUTOTUTOR has
AUTOTUTOR uses LSA in conjunction with vari- not anticipated them by virtue of the content
ous criteria when deciding which expectation in the curriculum script. AUTOTUTOR evaluates
to cover next. After each student turn, AUTOTU- student input by matching it to what it knows Our research
TOR updates the LSA score for each of the four in the curriculum script, not by constructing a
expectations listed earlier. An expectation is novel interpretation from whole cloth.
has revealed
considered covered if it meets or exceeds some that
threshold value (for example, .70 in our cur- How Does AUTOTUTOR Handle Mixed-
Initiative Dialogue? AUTOTUTOR is
rent tutor). One selection criterion uses the
zone of proximal development to select the We know from research on human tutoring almost as
next expectation, which is the highest LSA that it is the tutor who controls the lion’s share good as an
score that is below threshold. A second criteri- of the tutoring agenda (Graesser, Person, and
on uses coherence, the expectation that has Magliano 1995). Students rarely ask informa-
expert in
the highest LSA overlap with the previous tion-seeking questions and introduce new top- computer
expectation that was covered. Other criteria ics. However, when learners do take the initia-
that are currently being implemented are pre- tive, AUTOTUTOR needs to be ready to handle
literacy in
conditions and pivotal expectations. Ideally, these contributions. AUTOTUTOR does a moder- evaluating the
AUTOTUTOR will decide to cover a new expecta-
tion in a fashion that both blends into the con-
ately good job in managing mixed-initiative quality of
dialogue. AUTOTUTOR classifies the learner’s
versation and that advances the agenda in an speech acts into the following categories: student
optimal way. AUTOTUTOR generates a summary Assertion (RAM is a type of primary mem- answers in the
after all the expectations are covered (for exam-
ple, the tutor-30 turn).
ory.) tutorial
WH-question (What does bus mean and
How Does AUTOTUTOR other questions that begin with who,
dialogue ….
Give Feedback to the Student? what, when, where, why, how, and so on.)
There are three levels of feedback. First, there is YES-NO question (Is the floppy disk work-
backchannel feedback that acknowledges the ing?)
learner’s input. AUTOTUTOR periodically nods Metacognitive comment (I don’t under-
and says uh-huh after learners type in impor- stand.)
tant nouns but is not differentially sensitive to
Metacommunicative act (Could you re-
the correctness of the student’s nouns. The
peat that?)
backchannel feedback occurs online as the
learner types in the words of the turn. Learners Short response (okay, yes)
feel that they have an impact on AUTOTUTOR Obviously, AUTOTUTOR’S dialogue moves on
when they get feedback at this fine-grain level. turn N + 1 need to be sensitive to the speech
Second, AUTOTUTOR gives evaluative pedagogical acts expressed by the learner in turn N. When
feedback on the learner’s previous turn based on the student asks a What does X mean? ques-
the LSA values of the learner’s speech acts. The tion, the tutor answers the question by giving
facial expressions and intonation convey differ- a definition from a glossary. When the learner
ent levels of feedback, such as negative (for makes an assertion, the tutor evaluates the
example, not really while head shakes), neutral quality of the assertion and gives short evalua-
negative (okay with a skeptical look), neutral tive feedback. When the learner asks, What did
WINTER 2001 43
Articles
you say? AUTOTUTOR repeats what it said in the riculum script with deep-reasoning questions
last turn. The DAN manages the mixed-initia- and problems. The developer then computes
tive dialogue. LSA vectors on the content of the curriculum
scripts. A glossary of important terms and their
The Curriculum Script definitions is also prepared. After that, the
AUTOTUTOR has a curriculum script that orga- built-in modules of AUTOTUTOR do all the rest.
nizes the content of the topics covered in the AUTOTUTOR is currently implemented in JAVA for
tutorial dialogue. There are 36 topics, one for PENTIUM computers, so there are no barriers to
each major question or problem that requires widespread use.
deep reasoning. Associated with each topic are
a set of expectations, a set of hints and prompts
for each expectation, a set of anticipated bugs- ANDES: A Physics Tutoring
misconceptions and their corrections, and System That Does Not
(optionally) pictures or animations. It is very Use Natural Language
easy for a lesson planner to create the content
for these topics because they are English The goal of the second project is to use natural
descriptions rather than structured code. Of language–processing technology to improve
course, pictures and animations would require an already successful intelligent tutoring sys-
appropriate media files. We are currently devel- tem named ANDES (Gertner and VanLehn
oping an authoring tool that makes it easy to 2000; VanLehn 1996). ANDES is intended to be
create the curriculum scripts. Our ultimate goal used as an adjunct to college and high-school
is to make it very easy to create an AUTOTUTOR physics courses to help students do their
for a new knowledge domain. First, the devel- homework problems.
oper creates an LSA space after identifying a Figure 3 shows the ANDES screen. A physics
corpus of electronic documents on the domain problem is presented in the upper-left window.
knowledge. The lesson planner creates a cur- Students draw vectors below it, define variables
44 AI MAGAZINE
Articles
in the upper-right window, and enter equations shown to be effective (for example, Anderson et
in the lower-right window. When students al. [1995]; McKendree, Radlinski, and Atwood
enter a vector, variable, or equation, ANDES will [1992]; Reiser et al. [2002]). A new company,
color the entry green if it is correct and red if it Carnegie Learning,1 is producing such tutors for
is incorrect. This approach is called immediate use in high-school mathematics classes. As of
feedback and is known to enhance learning fall 2000, approximately 10 percent of the alge-
from problem solving (Anderson et al. 1995). bra I classes in the United States will be using
To give immediate feedback, ANDES must one of the Carnegie Learning tutors. Clearly,
understand the student’s entries no matter this AI technology is rapidly maturing.
how the student tries to solve the problem.
ANDES uses a rule-based expert system to solve Criticisms of ANDES and Other Similar
the problem in all correct ways. It gives nega- Tutoring Systems
tive feedback if the student’s entry does not The pedagogy of immediate feedback and hint
match one of the steps of one of the solutions sequences has sometimes been criticized for
from the expert model. For this reason, ANDES failing to encourage deep learning. The follow-
and similar tutoring systems are known as ing four criticisms are occasionally raised by
model-tracing tutors. They follow the student’s colleagues:
reasoning by comparing it to a trace of the First, if students don’t reflect on the tutor’s
model’s reasoning. hints but merely keep guessing until they find
an action that gets positive feedback, they can
How Does ANDES Hint and Give Help? learn to do the right thing for the wrong rea-
Students can ask ANDES for help by either click- sons, and the tutor will never detect the shallow
ing on the menu item What do I do next? or by learning (Aleven, Koedinger, and Cross 1999).
selecting a red entry and clicking on the menu Second, the tutor does not ask students to
item What’s wrong with that? ANDES uses a explain their actions, so students might not
Bayesian network to help it determine which learn the domain’s language. Educators have
step in the expert’s solution to give the student recently advocated that students learn to “talk
Talking
help on (Gertner, Conati, and VanLehn 1998). science.” Talking science is allegedly part of a science is
It prints in the lower-left window a short mes- deep understanding of the science. It also facil- allegedly part
sage, such as the one shown in figure 3. The itates writing scientifically, working collabora-
message is only a hint about what is wrong or tively in groups, and participating in the cul- of a deep
what to do next. Often a mere hint suffices, and ture of science. understanding
the students are able to correct their difficulty Third, to understand the students’ thinking,
and move on. However, if the hint fails, then the user interface of such systems requires stu- of the science.
the student can ask for help again. ANDES gener- dents to display many of the details of their rea-
ates a second hint that is more specific than the soning. This design doesn’t promote stepping
first. If the student continues to ask for help, back to see the “basic approach” one has used
ANDES’s last hint will essentially tell the student to solve a problem. Even students who have
what to do next. This technique of giving help received high grades in a physics course can sel-
is based on human-authored hint sequences. dom describe their basic approaches to solving
Each hint is represented as a template. It is filled a problem (Chi, Feltovich, and Glaser 1981).
in with text that is specific to the situation Fourth, when students learn quantitative
where help was requested. Such hint sequences skills, such as algebra or physics problem solv-
are often used in intelligent tutoring systems ing, they are usually not encouraged to see their
and are known to enhance learning from prob- work from a qualitative, semantic perspective.
lem solving (McKendree 1990). As a consequence, they fail to induce versions
During evaluations in the fall of 2000 at the of the skills that can be used to solve qualitative
U.S. Naval Academy, students using ANDES problems and check quantitative ones for rea-
scored about a letter grade (0.92 standard devi- sonableness. Even physics students with high
ation units) higher on the midterm exam than grades often score poorly on tests of qualitative
students in a control group (Shelby et al. 2002). physics (Halloun and Hestenes 1985).
Log file data indicate that students are using Many of these objections can be made to just
the help and hint facilities as expected. Ques- about any form of instruction. Even expert
tionaire data indicate that many of them prefer tutors and teachers have difficulty getting stu-
doing their homework on ANDES to doing it dents to learn deeply. Therefore, these criticisms
with paper and pencil. of intelligent tutoring systems should only
Other intelligent tutoring systems use similar encourage us to improve them, not reject them.
model tracing, immediate feedback, and hint There are two common themes in this list of
sequences techniques, and many have been four criticisms. First, all four involve integrat-
WINTER 2001 45
Articles
46 AI MAGAZINE
Articles
ken language dialogue systems (Jurafsky and In a limited sense, the KCDs are intended to
Martin 2000), so it makes sense to start with be better than naturally occurring dialogues.
them and see where they break down. Just as most text expresses its ideas more clearly
Our primary tool is the KCD editor (figure 5). than informal oral expositions, the KCD is
In the upper-left window, the author selects a intended to express its ideas more clearly than
topic, which is deceleration in this case. This the oral tutorial dialogues that human tutors
selection causes a shorthand form of the recipes generate. Thus, we need a way for expert physi-
(discourse plans) to appear in the upper-right cists, tutors, and educators to critique the KCDs
window. Selecting a tutor-student interaction and suggest improvements. Because the under-
brings up windows for seeing the tutor’s contri- lying finite-state network can be complex, it is
bution (as with the lower-right window) and not useful to merely print it out and let experts
the student’s expected answers (middle win- pencil in comments. The second tool facilitates
dows). The left-middle window is for correct critiquing KCDs by allowing expert physicists
answers. As in AUTOTUTOR, the student’s expect- and psychologists to navigate around the net-
ed answer is represented as a set of expectations work and enter comments on individual states
(left and opposite in this case). The right-mid- and arcs (figure 6). It presents a dialogue in the
dle window is for incorrect answers. When one left column and allows the user to enter com-
of these is selected, a subdialogue for handling ments in the right column. Because there are
it is displayed in the lower-left window. Notice many expected responses for each tutorial con-
that the author enters natural language text for tribution, the user can select a response from a
the tutor contribution, the expectations, and pull-down menu, causing the whole dialogue
almost everything else. to adjust, opening up new boxes for the user’s
WINTER 2001 47
Articles
48 AI MAGAZINE
Articles
WHY2: Tutoring
Qualitative Explanations Question: A battery is connected to a bulb by two wires.
All tutoring systems have students perform a The bulb lights. Why?
task, and they help the students do it. Some
tutoring systems, such as ANDES and ATLAS, have
1. Student: There are two currents
the student solve problems. Other tutoring sys-
which come out, one current
tems, such as AUTOTUTOR, ask the student deep
positive and one negative. They
questions and help them formulate a correct,
combine in the bulb to produce
complete answer. Recent work with human light and heat.
tutors (for example, Chi et al. [2001]) suggests
that a good activity for teaching is to have stu- 2. Tutor: I see. Which current is composed of moving
dents explain physical systems qualitatively. electrons?
Although it is possible to have them express 3. Student: I’m not sure. Both?
their explanations in formal or graphic lan-
4. Tutor: Right, both are. An electrical current is just
guages (for example, CYCLEPAD [Forbus et al.
moving electrons. Which direction do the electrons
1998]), we believe that they will learn more if move in your two currents?
they can express their explanations in natural
language. Thus, the goal of the WHY2 project is 5. Student: I think the electrons move out from the
to coach students as they explain physical sys- battery to the bulb. The two currents meet there and
tems in natural language. produce energy.
WHY2 is intended to be a successor of one of 6. Tutor: Are the electrons actually destroyed in the bulb?
the first intelligent tutoring systems in the lit-
7. Student: I’m not sure.
erature, the WHY system. WHY was envisioned
and partially implemented by Albert Stevens 8. Tutor: Actually, electrons are not destroyed or lost in
and Alan Collins (Stevens and Collins 1977). an electrical current. Instead, they go out one side of
They studied experts helping students articu- the battery, go through the bulb and go back in the
late such explanations and tried to embed their other side of the battery. Thus, your two currents are
tutorial strategies in the WHY system. Stevens actually the same flow, only the electrons are leaving
and Collins discovered that students had a the battery for one and returning to the battery for
great many misconceptions about nature. the other. Given everything you know about batteries
These misconceptions would only surface and bulbs, does this make sense?
when students expressed their ideas qualita- 9. Student: But I thought the battery dies when it runs
tively because they could solve textbook quan- out of electrons.
titative problems correctly (Halloun and 10. Tutor: Batteries die when they run out of the chemical
Hestenes 1985). Since this time, considerable energy that they use to pump electrons around the
effort has been expended by physics educators circuit. The battery is like a water pump that produces
to discover, catalog, and invent remedies for a powerful current, and the bulb is like a paddle wheel
student misconceptions. The remedies are usu- that is driven by the water current. When the water
ally intended for classroom or laboratories and pump runs out of energy, the current slows to a stop.
have had only moderate success (Hake 1998). As this happens, does the number of electrons in the
By adapting them to the tutorial setting and battery change?
embedding the tutorial strategies uncovered by
Collins, Stevens, and others, WHY2 might be 11. Student: Maybe a little. But it stays about the same.
much more successful. 12. Tutor: Exactly right! I think you understand electrical
The basic idea of WHY2 is to ask the student current better now. See if you can write a better
to type in an explanation for a simple physical explanation.
situation, such as the battery-bulb circuit
shown in figure 7. WHY2 analyzes the student’s
explanation (line 1 in Figure 7) to see if the stu- Figure 7. A Hypothetical Dialogue between a Student and WHY2.
dent has any misconceptions. If it detects a
misconception, it invokes a knowledge con-
struction dialogue (KCD), such as the one and is still in the design stages. A corpus of
shown in lines 2 through 9. During this dia- explanations from students has been collected
logue, further misunderstandings may arise, and is being analyzed to see what kinds of mis-
which can cause another KCD to be selected conceptions and language the students are
and applied (see lines 10 onward). using. Our plan is to use a combination of the
WHY2 is a joint project involving both the LSA technology from AUTOTUTOR and the
AUTOTUTOR and ATLAS groups. It began recently semantic composition technology from ATLAS.
WINTER 2001 49
Articles
The KCDs of ATLAS will be generalized Acknowledgments Generation Computer Tutors: Learn from
to incorporate elements of the DANs or Ignore Human Tutors? In Proceedings of
The AUTOTUTOR research was support-
of AUTOTUTOR. the 1999 Conference of Computer-Human
ed by grants from the National Science Interaction, 85–86. New York: Association
Our dialogue technology can be Foundation (SBR 9720314) and the of Computing Machinery.
stressed by the complexity of the lan- Office of Naval Research (N00014-00-
guage and discourse we anticipate Di Eugenio, B.; Jordan, P. W.; Thomason, R.
1-0600). The ANDES research was sup- H.; and Moore, J. D. 2000. The Agreement
from the students. However, if we can ported by grant N00014-96-1-0260 Process: An Empirical Investigation of
make it work, the pedagogical payoffs from the Cognitive Sciences Division Human-Human Computer-Mediated Dia-
will be enormous. Repairing the quali- of the Office of Naval Research. The logues. International Journal of Human-Com-
tative misconceptions of physics is a ATLAS research is supported by grant puter Studies 53(6): 1017–1076.
difficult and fundamentally important 9720359 from the LIS program of the Forbus, K. D.; Everett, J. O.; Ureel, L.;
problem. National Science Foundation. The Brokowski, M.; Baher, J.; and Kuehne, S. E.
WHY2 research is supported by grant 1998. Distributed Coaching for an Intelli-
N00014-00-1-0600 from the Cognitive gent Learning Environment. Paper present-
Conclusions Sciences Division of the Office of ed at the AAAWorkshop on Quantitative
Reasoning, 26–29 May, Cape Cod, Massa-
We discussed three projects that have Naval Research.
chusetts.
several similarities. AUTOTUTOR, ATLAS,
Note Freedman, R. 1999. ATLAS: A Plan Manager
and WHY2 all endorse the idea that stu-
for Mixed-Initiative, Multimodal Dialogue.
dents learn best if they construct 1. www.carnegielearning.com. Paper presented at the 1999 AAAI Work-
knowledge themselves. Thus, their dia- shop on Mixed-Initiative Intelligence, 19
logues try to elicit knowledge from the References
July, Orlando, Florida.
student by asking leading questions. Aleven, V.; Koedinger, K. R.; and Cross, K.
Freedman, R., and Evens, M. W. 1996. Gen-
They only tell the student the knowl- 1999. Tutoring Answer Explanation Fosters
erating and Revising Hierarchical Multi-
edge as a last resort. All three projects Learning with Understanding. In Artificial
Turn Text Plans in an ITS. In Intelligent
manage dialogues by using finite-state Intelligence in Education, eds. S. P. Lajoie and
Tutoring Systems: Proceedings of the 1996
M. Vivet, 199–206. Amsterdam: IOS.
networks. Because we anticipate build- Conference, eds. C. Frasson, G. Gauthier,
ing hundreds of such networks, the Anderson, J. R.; Corbett, A. T.; Koedinger, K. and A. Lesgold, 632–640. Berlin: Springer.
R.; and Pelletier, R. 1995. Cognitive Tutors:
projects are building tools to let Gertner, A.; Conati, C.; and VanLehn, K.
Lessons Learned. The Journal of the Learning
domain authors enter these dialogues 1998. Procedural Help in ANDES: Generating
Sciences 4(2): 167–207.
in natural language. All three projects Hints Using a Bayesian Network Student
Cassell, J., and Thorisson, K. R. 1999. The Model. In Proceedings of the Fifteenth
use robust natural language–under- Power of a Nod and a Glance: Envelope ver- National Conference on Artificial Intelli-
standing techniques—LSA for AUTOTU- sus Emotional Feedback in Animated Con- gence, 106–111. Menlo Park, Calif.: Ameri-
TOR, CARMEL for ATLAS, and a combina- versational Agents. Applied Artificial Intelli- can Association for Artificial Intelligence.
tion of the two for WHY2. All three gence 13(3): 519–538.
Gertner, A. S., and VanLehn, K. 2000. ANDES:
projects began by analyzing data from Chi, M. T. H.; Feltovich, P.; and Glaser, R. A Coached Problem-Solving Environment
human tutors and are using evalua- 1981. Categorization and Representation of for Physics. In Intelligent Tutoring Systems:
tions with human students through- Physics Problems by Experts and Novices. Fifth International Conference, ITS 2000, eds.
out their design cycle. Cognitive Science 5(2): 121–152. G. Gautheier, C. Frasson, and K. VanLehn,
Although the three tutoring systems Chi, M. T. H.; de Leeuw, N.; Chiu, M.; and 133–142. New York: Springer.
have the common objective of helping LaVancher, C. 1994. Eliciting Self-Explana- Graesser, A. C.; Person, N. K.; and
students perform activities, the specific tions Improves Understanding. Cognitive Magliano, J. P. 1995. Collaborative Dia-
tasks and knowledge domains are Science 18(3): 439–477. logue Patterns in Naturalistic One-on-One
rather different. AUTOTUTOR’S students Chi, M. T. H.; Siler, S.; Jeong, H.; Yamauchi, Tutoring. Applied Cognitive Psychology 9(4):
are answering deep questions about T.; and Hausmann, R. G. 2001. Learning 495–522.
computer technology, ATLAS’s students from Tutoring: A Student-Centered versus a Graesser, A. C.; Wiemer-Hastings, K.;
are solving quantitative problems, and Tutor-Centered Approach. Cognitive Science. Wiemer-Hastings, P.; Kreuz, R.; and the
Forthcoming. Tutoring Research Group 1999. AUTOTUTOR:
WHY2’s students are explaining physi-
cal systems qualitatively. We might Cohen, P. A.; Kulik, J. A.; and Kulik, C. C. A Simulation of a Human Tutor. Journal of
1982. Educational Outcomes of Tutoring: A Cognitive Systems Research 1(1): 35–51.
ultimately discover that the conversa-
Meta-Analysis of Findings. American Educa- Graesser, A. C.; Wiemer-Hastings, P.;
tional patterns need to be different for
tional Research Journal 19(2): 237–248. Wiemer-Hastings, K.; Harter, D.; Person, N.;
these different domains and tasks. That
Conati, C., and VanLehn, K. 1999. Teach- and the Tutoring Research Group. 2000.
is, dialogue styles might need to be dis-
ing Metacognitive Skills: Implementation Using Latent Semantic Analysis to Evaluate
tinctively tailored to particular classes and Evaluation of a Tutoring System to the Contributions of Students in AUTOTU-
of knowledge domains. A generic dia- Guide Self-Explanation While Learning TOR. Interactive Learning Environments 8(2):
logue style might prove to be unsatis- from Examples. In Artificial Intelligence in 129–148.
factory. Whatever discoveries emerge, Education, eds. S. P. Lajoie and M. Vivet, Hake, R. R. 1998. Interactive-Engagement
we suspect they will support one basic 297–304. Amsterdam: IOS. versus Traditional Methods: A Six-Thou-
claim: Conversational dialogue sub- Corbett, A.; Anderson, J.; Graesser, A.; sand Student Survey of Mechanics Test
stantially improves learning. Koedinger, K.; and VanLehn, K. 1999. Third Data for Introductory Physics Students.
50 AI MAGAZINE
Articles
American Journal of Physics 66(4): 64–74. Rosé, C. P. 2000. A Framework for Robust Kurt VanLehn is a pro-
Halloun, I. A., and Hestenes, D. 1985. Com- Semantic Interpretation. In Proceedings of fessor of computer sci-
mon Sense Concepts about Motion. Ameri- the First Meeting of the North American Chap- ence and intelligent sys-
can Journal of Physics 53(11): 1056–1065. ter of the Association for Computational tems at the University of
Linguistics, 311–318. San Francisco, Calif.: Pittsburgh, director of
Hestenes, D.; Wells, M.; and Swackhamer,
Morgan Kaufmann. the Center for Interdisci-
G. 1992. Force Concept Inventory. The
Rosé, C. P., and Lavie, A. 2001. Balancing plinary Research on
Physics Teacher 30(3): 141–158.
Robustness and Efficiency in Unification- Constructive Learning
Johnson, W. L.; Rickel, J. W.; and Lester, J. Environments, and a
Augmented Context-Free Parsers for Large
C. 2000. Animated Pedagogical Agents: senior scientist at the Learning Research
Practical Applications. In Robustness in Lan-
Face-to-Face Interaction in Interactive and Development Center. His main inter-
guage and Speech Technology, eds. J. C. Jun-
Learning Environments. International Jour- ests are applications of AI to tutoring and
qua and G. V. Noord, 239–269. Amsterdam:
nal of Artificial Intelligence in Education assessment. He is a senior editor for the
Kluwer Academic.
11(1): 47–78. journal Cognitive Science. His e-mail address
Rosé, C. P.; DiEugenio, B.; and Moore, J.
Jurafsky, D., and Martin, J. E. 2000. Speech is vanlehn@cs.pitt.edu.
1999. A Dialogue-Based Tutoring System
and Language Processing: An Introduction to
for Basic Electricity and Electronics. In Arti- Carolyn Rosé is a
Natural Language Processing, Computational
ficial Intelligence in Education, eds. S. P. research associate at the
Linguistics, and Speech Recognition. Upper
Lajoie and M. Vivet, 759–761. Amsterdam: Learning Research and
Saddle River, N.J.: Prentice Hall.
IOS. Development Center at
Koedinger, K. R.; Anderson, J. R.; Hadley, W. the University of Pitts-
Shelby, R. N.; Schulze, K. G.; Treacy, D. J.;
H.; and Mark, M. A. 1997. Intelligent Tutor- burgh. Her main re-
Wintersgill, M. C.; VanLehn, K.; and Wein-
ing Goes to School in the Big City. Journal search focus is on devel-
stein, A. 2001. The Assessment of ANDES
of Artificial Intelligence in Education 8(1): oping robust language
Tutor. Forthcoming.
30–43. understanding technolo-
Stevens, A., and Collins, A. 1977. The Goal
Kulik, J. A., and Kulik, C. L. C. 1988. Timing gy and authoring tools to facilitate the
Structure of a Socratic Tutor. In Proceedings
of Feedback and Verbal Learning. Review of rapid development of dialogue interfaces
of the National ACM Conference, 256–263.
Educational Research 58(1): 79–97. for tutoring systems. Her e-mail address is
New York: Association of Computing
Landauer, T. K.; Foltz, P. W.; and Laham, D. rosecp@pitt.edu.
Machinery.
1998. An Introduction to Latent Semantic Pamela Jordan is a
VanLehn, K. 1996. Conceptual and Met-
Analysis. Discourse Processes 25(2–3): research associate at the
alearning during Coached Problem Solv-
259–284. Learning Research and
ing. In Proceedings of the Third Intelligent
Lesgold, A.; Lajoie, S.; Bunzo, M.; and Development Center at
Tutoring Systems Conference, eds. C. Frasson,
Eggan, G. 1992. SHERLOCK: A Coached Prac- the University of Pitts-
G. Gauthier, and A. Lesgold, 29–47. Berlin:
tice Environment for an Electronics Trou- burgh. Her main interests
Springer-Verlag.
bleshooting Job. In Computer-Assisted are in computer-mediat-
VanLehn, K.; Freedman, R.; Jordan, P.; Mur-
Instruction and Intelligent Tutoring Systems, ed natural language dia-
ray, C.; Osan, R.; Ringenberg, M.; Rosé, C.
eds. J. H. Larkin and R. W. Chabay, logue, the analysis of
P.; Schulze, K.; Shelby, R.; Treacy, D.; Wein-
201–238. Hillsdale, N.J.: Lawrence Erlbaum. these dialogues to identify effective commu-
stein, A.; and Wintersgill, M. 2000. Fading
Lester, J. C.; Voerman, J. L.; Townes, S. G.; nication strategies, and the creation of dia-
and Deepening: The Next Steps for ANDES
and Callaway, C. B. 1999. Deictic Believ- logue agents to test strategies. Her e-mail
and Other Model-Tracing Tutors. In Intelli-
ability: Coordinating Gesture, Locomotion, address is pjordan@pitt.edu.
gent Tutoring Systems: Fifth International
and Speech in Life-Like Pedagogical Agents. Conference, ITS 2000, eds. G. Gauthier, C. Derek Harter is a Ph.D.
Applied Artificial Intelligence 13(4–5): Frasson, and K. VanLehn, 474–483. Berlin: candidate in computer
383–414. Springer-Verlag. science at the University
McKendree, J. 1990. Effective Feedback of Memphis. He holds a
Content for Tutoring Complex Skills. B.S. in computer science
Human-Computer Interaction 5:381–413. from Purdue University
and an M.S. in computer
McKendree, J.; Radlinski, B.; and Atwood,
science and AI from
M. E. 1992. The GRACE Tutor: A Qualified Arthur Graesser is a pro- Johns Hopkins Universi-
Success. In Intelligent Tutoring Systems: Sec- fessor of psychology and ty. His main interests are in dynamic and
ond International Conference, eds. C. Frasson, computer science at the embodied models of cognition and
G. Gautheir, and G. I. McCalla, 677–684. University of Memphis, neurologically inspired models of action
Berlin: Springer-Verlag. codirector of the Insti- selection for autonomous agents. His e-
Person, N. K.; Graesser, A. C.; Kreuz, R. J.; tute for Intelligent Sys- mail address is dharter@memphism.edu.
Pomeroy, V.; and the Tutoring Research tems, and director of the
Group. 2001. Simulating Human Tutor Dia- Center for Applied Psy-
logue Moves in AUTOTUTOR. International chological Research. He
Journal of Artificial Intelligence in Education. has conducted research on tutorial dia-
Forthcoming. logue in intelligent tutoring systems and is
Reiser, B. J.; Copen, W. A.; Ranney, M.; current editor of the journal Discourse
Hamid, A.; and Kimberg, D. Y. 2002. Cogni- Processes. His e-mail address is a-
tive and Motivational Consequences of graesser@memphis.edu.
Tutoring and Discovery Learning. Cognition
and Instruction. Forthcoming.
WINTER 2001 51
Articles
52 AI MAGAZINE