Artificial Intelligence in Medicine
Volume 38, Issue 1 , Pages 25-46, September 2006

An intelligent tutoring system that generates a natural language dialogue using dynamic multi-level planning

  • Chong Woo Woo

      Affiliations

    • School of Computer Science, Kookmin University, 861-1 Chongnung-Dong, Sungbuk-Ku, Seoul, Republic of Korea
  • ,
  • Martha W. Evens

      Affiliations

    • Computer Science Department, Illinois Institute of Technology, Room 236, 10 West 31st Street, Chicago, IL 60616, USA
    • Corresponding Author InformationCorresponding author. Tel.: +1 312 567 5153; fax: +1 312 567 5067.
  • ,
  • Reva Freedman

      Affiliations

    • Northern Illinois University, De Kalb, IL 60115, USA
  • ,
  • Michael Glass

      Affiliations

    • Valparaiso University, Valparaiso, IN 46383, USA
  • ,
  • Leem Seop Shim

      Affiliations

    • HS Tech, Inc., 26500 Agoura Road, Suite #108, Calabasas, CA 91302, USA
  • ,
  • Yuemei Zhang

      Affiliations

    • Wells Fargo - N9301-01J, 255 Second Avenue South, Minneapolis, MN 55479, USA
  • ,
  • Yujian Zhou

      Affiliations

    • WebEx Communications, Inc., 3979 Freedom Circle, Santa Clara, CA 95054, USA
  • ,
  • Joel Michael

      Affiliations

    • Department of Molecular Biophysics and Physiology, Rush Medical College, 1750 West Harrison, Chicago, IL 60612, USA

Received 16 February 2005; received in revised form 14 October 2005; accepted 21 October 2005.

Article Outline

Summary 

Objective

The objective of this research was to build an intelligent tutoring system capable of carrying on a natural language dialogue with a student who is solving a problem in physiology. Previous experiments have shown that students need practice in qualitative causal reasoning to internalize new knowledge and to apply it effectively and that they learn by putting their ideas into words.

Methods

Analysis of a corpus of 75 hour-long tutoring sessions carried on in keyboard-to-keyboard style by two professors of physiology at Rush Medical College tutoring first-year medical students provided the rules used in tutoring strategies and tactics, parsing, and text generation. The system presents the student with a perturbation to the blood pressure, asks for qualitative predictions of the changes produced in seven important cardiovascular variables, and then launches a dialogue to correct any errors and to probe for possible misconceptions. The natural language understanding component uses a cascade of finite-state machines. The generation is based on lexical functional grammar.

Results

Results of experiments with pretests and posttests have shown that using the system for an hour produces significant learning gains and also that even this brief use improves the student's ability to solve problems more then reading textual material on the topic. Student surveys tell us that students like the system and feel that they learn from it. The system is now in regular use in the first-year physiology course at Rush Medical College.

Conclusion

We conclude that the CIRCSIM–Tutor system demonstrates that intelligent tutoring systems can implement effective natural language dialogue with current language technology.

Keywords: Intelligent tutoring system, Natural language dialogue, Instructional planning, Dynamic planning, Hierarchical planning, Reactive planning, Language understanding, Dialogue generation

 

Back to Article Outline

1. Introduction 

1.1. Research goals 

The goal of this research was to develop an intelligent tutoring system capable of carrying on a natural language dialogue with students. Our system was originally conceived by two professors at Rush Medical College, Joel Michael and Allen Rovick, inspired by the conviction that natural language interaction was the most effective way for students to learn. They observed that students learn best when required to give explanations of their thinking and they instituted small group problem-solving sessions and individual tutoring sessions to supplement their own computer-aided instruction (CAI) systems [1], [2].

CIRCSIM–Tutor is designed to help first year medical students learn to solve problems involving the baroreceptor reflex system, which stabilizes blood pressure in the human body. The system presents a perturbation to the cardiovascular system and asks the student to make qualitative predictions about changes in seven important cardiovascular parameters. It analyzes these predictions, identifies any errors, and assists them in correcting their errors. The design of the system is based on the analysis of human tutoring sessions carried on by Michael and Rovick at Rush Medical College. This analysis convinced us that planning plays a central role in the generation of a tutorial dialogue.

When CIRCSIM–Tutor asks the student to make predictions about the behavior of various cardiovascular parameters it asks only whether that parameter rose or fell or stayed the same, because the focus is on qualitative causal reasoning. Practicing physicians do not generally need to know the numeric values of these parameters but they do need to use this kind of qualitative reasoning every day [3]. Michael and Rovick originally used a detailed mathematical model in their CAI system, but they found that students lost track of the underlying ideas when they tried to handle detailed models and tables of numbers [4]. So from the very beginning of this project one of the goals was to teach qualitative reasoning [5], [6], [7], an approach to problem solving that reasons about the causal relationships that structure our world. Anderson [8] argues that qualitative reasoning is the most demanding approach, one that is essential to a high performance tutoring system. He claims that it can also maximize pedagogical effectiveness, because it is human-like reasoning, although the implementation effort is much larger than that required for the traditional models.

1.2. Evolution of computer-based instruction at rush medical college 

Michael and Rovick had a great deal of experience with CAI in the cardiovascular domain at Rush Medical College. Their systems evolved from HEARTSIM [1], to CIRCSIM [2], [9], to the CIRCSIM–Tutor prototype [10], [11] and finally to CIRCSIM–Tutor, which has itself evolved over almost 15 years [12]. HEARTSIM was a Plato program and CIRCSIM is a stand-alone Basic program. The CIRCSIM–Tutor prototype was a Prolog prototype of our intelligent tutoring system designed and implemented without any natural language capabilities [10]. CIRCSIM–Tutor uses many features from the CIRCSIM–Tutor prototype but it is written in Lisp and it includes much more complete student modeling, instructional planning, and natural language facilities.

1.3. Natural language dialogue and tutoring systems 

We made natural language dialogue the core of our system, because it is an especially powerful tool for learning. Chi et al. [13] have now produced scientific verification of our belief that putting ideas into your own words is a central part of learning. In an ingenious experiment Fox Tree [14] demonstrated that people remember ideas that they have heard discussed in dialogue better than a monologue on the same topic. Perhaps this is why Plato's Dialogues have survived for 2400 years while so much other Greek learning has been lost. Fox [15] pointed out that, in tutoring dialogues, the tutors and the students typically construct the answer together, so the students remain active participants and also share ownership in the result.

The builders of the first intelligent tutoring systems, Carbonell [16], Carr and Goldstein [17], Burton and Brown [18], Collins and Stevens [19], all assumed that tutoring should be carried out via natural language dialogue. Then the difficulty of natural language processing and the attractions of the new graphical user interfaces drew intelligent tutoring systems research in a different direction. When this project began, CIRCSIM–Tutor was alone in the field with Wilensky's [20], [21] Unix Consultant, which is really a coach and not a tutor. There was important research in dialogue-based intelligent tutoring systems carried on by Woolf [22], [23] and Cawsey [24], but they used template-based generation and limited (partly menu) input rather than trying to handle whatever the user typed and generating responses from scratch.

Happily, CIRCSIM–Tutor is not so lonely anymore. Increases in machine capability have made natural language dialogue much more manageable and knowledge of text generation has increased rapidly. One notable example is Atlas [25], [26], a physics tutor at the University of Pittsburgh. VanLehn started from Andes, a successful cognitive tutor for physics, and assembled a top team to provide natural language interaction. Freedman's Atlas Planning Environment carried out the dialogue planning [27]. Rosé's [26], [28] Carmel parser did the parsing and produced a logical representation of the student input. Jordan et al. [29], [30] used Tacitus-Lite for reasoning about the analysis of the student's essay and also developed knowledge creation dialogues for dialogue generation.

Graesser and his group at the University of Memphis built AutoTutor [31] based on their studies of human tutoring, which uses latent semantic analysis (LSA) to handle natural language understanding and generation. For natural language understanding, they used LSA to match the student input to one or several ideal answers. They used LSA in generation, as well, to pick out the most relevant answers to a question from a collection of texts generated by experts.

With the encouragement of a multi-university research grant from the Office of Naval Research the Atlas and AutoTutor project teams joined forces to build tutors for qualitative physics using their two different approaches. The current generation of tutors resulting from this research, Why2-Atlas [32] and Why2-AutoTutor [33], has been shown to be more effective than reading text; the practical alternative for most university courses. Both tutors present the student with a problem and ask them to write a short essay giving an answer and an explanation of their reasoning. Then the system critiques the essay and helps the student to improve its content. VanLehn's team has also built the Pyrenees tutor [34], another physics tutor much like Atlas, except that it discusses problem-solving algorithms with the student in explicit terms, which gives a significant improvement. Making use of these results, Lane and VanLehn [35] have recently developed another dialogue-based tutor for introductory programming students that emphasizes the understanding of the algorithms involved. These tutors must analyze input with longer and more complex content than CIRCSIM–Tutor sees, but their dialogue is not as interactive.

CATO, developed by Ashley and Aleven, is designed to help law students learn techniques of argumentation. Ashley is a well-known expert in legal artificial intelligence, while Aleven contributes the natural language expertise [36], [37]. As a by-product of this research, the authors carried out an experiment demonstrating the efficacy of Socratic tutoring over more didactic tutoring.

A series of experiments by Di Eugenio [38] showed that improving the quality of the natural language generation of an existing system can make a significant difference in learning outcomes. Moore's BEETLE [39], [40], designed to teach basic electricity and electronics to Navy recruits, uses even more sophisticated generation techniques. Her group is using these excellent natural language capabilities to encourage students to participate more fully by responding to student affect and playing an effective part in the mixed-initiative dialogues that result. Forbus and Rosé have combined forces to give Forbus's well-known CyclePad tutor for engineering design [6] a new interface called CycleTalk [41], which has the ability to hold tutoring dialogues with considerable success.

The systems described so far, like CIRCSIM–Tutor, all make use of written interaction, but the age of speech-enabled tutoring has begun. Litman has added a speech front-end to Atlas to produce ITSpoke [42]. In a series of well-designed experiments, she has shown that ITSpoke produces even better learning outcomes than Atlas. Stanley Peters and his team at the Center for the Study of Language and Information at Stanford University have added speech to Wilkins’ Naval Damage Control simulation to produce SCoT [43]. Now that this tutor has been shown to be effective, it may suggest a way to provide training for various types of emergency response teams. In summary, there are now several groups making important contributions to our knowledge of dialogue-based tutoring.

1.4. Domain of CIRCSIM–Tutor—the baroreceptor reflex 

The cardiovascular system consists of many mutually interacting components, and it is important for the student to understand the cause and effect relationships between the individual components of the system. Fig. 1 shows a causal model of CIRCSIM–Tutor, called the “concept map,” designed by Michael and Rovick [44], [4]. Each box in the map represents a physiological variable, such as SV (Stroke volume) and MAP (mean arterial pressure). An arrow with a plus or a minus sign between two boxes tells the direction of the causal effects and whether the causal relationship between the connected variables is direct or inverse. For example, a qualitative change in one component of the system, a decrease in CVP (central venous pressure), directly causes a decrease in SV. This qualitative change propagates to other adjacent components of the system according to the propagation rule. It is important for the student to recognize that when the baroreceptors sense a change in MAP, the baroreceptor reflex kicks in and the central nervous system (CNS in the diagram) directly manipulates three neural variables, the heart rate (HR), the inotropic state (IS), and the total peripheral resistance (TPR), in order to regulate MAP.

  • View full-size image.
  • Figure 1. 

    The causal concept map. (An arrow from box A to box B means that parameter A immediately determines parameter B. A plus sign indicates that this relationship is direct; a minus sign indicates that it is inverse.) RV: venous resistance, PIT: intrathoracic pressure, CVP: central venous pressure, CBV: central blood volume, BV: blood volume, SV: Stroke Volume, CO: Cardiac Output, MAP: mean arterial pressure, BR: baroreceptor reflex, CNS: central nervous system, IS: inotropic state, HR: heart rate, TPR: total peripheral resistance.

There are three stages in the human body's response to a perturbation in the system that controls blood pressure. The first stage is the direct response (DR), in which a perturbation in the system has an immediate physical, hemodynamic effect on the other parameters. The second stage is the reflex response (RR), in which other parameters are affected by the negative feedback mechanism to stabilize the blood pressure. The final stage is the steady state (SS), which is achieved as a balance between the changes directly caused by the initial perturbation and the further changes induced by the negative feedback process.

1.5. Organization of this paper 

In the next section, we describe what the system looks like from the user's point of view, display a sample fragment of dialogue and give a brief report of the system trial in November 1999, which demonstrated that an hour with the system produced larger learning gains than reading a carefully chosen piece of text for the same amount of time. In Section 3, we describe some of the special features of the CIRCSIM–Tutor system. The rest of the paper describes how the system works to produce the kind of dialogue shown in the example. In Section 4, we describe the system architecture and then in the subsequent sections we describe each major module in the system and how it functions. Section 5 discusses the core issue of planning and describes the many kinds of planning that are needed for expert tutoring. Section 6 describes the domain knowledge base and the problem solver and Section 7 describes the screen manager. Section 8 discusses some different approaches to understanding the student input. Section 9 describes the student modeler and the different types of assessment that the system makes of the student's performance. Section 10 describes our approach to generating output and Section 11 presents our conclusions.

Back to Article Outline

2. CIRCSIM–Tutor in action 

2.1. How CIRCSIM–Tutor interacts with the student 

CIRCSIM–Tutor begins with a brief introductory message and then displays a list of eight available procedures (shown in Table 1). These procedures were developed by Michael and Rovick for use in the CIRCSIM program and were inherited by CIRCSIM–Tutor. Each procedure (called that because they replaced experimental procedures with animals) describes a perturbation of the cardiovascular system. As soon as the student has made a choice, the system brings up the screen in Fig. 2 with a description of the procedure in the window on the upper right and the prediction table underneath. Table 2 shows a larger diagram of the prediction table. The first column is used to enter qualitative predictions for the DR phase before the baroreceptor kicks in. A popup menu allows the student to enter a “+” sign to indicate an increase, a “−” for a decrease and a “0” to indicate no change.

Table 1. List of available procedures
1. Decrease arterial resistance (Ra) to 50% of normal
2. Denervate the baroreceptors
3. Decrease Ra to 50% of normal in a denervated preparation
4. Hemorrhage: remove 0.5 liter of blood
5. Hemorrhage: remove an additional 1.0l of blood
6. Decrease cardiac contractility to 50% of normal
7. Increase venous resistance to 200% of normal
8. Increase intrathoracic pressure to 2mg. Hg
Table 2. The CIRCSIM–Tutor prediction table
ParametersDRRRSS
Inotropic state+
Central venous pressure+
Stroke volume+
Heart rate0+
Cardiac output+
Total peripheral resistance0+
Mean arterial pressure+

DR: direct response; RR: reflex response; SS: steady state.

CIRCSIM–Tutor asks the student to figure out which variable will change first and enter the change for that variable in the corresponding square. If the student has difficulty in doing this, the system gives the student a hint. If that hint does not work, it produces a broader hint. If the student's third try is still wrong, the system tells the student the answer. Once the student has succeeded in predicting the first variable, the system asks for predictions for the rest of the first column without giving any feedback until the student has predicted all six remaining variables. The system then marks any errors with a diagonal bar across the box and starts a remedial dialogue with the student about these errors, as shown in the figure. After the student has corrected all the errors in the DR column, the system asks for predictions for the RR phase, then again marks any errors, and begins another tutorial dialogue. Once the RR errors have been corrected, the system asks for predictions about the behavior of these parameters in the SS phase (the third and last column) and then again launches a tutorial dialogue.

2.2. The prediction table/multiple simultaneous inputs 

CIRCSIM–Tutor begins with a prediction table, in which the student is asked to make qualitative predictions about the behavior of the system given a particular perturbation. CIRCSIM–Tutor inherited the prediction table from CIRCSIM [2], [9]. This very successful, widely used system asks the student to fill in all three columns of predictions at once, recognizes certain patterns of errors, and then delivers one of over 240 targeted remedial paragraphs stored in the system. Michael and Rovick were convinced that the prediction table was an important factor in the effectiveness of this older system. Although we believe that immediate feedback is valuable (which is why CIRCSIM–Tutor gathers only one column of predictions at a time), we feel that the advantages of using the prediction table outweigh that value. First, the prediction table provides the student with a simple mental model of the task and a way of keeping track of current progress in the solution process. Second, CIRCSIM–Tutor can make a much more detailed and sophisticated student model. It records errors and error patterns. Some error patterns violate fundamental equations; others suggest the possible presence of important misconceptions. Based on a careful analysis of these errors, the tutor can generate a lesson plan, and interactive tutoring begins by using a mixed-initiative Socratic strategy in natural language. Thus, the prediction table provides a qualitative simulation environment for the student by requiring multiple simultaneous inputs (multiple responses to different aspects of a problem provided by the student in a single uninterrupted turn) before interactive tutoring begins.

There are several benefits of adapting this kind of design strategy. First, the system receives enough initial knowledge about the student so that it can narrow the focus for tutoring. Second, it can also detect some common student misconceptions [45], [46] and probe for them further. Third, the presence of a simple mental model of the entire domain prevents the students from getting too far off the track. Elsom-Cook [47] argues that using multiple pedagogic strategies can provide a very powerful learning environment. CIRCSIM–Tutor begins with a coach-like environment during the Prediction Table entry, and then moves to Socratic tutoring for the interactive tutoring session. This kind of flexibility provides a fourth benefit. The system can adapt rapidly to the needs of individual students.

2.3. A fragment of a CIRCSIM–Tutor dialogue from November 1999 

Here is a fragment of dialogue between a student and CIRCSIM–Tutor in November 1999, in which a student (Subject #2) is tutored about a prediction error in the change in SV in the RR phase. The procedure that the student is solving right now is “decrease inotropic state by 50%.” This situation can result from several common diseases. As it happens, this student had made all the DR predictions correctly. The prediction table in Table 2 shows the DR and RR predictions made by this student.

T1: What are the determinants of Stroke Volume?

S2: CVP and IS.

T3: Correct, Stroke Volume is determined by Central Venous Pressure and Inotropic State.

Which determinant is dominant in this case?

S4: CVP.

T5: Correct, the important determinant of Stroke Volume in this case is Central Venous Pressure. Is the relationship from Central Venous Pressure to Stroke Volume direct or is it inverse?

S6: direct.

T7: Good, you got the correct answer. What is the correct value of Stroke Volume?

S8: -

T9: Correct, the value of Stroke Volume is decreased. Please change the value of SV in the prediction table.

In this example, student #2 gives the expected answer at each stage as we carry out a tutoring strategy that was abstracted from expert tutoring sessions [48], as described in Section 5. Now let us look at a more typical example, where the student has some difficulty with some of these questions. This second example below comes from student #35, who made the exact same prediction error for SV in the RR phase. The system starts off with the same strategy in T11 as in T1, but the student comes up with only one of the determinants in S12, so the system asks for the other in T13. In S14, the student's answer is not too brief but too long. The system gives an explanation in T15 and follows up with a hint.

T11: What are the determinants of Stroke Volume?

S12: Inotropic state.

T13: Right, Inotropic State is one determinant of Stroke Volume. The other determinant represents preload. What is the other determinant of Stroke Volume?

S14: Preload, afterlaod and inotr opic state.

T15: Right, Inotropic State is one. MAP, or afterload, does affect SV, but the effect is not quantitatively important in this case. Think about what parameter in the prediction table represents preload.

S16: Preload.

T17: What parameter in the prediction table represents preload?

S18: Central Venous Pressure.

T19: Right, Central Venous Pressure determines preload, therefore determines Stroke Volume.

In this second example the student's answer “preload” in S16 is categorized as a “near miss.” The student is certainly not wrong and we do not want the system to respond with “wrong” in this situation as the first version of the system did. The system gives the student yet another nudge in the right direction. When the student comes up with the right parameter name, the system recapitulates what is happening and ties it all up. The student goes on to correct the prediction for SV. We discovered that hinting is an important strategy for human tutors and we have analyzed human hints [45] in some detail and implemented them in CIRCSIM–Tutor [49].

2.4. Brief description of the results of CIRCSIM–Tutor experiment in November 1999 

We carried out an extensive experiment to validate CIRCSIM–Tutor in November 1998, with 50 first-year medical students at Rush Medical College, which is described in detail in Michael et al. [50]. In November 1999, we carried out another experiment with a control group that shows that these students learn more about solving problems in an hour with CIRCSIM–Tutor than in reading carefully chosen text from a standard textbook for an hour. This experiment demonstrated that CIRCSIM–Tutor works and led to its routine use at Rush. It was carried out in a regularly scheduled 2-h laboratory. All of the students took a pretest. A control group containing 28 students read a specially edited chapter on the baroreceptor reflex, excerpted from Heller and Mohrman's Cardiovascular Physiology [51] by our experts. The experimental group (with 22 students) used CIRCSIM–Tutor. A third group of 23 students used CIRCSIM. All of the students took a posttest.

We had earlier developed two comparable tests, tests a and b. In each group half of the students took test a as pretest and test b as pretest. The students who had taken test a as pretest took test b as posttest; while those who took test b as pretest took test a as posttest. Each test had three parts, relationship questions, problem-solving questions, and multiple-choice questions. A later analysis showed that the pairs of multiple-choice questions were not comparable and so we will not report those results. Finally, the students who had used CIRCSIM–Tutor filled out a brief survey form asking for their reactions to the system. More details about this experiment can be found in Evens and Michael [12].

The system performed pretty well. It did not crash and 60% of the students completed all eight procedures. The students made 96 spelling errors and the system corrected 91 of them. It came up with something appropriate to say in response to all but six of the 1692 dialogue inputs. In those six cases, in spite of the inappropriate responses by the system, the student was able to figure out how to keep going and continue the session.

A summary of the test results appears in Table 3. Using one-tailed t-tests and assuming equal variance we can see that both the control students and the students who used CIRCSIM–Tutor learned a significant amount (p<0.05) and the effect sizes (calculated as the difference between the means divided by the variance of the gain scores) varied from moderate to large. While the students in the control group did a better job of memorizing the relationship information, the CIRCSIM–Tutor students did significantly better on the problem-solving task (p<0.001). One argument for comparing system results with those for students reading a targeted text is that this choice of assignments is a real problem for instructors teaching physiology. Our results are comparable with those found by Graesser's group [33]. The results of the survey were quite positive, as can be seen in Table 4. Students found the program easy to use; they felt that they learned a lot; and they would recommend the program to other students. As a result of their requests, the system was installed on a number of computers in the open student laboratory. More details of this study can be found in Evens and Michael [12]. Another question of interest is “How does the effect of using CIRCSIM–Tutor compare with the results of using the old CAI system, CIRCSIM?” As Table 3 shows, the students using CIRCSIM–Tutor have higher mean learning gains, but the difference is not significant. Since Michael's colleagues are now convinced that CIRCSIM–Tutor is the better system, it is in routine use and we have not been able to repeat a full-scale experiment.

Table 3. Results of the CIRCSIM–Tutor experiment at rush medical college in November 1999
Pretest mean (S.D.)Posttest mean (S.D.)Gain(pre–post) p valueEffect size
Control (N=28)
Relationship points (max 24)14.1 (4.8)19.9 (4.5)5.8<0.0011.27
Correct predictions (max 20)12.2 (3.0)13.8 (2.6)1.60.0180.48

CIRCSIM (N=23)
Relationship points (max 24)11.0 (5.5)13.7 (6.8)2.70.0710.54
Correct predictions (max 20)11.5 (5.1)16.4 (1.6)5.3<0.0011.05

CIRCSIM–Tutor (N=22)
Relationship points (max 24)10.9 (5.5)14.2 (6.3)3.30.0390.65
Correct predictions (max 20)11.5 (4.8)16.8 (1.8)4.9<0.0011.24

S.D.: standard deviation.

Table 4. Survey and mean responses from November 1999
Your views on CIRCSIM–Tutor (1=Definitely YES, 2, 3, 4, 5=Definitely NO)Mean response
1.The system was easy to use1.7
2.The introductory screens were helpful2.1
3.Entering predictions into the table was easy1.8
4.Entering answers to the tutor's questions was easy2.1
5.The system's use of language seemed varied and helpful1.9
6.The tutors hints and explanations were informative1.85
7.I would prefer that the system always tell me about my mistakes immediately2.65
8.CIRCSIM–Tutor helped me understand the behavior of the baroreceptor reflex2.0
9.CIRCSIM–Tutor improved my ability to predict the cardiovascular responses to disturbances in blood pressure1.9
10.I would recommend the program to friends taking physiology1.9

Back to Article Outline

3. Significant features of CIRCSIM–Tutor 

Here is a brief list of some of the areas where CIRCSIM–Tutor has pioneered. The system is modeled on the behavior of expert human tutors. The pedagogical knowledge was extracted from their expert tutoring sessions and represented explicitly as rules, lesson planning rules and discourse planning rules. The rules are used to generate lesson plans and to control discourse strategies. The system interprets the rules and builds the lesson plans or returns an appropriate discourse action. The discourse is also modeled on the experts so, like expert human tutors, CIRCSIM–Tutor asks questions and produces hints whenever possible, but almost never tells the student the answers. A detailed description of our observations on human tutors can be found in Evens and Michael [12].

The more time we spent analyzing human tutoring sessions, the more planning we found. As a result the planner in our system is the central controller. It combines two different instructional planning approaches: lesson planning and discourse planning. Lesson planning produces global lesson plans. The planner then puts together strategies for carrying out those plans and tactics for carrying out the strategies. Plans for carrying out the strategies are produced by the discourse planning stage. The planner plans dynamically based on the inferred student model; it generates plans, monitors the execution of the plans, and replans when the student interrupts with a question during the tutoring session. The planner plans at different levels of the hierarchy; the higher level is an abstraction of the plan (lesson goals) and the lower is a detailed plan (subgoals), sufficient to solve the problem. The planner supports some minimal student initiatives during the tutoring session. If the student asks a question the planner suspends the current plan, carries out the student request, and then resumes the suspended plan.

The student modeler stores four different levels of assessment to feed plans at four different levels: curriculum, procedure, phase, and topic. The input understander does extensive spelling correction, and then uses a cascade of finite-state machines to handle free text answers, which are typically fragmentary. The natural language generation component generates sentences from logic forms using a lexical functional grammar approach.

Back to Article Outline

4. Architecture of the CIRCSIM–Tutor system 

The typical intelligent tutoring system consists of four major components [52], [53]: the domain knowledge base, a collection of instructional strategies and an algorithm for applying them, a student modeler, and an interface. Since the major goal of CIRCSIM–Tutor was to carry on a natural language dialogue, we divided the interface module into three pieces, an input understander, a text generator, and a screen manager. Even a few minutes listening to an expert human tutor in action was enough to convince us that we needed a dynamic hierarchical planner as well as a domain problem solver. As a result, CIRCSIM–Tutor has seven major modules: the instructional planner, the domain knowledge base, the problem solver, the screen manager, the input understander, the student modeler, and the text generator. Fig. 3 shows the overall architecture of our system as described by Woo [54], [55]. The Instructional Planner is in the center because we discovered that planning is indeed the central issue in tutoring.

Most of our decisions about the architecture were driven by the need to respond to the students in two seconds or less, the longest period that we could hope the students would wait according to the authoritative literature on interfaces [56], [57], [58]. This meant that we had to parse the input, turn it into a logic form, update the student model with the results, plan a response, and turn it into a series of sentences in less than two seconds on the kind of Macintosh that was available in early 1991. As a result, we chose to use space to save time whenever a time/space tradeoff presented itself. We considered a couple of less conventional approaches to the system architecture, especially a blackboard architecture [59] and a more hierarchical type of distributed architecture, and decided against both of them, because of these same time constraints.

4.1. How the planner and the other modules work together 

We now give a brief description of how the modules in Fig. 3 work together, coordinated by the planner. As an example, we explain how these modules collaborate to produce the turns S11–S19 in the second dialogue in Section 2.3. This dialogue carries out the first part of the lesson goal: tutor the student about the value of SV in RR, because the student has made an incorrect prediction about the change in SV in RR.

1.Planner: picks the current topic from the subgoal stack, selects the discourse tactic, question, and combines them into one internal logical form.
current topic: (determined-by SV?), discourse tactic: question.

call Text Generator: (question (determined-by SV?))


2.Text generator: generates the sentence: “What are the determinants of Stroke Volume?” and returns it to the planner, which passes it to the screen manager

3.Screen manager: displays the sentence T11 in the window.

4.Student input (S12): “inotropic state”.

5.Planner: passes the student's input with the current lesson topic to the input understander (question(determined-by SV?, answer(inotropic state))).

6.Input understander: parses the student's answer, checks its coherence with the dialog history, and calls the student modeler with the logic form (answer(determined-by SV (IS))).

7.Student modeler: calls the problem solver, which, in turn uses the domain knowledge base to get the correct answer: (CVP, IS), compares the correct answer with the student answer, discovers that the student's answer is partially correct, but it is missing CVP and updates the student model, then returns to the input understander with the information: (category partial (correct IS) (missing CVP)).

8.Input understander returns to the planner with the logic form that it sent to the student modeler and the information returned by the student modeler. The planner puts onto the lesson goal stack the new goal (tutor (category partial (correct IS) (missing CVP)). Then it looks for a rule to tutor a partial answer.

Unfortunately, it does not recognize that this is a good time to respond with “and?” but instead it creates three subgoals: give a positive acknowledgment for IS, give a hint about CVP, and then ask for the other determinant. It chooses the top subgoal, calls the text generator to put it into words, then gives it to the screen manager to display. Then chooses the next subgoal and repeats the process. The result is T13:

T13: Right, Inotropic State is one determinant of Stroke Volume. The other determinant represents preload. What is the other determinant of Stroke Volume?

The planner deals with each of the student answers in the same way – it hands the answer to the input understander, which parses the input and calls the student modeler. It takes the information from the student modeler, creates a new lesson goal and one or more subgoals, and puts them at the top of the stack. It then executes those subgoals one-by-one. When it gets to the end of the output T19, it finally finishes all the new goals, uncovers the rest of the subgoals from the original lesson plan and continues with that plan. As you can see, all the modules in the system are involved with the production of each turn in the dialogue.

Back to Article Outline

5. The instructional planner 

The instructional planner is the central component of our intelligent tutoring system; it is responsible for making decisions about the content of the lesson and decisions about its presentation strategy. The planning component of CIRCSIM–Tutor must carry out both functions, since it needs to provide a global lesson plan, and it needs to carry on a natural language exchange with the goal of providing the most effective instruction possible to the student. The problem of decision-making in an intelligent tutoring system has long been viewed as a complex planning problem [59], [60], [61], [62], [63]. Adaptive planning techniques in the tutoring domain enable the generation of customized plans for individualized instruction [64], [65].

It was apparent from the beginning of this project [54], [55], [66] that the computer tutor required a dynamic hierarchical planner capable of producing plans just in time to use them. The planning must be dynamic because we cannot predict what the student will say. The system must be capable of deleting old plans and making new ones at any point. It must be hierarchical to handle multiple levels of planning. It must be capable of long-range planning to support lesson goals and multi-turn discourse moves like directed lines of reasoning (DLR's, see Section 5.5), but it must postpone lower-level planning until it is needed. Since the student model affects nearly all lower-level plans and since that model changes at every turn, it is important to generate lower-level plans just prior to use.

The instructional planner also serves as the main program of CIRCSIM–Tutor. It consists of three parts (Fig. 4): the lesson planner, the discourse planner, and the plan controller, which has already been briefly described in Section 4.1. The lesson planner sets the instructional goals and develops the tutoring strategies and tactics needed to carry them out. The discourse planner turns the tactics into discourse plans and calls the text generator to produce the actual sentences one at a time. These planners use explicit planning rules derived from the analysis of expert human tutoring sessions to build their plans.

5.1. The lesson planner 

The lesson planner decides on the contents of a lesson, based on its model of the student's current knowledge about the domain. The planner generates the lesson goals, sequences them, and selects the appropriate planning strategies to create a plan for the current lesson goal. Fig. 5 shows the architecture of the lesson planner including the necessary planning steps, the student model, and the lesson planning rules. The lesson planner is a rule-based system. The result of the lesson planning is a set of subgoals (a plan), each of which will be the topic for a dialogue with the student.

The initial lesson planning is done as soon as the student finishes filling in a column of predictions. The planner calls the student modeler, which, in turn, calls the problem solver to get the correct values for the column. The student modeler compares the student's entries with the correct values, identifies the errors, updates the student model, and returns to the lesson planner with a list of errors. The lesson planner calls the screen manager to turn the corresponding boxes in the prediction table red and draw diagonal lines across them. Then it looks for patterns of errors and creates a sequence of remedial goals. The student modeler also lists any potential misconceptions triggered by the student's patterns of errors. If there are any possible misconceptions, those goals go on the top of the lesson goal stack. Next on the goal stack are any errors in the variables controlled by the nervous system, since a discussion of the neural variables may raise other misconception flags. Last on the goal stack are any errors in the other variables, listed in the “logical order”—the order in which the solution algorithm determines the correct prediction, since Michael and Rovick always use that order.

More lesson goals and subgoals are added as the dialogue proceeds. Most new goals are added at the top of the stack to be executed at once. This happens when the student makes errors that reveal a need for another lesson as in turns S12 and S14 in Section 4. It happens when the student takes the initiative, when the lesson goal is to respond to the initiative, if at all possible. It also happens when the system fails to understand the student's input. Then the lesson goal is to tell the student what kind of input the system was expecting and asks the student to try again.

In the experiments in November 1998 and 1999, if the student made no errors in a column, then no tutoring took place. In 1999 a majority of the students solved all eight procedures and a number of those made correct predictions for a whole column in the last three or four procedures. The result was that the best students were getting the least stimulation. In order to add stimulation and also to probe more effectively for student misconceptions, we developed a dozen open questions to ask the students in this situation, which we deployed in November 2002. Most students responded to the questions with longer, more thoughtful answers [67]. This gives us a strong motivation to expand the input understander to parse, categorize, and represent these answers so that the system can respond intelligently, as described in Section 9.

5.2. Lesson planning rules 

The lesson planner uses three sets of lesson planning rules: goal generation rules, strategy rules, and tactical rules. The rules are written in if-then form as explicit production rules and interpreted by the rule interpreter. The system has about 50 goal generation rules, 20 strategy rules, and 20 tactical rules.

The interpreter is built using Lisp macro functions, which understand and interpret the rules for the system. As a result the rules can be written, not as Lisp code, but in any free format as long as the rule interpreter can understand them. We designed the rules with three parts: the name part of the rule, the antecedent part, and the consequent part, in the form: (Rule_name: (antecedent)(consequent)). This approach makes the system efficient in representing the rules explicitly.

For example, assume that the student made an error in predicting the variable TPR. One of the goal generation rules applies; if the student does not know TPR, then build the lesson goal, tutor TPR about the neural control. This rule can be expressed as (G_Rule1: ((do-not-know TPR)(neural-control TPR))). If the current lesson goal is to teach the causal relationship between central venous pressure (CVP) and SV, and the student does not know the direction, then this rule can be written as (S_Rule1: ((causal-relation)(do-not-know direction))(tutor-causality))). This is the strategy rule for dealing with non-neural variables. If the strategy rule is tutor-causality, then the corresponding tactical rule is to teach determinants, actual-determinant, relation, and value. This rule can be written as (T_Rule1: ((tutor-causality)(determinants) (actual-determinant) (relation) (value))).

5.3. The goal generation process 

The generation of the goals is guided by a set of explicit goal generation rules designed by our experts, which ensures that the most serious misconception is selected and tutored first. For example, suppose the student made wrong predictions in the table for the variables TPR and SV. The student modeler has determined, from its analysis, that the student is confused about the mechanism controlling TPR and the causal relationships between CVP and SV and between SV and CO. So the lesson planner retrieves the information from the student model, applies the goal generation rules (see Table 5), and generates the lesson goals dynamically. The result is a set of lesson goals in the goal stack (see Table 6).

Table 5. Goal generation rules
1.IfCurrent primary variable is IS and student answer is not nochange for TPR
ThenBuild lesson goal (neural-control (TPR))

2.IfCurrent primary variable is CVP and student does not know causal-relationship between CVP and SV
ThenBuild lesson goal (causal-relation (CVP, SV))

3.IfCurrent primary variable is CVP and student does not know causal-relationship between SV and CO
ThenBuild lesson goal (causal-relation (SV, CO))

IS: inotropic state; TPR: total peripheral resistance; CVP: central venous pressure; SV: Stroke Volume; CO: cardiac output.

Table 6. Generated lesson goals in the goal stack
OrderLesson goals
1.Neural-control (TPR)
2.Causal-relation (CVP, SV)
3.Causal-relation (SV, CO)

TPR: total peripheral resistance; CVP: central venous pressure; SV: Stroke Volume; CO: cardiac output.

The goal generation is significant in many ways; the goals are generated dynamically and adaptively; the goals are sequenced in the order that the experts use to tutor this material; the goals provide a global context that remains coherent and consistent throughout the tutoring session, unless the goals are revised. New goals can also be generated, which tutor the student about a common misconception (a bug), if the Student Modeler detects such a misconception. The goals remain in force until they are changed by the planner dynamically.

5.4. The plan generation mechanism 

The second stage of the lesson planning is the plan generation mechanism, which creates the instructional plan by applying two sets of rules, rules for selecting tutorial strategies to achieve the selected goal and rules for selecting pedagogic tactics to execute those strategies. Strategy rules (Table 7) describe the tutorial approach from a domain-independent point of view. These include tutoring prerequisites before the material they underlie, reminding the student about relations between two parameters, explaining the definition before tutoring about it, and so on. Tactical rules (Table 8) can also be viewed as a domain-independent tutorial approach; they involve asking about concepts and relations between the concepts.

Table 7. Some strategy rules
1.IfGoal is causal-relation and student does not know and direction is incorrect
ThenStrategy is tutor-causality

2.IfGoal is causal-relation and student does not know and direction is correct
ThenStrategy is remind-relation

3.Ifgoal is neural-control and this is first procedure
ThenStrategy is define-neural
Table 8. Some tactical rules
1.IfStrategy is tutor-causality
ThenTactic is determinant, actual-determinant, relationship, value

2.IfStrategy is tutor-neural-control
ThenTactic is mechanism, value

For instance, if the goal is to teach the causal relationship between two parameters, then the strategy rule fired is tutor-the-causality, and this then fires the tactical rule: ask about: determinants, actual determinant, relationship, and correct value. The result is a hierarchy of goals. Thus, the current goal is ultimately refined into four subgoals by two-step goal transformations. In order to solve the current goal, all the subgoals must be solved. The subgoals generated at the tactical level are the current plan for the goal. These are kept in a subgoal stack (Table 9), which is used by the Discourse Planner to pick the next topic.

Table 9. The subgoal stack
OrderSubgoals
1.Determinants
2.Actual-determinants
3.Relation
4.Value

5.5. An example of the lesson planning process 

Table 10 shows an example of the lesson planning process for the causal-relationship between CVP and SV. From the top of the figure, the goal generation step is described with the other information that it uses: student model, rules used, goal stack, and current goal. Then the plan generation step is described in two steps, the strategic and the tactical steps. The lesson planner waits for the discourse planner to complete the current lesson plan, and when the plan controller sends a wake-up signal, the planner gets reactivated and continues with the next goal in the goal stack.

Table 10. An example of lesson planning
Goal generationa
Student modelDo-not-know (SV)
Goal stackCausal-relation (CVP-SV)
Causal-relation (SV,CO)
Current goalCausal-relation (CVP-SV)

Plan generationb
StrategyTutor-causality
Tactics(Determinants)
(Actual-determinant)
(Relation)(value)

Discourse plannerExecutes “determinants of SV”
Plan monitoringWaits for the student response

aRules used: DR_G_Rule8.

bRules used: DR_S_Rule1, DR_T_Rule6.

5.6. The discourse planner 

The discourse planner in CIRCSIM–Tutor mainly controls interactions between the tutor and the student. It needs to decide how the tutor should respond to a student with a given problem. This discourse strategy must be planned explicitly by the discourse planner so that the system can enter into flexible and coherent interactions. The discourse planner picks a tactical subgoal off the subgoal stack and turns it into a discourse plan ready for the text generator. The plan controller monitors the execution of the plan and forces the discourse planner to suspend the current plan when the student takes control.

The discourse planner consists of two sets of discourse planning rules. The upper level decides when it is time to hint or to give the student the answer. It also chooses from a collection of stored multi-turn plans for tutoring misconceptions. These plans have been developed from years of experience by the expert tutors to attempt to counter certain misconceptions. They are organized as the kind of multi-turn structure named a directed line of reasoning [68], in which the tutor asks a sequence of questions to deliver an explanation or a summary. For example, the plan for the common misconception in which the student mixes up the Frank–Starling Law and inotropic state can be described as a three step plan:

1.Ask or state the Frank–Starling Law

2.Ask or define inotropic state

3.Ask or explain the relationship between them (an increase in inotropic state will shift the Frank–Starling curve to the right along the axis).

We have an example in the transcripts in which the tutor delivers a monologue, but most often the tutor tries to get the student to supply the first two parts and then explains the third. The discourse planner has a collection of plans of this type. The lower level discourse rules are made up of logic forms and the conditions in which to apply them.

5.7. The plan controller and executor 

As we indicated above in Section 4.1, the plan controller and executor spends most of its time picking a goal off the subgoal stack, finding the correct rule, and carrying it out. The first applicable rule is used (unless possibly that rule has been tried and observed to fail with the same student during this procedure), so ties are never a problem. In addition it plays a role in starting the system off at the beginning of the session and closing it down at the end. When there is no session underway, it waits for the screen manager to tell the system that a student is ready to start, and then calls each module in turn and tells it to flush out any remaining data from previous students. Whenever the student is ready to quit, it closes the student log, calls the student modeler to close and store the student model, and calls the other modules and tells them to close.

Back to Article Outline

6. The domain knowledge base and the problem solver 

We describe the domain knowledge base and the problem solver together because they must work together constantly to solve domain problems and the design of each vitally affects the other.

6.1. The domain knowledge base 

The builder of a domain knowledge base faces two very important questions: what knowledge should it contain and how should that knowledge be encoded. Anderson [8] defines three different categories of knowledge encoding: the black box model, the glass box model, and the cognitive model. In a black box model the problem solving steps are hidden, only the final solutions are visible, as in many Bayesian systems. In a glass box model the rules and the problem-solving steps are visible to the user, but the problem-solving process does not always follow a human approach, as in an expert system that uses the Rete algorithm. In a cognitive model like ours, both the encoding of the knowledge and the problem solving process are deliberately modeled on human cognitive processes. The domain knowledge is decomposed into meaningful frames and the qualitative, causal reasoning are represented explicitly, so that the system can teach the student to solve problems in the same way that the expert tutors teach their students to do [69], [70]. Because our system is primarily designed to be used by students early in the first year, the underlying model of the cardiovascular system is definitely simplistic; for example, it totally neglects the pulmonary circulation.

Our domain knowledge is divided into four different types of knowledge needed for tutoring; each type has its own ontology [70], [71]: declarative knowledge about the cardiovascular system, procedural knowledge (the algorithms described above), knowledge of tutoring strategies and tactics, encoded as plans for the planner, a list of common misconceptions and the patterns of errors used to diagnose each one, and a hierarchy of computer concepts so that the system can communicate with the student about which window to look at and which button to press. Declarative knowledge includes variables like those mentioned above and causal relationships between them, and also some anatomy. Procedural knowledge involves the rules for using the concepts in solving problems. For example, in CIRCSIM–Tutor, a rule that figures out the actual determinant of SV is if the primary variable is CVP, then CVP is the actual determinant of SV in this procedure. Knowledge of tutoring heuristics must be extracted from the experience of domain experts; it involves ways of teaching the student about the particularly difficult points in the domain.

The basic knowledge is encoded in frames with slots for each aspect. Each variable has its own frame, which also includes the causal relationships involving it. Table 11 displays the frame for SV. Each relationship also has a frame, to make it easier to record information about the relationships in the student model. There are three conceptual levels in the domain knowledge [44]. The top level represents the knowledge in the concept map in Fig. 1. This is the knowledge that the expert tutors particularly want their students to internalize and use in problem solving. The middle layer contains this knowledge and further knowledge that the tutors use in explanations and hints. The bottom layer includes all the knowledge in the middle layer as well as anything else that we find the students bringing up in the tutoring sessions, which the tutor needs to be able to understand and respond to. After a long discussion with experts, some knowledge of anatomy was included at this lowest level. The experts objected to the inclusion of anatomical knowledge in a physiology tutor, but we were able to demonstrate a number of places in the human tutoring sessions where the students asked anatomy-based questions (they tend to find anatomy easier to understand than physiology) and the tutors answered. The experts agreed that we could include this knowledge as long as it was used only for responding to issues raised by the student.

Table 11. The frame for Stroke Volume in the domain knowledge base
(frame SV
(frame-typeVariable
Var-typePhysically affected
Frame-nameSV
ClassInstance
Instance-ofVariable
NameStroke Volume
DefinitionVolume of blood ejected each Heart beat
Part-ofHeart
AnatomyVentricle
Causal-relation-inCausal-CVP-SV causal-IS-SV
Causal-relation-outCausal-SV-CO))

SV: Stroke Volume; CVP: central venous pressure; IS: inotropic state; CO: cardiac output.

All of this domain knowledge is encoded in one network of frames. The goal was to include plans and algorithms in the frames so that the system could prompt the student to take the next step or discuss the algorithms explicitly with the student depending on the state of the student model. The current domain knowledge base [72], the problem solver [73], and the domain planner [54] were built some years ago. Although we have added knowledge to the knowledge base, invented new problems, and developed more plans, they remain essentially the same, unlike the other modules, which have been subject to major changes.

6.2. The problem solver 

Clancey [74] claims that the intelligence of an intelligent tutoring system comes from its ability to answer the questions it asks the student and the questions the student asks it. If the problem solver solves the problems but cannot explain how it solves them, it may just as well retrieve stored answers. The ability to solve the problem, using the expert's problem solving behavior, can be used to identify the student's misconceptions, to give an explanation, and to provide a basis for tutoring strategies. Problem solving in CIRCSIM–Tutor is carried out by two problem solvers: the main problem solver and the assistant solver. The main problem solver solves the problem, generates correct answers, and produces the same problem-solving path as an expert in the domain. This solution path can be used to monitor the student's problem solving behavior while the student is making entries in the predictions table. The assistant solver solves current problems generated by the planner, such as the determinant of x, the relationship between x and y, and also problems coming from the student questions. The other modules of the system consult these problem solvers to get any information they need. The problem solvers extract the information from the domain knowledge base for them.

Back to Article Outline

7. The screen manager 

The screen manager mediates the interaction between the student and the system. When the student logs on it displays system messages through the introductory windows. Then it displays the list of procedures that the student can select. As soon as the student has selected a procedure, the screen manager paints the main window with the procedure description in the window. Below the procedure description, the screen manager displays the prediction table with instructions about how to use the mouse and how to make entries into the table. Then it picks up the qualitative predictions (+, −, 0) from the prediction table one by one from the mouse clicks and passes them to the planner. At first, when the student clicked in the wrong window or the wrong column of the prediction table, the system produced a beep along with its warning message. We soon learned that medical students do not like beeps that tell the person at the next computer that they have made an error. Flashing messages and color changes are much more acceptable.

Originally the CIRCSIM–Tutor dialogue was displayed in two windows, one for natural language input from the student and one for tutor output. We organized it this way because Fox [15] describes the importance of back channel or overlapped responses (given while the other person is speaking) from both tutor and student in keeping the tutorial dialogue going. Unfortunately, the students seemed to find this confusing, so in 1996 we built a new version of the screen manager with the contributions from the student and the tutor interleaved in the same window [75]. Colleagues who used the system just before this change told us that the system was much improved and refused to believe us when we told them that the only change was the new window with interleaved dialogue. Fox [15] also describes human tutors as co-constructing the answer with the student. It may be that interleaving the contributions from the two speakers made users feel that the system was more cooperative and interactive. This change also made it feasible to allow the student scroll back through the dialogue, which some students find very helpful.

Clark [76] analyzes dialogue as a kind of joint action. We found this model intuitive. We used it to determine when we need an explicit positive acknowledgement and when that acknowledgment can be left implicit [75], which has helped to make the dialogue sound more natural. We also made a number of changes designed to help the users feel in control and to help them figure out what the system was doing. We made it easy to exit the system at any point; the result was that most users started to use our buttons to quit the system, which allowed us to gather some really useful information and close some files. We made the screen manager highlight the window where the system wants the user to enter information at the current moment. Similarly, when it is time for the student to make predictions, we highlighted the column where those predictions belong. As soon as the user finishes entering a full column of predictions, the system colors the boxes with errors red and draws diagonal lines across those boxes so the user can see immediately where the errors occur. As a result of many seemingly small changes like these, students can now use the system comfortably in any place at any time to explain what is happening or take a copy home with them.

Back to Article Outline

8. The input understander 

The input understander is responsible for parsing the student's natural language input and producing a logic form representation, which it returns to the planner. Since the input understander must be able to cope with any input, however ill formed, it needs to be very robust. In the context of the question “How does SV change?” the responses “SV increased,” “increased,” “in”, “I,” “SV +,” “Stroke Volume rose” and “it went up” produce identical logic forms. This is only possible because the planner passes the context, as a logic form version of the question, to the Input Understander along with the student input. If the system has just asked “What are the determinants of SV?” then the planner will pass the logic form used to generate that question “(question (determined-by SV?))” to the understander. Suppose the student's answer is “The determinants of SV are CVP and CO,” or more frequently just “CVP and CO.” The input understander parses this input, and returns the logic form “(answer (determined-by SV (CVP, CO))).”

With help from the student modeler and the problem solver, the input understander identifies the answer as belonging to one of eight categories: correct, partially correct, near miss, “I don’t know,” “grain of truth,” [cf. 22] misconception, totally incorrect, or a combination of the others [77], [78]. These categories are then used by the planner to define the response strategy.

Spelling correction was a central issue from the beginning of the development of the system. Typographical errors and spelling errors were sprinkled liberally through the student input. Since the tutoring domain is restricted, the intended interpretation is usually obvious to a human. For example, “gose down” can only mean “goes down” in our context, not “goose down.” The usual word-processor solution, providing the user with a number of alternatives to choose from, was not appropriate here. Students do not want to be bothered with alternative spellings when they are in the middle of solving a tough problem. It is essential for the system to determine a correction and respond immediately. As a result our spelling correction routine [79], [80], attempts to match all unknown words to the nearest word in the lexicon. One of our goals in implementing natural language dialogue was to ensure that the students learned to use the sublanguage of physiology correctly, so we included a large number of official medical abbreviations in the lexicon before we began. But we had not expected the students to invent their own abbreviations all the time, as they do. They start typing, and, when they think that the tutor ought to be able to understand what they mean, they stop typing.

The spelling correction routine looks for phrases first, as there are many phrases in our lexicon. Since students frequently abbreviate by shortening the word, the edit-distance algorithm that compares unknown words to the nearest lexicon entries heavily discounts letters toward the ends of words. The lexicon contains only about 4000 words, most collected from our human tutorial dialogues, plus a few manual additions. We have deliberately restricted the size to improve the spelling correction accuracy.

The original input understander contained a bottom-up chart parser [81]. This parser has now been replaced by a much more robust information extraction style parser [82], [83] that uses a cascade of small finite state transducers [84]. Each finite state machine is designed to handle a different input phenomenon. One of these machines deals with forms of the verb “to be” and distinguishes the parameter abbreviation “IS” (for inotropic state) from the copula. Another machine recognizes the physiological relationship types (direct or inverse—often called “indirect” by the students). This degree of specialization makes it possible for the input understander to recognize that “I” and “in” are abbreviations for “inverse” when the student is responding to a question about the type of relationship, while they signal “increase” when the student has been asked for a parameter change. Case frames [85] are used to distinguish word senses. In order to respond intelligently both to “CO increases” and to “CO increases CVP,” the system must recognize that in the first sentence it is CO that changes but in the second it is CVP that changes.

There are a number of ways to refer to the action of the nervous system. The student may mention the sympathetic or the parasympathetic system, for example, in response to mechanism questions, when the system is expecting the student to type “nervous system” or “CNS.” A small ontology [82], illustrated in Fig. 6, is used by the mechanism finite state machine to recognize that the student is talking about a neural mechanism. One advantage of this approach is that it is easy to update. This approach keeps the system from labeling correct answers as wrong, which the students understandably find very frustrating.

We made another improvement in the input understander that turned out to make the system much more usable. The first version of the input understander responded to any input that it did not recognize with the message “I’m sorry. I don’t understand you. Please rephrase.” Now the first thing that the input understander does is figure out whether it is expecting a parameter change (+, −, or 0), a parameter name, a parameter mechanism (neural or hemodynamic), the name of a stage (DR, RR, or SS), a relationship type, or some other kind of input. As a result it can respond to a student input it does not understand with a message about what type of answer is expected.

The recent addition of open questions has been successful in getting the student to give us longer, more complex input with some complete sentences. At the moment the system does not parse these answers, but we are happy to see that much of the new input is within the range of the current input understander. We are experimenting with two alternative approaches to handling the rest. The first involves adding more finite state machines. The second involves developing a new Head-Driven Phrase Structure Grammar (HPSG) parser [85], which would provide us with a more principled approach to combining the semantic information from the question context and the case frames with the grammar. The HPSG approach is more work but it would also be more portable to other tutors that we would like to build.

Most of the time, the student is trying to answer a question from the tutor, but occasionally the student has another plan in mind, for instance, to ask a question or to present a theory and ask for confirmation. We carried out a detailed study [86] of such student initiatives and the responses that expert human tutors make to them, but so far we have only implemented one simple variety. The biggest problem here seems to be recognizing when the student is trying to answer the question in some unexpected fashion versus when the student intends to take the initiative. At this point the input understander can only understand one simple type of initiative. If the student types “What is SV?” or “I don’t understand about SV,” or “I am confused about SV” then the Input Understander returns the logical form “(question (explain (SV))).” When this happens, the planner examines the logic form and recognizes that this is a student initiative, so it puts a new goal at the top of the stack to satisfy the student request. It calls the text generator with the question logic form as its argument and the text generator calls the problem solver to get the definition of SV from the knowledge base. It puts the definition in a sentence and then returns that sentence to the planner, which asks the screen manager to display it.

Back to Article Outline

9. The student modeler 

The student model is a data structure that represents the tutor's estimate of the student's current state of knowledge; what the student knows, what the student does not know, and what misconceptions he or she may have [87]. This model is intended to be used by the tutor in individualizing the instructions given. There are two major approaches to student modeling. One approach, the overlay model [17], is designed to represent the student's knowledge state as a subset of an expert's knowledge state. Another approach, the buggy model [88], represents the student's misconceptions not as subsets of the expert's knowledge, but as variants of the expert's knowledge.

Our original student model integrated overlay and buggy strategies in a simple way [89]. For each prediction table parameter and each relationship in the concept map it kept a record of the student's performance in making predictions and in answering questions about that entity. Then for a few of the most common misconceptions it also recorded whether the student had shown any evidence of this misconception. This approach made it possible to guess for any particular topic what the probability is that the student has some knowledge about it but it does not provide any overall picture of how the student is doing.

We carried out an extensive analysis of both misconceptions and hints in human tutoring sessions [45]. In this process we interviewed an expert while a tutoring session was in progress and persuaded him to talk about his personal student model, a combination of local and global assessments, both rough estimates of how the student is doing. In this interview we also realized that the implementation team failed to recognize the extent of the hinting strategies employed by expert tutors. Over the next 2 years we carried out a detailed study of the hinting process and developed an extensive categorization of hints. We also built a table of common student misconceptions and the error patterns that may betray those misconceptions.

We expanded an early version of an overlay model into four levels of student modeling, one to provide the necessary information for each level of planning [77], [78]. The different levels of the model calculate an assessment of the student's performance on the variable currently being tutored, the performance on the current stage, the performance on the current procedure, and a global assessment of the overall performance. Each assessment is calculated by combining counts from the level below.

We used the new model to implement our research on hinting [78]. The information from the lowest level is used to decide whether to hint or not. The information from the two levels above that is used to estimate what the student knows and what kind of hint to give. We also replaced the buggy model with a list of misconceptions, so that the system can now diagnose misconceptions, ask further probing questions, and generate a remedial dialogue using one of our stored plans. The global assessment is used for keeping a record of misconceptions and for curriculum planning.

Back to Article Outline

10. Text generator 

The text generator is responsible for turning each output of the discourse planner into a natural language sentence. It receives necessary information as a logical form from the planner a sentence at a time and generates a natural language sentence to express the content of the logic form [73], [90], [91]. This information includes the current topic and a text style such as question, hint, answer, etc. For example, the text generator is given a logic form from the planner, such as “(question (affected-by SV?)),” then it produces the English sentence, “What are the determinants of SV?” and returns it to the planner, which displays the sentence to the student in the dialogue window.

Before we started this project we chose to use a lexical functional grammar approach [92], because its models of syntax and semantics fit so well with what we know about psycholinguistics. The text generator converts the logic form into the functional structure (Fig. 7) that lexical functional grammar requires and then converts that functional structure into a constituent structure like that in Fig. 8.

When we started our project there were good grammars available for declarative sentences, even very complex ones, but not much available for simpler sentences that are very common in tutoring. We found that we needed to add rules for questions (“What are the determinants of SV?”) and imperatives (“Please predict the change in SV.” or “Remember that IS is a neural variable.”). With great generosity, Kaplan [93] arranged for Xerox PARC to give us a copy of his excellent tool, the Grammar Writer's Workbench, which we used extensively in our work.

The choice of appropriate ways to give the student feedback turned out to be quite complex. When Michael and Rovick switched from face-to-face tutoring to keyboard-to-keyboard tutoring, in order to provide keyboard-based language examples for our project, they were very conscious of the reduction in bandwidth. In response Rovick started to produce very enthusiastic positive acknowledgments like “super” and “marvelous.” To his surprise the students resented his efforts; they felt he was being patronizing. Rovick experimented with a CAI system under development at the time and discovered that students appreciated an enthusiastic comment when they finally reached a solution after struggling with a hard problem, but not at other times. Lepper et al. [94] describe elementary school students reacting negatively to compliments phrased too broadly. We have found Lepper's work very helpful in crafting appropriate ways to improve student affect.

Back to Article Outline

11. Conclusion 

This paper describes the design and implementation of an intelligent tutoring system, CIRCSIM–Tutor, which carries out a natural language dialogue with first-year medical students, designed to help them learn to solve problems involving the baroreceptor reflex using qualitative causal reasoning. Our experiments show that student use of CIRCSIM–Tutor produces significant learning gains. The experiment reported here shows that medical students learn more about solving problems via qualitative reasoning than students reading a carefully edited text for an hour. What is more, survey results show that students like the system. We can conclude that CIRCSIM–Tutor works and that it works better than reading text, even for highly intelligent and highly motivated students.

11.1. Future research 

We want to improve the linguistic competence of our system by expanding the capabilities of the input understander and adding turn planning to the dialogue generation process. We also want to add to its planning domains at the top level, to carry out curriculum planning and protocol switching.

The input understander needs to be expanded to handle a wider range of student initiatives, expressions of frustration, and answers to open questions [67], [86]. At this point the system can only understand a few simple student initiatives, but human tutoring dialogues contain many different kinds of initiatives. The problem of understanding and responding to student affect and student questions is a major area of research. So far we have carried out an analysis of the input [67] and also have done a study of student initiatives evoked by human tutors [86] including a detailed analysis of the tutor responses.

As students use the system, more examples appear in which the approach of generating output a sentence at a time produces awkward-sounding text. We clearly need to plan the output a turn at a time [27], [25]. This would allow us to insert appropriate discourse markers [95], softeners, and anaphora, and make better lexical choices as well. Yang has written a turn planner but we are still building the system to support it [96], [97] and so we have not yet been able to test it with students.

Planning needs to be extended in two other new ways: curriculum planning and protocol switching. We have developed 40 or more problems for students to choose from. A curriculum of this size requires curriculum planning [98]. We also discovered that though the expert tutors start out with the standard protocol implemented in the current version of the system, they switch protocols to one that gives more immediate feedback when the student gets in trouble [99]. The ability to switch protocols will not only improve the tutor, it will allow us to carry out educational experiments as well, to study the advantages of different protocols.

Back to Article Outline

Acknowledgments 

This research was partially supported by the Cognitive Science program of the Office of Naval Research under Grant Nos. N00014-94-1-0338 and N00014-02-1-0442 to Illinois Institute of Technology and under Grant N00014-00-1-0660 to Stanford University. The content does not reflect the position or policy of the government and no official endorsement should be inferred. This officially worded acknowledgment does not begin to express what our field owes to ONR. This research, like much other basic research in natural language dialogue and in tutoring systems would not exist without the support provided by the Cognitive Science Program of the Office of Naval Research under the leadership of Dr. Susan Chipman. From Carbonell's first efforts in 1970 to today's experimental work on spoken dialogue, ONR has led the way in making the dream of effective natural language communication with computers a reality.

Back to Article Outline

References 

  1. Rovick AA, Brenner L. HEARTSIM: a cardiovascular simulation with didactic feedback. The Physiologist. 1983;26:236–239
  2. Rovick AA, Michael JA. CIRCSIM: An IBM PC computer teaching exercise on blood pressure regulation, paper given at XXX IUPS Congress. Vancouver: IUPS; 1986;
  3. Hunink M, Glasziou P, Siegel J, Weeks J, Pliskin J, Elstein A, et al. Decision making in health and medicine: integrating evidence and values. Cambridge: Cambridge University Press; 2001;
  4. Michael JA, Rovick AA, Evens MW, Kim N. A smart tutor based on a qualitative causal model. In: Proceedings of the AAAI spring symposium on knowledge-based environments for learning and teaching. Menlo Park, CA: AAAI; 1990;p. 112–117
  5. Forbus KD. Qualitative process theory. Artif Intell. 1984;24:85–168
  6. Forbus KD. Articulate software for science and engineering education. In:  Forbus KD,  Feltovich PJ editor. Smart machines in education. Cambridge: MIT Press; 2001;p. 349–375
  7. Kuipers B. Commonsense reasoning about causality: deriving behavior from structure. Artif Intell. 1984;24:169–203
  8. Anderson JR. The expert module. In:  Polson MC,  Richardson JJ editor. Foundations of intelligent tutoring systems. Hillsdale, NJ: Lawrence Erlbaum Associates; 1988;p. 21–53
  9. Rovick AA, Michael JA. The predictions table: a tool for assessing students’ knowledge. Am J Physiol. 1992;263:S33–S36
  10. Kim N. CIRCSIM–Tutor: an intelligent tutoring system for circulatory physiology. Unpublished doctoral dissertation, Chicago, IL: Illinois Institute of Technology; 1989.
  11. Kim N, Evens MW, Michael JA, Rovick AA. CIRCSIM–TUTOR: an intelligent tutoring system for circulatory physiology. In:  Mauer H editors. Proceedings of the international conference on computer assisted learning. Berlin: Springer-Verlag; 1989;p. 254–266
  12. Evens MW, Michael JA. One-on-one tutoring by humans and computers. Mahwah, NJ: Lawrence Erlbaum Associates; 2005;
  13. Chi MTH, de Leeuw N, Chiu MH, LaVancher C. Eliciting self-explanations improves understanding. Cognit Sci. 1994;18:439–477
  14. Fox Tree JE. Listening in on monologues and dialogues. Discourse Process. 1999;27:35–53
  15. Fox B. The human tutorial dialogue project. Hillsdale, NJ: Lawrence Erlbaum Associates; 1993;
  16. Carbonell JR. AI in CAI: an artificial intelligence approach to computer-aided instruction, IEEE Transactions on Man–Machine Systems. MMS. 1970;11:190–202
  17. Carr B, Goldstein IP. Overlays: a theory of modeling for computer aided instruction. MIT AI Memo 406. Cambridge: AI Laboratory; 1977;
  18. Burton RR, Brown JS. Toward a natural language capability for computer-assisted instruction. In:  O’Neill H editors. Procedures for instructional systems development. New York: Academic Press; 1979;p. 272–313
  19. Collins AM, Stevens AL. Goals and strategies of inquiry teachers. In:  Glaser R editors. Advances in instructional psychology. vol. 2:Hillsdale, NJ: Lawrence Erlbaum Associates; 1982;p. 65–119
  20. Wilensky R, Arens Y, Chin D. Talking to Unix in English. Commun Assoc Comput Mach. 1984;27:574–593
  21. Wilensky R, Chin D, Luria M, Martin J, Mayfield J, Wu D. The Berkeley Unix Consultant project. Comput Linguistics. 1988;14:35–84
  22. Woolf BP. Context dependent planning in a machine tutor. Unpublished doctoral dissertation, University of Massachusetts, Amherst, MA, 1984.
  23. Woolf BP. 20 years in the trenches: what have we learned?. In:  Goettl B,  Halff H,  Redfield C,  Shute VJ editor. Proceedings of the intelligent tutoring systems, ITS. Berlin: Springer; 1988;p. 33–39
  24. Cawsey A. Explanation and interaction: the computer generation of explanatory dialogues. Cambridge: MIT Press; 1992;
  25. Freedman R, Rosé CP, Ringenberg M, VanLehn K. ITS tools for natural language dialogue: a domain-independent parser and planner. In:  Gauthier G,  Frasson C,  VanLehn K editor. Proceedings of the Intelligent Tutoring Systems, ITS-2000. Berlin: Springer; 2000;p. 433–442
  26. Rosé CP, Jordan P, Ringenberg M, Siler S, VanLehn K, Weinstein A. Interactive conceptual tutoring in Atlas-Andes. In:  Moore JD,  Redfield CL,  Johnson WL editor. Proceedings of the artificial intelligence in education. Amsterdam: IOS Press; 2001;p. 256–266
  27. Freedman R. Using a reactive planner as the basis for a dialogue agent. In:  Etheredge J,  Manaris B editor. Proceedings of the 13th Florida artificial intelligence research symposium. Menlo Park, CA: AAAI Press; 2000;p. 52–59
  28. Rosé CP, Bhembe D, Roque A, Siler S, Srivastava R, VanLehn K. A hybrid language understanding approach for robust selection of tutoring goals. In:  Cerri SA,  Gouardères G,  Paraguaçu F editor. Proceedings of the intelligent tutoring systems, ITS 2002. Berlin: Springer; 2002;p. 552–561
  29. Jordan P, Makatchev M, VanLehn K. Abductive theorem proving for analyzing student explanations. In:  Verdego F,  Kay J,  Pain H,  Aleven V editor. Proceedings of the artificial intelligence in education. Amsterdam: IOS Press; 2003;p. 73–80
  30. Jordan P, Makatchev M, VanLehn K. Combining competing language understanding approaches in an intelligent tutoring system. In:  Lester JC,  Vicari RM,  Paraguaçu F editor. Proceedings of the intelligent tutoring systems, ITS 2004. Berlin, Germany: Springer-Verlag; 2004;p. 346–357
  31. Graesser AC, Wiemer-Hastings K, Wiemer-Hastings P, Kreuz R. And the tutoring research group, AutoTutor: a simulation of a human tutor. J Cognit Syst Res. 1999;1:35–51
  32. VanLehn K, Jordan PW, Rosé CP, Bhembe D, Bottner M, Gaydos A, et al. The architecture of Why2-Atlas: A coach for qualitative physics essay writing. In:  Cerri SA,  Gouardères G,  Paraguaçu F editor. Proceedings of the intelligent tutoring systems, ITS 2002. Berlin: Springer; 2002;p. 158–167
  33. Graesser AC, Jackson GT, Mathews EC, Mitchell HH, Olney A, Ventura M, Chipman P, Franceschetti PD, Hu X, Louwerse MM, Person NK, TRG, 2003. Why/AutoTutor: a test of learning gains from a physics tutor with natural language dialog, in: Alterman R, Hirsh D, editors, Proceedings of the 25th annual conference of the cognitive science society (Lawrence Erlbaum Associates, Hillsdale, NJ, 2003) p. 1–5.
  34. VanLehn K, Bhembe D, Chi M, Lynch C, Schulze K, Shelby R, et al. Implicit vs. explicit learning of strategies in a non-procedural cognitive skill. In:  Lester JC,  Vicari RM,  Paraguaçu F editor. Intelligent tutoring systems, ITS 2004. Berlin: Springer; 2004;p. 521–530
  35. Lane HC, VanLehn K. Teaching the tacit knowledge of programming to novices with natural language tutoring, in: Fitzgerald S, Guzdial M, editors, special issue of Computer Science Education.
  36. Aleven V. Using background knowledge in case-based legal reasoning: A computational model and an intelligent learning environment. Artif Intell. 2003;150:183–237
  37. Ashley K, Desai R, Levine JM. Teaching case-based argumentation concepts using dialectic arguments vs. didactic explanations. In:  Cerri SA,  Gouardères G,  Paraguaçu F editor. Proceedings of the ITS 2002. Berlin: Springer; 2002;p. 585–595
  38. Di Eugenio B, Fossati D, Yu D, Haller S, Glass M. Aggregation improves learning: experiments in natural language generation for intelligent tutoring systems. In: Proceedings of the annual meeting of the association for computational linguistics. East Stroudsburg, PA: ACL; 2005;p. 50–57
  39. Moore JD, Foster ME, Lemon O, White M. Generating tailored, comparative descriptions in spoken dialogue. In:  Barr B,  Markov Z editor. Proceedings of the 17th international Florida artificial intelligence research society conference. Menlo Park, CA: AAAI Press; 2004;p. 917–922
  40. Moore JD, Porayska-Pomsta K, Varges S, Zinn C. Generating tutorial feedback with affect. In:  Barr V,  Markov Z editor. Proceedings of the 17th international florida artificial intelligence research society conference. Menlo Park, CA: AAAI Press; 2004;p. 923–928
  41. Rosé CP, Torrey C, Aleven V, Robinson A, Wu C, Forbus K. CycleTalk: Toward a dialogue agent that guides design with an articulate simulator. In:  Lester JC,  Vicari RM,  Paraguaçu F editor. Proceedings of the intelligent tutoring systems, ITS 2004. Berlin: Springer; 2004;p. 401–411
  42. Litman DJ, Rosé CP, Forbes-Riley K, VanLehn K, Bhembe D, Silliman S. Spoken versus typed human and computer dialogue tutoring. In:  Lester JC,  Vicari RM,  Paraguaçu F editor. Proceedings of the intelligent tutoring systems (ITS 2004). Berlin: Springer; 2004;p. 368–379
  43. Bratt EO, Clark B, Thomsen-Gray Z, Peters S, Treeratpituk P, Pon-Barry H, et al. Model-based reasoning for tutorial dialogue in shipboard damage control. In:  Cerri SA,  Gouardères G,  Paraguaçu F editor. Proceedings of the ITS 2002. Berlin: Springer; 2002;p. 63–69
  44. Khuwaja RA, Rovick AA, Michael JA, Evens MW. Knowledge representation for an intelligent tutoring system based on a multilevel causal model. In:  Frasson C,  Gauthier G,  McCalla GI editor. Proceedings of the intelligent tutoring systems (ITS 1992). Berlin: Springer; 1992;p. 217–224
  45. Hume G, Michael JA, Rovick AA, Evens M. Hinting as a tactic in one-on-one tutoring. J Learn Sci. 1996;5:23–47
  46. Michael JA, Rovick AA, Evens MW, Shim L, Woo CW, Kim N. The uses of multiple student inputs in modeling and lesson planning in CAI and ICAI programs. In:  Tomek I editors. Proceedings of the international conference on computer assisted learning. Berlin: Springer; 1992;p. 441–452
  47. Elsom-Cook M. Using multiple teaching strategies in an ITS. In:  Goettl B,  Halff H,  Redfield C,  Shute VJ editor. Proceedings of the intelligent tutoring systems (ITS-88). Berlin: Springer; 1988;p. 286–290
  48. Zhang Y, Knowledge-based discourse generation for an intelligent tutoring system, unpublished PhD Dissertation, Illinois Institute of Technology, Chicago, IL, 1991.
  49. Zhou Y, Freedman R, Glass M, Michael JA, Rovick AA, Evens MW. Delivering hints in a dialogue-based intelligent tutoring system. In: Proceedings of the 17th national conference on artificial intelligence. Menlo Park, CA: AAAI Press; 1999;p. 128–134
  50. Michael JA, Rovick AA, Glass MS, Zhou Y, Evens MW. Learning from a computer tutor with natural language capabilities. Interact Learn Environ. 2003;11:233–262
  51. Heller LJ, Mohrman D. Cardiovascular physiology. Stamford, CT: McGraw-Hill/Appleton and Lange; 1981;
  52. Kearsley G. Artificial intelligence and instruction: application and methods. Reading, MA: Addison-Wesley; 1987;
  53. Wenger E. Artificial intelligence and tutoring systems. Los Altos, CA: Morgan Kaufmann; 1987;
  54. Woo CW. Instructional planning in an intelligent tutoring system: combining global lesson plans with local discourse control. Unpublished doctoral dissertation, Illinois Institute of Technology, Chicago, IL, 1991.
  55. Woo CW, A multi-level dynamic instructional planner for an intelligent tutoring system, ONR Technical Report, 1992.
  56. Norman DA. The design of everyday things. New York: Doubleday; 1990;
  57. Norman DA, Draper SW. User centered system design: new perspectives on human–computer interaction. Hillsdale, NJ: Lawrence Erlbaum Associates; 1986;
  58. Shneiderman B. Response time and display rate in human performance with computers. ACM Comput Surv. 1984;16:265–285
  59. Murray WR. Control for intelligent tutoring systems: a blackboard-based dynamic instructional planner. In:  Bierman D,  Breuker J,  Sandberg J editor. Proceedings of the fourth international conference on artificial intelligence in education, AI-ED 89. Amsterdam: IOS Press; 1989;p. 150–168
  60. Brecht B, McCalla G, Greer J, Jones M. Planning the content of instruction. In:  Biermann D,  Breuker J,  Sandberg J editor. Proceedings of the fourth international conference on AI and education. Amsterdam: IOS Press; 1989;p. 32–41
  61. Derry SJ, Hawkes LW, Ziegler U. A plan-based opportunistic architecture for intelligent tutoring. In:  Goettl B,  Halff H,  Redfield C,  Shute VJ editor. Proceedings of the intelligent tutoring systems (ITS-88). Berlin: Springer; 1988;p. 116–123
  62. Peachey DR, McCalla GI. Using planning techniques in intelligent tutoring systems. Int J Man-Mach Stud. 1986;24:77–88
  63. Russell DM. IDE: the interpreter. In:  Psotka J,  Massey LD,  Mutter SA editor. Intelligent tutoring systems: lessons learned. Hillsdale, NJ: Lawrence Erlbaum Publishers; 1988;p. 323–349
  64. Vassileva J. Dynamic course generation on the WWW. In:  du Boulay B,  Mizoguchi R editor. Proceedings of the eighth world conference of the artificial intelligence in education society. Amsterdam: IOS Press; 1997;p. 498–505
  65. Woo CW, Choi J, Evens MW. Web-based ITS for training system managers on the computer intrusion. In:  Cerri SA,  Gouardères G,  Paraguaçu F editor. Proceedings of the intelligent tutoring systems, ITS 2002. Berlin: Springer; 2002;p. 311–319
  66. Woo CW, Evens MW, Michael JA, Rovick AA. Dynamic instructional planning for an intelligent physiology tutoring system. In: Proceedings of the fourth IEEE symposium on computer-based medical systems. Piscataway, NJ: IEEE; 1991;p. 226–233
  67. Lee CH, Evens MW, Glass MS. Looking at the student input to a natural language-based tutoring system. In:  Heffernan N,  Wiemer-Hastings P editor. Proceedings of the ITS 2004 workshop on dialogue-based tutoring systems. Berlin: Springer; 2004;p. 15–22
  68. Sanders G. Generation of explanations and multi-turn discourse structures in tutorial dialogue based on transcript analysis. Unpublished doctoral dissertation, Illinois Institute of Technology, Chicago, IL, 1995.
  69. Wielinga BJ, Breuker JA. Models of expertise. Int J Intell Syst. 1990;5:497–509
  70. Winkels R, Breuker J, Sandberg J. Didactic discourse in intelligent help systems. In:  Goettl B,  Halff H,  Redfield C,  Shute VJ editor. Proceedings of the intelligent tutoring systems, ITS-88. Berlin: Springer; 1988;p. 279–285
  71. Lee CH, Seu JH, Evens MW. Building an ontology for CIRCSIM–Tutor. In: Proceedings of the MAICS 2002. Chicago: IIT; 2002;p. 161–168
  72. Zhang Y, Evens MW, Michael JA, Rovick AA. Knowledge compiler for an expert physiology tutor. In: Proceedings of the ESD/SMI conference on expert systems. Dearborn, MI: ESD; 1987;p. 153–169
  73. Zhang Y, Evens MW, Michael JA, Rovick AA. Extending a knowledge base to support explanations. In: Proceedings of the third annual IEEE symposium on computer-based medical systems. Piscataway, NJ: IEEE; 1990;p. 259–266
  74. Clancey WJ. Intelligent tutoring systems: A tutorial survey. Report No. STAN-CS-87-1174, Stanford, CA, 1987.
  75. Brandle SS. Using joint actions to explain acknowledgments in tutorial discourse: application to intelligent tutoring systems. Unpublished doctoral dissertation, Illinois Institute of Technology, Chicago, IL, 1998.
  76. Clark HH. Using language. Cambridge, UK: Cambridge University Press; 1996;
  77. Zhou Y, Evens M. Practical student model in an intelligent tutoring system. In: Proceedings of the 11th IEEE international conference on tools with artificial intelligence. Piscataway, NJ: IEEE Press; 1999;p. 13–18
  78. Zhou Y, Freedman R, Glass M, Michael JA, Rovick AA, Evens MW. What should the tutor do when the student cannot answer a question?. In:  Kumar A,  Russell I editor. Proceedings of the 12th international Florida AI research society conference (FLAIRS-99). Menlo Park, CA: AAAI Press; 1999;p. 187–191
  79. Elmi M, Evens M. Spelling correction using context. In: Proceedings of the 17th international conference on computational linguistics. East Stroudsburg, PA: ACL; 1998;p. 360–364
  80. Lee YH. Handling ill-formed natural language input for an intelligent tutoring system. Unpublished doctoral dissertation, Illinois Institute of Technology, Chicago, 1990.
  81. Lee YH, Evens MW. Natural language interface for an expert system. Expert Syst Int J Knowledge Eng. 1998;15:233–239
  82. Glass MS. Broadening input understanding in a language-based intelligent tutoring system. Unpublished doctoral dissertation, Illinois Institute of Technology, Chicago, 1999.
  83. Glass M. Processing language input in the CIRCSIM–Tutor intelligent tutoring system. In:  Moore JD,  Redfield CL,  Johnson WL editor. Proceedings of the artificial intelligence in education. Amsterdam: IOS Press; 2001;p. 210–221
  84. Roche E, Schabes Y. Finite state devices for natural language processing. Cambridge: MIT Press; 1997;
  85. Lee CH, Evens MW. Using selectional restrictions to parse and interpret student answers in a cardiovascular tutoring system. In:  Berkowitz E editors. Proceedings of the midwest artificial intelligence and cognitive science conference, MAICS 2004. Schaumburg, IL: Roosevelt U; 2004;p. 63–67
  86. Shah F, Evens MW, Michael JA, Rovick AA. Classifying student initiatives and tutor responses in human keyboard-to-keyboard tutoring sessions. Discourse Process. 2002;33:23–52
  87. VanLehn K. Student modeling. In:  Polson MC,  Richardson JJ editor. Foundations of intelligent tutoring systems. Hillsdale, NJ: Lawrence Erlbaum Associates; 1988;p. 55–78
  88. Brown JS, Burton RR. Diagnostic models for procedural bugs in basic mathematical skills. Cognit Sci. 1978;2:155–192
  89. Shim L, Evens MW, Rovick AA, Michael JA. Effective cognitive modeling in an intelligent tutoring system for cardiovascular physiology. In: Proceedings of the fourth IEEE symposium on computer-based medical systems. Piscataway, NJ: IEEE; 1991;p. 338–345
  90. Chang RC. Surface level generation of tutorial dialogue using a specially developed lexical functional grammar and lexicon. Unpublished doctoral dissertation, Illinois Institute of Technology, Chicago, Illinois, 1992.
  91. Chang RC, Evens MW, Michael JA, Rovick AA. Surface generation in tutorial dialogues based on a sublanguage study. In:  Chang YF editors. Proceedings of the ICAST’94. Naperville, IL: CAPAMA; 1994;p. 113–119
  92. In:  Bresnan J editors. The mental representations of grammatical relations. Cambridge: MIT Press; 1982;p. 173–281
  93. Kaplan R. The formal architecture of lexical-functional grammar. J Informat Sci Eng. 1989;5:305–322
  94. Lepper MR, Woolverton M, Mumme DL, Gurtner JL. Motivational techniques of expert human tutors: lessons for the design of computer-based tutors. In:  Lajoie S,  Derry S editor. Computers as cognitive tools. Hillsdale, NJ: Lawrence Erlbaum Associates; 1993;p. 75–105
  95. Kim JH, Glass MS, Freedman R, Evens MW. Learning the use of discourse markers in tutorial dialogue for an intelligent tutoring system. In:  Gleitman LR,  Joshi AK editor. Proceedings of the cognitive science 2000. Mahwah, NJ: Lawrence Erlbaum Associates; 2000;p. 262–267
  96. Yang FJ, Kim JH, Glass MS, Evens MW. Lexical issues in the tutoring schemata of CIRCSIM–Tutor: analysis of variable references and discourse markers. In:  Benedict M editors. Proceedings of the human interfaces to complex systems. Urbana: Beckman Institute; 2000;p. 26–31
  97. Yang FJ, Kim JH, Glass MS, Evens MW. Turn planning in CIRCSIM–Tutor. In:  Etheredge J,  Manaris B editor. Proceedings of the Florida artificial intelligence symposium. Menlo Park, CA: AAAI Press; 2000;p. 60–64
  98. Cho BI, Michael JA, Rovick AA, Evens MW. A curriculum planning model for an intelligent tutoring system. In:  Kumar A,  Russell I editor. Proceedings of the 12th Florida artificial intelligence symposium (FLAIRS-99). Menlo Park, CA: AAAI Press; 1999;p. 197–201
  99. Cho BI, Michael JA, Rovick AA, Evens MW. An analysis of multiple tutoring protocols. In:  Gauthier G,  Frasson C,  VanLehn K editor. Proceedings of the ITS 2000. Berlin: Springer; 2000;p. 212–221

PII: S0933-3657(05)00109-0

doi:10.1016/j.artmed.2005.10.004

Artificial Intelligence in Medicine
Volume 38, Issue 1 , Pages 25-46, September 2006