Close
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Learning what to instruct: Acquiring knowledge from demonstrations and focussed experimentation
(USC Thesis Other)
Learning what to instruct: Acquiring knowledge from demonstrations and focussed experimentation
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
INFORMATION TO USERS This manuscript has been reproduced from the microfilm master. UM I films the text directly from the original or copy submitted. Thus, some thesis and dissertation copies are in typewriter face, while others may be from any type of computer printer. The quality of this reproduction is dependent upon the quality of the copy submitted. Broken or indistinct print, colored or poor quality illustrations and photographs, print bieedthrough, substandard margins, and improper alignment can adversely affect reproduction. In the unlikely event that the author did not send UM I a complete manuscript and there are missing pages, these w ill be noted. Also, if unauthorized copyright material had to be removed, a note will indicate the deletion. Oversize materials (e.g., maps, drawings, charts) are reproduced by sectioning the original, beginning at the upper left-hand comer and continuing from left to right in equal sections with small overlaps. Photographs included in the original manuscript have been reproduced xerographically in this copy. Higher quality 6" x 9' black and white photographic prints are available for any photographs or illustrations appearing in this copy for an additional charge. Contact UM I directly to order. ProQuest Information and Learning 300 North Zeeb Road. Ann Arbor. M l 48106-1346 USA 800-521-0600 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. LEARNING W HAT TO INSTRUCT: ACQUIRING KNOW LEDGE FROM DEM ONSTRATIONS AND FOCUSSED EXPERIM ENTATION by Richard Harrington Angros, Jr. A Dissertation Presented to the FACULTY O F TH E GRADUATE SCHOOL UNIVERSITY O F SOUTHERN CA LIFORNIA In Partial Fulfillment o f the Requirements for the Degree DO CTO R OF PHILOSOPHY (Com puter Science) M ay 2000 Copyright 2000 Richard Harrington Angros, Jr. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. UMI Number: 3054846 ___ UMI UMI Microform 3054846 Copyright 2002 by ProQuest Information and Learning Company. All rights reserved. This microform edition is protected against unauthorized copying under Title 17, United States Code. ProQuest Information and Learning Company 300 North Zeeb Road P.O. Box 1346 Ann Arbor, Ml 48106-1346 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. UNIVERSITY OF SOUTHERN CALIFORNIA THE GRADUATE SCHOOL UNIVERSITY PARK LOS ANOELES. CALIFORNIA 90007 This dissertation, written by i S i j i f e a r i y i __ under the direction of h.JS. Dissertation Committee and approved by all its members, has been presented to and accepted by The Graduate School in partial fulfillment of re quirements for the degree of DOCTOR OF PHILOSOPHY .......................... f Dean of Graduate Studies Date 2000 DISSERTATION COMMITTEE Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Acknowledgem ents G etting a PhD is a large undertaking and requires th e help and support of many people. I would like to thank my advisor Lewis Johnson. Obviously, he provided a great deal of advice and guidance. He was careful to offer guidance in such a way th at I had the freedom to make my own decisions. Having gone through this process, I am now much better able to make these decisions. I would like to thank Jeff Rickel with whom I worked on the VET project. He was always available to offer comm ents and suggestions. I would like to thank my com m ittee members for graciously providing time and effort. My committee consisted of Lewis Johnson, Jeff Rickel, M artin Frank, Allen M unro, Paul Rosenbloom and Skip Rizzo. They each provided insightful comments and suggestions. I also want to thank them for their patience and for reading two drafts of this docum ent. I would like to thank Jeff Rickel for the STEVE tu to r. He was always willing to answ er questions about it and to fix problems. I would also like to thank everyone else who worked on STEVE, especially Ben Moore and M arcus T hiebaux. I would like to thank the people a t BTL for the VIVIDS authoring tool. I would like to particularly thank Allen Munro and Quentin Pizzini. They not only answered questions, but they also made changes to VIVIDS in order to support my work. These changes included fixing software problems th a t only I experienced as well as the ability for an external program to save and restore the state of a VIVIDS model. I would also like to thank the people at Lockheed M artin who worked on the V E T project and on the VISTA Viewer, which managed th e display of the simulated environ ment. Particularly, I would like to thank Randy Stiles and Laurie McCarthy. A number of people helped me design the empirical evaluation. These people include Lewis Johnson, Jeff Rickel, M artin Frank. Paul Rosenbloom, Allen Munro. Skip Rizzo and Gallon Buckwalter. I would especially like to thank Skip Rizzo and Gallon Buckwalter because they work in a different part of the university and are not associated with th e VET project. ii Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. When I first started attending USC, Seym our G insburg kindly gave of his tim e and offered me some very good advice. I would like to thank the Information Sciences In stitu te’s (ISI) Soar group. T h e Soar group was a place to explore new ideas and to hear a variety of perspectives. Some current and former m em bers include Jo n ath an G ratch, R andy Hill, Gal Kaminka, Jihie Kim, Lewis Johnson, Jeff Rickel, Paul Rosenbloom, Ian Stobie, Ben Smith, Bongham Cho and Karl Schwamb. I would like to thank additional people here a t ISI. Lorna Zorman helped me when I first started. I also want to th ank Kate LaBore and Erin Shaw. Additionally, I w ant to thank fellow grad students and officemates Ali Erdem and Chon Yi. I would like to thank Hughes Aircraft and Raytheon for supporting my efforts to get this degree. I wish to thank Hughes for a fellowship, and I want to thank some of my managers at Hughes: N atacha Estrada, Mike Stem ig, Dave Manz and Diana Chu. I would also like to thank the Office of Naval Research for funding the V ET project with grant N 00014-95-C 0179 and AASERT g ran t N00014-97-1-0598. Finally, I would like to th ank my family and parents. T hey’ve provided a great deal of support and encouragem ent. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Contents Acknowledgements ii List Of Tables xi List Of Figures xii Abstract xv 1 Introduction 1 1.1 M otivation ....................................................................................................................... 2 1.1.1 A Model of T each in g ........................................................................................ 2 1.1.2 W hat Activities Must a T utor P e r fo r m ? .................................................... 3 1.1.3 The Required K n o w le d g e ............................................................................... 4 1.1.4 Acquiring Knowledge from E x p e r ts ............................................................. 5 1.1.5 Heterogeneous Tutoring E n v iro n m e n ts...................................................... 5 1.1.5.1 Advantages of a Heterogeneous A r c h ite c tu r e .......................... 5 1.1.5.2 Lim itations of a Heterogeneous A r c h ite c tu r e .......................... 6 1.2 The P r o b le m ................................................................................................................... 6 1.3 Addressing the Knowledge Acquisition Bottleneck ........................................... 7 1.4 The Basic A p p r o a c h ..................................................................................................... 8 1.5 Related W o rk ................................................................................................................... 9 l.G C o n trib u tio n s................................................................................................................... 10 1.7 Organization of the T h e s is ........................................................................................... 11 2 Using Diligent 12 2.1 Properties of a Simple P ro c e d u re ............................................................................... 12 2.2 W hat Does the A uthor Need to Know? .................................................................. 13 2.2.1 Concepts Needed for Basic U s e ..................................................................... 14 2.2.2 Concepts Needed for Advanced Use ........................................................... 15 2.3 Using Diligent to A uthor a P r o c e d u r e ................................................................... 18 2.3.1 The Procedure to be A u th o r e d ................................................................... 18 2.3.2 Authoring the Procedure .............................................................................. 19 2.3.2.1 C reating the P r o c e d u r e ................................................................... 19 2.3.2.2 Specifying the Initial S t a t e ............................................................ 19 2.3.2.3 D em onstrating the P ro c e d u re ........................................................ 20 iv Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.3.‘ 2.4 C reating a Procedure from th e D e m o n s tra tio n ...................... 23 2.3.2.5 T he Initial P r o c e d u r e ...........................................................................24 2.3.2.C Refining the Procedure with E x p e rim e n ts................................ 25 2.3.2.T Additional Authoring A c t iv i ti e s ......................................................28 2.4 S u m m a r y ............................................................................................................................ 28 3 Diligent — Its Problem and Approach 30 3.1 The P r o b le m ..................................................................................................................... 30 3.1.1 R e q u ire m en ts........................................................................................................ 30 3.1.1.1 Reducing the Instructor’s E f f o r t ......................................................31 3.1.2 C onstraints on the Domain K now ledge............................................................. 31 3.1.3 Interface with th e E n v iro n m e n t............................................................................ 33 3.2 Input and O utput .......................................................................................................... 37 3.2.1 Input ...................................................................................................................... 37 3.2.1.1 Definitions of Basic D ata T y p e s ..................................................... 38 3.2.2 O u tp u t .................................................................................................................. 40 3.2.2.1 P r o c e d u re s .......................................................................................... 41 3.2.2.2 O p e ra to rs ............................................................................................. 43 3.3 How Diligent W o r k s ....................................................................................................... 46 3.3.1 Processing D e m o n s tra tio n s ..................................................................................46 3.3.2 H e u ris tic s............................................................................................................... 48 3.4 Where to Look for More In fo rm a tio n ........................................................................ 50 4 Processing D em onstrations 51 4.1 The A uthoring P ro c e ss................................................................................................... 51 4.2 Types of D em onstrations ............................................................................................ 52 4.3 D ata S tru c tu re s ................................................................................................................. 53 4.3.1 P r e f ix e s ................................................................................................................... 53 4.3.2 D em o n stratio n s..................................................................................................... 54 4.3.3 P a t h s ...........................................................................................................................55 4.3.4 Steps ...................................................................................................................... 55 4.3.5 Revisiting the Representation of P ro c e d u re s.................................................. 57 4.4 Assumptions about How the Instructor D e m o n s tr a te s ........................................ 57 4.5 About this C h ap ter’s Extended E x am p le ................................................................... 59 4.6 Authoring a New P ro c e d u re ........................................................................................ 59 4.6.1 C reating a New P ro c ed u re................................................................................ 59 4.6.2 Setting Up the Initial S t a t e .................................................................................60 4.6.3 D em onstrating the P ro c e d u re ..............................................................................60 4.6.4 C reating Prim itive S te p s ........................................................................................60 4.6.5 Converting the D em onstration into a P a t h ..................................................... 64 4.6.6 A Second D em onstration ................................................................................ 66 4.6.6.1 Setting Up the D em onstration’s Initial S t a t e ......................... 67 4.6.6.2 Performing The D em o n stratio n ......................................................... 67 4.6.6.3 Processing the Dem onstration ......................................................... 68 4.6.7 G enerating a P l a n ............................................................................................. 69 4.6.7.1 Guessing the Procedure’s Goals ................................................. 70 v with permission of the copyright owner. Further reproduction prohibited without permission. 4.6.7.2 Deriving Step R elationships........................................................... 71 4.7 C reating a Hierarchical Procedure ........................................................................... 80 4.7.1 Internally Sim ulating A S u b p ro c e d u re ......................................................... 82 4.7.2 Continuing the Running Example ............................................................... 86 4.7.3 A Nested Procedure D e f in itio n ...................................................................... 87 4.7.4 Sensing A c tio n s ................................................................................................... 88 4.7.5 Dem onstrating the Nested P ro ced u re........................................................... 90 4.8 The Completed P ro c e d u re ............................................................................................ 91 4.8.1 Information Provided by the I n s t r u c t o r .................................................... 94 4.8.1.1 G enerating default descriptions..................................................... 95 4.9 C o m p le x ity ......................................................................................................................... 95 4.10 Related W o rk ..................................................................................................................... 97 4.10.1 Natural Language Versus Direct M anipulation ...................................... 97 4.10.2 Program m ing By D e m o n s tra tio n ................................................................. 98 4.10.2.1 Procedure R ep resen tatio n ............................................................... 98 4.10.2.2 Basic T echniques............................................................................... 99 4.11 S u m m a r y ...............................................................................................................................100 5 L e arn in g O p e r a to r s 102 5.1 Additional R equirem ents.................................................................................................. 103 5.2 H e u ris tic s ...............................................................................................................................104 5.3 About this C hapter’s E x a m p le s ..................................................................................... 105 5.4 D ata S tru c tu re s.....................................................................................................................105 5.4.1 Preconditions as a Version S p a c e ................................................................... 106 5.5 Creating a New O p e r a t o r ............................................................................................... 109 5.6 Positive and Negative E x am p les..................................................................................... 112 5.7 Refining P re c o n d itio n s...................................................................................................... 113 5.7.1 Refining Preconditions with Positive E x a m p le s ........................................... 114 5.7.2 Refining Preconditions with Negative E x a m p le s........................................... 116 5.7.2.1 D iscrim inating Between E ffects....................................................... 120 5.8 Putting it all T o g e th e r ...................................................................................................... 123 5.8.1 Determining How to Process E f f e c ts ................................................................124 5.8.2 Adding a New E ffe c t.............................................................................................. 126 5.8.3 Splitting an Effect in T w o ....................................................................................129 5.9 Complexity A n a ly s is ...........................................................................................................132 5.9.1 S calab ility ...................................................................................................................135 5.10 Related W o rk ........................................................................................................................ 136 5.11 S u m m a r y ...............................................................................................................................137 6 E x p e rim e n tin g 139 6.1 The P r o b le m .......................................................................................................................140 6.1.1 R eq u irem en ts............................................................................................................140 6.2 B ack g ro u n d ............................................................................................................................141 6.2.1 Focused versus U n fo c u se d ....................................................................................142 6.2.2 Supervised versus U nsupervised.........................................................................142 6.2.3 Experimenting with P l a n s ................................................................................... 143 vi with permission of the copyright owner. Further reproduction prohibited without permission. 6.3 In p u t......................................................................................................................................144 6.4 Diligent’s Approach .............................................................................................145 6.5 The Procedure Being Used ..........................................................................................146 6.6 The A lg o r ith m ............................................................................................................... 147 6.6.1 W hat Was Learned From the E x p e rim e n t................................................151 6.7 Complexity A n a ly s is ...................................................................................................... 153 6.7.1 S c a la b ility .............................................................................................................155 6.8 Related W o r k .................................................................................................................... 156 6.8.1 T he Self-Explanation E f f e c t............................................................................156 6.8.2 O ther S y s t e m s .................................................................................................... 156 6.9 S u m m a r y ...........................................................................................................................157 7 Empirical Evaluation 159 7.1 H y p o th e s e s ........................................................................................................................159 7.2 The Three Versions of Diligent ................................................................................ 160 7.3 Usability Analysis .........................................................................................................162 7.4 Experimental M ethod ..................................................................................................164 7.4.1 Independent Variable.... ..................................................................................... 164 7.4.2 Test S u b je c ts ...................................................................................................... 164 7.4.3 D ependent V a ria b le s.........................................................................................166 7.4.3.1 M easuring Errors in Plans ............................................................ 167 7.4.4 Test P r o c e d u r e ...................................................................................................168 7.4.5 The Procedures Being A u t h o r e d ...................................................................171 7.4.6 D ata A nalysis...................................................................................................... 172 7.5 R esults................................................................................................................................ 173 7.5.1 Results of Background Q uestionnaire............................................................173 7.5.2 Tim e Spent T ra in in g ........................................................................................ 175 7.5.3 Logical E d i t s ...................................................................................................... 176 7.5.4 E r r o r s ....................................................................................................................179 7.5.4.1 Errors in Identifying S t e p s ............................................................179 7.5.4.2 Errors of O m issio n ............................................................................ 180 7.5.4.3 Errors o f C o m m issio n ......................................................................183 7.5.4.4 Total E r r o r s ...................................................................................... 185 7.5.5 Total Required Effort .....................................................................................189 7.5.6 Tim e Spent A u th o rin g .....................................................................................189 7.5.7 Subjective Im pressions.................................................................................... 189 7.6 D iscussion.........................................................................................................................193 7.6.1 A ssum ptions About Test S u b j e c t s ...............................................................193 7.6.2 Discussion of Background Q u e stio n n a ire .....................................................194 7.6.3 Discussion of Training T im e ............................................................................ 195 7.6.4 Discussion of Logical Edits ............................................................................ 196 7.6.5 Discussion of Errors in Identifying S t e p s .....................................................196 7.6.6 Discussion of Errors of O m is s io n .................................................................. 197 7.6.7 Discussion of Errors of C om m ission...............................................................198 7.6.8 Discussion of Total E r r o r s ............................................................................... 198 vii Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 7.6.9 Discussion of Total Required E ffo rt................................................................. 199 7.6.10 Discussion of Tim e Spent A u th o rin g ..............................................................199 7.6.11 Discussion of Subjective Im pressions............................................................. ‘ 200 7.7 Reviewing the Claim s ........................................................................................................201 7.8 Observations..... ......................................................................................................................‘ 204 7.9 S u m m a r y .............................................................................................................................. ‘ 204 8 A n aly sis a n d F u t u r e W o rk 206 8.1 Perspectives for U nderstanding D em onstrations.........................................................‘ 206 8.2 Assumptions ..................................................................................................................... ‘ 208 8.2.1 Easier to R e la x ......................................................................................................‘ 208 8.2.2 H arder to Relax .................................................................................................. ‘ 211 8.3 L im ita tio n s ...........................................................................................................................‘ 214 8.3.1 C oordinated Simultaneous A c tio n s ................................................................. 214 8.3.2 W hen P re-S tate and Post-State Values are In d e p e n d e n t.........................‘ 214 8.3.3 T ransitive D ep en d en cies.....................................................................................‘ 216 8.4 E xtensions.............................................................................................................................. 217 8.4.1 Procedural R epresentation...........................................................................‘ 217 8.4.1.1 M ultiple M ethods for Performing a P r o c e d u r e ...........................‘ 217 8.4.1.2 Conditional P l a n s ...............................................................................‘ 219 8.4.1.3 Disjunctive Goal C o n d itio n s ........................................................... ‘ 219 5.4.2 A u th o r in g ................................................................................................................‘ 2*20 8.4.2.1 Additional Types of D em onstrations............................................. 220 8.4.2.2 Continuous/Param eterized Actions ............................................. 2*21 8.4.2.3 Types of M ental A ttrib u te s.............................................................. ‘ 221 8.4.2.4 Inferred A ttrib u te s...............................................................................‘ 222 8.4.3 L e a rn in g .................................................................................................................... ‘ 222 8.4.3.1 Simple E x te n sio n s...............................................................................22*2 8.4.3.*2 M ore Involved E xtensions................................................................. 223 8.4.4 E x p e rim e n ta tio n ...................................................................................................'224 8.4.4.1 Simple E x te n sio n s.............................................................................. ‘ 224 8.4.4.*2 M ore Involved E xtensions................................................................. '225 8.5 S u m m a r y ............................................................................................................................. ‘ 225 0 R e la te d W o rk 226 9.1 The Presentation of E x am p les.......................................................................................'226 9.1.1 Felicity C o n d itio n s ............................................................................................... ‘ 226 9.1.2 Presenting a Sequence of E xam ples................................................................. ‘ 229 9.2 Intelligent T utoring S y ste m s.......................................................................................... ‘ 232 9.2.1 C om puter Aided Instruction ........................................................................... ‘ 232 9.2.2 W ho is th e A u th o r ...............................................................................................233 9.2.3 Approach to A u th o r in g .....................................................................................'234 9.2.4 Easier D a ta E n t r y ...............................................................................................‘ 235 9.3 Learning From D e m o n s tra tio n s................................................................................... ‘ 237 9.3.1 Program m ing By D e m o n stra tio n .................................................................... ‘ 237 9.3.2 Detailed Dom ain M odels.....................................................................................‘ 238 viii with permission of the copyright owner. Further reproduction prohibited without permission. 9.3.3 Procedure R ecognition.........................................................................................'239 9.3.4 University of Michigan Soar G r o u p .................................................................'239 9.3.5 Approach to E x p erim en tatio n ...........................................................................'240 9.3.6 Systems th a t Learn O perators ....................................................................... * 2 4 1 9.3.7 Other W o rk ............................................................................................................ 24'2 10 Conclusion 244 10.1 Sum m ary of the A p p ro a c h ..............................................................................................'244 10.2 C o n trib u tio n s......................................................................................................................'245 10.3 E valuation............................................................................................................................ '246 10.4 Future Work ......................................................................................................................'246 Reference List 248 Appendix A Im plem entation............................................................................................................................ '261 A .l A rchitecture........................................................................................................................ '261 A.2 M aintenance of A g e n d a .................................................................................................262 A.3 Providing Feedback About Diligent’s B e lie fs........................................................... '264 Appendix B Evaluation M a te ria ls.................................................................................................................. '266 B.l Background Q u estio n n aire............................................................................................. '266 B.2 Procedure Representation D e s c rip tio n ..................................................................... '268 B.3 The Procedure Representation W o rk s h e e t.............................................................. '274 B.4 Worksheet A nsw ers.......................................................................................................... 276 B.5 The P o st-T est.....................................................................................................................*277 B.6 The Directions Given S u b j e c t s ...................................................................................279 B.7 The List of A ttribute V alues......................................................................................... ‘ 283 B.8 Labeled Pictures of the HPAC ...................................................................................'287 B.9 Procedure D escriptions....................................................................................................'29‘ 2 B.9.1 High Condensate Level S hutdow n................................................................... '292 B.9.2 Overload Relay T r ip p e d ................................................................................... 293 B.10 Desired P ro c ed u re s.......................................................................................................... '294 B.10.1 High Condensate Level S hutdow n...................................................................'294 B.10.2 Overload Relay T r ip p e d ....................................................................................'297 B.U Practice P ro c e d u re .......................................................................................................... ‘ 299 B.12 Practice Procedure S o lu tio n ......................................................................................... 300 Appendix C Evaluation D a t a .........................................................................................................................302 C .l Background Q u estio n n aire.............................................................................................303 C.'2 Impressions of D ilig e n t...................................................................................................304 C.2.1 Experimental Condition E C i ..........................................................................305 C.'2.2 Experimental Condition E C 2 ..........................................................................306 C.2.3 Experimental Condition E C 3 ..........................................................................308 ix Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. C.3 A u th o rin g ............................................................................................................................. 309 C.3.1 Experim ental Condition E C i ...........................................................................311 C.3.2 Experimental Condition E C 2 ...........................................................................314 C.3.3 Experim ental Condition E C 3 ...........................................................................317 C.4 Session L o g ..........................................................................................................................3*20 A ppendix D How to Use D ilig e n t...................................................................................................................329 D .l Starting to Specify a P ro c e d u re ....................................................................................330 D.2 D em onstrations...................................................................................................................331 D.2.1 C hapter Goals .....................................................................................................331 D.2.2 Setting the Initial Environm ent S ta te ........................................................331 D.2.3 Adding S t e p s ........................................................................................................ 332 D.2.4 O perator D e s c rip tio n s ....................................................................................... 335 D.2.5 Add More Steps ..................................................................................................335 D.2.6 End D em onstration ...........................................................................................335 D.2.7 Additional D e m o n s tra tio n s ............................................................................337 D.2.8 Choosing a Previous S t e p ................................................................................338 D.3 Adding Steps to a Procedure ........................................................................................ 339 D.3.1 C hapter Goals .....................................................................................................339 D.3.2 Adding S t e p s ........................................................................................................ 339 D.3.3 Choosing a Previous S t e p ................................................................................340 D.3.4 Selecting an A c tio n ..............................................................................................340 D.3.5 O perator D e s c rip tio n s .......................................................................................342 D.3.6 Selecting O perator E ffe c ts ................................................................................343 D.3.7 Adding O perator E ffe c ts ....................................................................................345 D.3.8 Selecting O perator Effect’s R e v is ite d ........................................................... 350 D.3.9 Add a Couple M ore S t e p s ................................................................................350 D.4 Editing a P r o c e d u r e ........................................................................................................ 351 D.4.1 C hapter Goals .................................................................................................... 351 D.4.2 Review: Reaching the Procedure M odification M e n u ............................ 351 D.4.3 Procedure G r a p h s ..............................................................................................352 D.4.4 Looking at a s t e p ................................................................................................. 354 D.4.5 O perator Effect M e n u .......................................................................................356 D.4.6 Precondition Window .......................................................................................357 D.4.7 State Change W in d o w .......................................................................................357 D.4.8 Modifying P re co n d itio n s................................................................................... 358 D.4.8.1 Using the O perator Effect m e n u .................................................... 358 D.4.8.2 Using the Step Prerequisites m e n u ................................................. 358 D.4.9 Updated Procedure G raph .............................................................................. 358 D .4.10 Updated Step M odification M e n u ....................................................................361 D.4.11 Dependencies Menu ........................................................................................... 363 D .4.12 Looking a t the Causal Link Menu ..................................................................364 x Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. List Of Tables 3.1 Where Topics are C o v e re d ............................................................................................ 50 6.1 Changes to p ro d 's P re c o n d itio n s .............................................................................. 152 7.1 Distribution of S ubjects Based on Sex and L a n g u a g e ........................................... 165 7.2 Activities Performed B y S u b je c ts .............................................................................. 169 7.3 Background ANOVA Tests ........................................................................................... 174 7.4 Background M eans and Standard Deviations ..........................................................174 7.5 Background ANOVA Tests ........................................................................................... 175 7.6 Training Tim e M eans and Standard D e v ia tio n s ........................................................ 176 7.7 Linear Regression on T otal Training T i m e ................................................................177 7.8 Logical Edit A nalysis......................................................................................................... 177 7.9 Means and Standard Deviations on Invalid S t e p s .....................................................179 7.10 Errors of Omission A n a ly s is ........................................................................................... 180 7.11 Errors of Commission A n a ly s is .....................................................................................183 7.12 Total Error A n a ly s is ......................................................................................................... 185 7.13 Total Required Effort A n a l y s is .....................................................................................187 7.14 Analysis of Time Spent A u th o r in g ..............................................................................190 7.15 Subjective Im p re ssio n s......................................................................................................192 7.16 Summary of R e s u l t s .........................................................................................................203 A .l Status Values Used by D ilig e n t.................................................................................... 265 C .l E C i Impressions about A u th o r in g .............................................................................. 305 C.2 ECi Impressions about Dem onstrations and E x p e rim e n ts .................................. 305 C.3 EC 2 Impressions about A u th o r in g .............................................................................. 306 C.4 EC 2 Impressions about D em onstrations and E x p e rim e n ts .................................. 306 C.5 EC 3 Impressions about A u th o r in g .............................................................................. 308 C.6 EC 1 Procedure 1 A uthoring Inform ation....................................................................311 C.7 EC 1 Procedure 2 A uthoring Inform ation....................................................................312 C.8 EC\ Time Spent on A c tiv itie s....................................................................................... 313 C.9 EC'2 Procedure 1 A uthoring Inform ation....................................................................314 C.10 EC 2 Procedure 2 A uthoring Inform ation....................................................................315 C .l l EC 2 Time Spent on A c tiv itie s....................................................................................... 316 C.12 EC 3 Procedure 1 A uthoring Inform ation....................................................................317 C.13 EC 3 Procedure 2 A uthoring Inform ation....................................................................318 C.14 EC 3 Time Spent on A c tiv itie s....................................................................................... 319 xi Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. List Of Figures ‘ 2.1 Procedure Description Menu ..................................................................................... 19 2.2 Resetting the E nvironm ent’s S t a t e ........................................................................... 19 2.3 The Front of the H P A C ............................................................................................... 21 2.4 O perator Description M e n u ........................................................................................ 23 2.5 Hypothesized Goal C onditions .................................................................................. ‘ 24 2.6 Hypothesized Goal C ondition D e ta ils....................................................................... ‘ 25 2.7 The Initial D ependencies............................................................................................... ‘ 26 2.8 The Dependencies A fter E x p e rim e n ta tio n ............................................................. ‘ 27 3.1 In p u t/O u tp u t.................................................................................................................... 36 3.2 Syntax of Basic D ata T y p e s ........................................................................................ 38 3.3 An Action-Example ...................................................................................................... 40 3.4 Example Plan p r o c A ...................................................................................................... 42 3.5 Example O perator to g g le-m o to r................................................................................. 44 3.6 Processing a D e m o n s tra tio n ........................................................................................ 47 4.1 First D em onstration’s A c tio n -E x a m p le s................................................................. 61 4.2 Creating a Prim itive S t e p ........................................................................................... 62 4.3 First D e m o n s tra tio n ...................................................................................................... 64 4.4 Initializing a Path ......................................................................................................... 65 4.5 The Initial Path ............................................................................................................. 65 4.6 Using a P re fix .................................................................................................................... 67 4.7 The Second D em onstration’s P re fix .......................................................................... 68 4.8 The Second D em onstration ....................................................................................... 68 4.9 Adding a D em onstration to a P a t h .......................................................................... 69 4.10 Updated P a t h ................................................................................................................... 69 4.11 Deriving Goals from a P a t h ....................................................................................... 72 4.1*2 Goal Conditions Derived from P a th .......................................................................... 7'2 4.13 Com puting Step Relationships ................................................................................ 73 4.14 The O perators ................................................................................................................ 74 4.15 Identifying a P a th ’s E f f e c t s ....................................................................................... 76 4.16 Skeleton of Procedure ................................................................................................. 77 4.17 Com putation of Causal L i n k s .................................................................................... 78 4.18 Causal Links ................................................................................................................... 79 4.19 Com putation of A dditional Ordering C o n s tra in ts ............................................... 81 4.20 Ordering C o n s tra in ts ..................................................................................................... 8 * 2 xii Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 4.21 The Plan for Procedure p r o c l .................................................................................... 83 4.22 Simulating a Subprocedure ........................................................................................ 84 4.23 Results from Sim ulating Step p ro cl-6........................................................................ 87 4.24 The Subprocedure’s P re fix ........................................................................................... 88 4.25 Com puting S ta te Changes Caused by Earlier S t e p s .............................................. 89 4.26 Subprocedure D e m o n s tra tio n .................................................................................... 92 4.27 The Plan for Subprocedure p ro c 2 .............................................................................. 93 4.28 The Top Level P r o c e d u r e ........................................................................................... 94 5.1 Preconditions for S tarting a C a r .................................................................................104 5.2 Relationship between the Precondition C o n c e p t s ...................................................106 5.3 An O perator ................................................................................................................... 107 5.4 Algorithm for C reating New O p e r a to r ....................................................................... 110 5.5 Input for C reating New O p e r a to r .................................................................................I l l 5.6 A New O p e r a t o r ............................................................................................................... I l l 5.7 Some Positive and Negative E x a m p le s....................................................................... 112 5.8 Refining Preconditions with a Positive E x a m p le...................................................... 114 5.9 Using a Positive E x a m p le .............................................................................................. 115 5.10 Potentially Needed Conditions .................................................................................... 116 5.11 Refining Preconditions with Negative E x a m p le ...................................................... 117 5.12 Using Negative E x a m p le s .............................................................................................. 118 5.13 Discriminating Between E f fe c ts .................................................................................... 121 5.14 An Example o f D iscrim inating Between E f f e c ts ...................................................... 122 5.15 Refining an O p erato r w ith an E xam ple....................................................................... 125 5.16 An Example for Assigning Delta-State Conditions to E ffe c ts.............................. 126 5.17 Creating a New E f f e c t ..................................................................................................... 127 5.18 An Example o f C reating a New E f f e c t....................................................................... 129 5.19 Splitting an E f f e c t ............................................................................................................ 130 5.20 An Example o f C reating a New E f f e c t....................................................................... 131 6.1 A Hierarchical Procedure .............................................................................................. 147 6.2 The Top Level Experim entation Algorithm ............................................................. 147 6.3 Generating Skip-Step E x p e rim e n ts ..............................................................................148 6.4 Performing E x p e rim e n ts ..................................................................................................149 6.5 The Stack of A ctions to P e r f o r m ................................................................................ 150 7.1 Graphs of Logical E d i t s .................................................................................................178 7.2 G raphs of E rrors of O m is s io n ...................................................................................... 181 7.3 Graphs of E rrors o f Commission ................................................................................ 184 7.4 Graphs of Total E r r o r s .................................................................................................... 186 7.5 Graphs of Total Required Effort ................................................................................18S 7.6 Graphs of Tim e Spent A u th o rin g ................................................................................191 8.1 An A ttribute whose Post-State is Independent of its P re-S tate .......................215 8.2 Incompatible P a t h s ............................................................................................................218 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. A .l The VET Software A rch ite ctu re.................................................................................... 261 A.2 The STEVE Tutoring A g e n t........................................................................................... 263 B .l Procedure with Steps in Specification O r d e r .............................................................269 B.2 Procedure with Steps O rdered by D ependencies......................................................270 B.3 Operator with Two Effects ...........................................................................................271 B.4 Example S t e p s ................................................................................................................... 272 B.5 Procedure Example2’s D e p e n d e n c ie s ..........................................................................273 D .l Main Learning M e n u .........................................................................................................330 D.2 Main Learning Menu “E diting” O p t i o n s ...................................................................330 D.3 Procedure Description Menu ........................................................................................330 D.4 Simulation Configuration M e n u ....................................................................................331 D.5 Communications Bus M onitor W in d o w ......................................................................332 D.6 Additional Environm ent C h a n g e s ................................................................................ 332 D.7 Demonstration M e n u ......................................................................................................... 333 D.8 Environment before D e m o n stra tio n ............................................................................. 333 D.9 Environment after D em o n stratio n .................................................................................334 D .10 Operator Description W in d o w ........................................................................................ 335 D .ll Soar Processing an A c tio n ................................................................................................336 D.12 Demonstration Version of Procedure Modification M e n u ....................................... 337 D.13 Demonstration Type Menu ............................................................................................338 D.14 Previous Step M e n u ..........................................................................................................339 D.15 Manual Editor Version of Procedure Modification M e n u ..................................... 340 D.16 Previous Step M e n u ..........................................................................................................341 D.17 Action Selection M e n u ...................................................................................................... 341 D.18 O perator Description W in d o w ........................................................................................ 342 D.19 Effect Selection Menu Before Effects Defined ..........................................................342 D.20 Initial O perator Effect Menu ........................................................................................ 344 D.21 Precondition A ttribute L i s t ............................................................................................345 D.22 Attribute Value Input W in d o w '.....................................................................................345 D.23 Precondition Value W in d o w ............................................................................................346 D.24 Ifpdated O perator Effect M e n u .....................................................................................347 D.25 Updated Effect Selection M e n u .....................................................................................349 D.26 Main Learning M e n u ..........................................................................................................351 D.27 Procedure G raph from “O rdering relationships” .......................................................352 D.2S Procedure Graph showing “execution order” ............................................................353 D.29 Step Modification M e n u ...................................................................................................354 D.30 O perator Effect M e n u ...................................................................................................... 356 D.31 Precondition Window ...................................................................................................... 357 D.32 State Change W in d o w ...................................................................................................... 357 D.33 Step Prerequisites M e n u ...................................................................................................359 D.34 Incorrect Procedure Graph ............................................................................................ 360 D.35 Step Modification Menu with E r r o r ..............................................................................361 D.36 Dependencies Menu ..........................................................................................................363 D.37 Causal Link M e n u ............................................................................................................. 364 xiv Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. A bstract One way th a t people can learn procedural tasks (e.g. machine m aintenance) is by performing them in a sim ulation of the dom ain. Simulation-based training not only allows repetition, but also allows exposure to situations th a t would be dangerous or expensive in the real-world. However, for sim ulation-based training system to teach students more effectively, the training system needs to know the procedures th a t students are learning. U nfortunately, it has been difficult to acquire this type o f knowledge. This dissertation looks a t using machine learning techniques to acquire procedures th a t can be used for tutoring human students. The work focuses on exploiting access to a simulation of the dom ain. Because we assum e th a t an autom ated tutoring program will already know general principles for teaching, we focus on learning “w hat” to teach rather than “how.” The approach learns general-purpose operators and outputs knowledge of what to teach in the form of hierarchical partially ordered plans. The approach is implemented by a system called Diligent. Procedures are specified by having human authors dem onstrate them . To dem onstrate, a human performs the proce dure by directly m anipulating a graphical representation of the simulation. T his technique could be used by domain experts, who m ay not be programmers or expert knowledge en gineers. However, there is a problem: a single dem onstration may be insufficient for understanding the causal relationships between a procedures steps, while requiring many dem onstrations would take too much tim e and effort. Diligent attem p ts to overcome this problem by using a novel approach for understand ing demonstrations. Although Diligent m ay s ta rt with no domain knowledge, it can reduce the requisite num ber of dem onstrations by performing autonomous experim ents th a t ex ploit the domain knowledge embedded in th e sim ulation. These experim ents identify the causal dependencies between a dem onstration’s steps by removing a step from the dem on stration and observing how this affects later steps. These experiments are augm ented by heuristics th at focus on the dependencies between steps. xv Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. We performed an evaluation on human subjects th a t tested the benefits of our ap proach. As dependent variables, we m easured the amount of work performed and the number of errors. The results suggest th at using dem onstrations and experim ents together is better than using ju st dem onstrations. T h e results also suggest th at dem onstrating a procedure is better than using an editor to declaratively specify it. The benefits of both dem onstrations and experiments appeared to be greater on more complicated procedures. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Chapter 1 Introduction A common activity th a t people engage in is perform ing procedures. Procedures include such things as program ming a VCR or following a cooking recipe. Knowledge o f how to perform procedures is called procedural knowledge. Besides the ability to follow fixed recipes, procedural knowledge also includes know ing how to adapt procedures to a given situation. For example, when repairing an engine, a p a rt might break or an unrelated problem might be discovered. For procedures th a t involve m anipulation o f th e real-world (e.g. machine m aintenance), there are advantages to having human stu d en ts learn procedures by performing them in a simulation of the dom ain. Not only can th e stu d e n ts gain exposure to many train in g problems, but they can also experience a variety o f unusual situations. Training with sim ulations is very useful when real-world tra in in g episodes are expensive or dangerous (e.g. machine m aintenance or surgery). However, not all types of training are equally useful. One effective method of training is tutoring. When compared to conventional classroom instruction, studies have shown th a t tutoring students one-on-one can improve their achievem ent by two standard deviations [BIo84, Ana83, Bur83j. Unfortunately, human tu to r s are expensive, and their availability is usually very limited. T he limited supply of human tutors can be overcom e by using a com puter program as an autom ated tu to r (e.g. STEVE [RJ99]). An a u to m a te d tu to r can free a human in stru cto r for m ore specialized instruction by assuming m any o f his normal duties. In perform ing the instructor’s normal duties, an autom ated tu to r can use general-purpose knowledge a b o u t teaching. However, general-purpose knowledge alone is insufficient because an au to m ated tu to r also needs dom ain specific knowledge a b o u t th e procedures being taught. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Unfortunately, it has not been easy to acquire this type of knowledge. In fact, the general problem of acquiring domain knowledge from experts has been called the knowledge acquisition bottleneck [Hof87]. Fortunately, in the case of si mutation-based training, considerable domain knowledge may already be contained in a simulation. By exploiting access to the domain knowledge embedded in the sim ulation, the acquisition of procedures can be made easier. This dis sertation looks a t how a software program (or agent) can use machine learning techniques and access to a sim ulation in order to learn the procedural knowledge necessary for an autom ated tu to r to teach human students. This chapter is roughly organized as follows. First, we will motivate cur research: this involves discussing w hat we mean by teaching, w hat knowledge is needed, and the difficulties in acquiring this knowledge. Second, we will discuss an approach for learning procedures. We will then finish by discussing contributions and providing a high level overview of related work. 1.1 M otivation 1.1.1 A M od el o f T eaching In this thesis, we assum e th a t an autom ated tu to r will use an apprenticeship style of teach ing [CBN89]. Apprenticeship is a traditional method for learning procedural knowledge and involves “learning-through-guided-experience’’ [CBN89]. During an apprenticeship, an apprentice (or student) learns by observing, copying and interacting with a m aster (or tutor). Learning a procedure typically has two phases: 1. A student observes a tutor dem onstrate a procedure. Observing the dem onstration allows the student to create an initial conceptual model of the procedure. 2. The tutor then m onitors the student as the student performs the procedure. While the student is performing the procedure, the tu to r can provide help and reminders. At this stage, a potential problem is th a t the student may only be able to repeat the procedure by rote. To make sure that the student knows how to adapt the procedure to different situations, he may perform it several tim es from different initial states. As a student gains proficiency in a domain, the procedures th at the student performs tend to become more complex. Moreover, because the stu d en t is more capable, the tutor provides less help and direction. 2 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.1.2 W hat A c tiv itie s M u st a T utor Perform? To teach a procedure, an autom ated tu to r must be able to perform a num ber of activities. T he ability to use knowledge for m ultiple purposes is a characteristic of Intelligent Tutoring System s (ITS) [Wen87]. The activities th at we are interested in will be illustrated with the following example. When an air compressor is running, the condensation of water can cause pres sure to build up internally. W hen the pressure is too high, the compressor shuts down in order to avoid dam age. To restart the compressor, a human operator does the following. He resets the compressor and opens a valve th at allows the water to drain. He then restarts the motor. After the pressure is relieved, he shuts the valve. A tu to r needs to be able to dem onstrate procedures to students. A dem onstration involves performing a sequence of actions th a t will achieve the procedure’s goals. Consider the above example. The tu to r could dem onstrate restarting compressor by performing the following sequence of actions: open the valve, press the reset button, toggle the start button, and close the valve. A tu to r also needs to be able to m onitor students as they perform a procedure. A hu m an student might legitim ately perform the actions in a different order than any demon stration. Thus, the tutor needs to know more than the sequences of actions used in demon strations; the tutor also needs the knowledge to recognize valid sequences of actions. For example, when the tutor dem onstrated restarting the compressor, the tu to r opened the valve before pressing the reset button. However, the student could have instead pressed the button before opening the valve. If the tutor didn’t recognize th a t the two actions were independent, the tu to r m ight believe th at the student has to press the button again after opening the valve. As a tutor is dem onstrating a procedure or monitoring a student, the tu to r needs to be able to answer the stu d en t’s questions. To avoid confusing the student, the tu to r should not provide incomplete or incorrect answers. Questions th at students m ight ask include ones about how to perform a procedure and why an action is needed [Dav84], Questions of this type include the following. • As a student performs the procedure, he might ask which actions are currently applicable. For example, when starting the restart procedure, a student might ask what actions can he do now. The tu to r could then tell the student th a t he could press the reset button or open the valve. 3 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. • A student might also ask why an action is being performed. For example, a student might ask why he needs to toggle the s ta r t button. The tutor might give two reasons: it turns on the m otor, and it results in a normal internal pressure. • A student could also ask why the sta te changes produced by an action are im portant. For example, if a student asked why a normal internal pressure is im portant, the tutor might say th a t the pressure needs to be normal before closing the valve and that having normal pressure is one of th e goals of the procedure. The tutor also needs some capability to respond to student errors, to unusual situ ations, and to unexpected sta te changes. An example of an unusual situation is when the procedure is started with the valve already open. An example of a student error is when a student opens the valve but shuts it before starting the motor. In this case, the student would have to reopen the valve before he could start the m otor. An example of an unexpected state change is when a stu d en t opens the valve but then spends a few minutes taking care of more urgent business. In the mean time, someone could walk by, notice the valve is open, and shut it. The tu to r should be able to react to the valve being unexpectedly shut. 1.1.3 The R equired K now ledge In this dissertation, the knowledge of how to perform procedures will be represented using hierarchical partially ordered plans [R.N95]. N ot only is this representation commonly used by the Artificial Intelligence (AI) community, but the representation has been often used in work that focuses on describing procedures to hum ans. This research on describing procedures has focused on topics such as providing concise descriptions [You97], selecting rhetorical relations to express the relationships between actions [VM95, Van93], m ulti lingual instruction generation [DHP+ 94, PV96, PV F+95], and representing relations th a t hold between pairs of actions [Pol90, Di 94, Bal93]. This procedural representation has several properties th at are needed to provide ade quate explanations [ME89, You97j. • Large, complicated procedures can be decomposed into a sequence of smaller, sim pler and logically coherent subprocedures. For example, consider a procedure th a t teaches someone to drive to the store. One subprocedure might involve starting a car, and another subprocedure might involve stopping a t a stop light. 4 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. • The representation describes causal dependencies between steps th a t can be used to answer questions about how to perform a procedure or why is an action needed. Recall that these are the types o f questions that we discussed in Section 1.1.2. 1.1.4 A cquiring K now ledge from E xp erts Before a procedure can be used, it needs to be acquired from a dom ain expert. Unfor tunately, domain experts often have g reat difficulty encoding knowledge in a form that a com puter program can use [Mus93, Gai87, EEMT87, Hof87]. T he process tends to be tedious, time consuming and error prone. One problem is th at program s often require a very structured and formal representation. This representation may be very different than how the expert thinks about the dom ain. Additionally, the techniques used for specifying a procedure may be very different than how the instructor would teach a human student. For these reasons, the process of specifying procedures may seem awkw ard and unnatural. 1.1.5 H eterogeneous T utoring E nvironm ents Instead of addressing the general knowledge acquisition problem, this thesis addresses knowledge acquisition in heterogeneous, simulation-based tutoring environm ents. By het erogeneous, we mean th a t the system contains multiple software com ponents (e.g. auto m ated tu to r and simulation) and th a t these components may have been created by different people and organizations. This problem is easier than the general knowledge acquisition problem because of access to a simulation, which embodies knowledge of the domain. Although a sim ulation provides a executable model of the domain, sim ulations do not know the procedures th a t we are trying to teach. 1.1.5.1 Advantages o f a H eterogeneous Architecture This type of modular, heterogeneous architecture has advantages. Consider the situation when the interfaces between com ponents are stable. Old com ponents can be replaced by newer ones without modifying the o th er components. Because com ponents can be developed separately, it may be possible to spend more time on each, especially if it appears th a t a component can be reused. The modularity of components should make transferring the tu to rin g environm ent to a new domain easier. For example, the sim ulation for the old domain m ight be replaced with a simulation for the new domain. Because sim ulations are often w ritten during product 5 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. design and evaluation, an existing simulation for the new domain might be available. The system ’s m odularity should also allow easier transfer of general-purpose com ponents (e.g. autom ated tutors). One reason for transferring an autom ated tu to r to another dom ain is that the tu to r may contain a lot of reusable, general knowledge about teaching. However, without knowledge of a dom ain’s procedures, a tu to r could not teach. 1.1.5.2 Limitations o f a Heterogeneous Architecture An im portant characteristic of heterogeneous system s is th a t individual com ponents may contain specialized knowledge and have little access to the internal knowledge o f other components. Different com ponents are likely to have been created by different people and organizations, and some of these people may have little or no contact w ith each other. Moreover, different com ponents may use different representations and program m ing languages. The limited access to another component’s knowledge can impose restrictions on the types of operations th a t another component (e.g. an authoring tool) can perform on a simulation. Ideally, another com ponent (e.g. an authoring tool) could extract knowledge from the sim ulation by putting the simulation in a particular state, performing th e type action that a human would (e.g. press a button), and observing the result. While a sim u lation should be able to support the type of actions th a t hum ans perform, the sim ulation may not allow another com ponent to independently set the values of individual a ttrib u tes. The rationale for this lim itation is th a t other com ponents may not know which s ta te s of the simulation are valid. A rbitrary changes to the sim ulation's sta te could result in in consistent a ttrib u te values. Even worse, arbitrary sta te changes could put the sim ulation in an unsupported, invalid sta te and result in invalid behavior. This limitation is more im portant when a sim ulation models something (e.g. a machine) with many constraints on valid attrib u te values. 1.2 The Problem This thesis deals with acquiring procedures from dom ain experts in a heterogeneous, simulation-based tutoring environm ent. To do this, we will want to meet a few general requirements. • We want to reduce the effort required from the dom ain expert. This includes using techniques th at will reduce the number of num ber of questions th at the user is asked. 6 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. • We want to use m ethods that could be used by a large class of users. In particular, we are interested in someone who teaches human students (i.e. an instructor). Although an instructor is a domain expert, he may not have expertise in programming or knowledge engineering. • We want an approach that can be easily transferred to a new dom ain. This means th a t the approach should require little explicit encoding of dom ain knowledge. 1.3 Addressing the Knowledge Acquisition B ottleneck To address these requirements, we will make some assum ptions ab o u t the components of the heterogeneous tutoring system. In particular, we assume th a t an instructor has access to a simulation of the “real-world” domain. In this docum ent, we will use the term environment when referring to the com ponents of the tutoring system th at sim ulate the dom ain. We assume th at the environm ent provides two im portant capabilities: it provides a graphical representation of the dom ain, and it allows the software program th a t learns procedures to interact with the simulation th at controls the environm ent. This thesis focuses on how a software program (or agent) can use machine learning techniques and access to the environment in order to reduce the effort required from an instructor. The approach addresses th e knowledge acquisition bottleneck in two ways. • The instructor can communicate w ith the agent a t a high level. T he communication is at a high level because the graphical representation of the dom ain allows the instructor to dem onstrate procedures in manner sim ilar to how he would demonstrate them to a human student. • The agent’ s ability to observe and interact with the simulation allows the agent to use machine learning techniques for extracting inform ation from the sim ulation. The ability to observe dem onstrations allows the agent to observe how a demonstration changes the sim ulation’s state. Furtherm ore, the ability to interact with the simula tion allows the agent to improve its understanding by performing experim ents. One limitation of this approach is th a t the agent m ay not have enough d a ta to learn an entirely correct procedure. Instead, the approach provides an heuristic aid to the instructor by learning procedures that are “reasonably” co rrect.1 Given such a procedure, an instructor can then use his domain knowledge to refine it. 1A procedure is ‘ "reasonably” correct in the sense that it should be close to the correct procedure. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Because observing a dem onstration may not provide enough knowledge to learn a correct procedure, it is desirable for an agent to acquire more knowledge. One approach is to ask the instructor questions th a t clarify areas of uncertainty. However, instead of asking the instructor questions, the agent could have attem pted to answer its own questions by performing experim ents. This thesis leaves questioning the instructor as an area for future work and instead focuses on understanding dem onstrations by perform ing experiments. 1.4 The Basic Approach The basic approach for learning procedures involves integrating dem onstrations and ex periments. The agent initially learns a procedure by observing a dem onstration. The agent then refines the procedure by performing experiments th a t are derived from the demonstration. The instructor dem onstrates procedures by using a mouse to directly m anipulate a graphical representation of the dom ain. W hen dem onstrating, the in structor performs a sequence of actions (e.g. pressing a button) th a t m anipulate the dom ain. These actions are referred to as the dem onstration’s steps, and this approach is called Programming By Demonstration [C+ 93]. It should be easy for an instructor to provide dem onstrations because he performs the procedure ju st like a human student would. While the instructor is dem onstrating, the agent is constantly learning. However, the agent may not have enough d a ta to learn the correct procedure. In particular, the agent may not have enough d a ta to correctly identify how earlier steps in a dem onstration establish preconditions of later steps. To com pensate for its lack of d a ta , the agent identifies likely preconditions. Even though likely preconditions may be incorrect, they aid the instructor by identifying a small set of item s on which he should focus. To provide more d a ta for learning, the agent then attem pts to im prove its understand ing of a dem onstration by experim enting. Experim ents provide d a ta by transform ing one demonstration into multiple similar dem onstrations. The agent experim ents by replaying a demonstration while skipping a step. This allows the agent to observe how eliminating the step affects later steps. The observations are used to identify m issing or incorrect pre conditions. By extracting knowledge from the simulation, experim ents reduce the amount of d a ta that the instructor needs to provide. Because there may be relatively little d ata, the instructor may need to interact further with the agent. T his interaction may include looking at the agent ’s knowledge, testing the procedure or providing additional dem onstrations. 8 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. As an artifact of learning procedures, the approach learns operators. An operator models an action th a t the instructor performs during a dem onstration. To do this, an operator identifies the preconditions necessary for the action to produce a given sta te change. Operators are used to contain the reusable dom ain knowledge associated with a dem onstration’s steps. Unlike steps, which are belong to one procedure, operators can be reused in many procedures. To help understand how operators differ from steps, consider a procedure th a t uses a pum p to drain a tank of w ater. O ne of the steps needed to drain the tank is pressing the s ta rt button, which sta rts the m otor. While pressing the button is a step in this specific procedure, knowledge of how pressing the button affects the pump could possibly reused in other procedures. Diligent stores this reusable knowledge in operators. This approach for authoring procedures has been implemented in a system called Dili gent [AJR97]. The system is called Diligent because of its tenacity in attem pting to understand dem onstrations. 1.5 Related Work Diligent differs from earlier work because it solves a different problem: it is designed to exploit the presence of a sim ulation so th a t it can learn the knowledge necessary to provide good explanations while requiring minimal d a ta from the user. Some systems learn to perform procedures by only learning how to react to the current sta te (e.g. Instructo-Soar [HL95], IM PROV [Pea96] and M etam ouse+ [MWM94]). While it is possible for system s th a t learn this type of knowledge to provide good explanations, this type of system is unlikely to be able to explain how a procedure’s steps depend on each other. One reason for this lim itation is th a t a system might not be able to directly examine some of its knowledge (e.g. production memory in Soar [LNR87]). A nother reason is th at the knowledge may not be structured in a way th a t supports good explanations [Cla86]. Some machine learning system s require a lot of d ata. This may mean th at the learn ing algorithm requires a lot of d a ta (e.g. MSDD [OC96]), but it may also mean th at system does not make the m ost effective use of the d a ta th a t it is given. For example. OBSERVER [Wan96c], like Diligent, learns operators; but OBSERVER doesn’t consider why a dem onstration’s steps are sequenced in a given order. Unlike Diligent, some system s do not need access to a simulation (e.g. Disciple [TH96], IvidSim [SCS94], MARVIN [SB86], and ALVIN [KW88]). These systems may have a very effective interaction with users, but they do not ex tract knowledge from a sim ulation 9 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. with autonomous experim ents. If these system s were used on Diligent’ s problem , users would have to answer questions th at provide the same knowledge that Diligent learns from experiments. Other approaches require the ability to make arbitrary changes to the sta te of a sim ulation while ignoring the constraints on valid attrib u te values (e.g. PET [PK86]). In machine maintenance dom ains, this type of change to a sim ulation's state is either not allowed or is likely to put the simulation in an invalid state. Another problem is th a t system s can require an initial dom ain theory, which may be difficult to acquire (e.g. EX PO [Gil92], ASK [Gru89], CELIA [Red92], LEAP [MMS90], ARMS [Seg87], LEX [MUB83]). Although a domain theory can make learning easier and allow systems to have additional capabilities, someone must spend the requisite am ount of time necessary to encode and validate the dom ain theory. Finally, some system s use a representation for preconditions th a t is inappropriate for our problem. The representation may be complicated or difficult to understand (e.g. IMPROV [PL96]). Even if the representation is simpler, the knowledge may still be rep resented in an unnecessary complex manner (e.g. LIVE [She93]). The complexity of the representation is im portant because an instructor is less likely to accept a system if he is uncomfortable with how it represents the domain knowledge. 1.6 Contributions This work focuses on using machine learning techniques to integrate dem onstrations and experiments. This integration not only reduces the amount of d a ta th at the user has to provide but also makes it easier to provide the d a ta . In this way, the approach supports more efficient authoring of procedures for intelligent tutoring system s. Because the ap proach exploits access to a sim ulation, the approach can also be used with little explicit encoding of domain knowledge. The major contributions are as follows. • A method th at balances the strengths and weaknesses of dem onstrations and exper iments. Experiments are used to identify missing or unnecessary preconditions, but can more easily identify unnecessary preconditions. For this reason, operators are created during dem onstrations using heuristics th a t have a bias towards creating un necessary preconditions. W hile creating operators, the system uses a novel heuristic that focuses on how earlier steps in a dem onstration establish preconditions for later 10 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. steps. Because experim ents com pensate for the bias towards creating unnecessary preconditions, Diligent can learn a great deal from a single dem onstration. • A method for performing useful and focused experiments while requiring only mini mal knowledge. The approach only needs to know the sequence of steps in a demon stration. The approach exploits the simulation to focus on how the sta te changes of early steps in a dem onstration affect later steps. This approach effectively transforms one dem onstration into m ultiple related demonstrations. 1.7 Organization o f the Thesis T he rest of this dissertation is organized as follows. Chapter 2 presents a high-level description of how to use Diligent. It also discusses the basic concepts th a t an author would need to know. C hapter 3 describes the problem more formally and provides an overview of how Dili gent works. The problem statem en t includes high level requirements and Diligent’s inputs and outputs. The discussion o f how Diligent works describes the main heuristics and the m ajor data flows. C hapter 4 discusses interaction with instructors and how dem onstrations are trans formed into plans. This includes the assum ptions th at Diligent makes about dem onstra tions. The algorithms in this ch ap ter provide support for learning operators and perform ing experiments. C hapter 5 discusses how Diligent learns operators. C hapter 6 discusses how Diligent generates additional training d a ta by performing autonom ous experiments. C hapter 7 provides an em pirical evaluation of Diligent being used by hum ans. C hapter 8 brings together th e m aterial discussed in the earlier chapters. It discusses how demonstrations, experim ents and machine learning are integrated. It also talks about Diligent’s assumptions and how easily they could be relaxed. It finishes by discussing lim itations and potential extensions. C hapter 9 discusses related work. C hapter 10 provides a short sum m ary. Appendix A describes im plem entation details. Appendix B contains m aterials used for evaluating Diligent. Appendix C contains the d a ta found during the evaluation. Appendix D contains tutorial m aterial th a t describes how to use Diligent. 11 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Chapter 2 Using Diligent Diligent is a tool th at supports authoring of procedures th a t perform tasks such as machine maintenance. T he steps in these tasks include actions such as reading gauges, pushing buttons and turning handles. W hile some o f these procedures might seem sim ple and intuitive, there is a need for training because there is a lot of room for hum an error. Obviously, autom ating this type o f training requires someone to specify procedures with a software tool. One obstacle to authoring procedures is th a t all tools require knowledge o f some con cepts. One advantage of Diligent is th at it only requires an author to know relatively few concepts. This chapter illustrates how to use Diligent and describes the concepts th a t an author would need to know. 2.1 Properties o f a Simple Procedure In order to illustrate the basic properties of procedures, we will consider a simple procedure for starting a car. To sta rt a car, a person needs to do the following. 1. Open the door. 2. Get in. 3. Shut the door. 4. Put on the seat belt. 5. Put the key in the ignition. 6. Start the car by turning the key. 12 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. However, it is not th at sim ple - sometimes the car will not sta rt. Perhaps, the battery is bad, or perhaps, the transm ission is in drive rather than park. How could people learn this type o f task? One approach is simple trial and error, but this is not likely to be effective with an automobile. In fact, it could dangerous. Another approach is to repeat a dem onstration of the task by rote. However, this approach is fragile and inflexible. Consider w hat happens in a slightly different situation. W hat if one of the steps is not needed? For exam ple, if the key is already in the ignition, do you need to take the key out and put it back in? W hat if one the steps is counter-productive? For example, suppose th at you always turn on the radio when you s ta rt the car. If the radio is already on, pressing the radio’s o n/off button will turn it off. A better approach for learning a task is to build a sim ple model: this model could then be used to identify the purpose of each step and the relationships between steps. Consider a simple model o f startin g a car. The abstract goal is to s ta r t the car in manner th at allows you to drive safely. To achieve this abstract goal, you need to satisfy a num ber of conditions: the door needs to be shut, you need to be wearing the seat belt, and the m otor needs to be running. There are also goals for each step. For example, you turn the key in order to start the car. If the m otor is already running, you don’t need to turn the key. Individual steps may also have preconditions. For exam ple, turning the key will not s ta rt the motor if the transm ission is in drive or th e battery is bad. There are also dependencies between the steps because som e steps may influence later steps. For example, you need to put the key in the ignition before turning the key. However, other steps can be performed in any order. For example, it doesn’t m atter whether you first shut the door or put on th e seat belt. 2.2 What D oes the A uthor Need to Know? By authoring a procedure with Diligent, an author creates a model sim ilar to one described in the previous section. However, to do this, the a u th o r (or instructor) needs to understand certain concepts. As in most software systems, knowledge of more concepts is needed by an expert user than by a minimally com petent user. 13 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.1 C oncepts N e ed ed for B asic Use A minimally com petent user needs to know about the types of knowledge needed to rep resent a procedure.1 • The sta te of the world is represented by conditions. A condition indicates th a t a given attrib u te has a specific value. For example, the position of the door is sh u t, or the seat belt is fastened. • A procedure has an a b stract description, which is used to describe the procedure to students. For exam ple, the previous section’s procedure allowed you to “s ta rt the car in m anner th a t allows you to drive safely.” • The purpose of a procedure is to achieve a set of goal conditions. This means th a t some of the environm ent’ s attributes need to have certain values. For example, the goal conditions of the car starting procedure are “the door is shut, the seat belt is fastened, and the m otor is running.” • Performing a procedure involves performing a sequence of actions th at are called steps. For example, sta rtin g the car involves “opening the door, getting in ” • Each step has an English description that is used to describe the step to students. For example, the act of opening a door could be described as “opening the door.” • Steps have preconditions, which are conditions th at need to be true im m ediately before the step is perform ed. For example, before turning the key, the transm ission should be in park. • There are dependencies between steps because some steps cause changes th a t are preconditions of later steps. These dependencies are called causal links. A causal link identifies an a ttrib u te value th a t is a precondition for one step and is established by an earlier step. For example, the act of putting the key in the ignition causes the key’ s location to be in the ignition, and the presence of the key in the ignition is precondition for turning the key. Thus, there should be a causal link between the step th at inserts the key and the step that tu rn s the key. ‘This discussion ignores concepts needed by system com ponents other than Diligent. For example, we will not consider the concepts needed to manipulate the environm ent's graphical interface. 14 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. • Each causal link has an English description th at can be given to students. For example, the presence of the key in the ignition might be described as “the key is in the ignition.” • The concept of causal links can be extended to include the initial sta te in which the procedure s ta rts and the end of the procedure when all the procedure's goal conditions are satisfied. A causal link involving the initial state indicates th a t a precondition of a step is satisfied when the procedure starts. For exam ple, the car starting procedure depends on the transmission being in drive a t the s ta rt of the procedure. A causal link involving the end of the procedure indicates th a t a step establishes a goal condition. For example, putting on the seat belt establishes the goal condition th a t the seat belt is fastened. A minimally com petent user also needs a couple of concepts th at involve Diligent’s m ethod of authoring. • A user dem onstrates a procedure by manipulating objects in the environm ent’s graphical interface. • Diligent can use dem onstrations to generate and perform experiments th a t are likely to improve the correctness of the preconditions of a procedure’s steps. 2.2.2 C oncepts N eed ed for A dvanced Use To get beyond minimal com petence, a user should know additional concepts. U nfortu nately, Diligent’s im plem entation does not always distinguish between basic and advanced concepts. This makes a minimally com petent user’s job more difficult because Diligent’s user interface shows advanced concepts th a t are not needed for basic use. To get beyond minimal competence, a user should know some additional concepts related to the representation of procedures. • The actions performed by users are modeled by operators. O perators describe the preconditions necessary for an action to produce some desired state changes. O pera tors differ from steps in th a t operators are independent of a given step or procedure. The use of operators allows Diligent to reuse the knowledge contained in an operator by associating the operator with potentially many steps. Each step is associated with one operator, and Diligent uses the step ’s operator to identify the ste p ’s precondi tions. Suppose the user wants to author procedures for starting a car in different 15 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. initial states. For example, there might be a one procedure for when the battery is bad and ano th er procedure for when the transm ission is initially in drive. In this case, all procedures for starting the car would share the operator for turning the key in the ignition. Diligent’s user interface expects users to know about operators during dem onstra tions. To sim plify the interaction with the user, Diligent generates a step’s name using its o perator. If there is no operator for an action, Diligent creates one and asks the user for a nam e and an English description. • If an action is perform ed in different situations, it can cause different state changes. O perators model this by having multiple conditional effects (or effects). Each effect identifies preconditions th a t need to be satisfied for the action to produce a given set of sta te changes. For example, suppose a radio had an on/off button. When the radio is on, pressing the button turns it off, and when th e radio is off, pressing the button turns it on. An operator that models pressing the button should have one effect for turning the radio on and one for turning it off. • A Step can also have preconditions that are specific to th a t step. Because these preconditions a re independent of the step’s operator, the preconditions are associated with the step ra th e r than the operator. For example, a person could read a car’s fuel gauge a t any tim e. However, the value shown on the gauge may be invalid when the m otor is off. Therefore, if you have a step th a t reads the fuel gauge, you may want the step to have a precondition th at requires the m otor to be running. • When Diligent initially learns a procedure, some preconditions may be incorrect. Diligent can help the user identify problems with preconditions by indicating its confidence in a given precondition. In this way, a user can distinguish between preconditions th a t are very likely and those th a t are som ew hat likely. (The measures of belief are the s-rep, h-rep and g-rep, which are described in C hapter 3.) • Sometimes the preconditions of several steps may be satisfied. When this happens, it may not m atte r which of the steps is performed first, but sometimes it does. This type of dependency between two steps is called an ordering constraint. An ordering constraint indicates which of the tw o steps to perform first. For example, an ordering constraint might indicate that a c ar’s key needs to be inserted into the ignition before th e key is turned. Normally, there is one ordering constraint for every causal link, but som etim es additional ordering constraints are needed. Suppose th at 16 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. a procedure involved unlocking the glove com partm ent, removing a map, locking the glove com partm ent, and then inserting the key in the ignition. It might be possible to take the key out of the lock and put it in th e ignition w ithout locking the glove com partm ent. This problem can be avoided by adding an ordering constraint to prevent the key from being inserted into the ignition before the glove com partm ent is locked. Although Diligent’s user interface assumes th a t all users know about ordering con straints, the user interface need not have made th is assum ption. Users should almost always use the ordering constraints th a t Diligent autom atically identifies. • Diligent allows users to au th o r a large procedure by hierarchically composing it from smaller procedures. W hen this happens, the sm all procedures are treated as steps in the large procedure. W hen a procedure is used as a step, like any other step, it is considered to have preconditions and produce s ta te changes. Suppose the user wants to create a procedure for driving to a grocery sto re. In this procedure, the procedure for starting a car might be the first step. An advanced user should also have some knowledge o f how Diligent learns. • Diligent assumes th a t changes to the sta te during a dem onstration are likely to be im portant. Sometimes s ta te changes of early step s in a dem onstration are proposed as preconditions of later steps. This approach som etim es causes Diligent to propose unnecessary dependencies between steps. (Diligent can later attem pt to correct these unnecessary dependencies by performing experim ents.) An author can use Diligent’s bias towards s ta te changes to provide demonstrations that promote more effective learning. The steps in a dem onstration should be closely related, and closely related steps should be dem onstrated together. If each step in a procedure comes from a different one step dem onstration, Diligent does not learn as effectively. • Diligent s learning can be improved if a given action (e.g. fastening the seat belt) is demonstrated in very different situations. Diligent has Clarification dem onstrations that support this type of input (Section 4.2). Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.3 Using Diligent to A uthor a Procedure The above discussion describes th e concepts th at authors should know, but it doesn’t describe how an author would use D iligent. This section illustrates how to use Diligent.2 To do this, it uses a procedure from th e High Pressure Air Com pressor (HPAC) domain.3 2.3 .1 T he P roced ure to b e A u th o red Sometimes high levels of condensation can build up inside the compressor. To avoid dam aging the machine, the com pressor’s condensate drain m onitor turns off the motor. To restart the motor, the condensation needs to be drained. This condensation is built up in on e of the separator drain manifold’s five values. There is one valve for each of the com pressor’s stages. If too much condensation builds up in one of the stages, the alarm light for th a t stage is illuminated. For the procedure’s initial state, we will assume th at the m otor has stopped and that alarm lights for the first and second stages are illuminated. The procedure to restart the com pressor has the following steps. 1. Open the second stage value. T his is done by turning a handle th a t is used to open and shut valves. 2. Move the handle to the first sta g e valve. 3. Use the handle to open the second stage valve. 4. Reset the compressor by pressing the condensate drain m onitor reset button. 5. S tart the motor by pressing th e m otor button on the HPAC’s control door panel. This starts the m otor and d rain s the first and second stages. Once the stages are drained, the alarm lights will tu rn off". 6. Use the handle to close the first stage valve. 7. Move the handle to the second valve. 8. Use the handle to close the second stage valve. 2 Appendix D contains a more complete discussion of how to use Diligent. 3Section B.9.1 contains a plan for this procedure. 18 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.3.2 A uthoring th e P roced u re Suppose that Diligent initially has no knowledge of the dom ain. 2.3.2.1 Creating the Procedure F igure 2.1: Procedure Description Menu When authoring a procedure, the first thing the instructor does is tell Diligent that he wants to author a procedure. Diligent then brings up the Procedure Description Menu (Figure 2.1). The procedure nam e is used by the instructor to identify the procedure, while the description contains English text th at is presented to students. In our example, the in stru cto r calls the procedure “re-start” and types in the descrip tion “restart the m otor when high levels of condensation.” A fterw ards, the instructor continues by selecting the “accept” button. 2.3.2.2 Specifying the Initial State Figure 2.2: Resetting the E nvironm ents S tate Before starting the dem onstration, the simulation environm ent needs to be put in an initial state. To do this, th e instructor resets the environm ent to known configuration, which is identified by a tex t string (Figure 2.2). 19 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. R esetting the environment to known state m ay seem strange, but it is very useful in an educational setting. It supports both teaching and authoring. It allows an autom ated tu to r to reset the environment before dem onstrating a procedure to students. It also allows students to sta rt procedures from pre-specified initial sta te s. T he ability to reset the sta te is also useful when providing additional dem onstrations and when Diligent experiments with a procedure. In our example, the instructor enters the configuration “config” and then selects the “Ok” button. This causes Diligent to reset the s ta te o f the sim ulation environment. The instructor will use this new sta te as the procedure’s initial state. 2.3.2.3 Demonstrating the Procedure Once the environment is in the desired initial sta te , the instructor can start dem on strating the procedure. Figure 2.3 shows the environm ent’s graphical representation of the procedure’s initial state. To dem onstrate, the instructor directly m anipulates objects in the graphical interface by selecting them with a mouse. Here is how an in stru cto r could dem onstrate the example procedure. 1. The instructor opens the first stage valve by selecting the handle (the cross-shaped object) that is on top of the valve. This causes the object th a t represents the handle to turn. This new position indicates th at the valve is open.4 2. The instructor moves the handle to the second stage valve by selecting the second stage valve. 3. The instructor opens the second stage valve by selecting the handle that is on top of the valve. 4. The instructor resets compressor by selecting the condensate drain monitor reset button. 5. The instructor interacts with the simulation to move the view of HPAC to the control door. (In the current im plem entation, this can be done by pressing a button th at is in a different window.) This is not recorded as one o f the dem onstration’s steps. 6. The instructor sta rts the m otor by selecting the m otor button. 4O ne idiosyncrasy of the environm ent’s graphical interface is th a t w hether a valve is open or shut can only be seen when the handle is on top of the valve. 20 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Second Stage Valve Reset button First Stage Valve Figure ‘ 2.3: T he Front of the HPAC Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 7. The instructor interacts with the simulation to move the view back to the front of the HPAC. (This can also be done by pressing a button th a t is in a different window.) This is not recorded as one of the dem onstration's steps. 8. The instructor closes the first stage valve by selecting th e handle th a t is on top of the valve. 9. The instructor moves the handle to the first stage valve by selecting the first stage valve. 10. The instructor closes the first stage valve by selecting th e handle th a t is on top of the valve. At this point, the instructor indicates th a t he is done with the dem onstration. T he above discussion of the dem onstration does not discuss the creation of operators. Diligent associates each step in the dem onstration with an operator, which Diligent uses to model the step’s preconditions and s ta te changes. By associating a reusable operator with a step, Diligent is better able to identify the step ’ s preconditions because it can learn from other steps th at use the sam e operator. Ideally, a minimally com petent user would not see or care ab o u t operators. For exam ple, a user might focus on the preconditions and state changes o f the steps while ignoring how they were identified. However, Diligent’ s user interface exposes the concept of operators to instructors. This happens the first time th at an instructor dem onstrates a given action (e.g. selects the first stage valve). At this point, Diligent asks the instructor to nam e the operator and to modify or approve Diligent’s default description. Once operators have been given names, Diligent uses the name to autom atically generate step names. (Note th a t default names and descriptions could have been generated without consulting the instructor.) Figure 2.4 shows the naming of the operator for the exam ple procedure’s first step. The default description was generated by considering the type o f action (i.e. turning the handle) and the object acted upon (i.e. the handle th a t is used for m anipulating valves). In the example procedure, Diligent learns operators th at model the following actions: turning the handle on top of a valve, moving the handle to the second stage valve, pressing the reset button, pressing the m otor button, and moving the handle to the first stage valve. 22 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 2.4: Operator Description Menu 2.3.2.4 Creating a Procedure from the Demonstration So far, we have identified a sequence of steps that make changes to environm ent’s state. However, we have not yet identified w hat the procedure is supposed to accomplish or how later steps depend on earlier steps. The first thing the in structor needs to do is tell Diligent to identify some likely goal conditions. A goal condition indicates the value a given attrib u te m ust have in order to finish the procedure (e.g. the com pressor’s motor is running). Because the purpose of a procedure is to change the environm ent’s state, Diligent hypothesizes th a t the final values of all attributes th a t change value during the procedure are goal conditions. After Diligent identifies some likely goal conditions, the instructor is shown a menu (Figure 2.5) where he can approve or reject the conditions. In our example, we will assume th a t the instructor approves all hypothesized goal conditions. Unfortunately, the goal conditions in Figure 2.5 are a little cryptic because attribute names are being shown. This problem can overcome by selecting a goal condition with the mouse and getting more a more detailed description (Figure 2.6). Once the goal conditions have been identified, Diligent is able to identify which steps achieve which goal conditions and to determ ine how later steps arc dependent on ear lier steps. This calculation could have been done autom atically, but for implementation reasons, Diligent requires the in stru cto r to explicitly request this calculation. Once the goal conditions an d the dependencies between steps have been identified. Diligent is able generate a procedure from the dem onstration. 23 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 2.5: Hypothesized Goal Conditions 2.3.2.5 The Initial Procedure Figure 2.7 shows the dependencies between the procedure’s steps when the procedure is first created. The lines represent dependencies between pairs of steps. The steps th a t the instructor dem onstrated are shown as ovals. The ovals contain step names, which are numbered sequentially in the order th at the steps were dem onstrated. The rectangle labeled begin-re-start represents the procedure’s initial state. Not shown in the figure is a rectangle labeled end-re-start th a t represents the end of the procedure. A line between a step and the end of the procedure, indicates that the step establishes one of the procedure’s goal conditions. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure ‘ 2.6: Hypothesized Goal Condition Details To edit the procedure, the instructor can bring up a menu by selecting an oval or a rectangle with the mouse. The details of these menus won’t be shown because they emphasize details of the user interface’s im plem entation.5 2.3.2.6 Refining the Procedure with Experiments Because the initial procedure (Figure *2.7) was created with very little data, Diligent had use to heuristics. One heuristic is th a t state changes caused by earlier steps are likely to be preconditions of some later steps. W hile the dependencies between steps identified by the heuristics are often valid, the dependencies are not always valid. Diligent can help alleviate this problem by generating more d a ta for identifying the preconditions of steps. Diligent does this by performing experim ents where it performs the procedure while skipping a step. These experiments allow Diligent to observe how the missing step affects later steps. Some advantages of experiments are th at the instructor does not have to perform any additional work and th a t an experim ent’s results are gathered from the simulation's executable model of the domain. In order to keep Diligent’s user interface more responsive, Diligent only experiments when told to do so by the instructor. The problem is th a t experim ents require dedicated use of the simulation environm ent. If Diligent autom atically decided to initiate an experiment, then during the experim ent, the instructor could not use the environm ent's graphical interface to perform dem onstrations or to test procedures. Furtherm ore, experiments 5Appendix D discusses Diligent s menus. 25 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. l # E l K E l l E E $ l i : ^ WJ8» % T Z ? - Figure 2.7: The Initial Dependencies 26 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. prejj-motor-butrcn-5 If move-2nd-«*ge-7 J li & » < S S S f e ^ 5 i^ ^ s ^ ? 5 ^ e J' j-i.j.i' -' " ,' " ^ V - - 3 ^ - * i'" .- k ; . - + - . ■ > ' ‘ S - . cl D e s c r i p t i o n - W g8l3» Figure 2.8: The Dependencies A fter Experim entation Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. can change the d a ta shown on menus, and Diligent’s implementation does not support dynam ically updating menus with the results of experiments. At this point, let us suppose th a t Diligent experiments with our exam ple procedure. The improved procedure would now like like Figure 2.8. The experiment allowed Diligent to remove unnecessary dependencies, such as the one between steps turn-1 and turn-3. 2.3.2.7 Additional Authoring A ctivities A t this point, the instructor could refine the example by performing additional authoring activities. He could use menus to edit the procedure, he could test the procedure with the STEV E tu to r, and he could provide Diligent with additional dem onstrations. Additional dem onstrations can be used to add new steps to an existing procedure, but dem onstra tions do not have to add steps. Diligent also supports dem onstrations th a t are only used to help learn operator preconditions. However, dem onstrations th at focus on learning preconditions without adding steps would probably only be used by an expert user. Diligent allows editing, testing and dem onstrating to be interleaved and repeated. This allows an instructor to iteratively refine a procedure. 2.4 Summary This chapter presented a high-level discussion of the basic concepts involved in using Diligent. We first discussed a simple procedure for turning on a c a r’s motor. This exam ple was used to identify properties of procedures. The purpose (or goal) of a procedure is to make changes to the world. This is done by perform ing a sequence of steps where some later steps may be dependent on changes caused by earlier steps. We then discussed w hat concepts are needed for simple use of Diligent. Users need to understand the basic representation (but not operators). Users should also understand th a t Diligent uses dem onstrations and can improve its knowledge with experim ents. For m ore advanced use, a user should understand the operator representation and have some idea of what types of examples will improve learning. Users should also understand Diligent’s heuristic assum ption th a t the s ta te changes of earlier steps are likely to be preconditions of some later steps. Users should also understand extensions to the basic representation. For example, an advanced user should understand th at a large procedure could be composed hierarchically by using other procedures as steps in the large procedure. 28 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Finally, we went through an example th at illustrated w hat instructor would have to do in order to author a procedure. 29 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Chapter 3 Diligent - Its Problem and Approach As discussed previously, this thesis presents an approach for learning procedures for use in a heterogeneous, sim ulation-based tutoring system . In o u r approach, a human instructor authors procedures by dem onstrating them in m anner sim ilar to how he would dem onstrate them to a human student. This thesis focuses on understanding these dem onstrations. The dem onstrations are understood by first observing them and then refining the system ’s understanding with experim ents. The approach was implemented in a system called Diligent, which was used to learn procedures for machine maintenance. However, the approach is independent of any par ticular domain. This chapter describes how Diligent works and the problem th a t it is addressing. First, we will characterize the problem. Second, we will discuss inputs and outputs. Third, we will describe the basic heuristics and how dem onstrations are turned into procedures. The chapter finishes by indicating where various algorithm s a re discussed in greater detail. 3.1 The Problem The problem th at Diligent addresses can be described by high-level requirements, con straints on the nature of the domain knowledge, and th e functionality provided by the interface to the simulated domain (or environm ent). 3.1.1 R equirem ents Diligent has several high-level, general requirements. Produce a procedure. Diligent needs to learn procedures th a t can be used for teaching human students. 30 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Understand demonstrations. An in stru cto r specifies the steps of procedures with demon strations. Therefore, in order to understand procedures, Diligent needs to understand demonstrations. 3.1.1.1 Reducing the Instructor’s Effort Diligent has a general requirem ent to reduce th e effort required from an instructor. This requirement has a number of aspects. Maximize the utility o f each dem onstration. Not only do dem onstrations require some effort on the part of an instructor, but the number of dem onstrations th at an instructor can provide is lim ited. Therefore, it behooves Diligent to ex tract as much information as possible from each dem onstration. Save time. A goal of a system like Diligent is to reduce the amount of tim e required by an instructor. One way to do this is to get th e maximum use out of each dem onstration. Reduce the difficulty. Diligent should m ake authoring easier. We assum e th a t easier authoring helps improve the quality of th e resulting procedures because an instructor less likely to have a lapse of concentration and because he can focus his attention more effectively. Authoring should be easier if it takes less time and if the system gets the maximum use out of each dem onstration. Exploit the environment. Diligent should g ath er as much information as possible from the environment in order to avoid asking the instructor unnecessary questions. By exploiting the environm ent, Diligent can learn more from dem onstrations and make the instructors job less difficult while taking less time. Provide aid when possible. W henever possible, Diligent should aid the instructor by doing tasks th at it can easily autom ate. For example, given knowledge o f a procedure, Diligent can easily o utput the procedure in the form of a plan. However, given the same information, the instructor m ay have difficulty creating a plan. 3.1.2 C onstraints on th e D om ain K now ledge The available domain knowledge constrains which hypotheses Diligent proposes and how Diligent approaches learning. 31 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Little initial knowledge. W hen Diligent starts, it may have no knowledge of the do main. Of course, Diligent gains knowledge as it interacts with the instructor, but Diligent still needs to be able to function with very little knowledge. This thesis looks at general-purpose techniques th a t would allow an authoring tool to be easily ported to a different dom ain. For this to happen, little explicit encoding of domain knowledge should be required from users. Because little knowledge may be available, the techniques need to be robust enough to provide an instructor with “useful” feedback when only given a small amount of d ata. For example, Diligent should be able to identify likely dependencies between a procedure’s steps after only a single dem onstration of th a t procedure. Idiosyncratic Objects. In m achine maintenance domains, every m em ber of a class of objects (e.g. buttons or switches) may behave differently. T his happens because they have different functions (e.g. sta rt the motor versus turn off the lights). Thus, actions performed on the sam e class of object may have very different preconditions and produce very different s ta te changes. Because objects of the sam e class may behave so differently, Diligent makes no at tem pt to generalize an action’s preconditions so th at the preconditions can be applied to actions on similar o b jects.1 Unstructured environment. Diligent may have an unstructured environm ent. The state of unstructured environm ent only contains a set of a ttrib u te values. An un structured environm ent contains no information about the relationships between attributes, between objects, or between attributes and objects. Because structural knowledge m ight be easy to acquire, why doesn’t Diligent re quire it? First, Someone would have to explicitly encode the knowledge if it were unavailable. Second, because objects may have idiosyncratic behavior, it would be difficult to use structural knowledge to make generalizations th a t apply to a class of objects rather than to a specific object. Third, the machine m aintenance domains th a t we have looked a t contain a lot of constraints between objects, and many of these constraints are unknown. This means that an action on one object may have an effect on another object th a t appears to be totally unrelated to the first object. l \Ve considered learning generalized preconditions th at could apply to a class of object. If a particular object were to have unusual behavior, then th e object’s behavior could have been modeled differently than the other objects in the class. While this approach is a possible extension, it did not appear very useful in the domains that we were using. 3 2 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. For example, pressing a button could drain an overflow tank in another room. Since structural knowledge is not o f obvious benefit in our domains, not requiring it makes Diligent’s techniques more general. Can observe all relevant attributes. Diligent can see every attrib u te in the environ ment th at is relevant. Relevant a ttrib u tes include those th at are used for teaching students as well as attributes th a t are required to make the actions perform ed in the environment appear determ inistic. Because Diligent can see all relevant attributes, it does not need to detect or model missing attributes. Most attributes are irrelevant. In a complex domain, there may be hundreds if not thousands of attributes. For example, the simulation in our G as Turbine Engine domain internally used over 10,000 attrib u tes. In Diligent's main dom ain (High Pressure Air Com pressor), the sim ulation internally used approxim ately 4,500 a t tributes, of which approxim ately 70 appeared to have potential use in teaching.2 However, for any given procedure, m ost attributes are very likely to be irrelevant. This means th at the learning algorithm will need to focus on filtering o ut irrelevant attributes.3 3.1.3 Interface w ith th e E n viron m en t In this document, the parts of the heterogeneous tutoring system th at sim ulate the domain are called the environment. Because this work focuses on using the environm ent, we need to discuss the functionality provided to Diligent by the environment. Diligent places very few dem ands on the interface to the environm ent, and because the interface is very general, this interface could be supported by a wide variety of sim ulations. Because our purpose is to convey the basic functionality provided by the environm ent, we will use informally defined d a ta types. These d a ta types will be defined formally in Section 3.2.1.1. procedure Current-State output: state-vector The procedure Current-State returns a state-vector, which contains the current val ues of the attributes th a t Diligent can observe. This functionality is necessary when an 2 Diligen: and the STEVE tutor had access to approxim ately 40 attributes. 3 Identifying attributes th a t are relevant or likely to be relevant is a m ajor focus of the algorithms discussed in the chapters on learning operators and experimentation. 33 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. autom ated tutor interacts with hum an students. It is also useful when an instructor tests a procedure. procedure Observe-Action output: action-example The procedure Observe-Action allows Diligent to observe the instructor as he per forms a dem onstration. The action-exam ple identifies which action the instructor per formed and how it changed the sta te of the environm ent. The change in state is represented by the environment’s sta te before and after the instructor performed the action. It is assumed th a t Diligent can see all actions performed by the instructor and th at Diligent is notified whenever an action is performed. procedure Perform-Action input: action-id output: action-example The procedure Perform -Action allows Diligent to perform experiments. The action- id identifies which action should be performed (e.g. press the red button). The action- example identifies the environm ent’s sta te before and after the action was performed. procedure Restore-Environm ent-State input: configuration-id The procedure Restore-Environm ent-State allows Diligent to restore the environ ment to a known state. The configuration-id is a tex t string th a t the instructor provides, and it identifies a known configuration of the environm ent. O ther than this text string, Diligent has no knowledge about w hat d a ta is required to restore the environm ent’s state. The ability to reset the environm ent is reasonable in a system th a t tutors human students. This ability allows students to sta rt working on procedures from known, pre specified initial states. Diligent uses Restore-Environm ent-State before starting a dem onstration. This allows Diligent to later restore the dem onstration’s initial state so that Diligent can perform experiments or the instructor can provide additional dem onstrations. One procedure th at is not listed is Save-Environment-State. While instructors require this capability in order to create configurations for Restore-Environm ent-State. Diligent does not dynamical create new environment configurations because it takes too 34 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. long in the tutoring environment in which Diligent was developed. To see why saving the environm ent’s sta te could take a long tim e, consider th a t a simulation could use thousands of attributes, and the values of each of these attrib u tes would need to be saved. As will be discussed later, Diligent gets around its inability to save the environm ent’s state by remembering an existing configuration and the subsequent actions used to modify the environm ent’s state. Besides the functionality to observe and m anipulate the environment, Diligent requires some functionality to map the identifiers used by the environment to English descriptions. This has a couple of advantages in a heterogeneous tutoring system . First, this allows different com ponents of the system to use th e same term s when discussing the sam e objects. Second, by having a centralized repository o f descriptions, there is less likely to be problems because of inconsistencies or errors in redundant descriptions. procedure Action-Type-Description input: action-type output: description The procedure Action-Type-Description describes the type of action th a t the in structor performed. Some examples are “pressing” a button or “toggling” a sw itch. procedure Object-Description input: environm ent-object output: description The procedure Object-Description provides a name for domain objects (e.g. “reset button” or “alarm light” ). procedure Attribute-Description input: attribute-nam e output: description The procedure Attribute-Description describes attributes of the environm ent. At tributes can represent things such as w hether a valve is open or whether a light is illumi nated. For example, a description of a ttrib u te valvel might be “first stage value.” There is no procedure A ttribute-Value-Description because Diligent assum es th at attribute values are English text strings th a t can be understood by students. For exam ple, the state of a light might “on,” or the position of a value might “off.” One consequence of this relatively limited interface is the following constraint. 35 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Environm ent Commands Demonstrations Current state >£ Commands Instructor Actions performed Graphical interface simulation Current state English descriptions Diligent Procedures Operators Names and descriptions Verify hypotheses State of the procedure Output Figure 3.1: In p u t/O u tp u t Cannot directly change attribute values. Although Diligent has the ability to re store the environm ent’ s entire state, we do not assum e th at Diligent has the ability to independently set the value of individual a ttrib u tes (e.g. set the value of valvel to open). This restriction might be unnecessary in dom ains w ith few constraints on valid a t tribute values, but in some dom ains (e.g. machine m aintenance dom ains), there tend to be a large number of constraints. Consider a sim ulation of an automobile. One source of constraints is consistency between a ttrib u te values. For example, de pressing the gas pedal causes more gasoline to flow to the engine. The simulation would be in an inconsistent state if Diligent was allowed increase the depression of the pedal without increasing the flow of gasoline. A nother source of constraints is avoiding states th at the environm ent does not support. For example, the simulation may not correctly model the behavior of the autom obile’s suspension a t high speeds. If Diligent were able to change the autom obile’s speed to a high value, then the behavior of the simulation would be invalid. 36 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 3.2 Input and Output Figure 3.1 shows the input and output involved in authoring procedures. In the figure, the lines represent d a ta flows and the rectangles represent program s. While Diligent and the instructor treat the environm ent as a single entity, the environm ent actually consists of a graphical interface and a simulation. T he simulation controls the state o f the graphical interface. The instructor interacts with the environment by m anipulating and examining the graphical interface, while Diligent interacts directly with the sim ulation.4 3.2.1 Input When learning a procedure, Diligent gathers input from both the instructor and the envi ronment. The instructor uses the environm ent’s graphical interface to dem onstrate a procedure. Diligent observes the dem onstration as a sequence of action-examples th a t it receives from the environment. (As will be described below, an action-example indicates w hat action was performed and the environm ent’s sta te before and after the action.) As the instructor works on a procedure, he can examine the state of th e environm ent by looking a t the graphical interface. However, like the “real-world” . the in structor may be unable to directly observe the values of some of the environm ent’s attributes (e.g. w hether the internal pressure is norm al). The instructor uses D iligent’s menus to provide additional knowledge beyond dem on strations. He provides text strings for names and descriptions. For example, a procedure might be called “procA” and have the description “shut down the device.” He can also verify Diligent’s hypotheses by making discrete choices. A discrete choice may involve indicating “yes” or “continue” (e.g. accepting hypothesized procedure goals), or it may involve selecting one of several possibilities (e.g. rejecting a precondition). A ddition ally, the instructor controls the interaction with Diligent through com m ands (e.g. s ta rt a dem onstration). Besides dem onstrations, the environment also provides Diligent with English descrip tion of attributes, objects and actions. One use of these descriptions is m apping the environm ent’s internal representation to som ething th at the instructor can understand. Another use is building default descriptions th a t could be used with human students. ‘ 'T he simulation is implemented with VIVIDS, which is a version of the University o f Southern Cali fornia’s Behavior Technology Laboratory’s RIDES [M JP +97], and the graphical interface is implemented with Lockheed M artin's Vista Viewer [SMP95], Additional details of the implementation are located in Appendix A. 37 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Diligent also has the ability to send two types of commands to the environm ent. One type resets the environm ent to a known state. This allows Diligent to setup and restore a dem onstration’s initial sta te . Besides restoring the state, Diligent can also perform the sam e types of actions (e.g. pressing a button) as the instructor. This ability allows Diligent to experiment. 3.2.1.1 Definitions o f Basic D ata Types identifier — > • string of ASCII characters description — > string of ASCII characters th a t a hum an should be able to understand attribute-nam e, environment-object, action-type, configuration-id — > identifier attribute-value -y description condition ( attribute-nam e attribute-value ) action-id — > ■ action-type environment-object state-vector, partial-state — > ■ set of conditions pre-state, post-state — * state-vector delta-state -> partial-state action-example — > ■ action-id pre-state post-state d elta-state Figure 3.2: Syntax of Basic D ata Types Figure 3.2 shows the syntax of the basic d a ta types used for input. Two types of ASCII strings are used: identifiers and descriptions. D escriptions contain text th a t a human student should be able to understand. For example, an attribute-value is a description and should be som ething like “open” or uon” rather th an som ething like “x25” or "L tP urp."5 In contrast, identifiers are not given to students. Because Diligent uses some identifiers internally, an instructor may only be aware of a few types of identifiers (i.e. attribute-nam e and configuration-id). The interface uses identifiers for a variety of purposes. An attribute-nam e identifies an attrib u te in the environm ent. An environment-object is an object in the environm ent that can be manipulated (e.g. reset b u tto n ). An action-type indicates w hat type of action is 5 If attribute values could not be understood by students, the environment would have to provide Diiigent with a means of translating a value into a hum an understandable form. 38 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. performed (e.g. pressing a button versus toggling a switch). A configuration-id identifies an internal state of the environment. A configuration-id allows Diligent and the instructor to com m unicate about known states. Configuration-ids are also needed because Diligent may only be aware of a small fraction of the env iro n m en ts internal state. The portion of the environm ent’s state th a t Diligent can see is represented as a con junction of conditions. A condition contains an a ttrib u te and its value, and represents an equality relation. A condition is said to be satisfied or true if the attrib u te has the given value; otherwise, the condition is said to be unsatisfied or false. For example, the condition (valvel open) indicates th at attribute valvel has the value open. Diligent only supports conditions th a t involve a n equality relation between an a ttrib u te and a small set of discrete values (e.g. “solid,” “liquid,” or “gas” ). A ttribute values cannot be continuous (e.g. real numbers), and the relation between an attribute and its value cannot be a relation such a “less than,” “greater th a n ,” or “not equal.” For example, a condition cannot represent the relation “tem perature < 100 degrees.”6 An instructor performs actions in the environm ent by selecting objects in the graphical interface with the mouse. An action is represented with an action-id. An action-id iden tifies the type of action (e.g. pressing a button) a n d the object acted upon (e.g. the reset button). (How Diligent uses action-ids will be discussed below when we cover operators, which are used to model actions.) While Diligent could have allowed an instructor to perform an action by using a menu to select from a list of actions, Diligent does not assum e th at it knows all possible types of actions or all objects th at could be acted upon.' The interface with the environment uses sets o f conjunctive conditions. One type of set is a state-vector, which contains all attributes th a t Diligent can see. A nother type of set is a partial-state, which may contain zero or m ore conditions. An action-example indicates how an action (a ctio n -id ) affected the environment. The sta te before the action is called the pre-state, and th e sta te afterwards is called the post state. The delta-state contains the post-state conditions of attributes that changed value during the action. To support identification of the pre-state and post-state, it assumed th a t actions are discrete rather than continuous. This means th a t th ere is an identifiable tim e before an 6 Relaxing the assumption th at all conditions represent equality relations is briefly discussed in Section 8 .2 . 7When we evaluated Diligent, we created a version of Diligent th at allowed users to specify an action by using a menu to select the action from a list of predefined actions. 39 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. action starts and after it finishes. Actions that have delayed effects, which appear after subsequent actions have been perform ed, are beyond the scope o f this thesis and are an area for future work. Furtherm ore, it is assumed th a t actions are determ inistic in th a t an action will produce the sam e post-state when performed in the same pre-state. a c tio n -id : turn handlel (i.e. turn the handle on a valve) p re -s ta te : (valvel open)(valve*2 open)(H andleO n valve*2) p o s t-s ta te : (valvel open)(valve2 shut)(H andleO n valve2) d e lta -s ta te : (valve2 shut) Figure 3.3: An Action-Example Figure 3.3 shows an action-exam ple. In the example, an instructor selects a handle th a t is used to open or shut valves. W hen the instructor does this, the valve underneath the handle (valve2) is shut. T he only change in the state (i.e. delta-state) is th a t valve2 is now shut. 3 .2 .2 O utput Diligent produces two types of objects (i.e. procedures and operators) th a t could be given to an autom ated tu to r. Procedures describe how to perform tasks, such as restarting an air compressor, and operators model how the actions performed by an instructor affect the sta te of the environm ent. W hile only procedures can be directly used for teaching, a tu to r could use operators for more robust error recovery and for modifying existing procedures.8 To see how operators could help a tutor, consider the following exam ple. Suppose a student is trying learn how to checkout an air compressor, and he m istakenly turns off the com pressor’s power. The procedure used by the tu to r should allow the tu to r to deal with most unexpected events and student errors, but what happens if the student performs an action th at is not described by the procedure. If the tu to r only used the procedure, it be m ight not be able to provide the student with help, but if the tu to r used operators, it might be able to modify the procedure so th at it could provide help. 8 Extensions to the basic procedure and operator representation are discussed in C hapter 8. 40 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 3.2.2.1 Procedures As was discussed in the first chapter, Diligent outputs procedures in a form (i.e. hierar chical partially ordered plans [RN95]) th a t is commonly used by the Artificial Intelligence community and has often been used in research on explaining procedures to humans. This research on explaining procedures to hum ans has looked a t topics such as providing concise descriptions [You97], selecting rhetorical relations to express the relationships between ac tions [VM95, Van93], m ultilingual instruction generation [DHP+94, PV96. P V F + 95], and representing relations th a t hold between pairs of actions [Pol90, Di 94, Bal93]. Research has identified features of this representation th at are needed to provide ade quate explanations [ME89, You97]. • Large, complicated procedures can be decomposed into a sequence of smaller, simpler and logically coherent subprocedures. For example, consider a procedure th a t teaches someone to drive to the store. One subprocedure might involve starting a car, and another subprocedure might involve stopping a t a stop light. • The representation describes causal dependencies between steps th a t can be used to answer questions ab o u t how to perform a procedure or why is an action needed.9 To clarify the discussion, we will use the example plan in Figure 3.4. The components of a procedure th a t Diligent outputs include • Name. Names are used with an instructor to identify a procedure. Diligent’s imple mentation requires each th a t procedure have a distinct name. In the example, the plan is named procA. • Set of steps. Each step corresponds to an action performed in the environment or to another procedure. A procedure embedded inside another procedure as a step is called a subprocedure, and the procedure containing a subprocedure is called the parent procedure. Steps representing a subprocedure are called abstract, while steps representing an action are called prim itive. For implem entation reasons, procedures are trees in th at no procedure can recursively be a step inside itself or any of its subprocedures. 9Questions about how to perforin a procedure and why to perform a step were briefly discussed in Section 1.1.2. However, this topic is not a focus of this dissertation. 41 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Name: procA Steps: begin-procA, turn-1, press-system-test-2, end-procA Goal conditions: (valvel shut)(C dm Status test) Causal links: begin-procA establishes begin-procA establishes begin-procA establishes turn-1 establishes turn-1 establishes press-system-test-‘ 2 establishes Ordering constraints: turn-1 before test-system-test-'2 Figure 3.4: Example Plan procA For im plem entation reasons, each step has to have a distinct name. To prevent two steps from having the same name, most steps have a num ber appended to their name (e.g. turn-1 and press-system-test-2). To simplify processing, each procedure has two additional steps that represent the procedure’s initial and final sta te (e.g. begin-procA and end-procA). Because the names of these steps are created using the procedure's name (e.g. procA), the step names are already distinct and do not need a num ber appended to them. • Set o f goal conditions. The purpose of the procedure is to establish a conjunctive set of goal conditions. The procedure term inates when all goal conditions are satisfied. When the procedure term inates, the environm ent is said to be in the goal state. The example procedure has the goal condition (valvel shut). This means that valvel needs to be shut at the end of the procedure. • Set of causal links. A causal link [MR91] indicates th a t performing a step causes a precondition for another step to be true. The exam ple procedure has the causal link turn-1 establishes (valvel shut) for press-system-test-2. This means th at step turn-1 shuts valvel and that valvel being shut is a precondition of step press-system-test-2. 42 (HandleOn valvel) for turn-1 (valvel open) for turn-1 (Cdm Status normal) for press-system-test-2 (valvel shut) for press-system-test-2 (valvel shut) for end-procA (Cdm Status test) for end-procA Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. For bookkeeping purposes, there are also causal links involving the initial and goal states. In the exam ple, conditions th at are part o f procedure’s initial sta te are identified with causal links involving the step begin-procA. Conditions th a t are part of the goal state are identified with causal links involving the step end-procA. Because causal links identify fine-grained dependencies between steps, they have been found to be very useful when describing procedures to humans [ME89]. • Set o f ordering constraints. An ordering constraint indicates the relative order for performing a pair of steps. For example, the ordering constraint turn-1 before test- system-test-2 indicates th a t step turn-1 should be performed before step test-system- test-2. By definition, all steps are performed after the s ta rt of the procedure (i.e. begin- procA) and before the end of the procedure (i.e. end-procA). • Text descriptions. T he procedure also contains English descriptions th a t can be given to human students. There are descriptions for the procedure, individual steps and causal links. In this docum ent, the term step relationships is used to describe both causal links and ordering constraints.1 0 Throughout this thesis, plans will be represented in the form at shown in Figure 3.4. The order of steps, causal links, and ordering constraints is m eant to improve readability and is not im portant. Steps are listed in the order th a t the instructor dem onstrated them . Causal links and ordering constraints are ordered so th at those involving the procedure’s earlier steps are listed before those involving later steps. 3.2.2.2 Operators Besides procedures, Diligent also learns operators. Although operators are not p a rt of a plan, Diligent internally associates one operator with each primitive step. Diligent uses a step ’ s operator to identify the step’s preconditions, which are used when generating the plan’s step relationships. Diligent uses operators to model reusable knowledge of how an action affects the envi ronment. Because operators model the environm ent, they are not specific to a given step 10Diligent’ s implementation and its training docum entation used the term ordering relationships instead of step relationships. This docum ent uses the term step relationships so th a t the reader does not confuse ordering relationships with ordering constraints. 43 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Name: toggle-motor A c tio n -id : press m otor-button E ffect: effect 1 P re c o n d itio n s : g-rep : (m otor on) h -rep : (m otor on)(valvel open) s-rep : (m otor on)(valvel open)(valve2 open)(HandleOn valvel) S ta te c h a n g es: (motor off) E ffect: effect2 P re c o n d itio n s : g-rep: (m otor off) h -rep: (m otor off)(valvel closed) s-rep: (m otor off)(valvel closed)(valve2 open)(HandleOn valvel) S ta te ch a n g es: (motor on) Figure 3.5: Example O perator toggle-motor Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. or procedure. O perators model an action by identifying the set o f conditions (or precondi tions) necessary so th at performing the action will achieve specified sta te changes. A state change indicates th at an action caused an a ttrib u te to change value and is represented by a condition th a t contains the a ttrib u te ’s new value.1 1 An operator consists of the following com ponents. To clarify the discussion, we will use the exam ple operator in Figure 3.5. • Name. The name used to identify the o perator to the instructor. For implementation reasons, Diligent allows operators to have duplicate nam es. In the example, the operator is called toggle-motor. • Action-id. The action-id uniquely identifies the operator. T he operator in the ex ample models pressing the motor-button. An operator is only associated with one action-id. Diligent tre a ts action-ids as atom ic and does not compare action-ids to look for common objects or common types of actions. As was discussed in Section 3.1.2, operators do not model same the type of action on multiple objects (e.g. all buttons) because different objects of the sam e type (e.g. m otor button versus reset button) may affect the environm ent in very different ways. Similarly, different types of actions on the same object are modeled independently because the actions may perform totally different activities (e.g. “pressing” the button versus “removing” it). • Description. The description provides a default English description for the plan steps associated with the operator. • Conditional effects (or effects). In different situations, perform ing an action can produce different state changes. These differences in behavior are modeled with effects. Each effect identifies the preconditions necessary for the action to produce specific sta te changes. The example operator has two effects: one effect models turning on the motor, and the other effect models turning off the m otor. Each effect consists of the following components. - One or more slate changes. State changes identify how the action changes the state of the environment. Each state change identifies an a ttrib u te whose value is changed by the action. A state change is is represented by a condition th at “ C hapter 5 discusses operators and motivates their representation in greater detail. 45 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. contains the a ttrib u te ’s new value. T he exam ple’s effectl has the state change (motor off), which m eans th a t the m otor is turned off. Because a sta te change models changes to the environm ent’s state, each effect has implicit preconditions th a t the attrib u tes in its s ta te changes cannot have their post-action (or post-state) value before the action takes place. For ex ample, effectl’s state change (motor off) gives effectl th e implicit precondition that the m otor is not turned off before the action is perform ed. — Preconditions. For an action to produce an effect’s s ta te changes, the effect’s preconditions should be true. An effect has three sets o f conjunctive precondi tions: g-rep, h-rep and s-rep. The g-rep and s-rep represent a version space [Mit78]. T h e version space bounds the correct precondition between the most general candidate precondition (g- rep) th a t is consistent w ith the data and the m ost specific candidate precon dition (s-rep) th a t is consistent with the d ata. T he g-rep contains all pre conditions th a t have been proven to be necessary, while the s-rep contains all potential preconditions th a t have not been proven to be unnecessary. For exam ple, in effectl, the g-rep contains (motor on) because Diligent has proven that the m otor needs to be turned on. In effectl, the s-rep contains all conditions ((valvel open)(valve2 open)(motor on)(HandleOn valvel)) th a t Diligent has not proven to be unnecessary. Because there may be little d a ta for learning, the g-rep is likely to be too general, and the s-rep to be too specific. To overcome this lack o f d ata, Diligent has a third set of preconditions, the h-rep, which represents D iligent’s heuristic, best guess about the “real” preconditions. The h-rep contains every condition in the g-rep and some conditions in the s-rep. For exam ple, the h-rep for effectl contains two conditions. O ne condition has been show to necessary ((motor on)), while the other ((valvel open)) is only likely to be necessary. 3.3 How Diligent Works 3 .3 .1 P r o c e s s in g D e m o n s t r a t i o n s Figure 3.6 summarizes how Diligent learns procedures. The rectangles represent data objects and the ovals represent activities. 46 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. build proof demonstration process demonstration, initial model (heuristic) refine existing operator version space learning experiment build procedure, operators partially ordered plan multiple example _ single example • data flow — demonstration path proof of procedure Figure 3.6: Processing a Demonstration Initially, the instructor dem onstrates the procedure. Diligent observes one action- exam ple for each step in the dem onstration, and it uses the action-exam ples to create a path for the dem onstration. T he path can be thought of as containing the dem onstration’s sequence of action-examples. T he path does not indicate any causal relationships between the steps. During a dem onstration, Diligent creates and refines operators for each of the demon stra tio n ’s steps. If Diligent has not seen a ste p ’s action before, it creates a new operator. Otherwise, an existing operator is refined. Diligent stores operator preconditions in a version space representation [Mit78]. The version space bounds the “real” precondition between a most specific and a m ost general candidate precondition. Because version space learning may converge slowly, Diligent has a heuristic precondition th a t is in between the m ost general and specific candidate preconditions. This heuristic precondition is used for generating plans. Once a procedure has been dem onstrated and operators are defined, the instructor can tell Diligent to generate a proof of the procedure. This d a ta stru ctu re is called a proof because it records how the preconditions and state changes of the p a th ’s steps are used to transform the procedure’s initial sta te into its goal state. Given a proof, it is trivial to transform the proof into a partially ordered plan. 47 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. However, if the operators are not very refined, a plan can have missing or unnecessary step relationships. T he instructor can help correct this problem by telling Diligent to experiment. When experim enting, Diligent uses the environment to repeatedly perform the dem onstration while skipping a step. Elim inating a step exposes how the missing step affects later steps. After th e operators are m ore refined, Diligent can then generate a new version of the plan. 3.3.2 H euristics Diligent provides authors with heuristic aid when authoring procedures. The aid is “heuris tic” because Diligent does not receive enough d a ta to guarantee the correctness of what it learns. Instead of correctness, Diligent a tte m p ts to provide an instructor with a “reason able” procedure.1 2 Most of the machine learning in this thesis focuses on learning operator preconditions. As mentioned earlier, in any procedure, m ost of the environm ent’s attrib u tes are probably irrelevant. This means th a t Diligent’ s learning algorithms need to understand dem onstra tions well enough to identify th e subset of a ttrib u tes th at are likely to be used in precon ditions. By identifying a “likely” set of preconditions, Diligent can make an instructor’ s job easier because it allows him to focus on a smaller set of candidate preconditions. One heuristic th at Diligent uses to identify preconditions is the logical fallacy “ post hoc, ergo propter hoc,” which means “after this, therefore because of this” [She97]. In other words, things th a t happened earlier in a dem onstration are likely to cause things later in a dem onstration. T his is a fallacy because correlation does not equal causation. The heuristic has two relevant aspects. E a rlier s te p s e s ta b lis h p re c o n d itio n s o f l a te r s te p s. The steps in a dem onstration are related, and the instructor probably has reasons for dem onstrating the steps in a given order. A likely reason for this ordering is th at some state changes of earlier steps establish preconditions of later steps. Focus o n a ttr ib u te s t h a t c h a n g e v a lu e . A ttributes with a constant value will not af fect whether a dem onstration’s final s ta te is achieved when the dem onstration’s steps are performed in a given order. In contrast, attributes th a t change value may dif ferentiate between orders of steps th a t achieve the dem onstration’s final state and those th a t don’t. 1 2 By “reasonable” procedure, we mean that it should be close to correct. 48 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Another way to look a t this is to consider an action th a t produces different state changes when perform ed in two different states. A ttributes with the sam e value in both states did not cause the difference in the sta te changes. In contrast, at least one difference between th e states is a precondition. Just because Diligent concentrates on attributes th a t change value does not mean th at an attrib u te w ith a constant value is irrelevant - it ju st means th a t there is only weak evidence th a t th e a ttrib u te is used in a precondition. When Diligent has different levels of knowledge, different techniques be m ay appropri ate. Initially, Diligent may have little dom ain knowledge. To com pensate for its lack of knowledge, Diligent uses general heuristics. However, as Diligent gains more knowledge, the same heuristics may no longer be appropriate. To handle this situation, Diligent has a heuristic to deal with existing knowledge. Favor existing knowledge. Diligent favors existing knowledge and hypotheses over knowl edge derived from general heuristics. In this aspect, Diligent is very sim ple because it does not consider th e quality of existing knowledge. Large procedures tend to be hierarchical in that some of a procedure^ steps represent subprocedures. Furtherm ore, a subprocedure may itself contain its own subprocedures. This nesting of subprocedures can cause the total number of steps in a hierarchical proce dure and its subprocedures to become very large. As procedures become larger, run-tim e overhead becomes more of a concern. For hierarchical procedures, Diligent has a heuristic to reduce run-tim e overhead by limiting the amount knowledge th a t its algorithm s use. Focus on the current procedure. W hen Diligent interacts with an instructor, inter action is focused on th e steps of the current procedure rather than the steps in a subprocedure. In fact, a subprocedure is treated as ju st another step th a t has preconditions and establishes sta te changes.1 3 To simplify processing, Diligent assumes that the step relationships of subprocedures are correct. This assum ption allows Diligent to ignore many of the internal details of subprocedures. 1 3 When authoring, an instructor works on a current procedure, which he has explicitly selected. As will be explained in C hapter 4, a n instructor can explicitly include subprocedures as steps in the current procedure. The instructor m ust explicitly specify all subprocedures because Diligent cannot autom atically decompose a large procedure into subprocedures. 49 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. As will be explained later, this heuristic is used to reduce the overhead of deriving step relationships and performing experiments. 3.4 W here to Look for More Information Chapter Chapter Num ber Topic Processing Demonst rations 4 Interaction w ith the instructor Types of dem onstrations Assum ptions ab o u t dem onstrations Derivation of goal conditions Derivation of step relationships C reating hierarchical procedures Generation of default English descriptions Learning O perators 5 O perator creation O perator refinement Experimenting 6 Generation and performance of experiments Table 3.1: Where Topics are Covered Table 3.1 shows where th e algorithms discussed in this chapter are covered in more detail. The chapter on dem onstrations (chapter 4) covers how an instructor interacts w ith Diligent to perform dem onstrations and how the dem onstrations are then turned into plans. The later chapters on learning operators (chapter 5) and experimentation (chapter 6) concentrate on machine learning rather than authoring. 50 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Chapter 4 Processing D em onstrations Dem onstrations by human instructors are Diligent’ s primary source o f input. Yet, a demon stration is not a procedure. A dem onstration doesn’t identify the procedure’s goals or how the dem onstration’s steps depend on each other. This chapter describes the processing involved in transform ing dem onstrations into procedures. The chapter addresses a num ber of issues: • The interaction between th e user (or instructor) and Diligent. This includes assump tions about how the in structor dem onstrates. • The algorithms used to transform dem onstrations into procedures th a t Diligent can output. • How to construct a hierarchical procedure out of other procedures. A hierarchical procedure is a procedure th a t contains another procedure as one of its steps. This chapter focuses on the interaction between Diligent and the instructor. We will s ta rt by briefly discussing authoring with Diligent. We will then discuss the d a ta structures used to record dem onstrations. A fterw ards, we will discuss how to dem onstrate a simple procedure and generate a plan for it. V V e will then discuss how to construct hierarchical procedures. We will also discuss how to incorporate steps into a procedure th at gather information without changing th e s ta te of the simulated domain (or environm ent). We will then discuss complexity, and finish the chapter with related work on basic Programming By Demonstration (PBD) techniques. 4.1 The Authoring P rocess In C hapter 2, we discussed how an instructor could use Diligent to au th o r a procedure. V V e will briefly review this m aterial. 51 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Authoring a procedure involves specifying the procedure's steps and m aking sure th a t Diligent understands the relationships between the steps. A fter creating a new procedure, an instructor provides dem onstrations for the procedure. D em onstrations can identify the procedure’s steps as well as provide d a ta for learning the preconditions of steps. After the instructor has defined the procedure’s steps, Diligent is able to heuristically derive goal conditions for the procedure and to perform experim ents th a t a tte m p t to identify the pre conditions of steps. A t some point after the goals have been specified, the instructor can tell Diligent to derive the dependency relationships (i.e. step relationships) between the procedure’s steps. W hether Diligent derives goal conditions, derives step relationships, or experiments is controlled by the instructor. The instructor controls when Diligent experi m ents to prevent experim ents initiated by Diligent from causing the instructor undesired delays. When the instructor is satisfied with the procedure, he can give it to an autom ated tu to r where it can be tested. In order to make authoring easier, Diligent allows instructors to perform many itera tions of these activities. 4.2 Types of D em onstrations So far we’ve treated one dem onstration as if it were analogous to one procedure. W hile th a t is the expected case, Diligent allows multiple dem onstrations to be associated with a single procedure. Diligent su p p o rts the following types of dem onstrations.1 • Add-step. A dd-step dem onstrations add steps to a procedure. This type of demon stration is used when a procedure is created, and it can also be used to add additional steps to an existing procedure. Additional steps are added to a procedure by inserting the new dem onstration’s step s in between a pair of the procedure’s existing steps. Besides augm enting existing procedures, the ability to perform additional demon strations supports error recovery.2 1 Section 8.4.2.1 discusses extending Diligent to support additional types of dem onstrations. These types of dem onstrations did not appear im p o rta n t for the types of procedures th a t we used, but it appears th at they would be useful on more com plicated procedures. 2 An instructor might detect errors by using menus to look a t dependencies between steps, or he might d etect errors by testing a procedure by w ith an autom ated tu to r. 52 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. • Clarification. This type of dem onstration lets the instructor illustrate how the do main works without adding steps to the procedure. Instead of adding steps, clarifi cation dem onstrations provide more data for machine learning. Clarification dem on strations can be used to show w hat happens if the dem onstration is not performed properly, and they can be used to provide additional correct, but slightly different, dem onstrations of the procedure. In Chapter 2, it was m entioned th a t only an expert user was likely to use this type of dem onstration. Diligent differs from m ost Program m ing By Dem onstration system s by not requiring multiple, correct dem onstrations of a procedure. Diligent can do this because it has access to the environment, which contains an executable model of the dom ain. Access to an executable model allows Diligent to perform experiments th a t can reveal information th a t would normally be provided by additional demonstrations. As will be discussed later, both types of dem onstrations are used to generate experi ments. 4.3 Data Structures This section presents the d a ta structures used to process dem onstrations. This discussion assumes some knowledge of how procedures are represented as plans (Section 3.2.2.1). The d a ta structures use the basic d a ta types that were defined for the interface to the environment (Section 3.2.1.1). The d a ta structures will be illustrated later as we discuss processing dem onstrations. 4.3.1 Prefixes Each demonstration starts in a particular initial state, and Diligent remembers how to restore this initial state. Diligent restores the initial state when performing experim ents and when the instructor provides additional dem onstrations of the procedure. The data structure used to store an initial state is called a prefix. Prefixes have the following components: • Configuration. A configuration is a text string (i.e. configuration-id) th at is used by the instructor and Diligent to communicate about known states of the environm ent. 53 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. • Additional-actions. A sequence of actions (i.e. action-ids) that modify the s ta te of the environment th a t is specified by the configuration. Additional actions are useful for a couple o f reasons. - Additional-actions can reduce the need for creating additional configurations of the environm ent. Saving the s ta te o f the environment in order get a new configuration-id might be expensive: n o t only could it take a long tim e, but it could also use a lot of memory. — Besides reducing the cost of saving configurations, additional-actions are used when embedding one dem onstration inside another dem onstration. T his is use ful when adding steps to an existing procedure. It is also useful when defining a new subprocedure as a step in a n o th er procedure. 4.3.2 D em on stration s Dem onstrations are the m ajor source of input th a t Diligent receives from the instructor. To dem onstrate, an instructor needs to provide an initial state and use the environm ents graphical interface to perform a series of actions. A demonstration has the following com ponents: • Prefix. The prefix contains the information necessary to restore the environm ent to the dem onstration’s initial state. • Previous-step. The previous-step is useful in an add-step dem onstration th a t adds steps to an existing procedure. The previous-step is a step defined by a previous dem onstration. (For a procedure’s first dem onstration, the previous-step is the step representing the procedure’s initial state.) A new dem onstration’s steps will be inserted into the procedure between the previous-step and the step im m ediately after it. • Steps. The sequence of steps that the in stru cto r demonstrates. A step is either a subprocedure or an action performed in the environm ent. A step in a dem onstration is the same as a step in procedure. How a ste p changes the environment is recorded in the action-example th a t was produced w hen the step was dem onstrated. • Type. As discussed in Section 4.2, Diligent supports two types of dem onstrations: add-step and clarification. (The steps of a clarification demonstration are not added to the procedure, but are used when Diligent experim ents.) 54 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 4.3.3 Paths One problem with the dem onstration d a ta structure is th a t it can be awkward for Diligent to use. Not only can a procedure contain steps from several add-step demonstrations, but a dem onstration also references a previous step outside itself.3 To simplify processing, dem onstrations are converted into a data structure called a path. O nce the instructor has finished a dem onstration, Diligent adds the dem onstration’s d a ta to a path and no longer uses the dem onstration. A path is easier to use than a dem onstration because, unlike a dem onstration, a p ath does not reference any steps outside itself.4 A path contains the following components: • Prefix. Specifies the procedure’ s initial state. • Steps. The sequence of steps to be performed. • Generates-plan. Yes or no. Should the path be used to generate a plan? A ‘ "no” indicates that the path represents a clarification dem onstration. 4.3 .4 Steps In order to create a plan for a procedure, Diligent needs to identify the preconditions of steps and the state changes produced by steps. To illustrate the d a ta associated with a step, we will use the following example: Consider a step where a valve is opened. Suppose the environment allows the valve to be opened whenever the valve is shut, but in the procedure being learned, the valve should only be opened if the alarm light is illuminated. A step contains the following components: • Nam e. Each step has distinct name. • Type. Abstract, primitive or special. An abstract step represents a subprocedure, and a primitive step represents an action performed by the instructor. A special step indicates either the beginning or the end of the procedure. 3If one considers the step th at represents the start of a procedure (e.g. begin-procA) as outside a dem onstration, then demonstrations always reference a step outside themselves. 4 Diligent uses paths rather than dem onstrations to perform experim ents. 55 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. • Subprocedure. The name of the subprocedure perform ed by an abstract step. This field is em pty if the step isn’t ab stract. • Operator. The operator th a t models the action perform ed by a primitive step. This field is em pty if the step isn’t prim itive. An operator models how an action changes th e environm ent. An operator does this by identifying the preconditions necessary for the action to produce a set of state changes. Because operators can be reused in other procedures, an operator's preconditions are independent of the current procedure. In the above example, the operator would indicate th a t the valve can be opened whenever it is shut. However, the operator should not contain procedure specific preconditions such as requiring the alarm light to be illuminated. • Control-preconditions. Control preconditions a re procedure specific preconditions for performing the step. In the above example, a control precondition should in dicate th a t the valve should not be opened unless the alarm light is illuminated. The precondition is needed because the environm ent allows the the valve to opened whenever it is shut rather than when the light is illum inated. It appears th at control preconditions are likely to refer to indicators such as lights and gauges th at humans look at for visual cues. • Mental-conditions. A mental condition is a condition th a t contains a mental a t tribute, and a m ental attribute is an attrib u te th a t is internal to Diligent rather than present in the environm ent. Diligent creates m ental-conditions for sensing actions. A sensing action [AIS88, RN95] gathers information from the environm ent w ithout changing the state of the environment. For example, a sensing action m ight involve checking to see whether a light is illuminated or checking the value of a gauge. A human student might perform a sensing action on a light by looking a t the light o r selecting it with a mouse. Diligent uses mental-conditions to guarantee th a t a step is performed. Diligent’s heuristics do this by putting all m ental-conditions into the procedure’s goal condi tions. (Of course, the instructor can reject these goal conditions.5) If a mental- condition were not in the procedure’s goal conditions, then the mental-condition’s step would only be performed if the ste p ’s changes to the environm ent’ s state were 5Becausc he is a domain expert, an instructor should be able to determ ine which goal conditions seem valid o r reasonable. 56 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. needed to com plete the procedure. For example, if asensing action's mental-condition was not p art of the goal conditions or preconditions of other steps, then the step would never be needed because it does not change the environm ent’s state. • Action-example. An action-exam ple (Section 3.2.1.1) is an exam ple of an action being performed and identifies the state before the step (pre-state) and after the step (post-state). The portion o f the post-state th at changed is called the delta-state. The action example associated with a step comes from the in structor’s dem onstration of the step. 4.3.5 R evisitin g th e R ep resen ta tio n of P rocedures A procedure consists of the following components: • A plan. Plans are discussed in Section 3.2.2.I. Diligent outputs a procedure in the form of a plan. • Set of paths. Diligent uses paths to generate plans. Diligent only allows one of a procedure’s paths to generate a plan. However, it would not be difficult to extend Diligent so th a t m ultiple paths can generate plans. The path th a t generates the plan contains every add-step dem onstration, while each clarification dem onstration has its own path. Each clarification demonstration has its own path because clarification dem onstrations are m eant to be used only for learning and may not correctly perform the procedure. When Diligent experim ents on a procedure, Diligent uses the procedure’s paths to generate experim ents. T his includes paths th at generate plans and those that don’t. 4.4 Assumptions about How the Instructor D em onstrates Before presenting some exam ple dem onstrations, we will discuss the nature of the dem on strations presented to Diligent. Diligent makes the following assum ptions about dem on strations. C o rre c t d e m o n s tra tio n s . Diligent assumes that an instructor knows how to correctly dem onstrate a procedure. Diligent uses this assum ption when it assumes th a t a p ath ’s sequence of steps is correct. Diligent also uses this assum ption when it uses the action-example from th e dem onstration of a step. 57 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Diligent allows an instructor to recover from a violation of this assumption by provid ing additional dem onstrations, editing preconditions, and in the worst case, deleting a step and dem onstrating it again. Small, modular procedures. Diligent assumes th a t the instructor breaks large proce dures into sets of small m odular procedures. The instructor then uses the small procedures to construct large procedures. This assum ption is used when considering the run-time overhead of som e algorithm s used to perform experiments or create plans. First demonstration contains all steps. Because Diligent assumes correct dem onstra tions of small m odular procedures, Diligent assum es th a t the first dem onstration of a procedure probably contains all the procedure’s steps. This assum ption is used by Diligent when it considers only the current dem onstration when creating the preconditions of a new operator. Because of this assum ption, we did not focus on undesirable interactions between steps in different dem onstrations of the sam e pro cedure. Because this assum ption is not always correct, Diligent allows the instructor to add steps to a procedure with additional dem onstrations. M ultiple Demonstrations are Consistent. Suppose a procedure has multiple add- step dem onstrations. Diligent assumes that the steps of a new dem onstration will not remove preconditions th a t are required by steps later in the path. This assum ption allows Diligent to use the action-example from a ste p ’s dem onstration even though newer dem onstrations may have changed the pre-state in which the step will be performed. We did not focus on violations of this assum ption because the procedures th a t we were looking a t did not seem to require m ultiple dem onstrations. This made to it difficult to find typical examples of how users would inconsistently perform m ultiple dem onstrations. Instead, we focused on understanding small, m odular procedures th at were correctly dem onstrated on the first dem onstration. Recovery from a violation of this assumption is sim ilar to to recovery from an in correct dem onstration. M aintaining consistency of action-examples between dem on strations is an area for future work. Logically related steps grouped together. Diligent assum es that the instructor groups logically related steps together in the same small procedure. 58 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Violating this assum ption not only causes the problems associated with misleading dem onstrations, but also raises questions about whether the procedure being learned is usable. To some degree, Diligent’s plans assume th at students will finish one subprocedure before sta rtin g on the next subprocedure. W hen deriving a plan’s step relationships, Diligent does not consider what would happen if the steps of two subprocedures were interleaved. It is unclear w hether authoring interleaved subprocedures is of any relevance. W hen an instructor provides dem onstrations to Diligent, all subprocedures are performed sequentially, and thus, it is impossible to provide a dem onstration with interleaved subprocedures. In any case, interleaved subprocedures are beyond Diligent's scope. 4.5 About this C hapter’s Extended Example This chapter’ s extended example shows how to author procedures by providing a series of demonstrations. T he dem onstrations will be perform ed on a device called the High Pressure Air Com pressor (HPAC). To improve clarity of presentation, the domain has been simplified by reducing the number of a ttrib u tes and changing the names of attributes. This chapter will not discuss the details of Diligent’s user interface, which can be found in C hapter 2 and Appendix D. 4.6 Authoring a N ew Procedure Initially, we will assume th at Diligent has no knowledge of other procedures or about the dom ain. 4.6.1 C reating a N ew P ro ced u re The first thing th a t the instructor does is to create a new procedure. When creating a procedure, the instructor needs to give it a name and to provide a description. T he name identifies the procedure to the instructor, and the description is used to describe the procedure to students. The instructor calls the procedure “p r o d ” and gives it the description “shut a few valves.” 59 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 4.6.2 S ettin g U p th e In itial S ta te Before dem onstrating the new procedure, th e instructor needs to put the environm ent into the dem onstration’s initial sta te . Since Diligent remembers this initial sta te , Diligent can restore the initial sta te for experim ents and additional dem onstrations. Diligent asks the instructor for a configuration-id th at identifies a known state of the environment. A configuration-id is a text string th at Diligent and instructor use for com munication. Let the instructor specify the configuration with the string “configl.” Diligent then uses “configl” to reset the state of th e environment with Restore-Environment- State (Section 3.1.3). Because creating a new configuration o f the environment may be slow or use a lot of memory, the instructor may wish to reuse an existing configuration while modifying the state associated with the configuration. For this reason, Diligent now asks the instructor if he’d like to modify the configuration “configl” by performing some additional actions. In our case, the instructor indicates th a t he doesn’t want to perform any additional actions. 4.6.3 D em on stratin g th e P roced u re Now th at the environm ent is in the desired initial state, the instructor can dem onstrate the procedure. The dem onstration contains three steps, and its purpose is to shut two valves. The steps are as follows. 1. The instructor uses the mouse to select the handle (handlel) th at opens and shuts valves. This causes the handle to tu rn , which shuts the valve (valvel) th a t is under neath the handle. 2. The instructor moves handle handlel to valve valve2 by selecting valve2 with the mouse. 3. The instructor selects handle handlel, which shuts valve2. The instructor then indicates th a t he has finished the dem onstration. 4 .6.4 C reating P rim itiv e Step s In the above dem onstration, none of the steps represents performing a subprocedure. In stead, every step represents an action th at th e instructor performs. When a step represents an action performed by the instructor, the step is called primitive. 60 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Action-example: exam plel Action-id: turn handlel Pre-state: (valvel open)(valve2 open)(HandleOn valvel)(A larm L ightl off) (Cdm Status normal) Delta-state: (valvel shut) Action-example: example2 Action-id: move valve2 Pre-state: (valvel shut)(valve2 open)(HandleOn valvel)(A larm L ightl off) (Cdm Status normal) Delta-state: (HandleOn valve2) Action-example: example3 Action-id: turn handlel Pre-state: (valvel shut)(valve2 open)(HandleOn valve2)(A larm Lightl off) (Cdm Status normal) Delta-state: (valve2 shut) Figure 4.1: First D em onstration's A ction-Exam pies Diligent gets information about how an action affects the environment in the form of action-examples. The action-examples for the d em onstration’s three steps are shown in Figure 4.1 and are used by C r e a te -P rim itiv e -S te p (Figure 4.2) to create steps. For the dem onstration’s first step, the instructor tu rn s handle handlel. Diligent uses O bserve-Action (line 1 in 4.2) to get the first ste p ’s action-example (examplel). The action-exampie’s action-id identifies which action was perform ed by indicating the type of action (turn) and the object being acted upon (handlel). Diligent models how an action affects the environm ent with operators. O perators represent reusable, procedure-independent knowledge o f the environment and can be used with multiple steps. In order to reuse existing knowledge, Diligent searches for an existing operator th at matches the action-id of the step’s action-example. Diligent searches with the action-id because there is a one-to-one correspondence between o p erato rs and action-ids. 61 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. procedure Create-Prim itive-Step Input: dem o: The current dem onstration. Result: stp : A step in the procedure. 1. Get the ste p ’s action-example ex from the environment with Observe-Action (Section 3.1.3). 2. Find if an operator op already exists for action-id (ex). 3. If an operator was found, refine op using the action-example ex. (C hapter 5). 4. Otherwise, no operator was found. (Need to create a new operator) 5. Ask the user for an operator nam e and description. 6. Use the operator name and description to create a new operator op. C reating the new operator requires the action-example ex and the current dem onstration demo. (demo is used to create heuristic preconditions.) (C hapter 5). 7. Initialize the com ponents of the new step stp. (The < integer > is used to give the step a distinct name.) name(sfp) < — concatenate: name(op) < integer > description (stp) < — description(op) action-example(sJp) < — ex operator(sZp) f - op 8. If the step represents a sensing action then initialize the the com ponents of the step involving control preconditions and mental attributes. (Section 4.7.4.) Figure 4.2: Creating a Primitive Step Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. At this point, Diligent doesn’t know any operators. Therefore. Diligent needs to create an operator. First, Diligent asks the instructor to give the operator a nam e and a descrip tion. The instructor nam es the operator “turn” and approves the default description th at Diligent has generated, “tu rn the valve handle.”6 Once the operator has a name and a description, the action-exam ple and the current dem onstration are used to initialize the new operator.7 Now th at the step has an operator, Diligent uses the operator to create a name for the step. Since an o perator could be used multiple tim es in a procedure, each step has a distinct name. The first ste p is called turn-1 and inherits the operator’s description. The last thing to do when creating a primitive step is to check w hether it represents a sensing action (line 8 in Figure 4.2). A sensing action (e.g. checking a light) gathers information from the environm ent w ithout changing it. Line 8 is skipped because none of the steps in this dem onstration involve sensing actions. For the second step, th e instructor selects valve2. This moves the handle from valvel to valve2. The ste p ’s action-exam ple is example2. Once again, a new operator is created. The instructor calls the o p e ra to r “move-2nd” and approves the default description “move to the second stage valve.” T his results in a step called move-2nd-2. The operator is called move-2nd rather than move because different operators are needed to move the handle to each valve. An operator only models actions performed on one object and moving to a valve involves selecting th at valve. As far as Diligent can observe, the only com m onality in moving the handle to different values involves the type of action (move) and the a ttrib u te (HandleOn) whose value is changed. The problem is more difficult than it appears because the values o f attrib u te HandleOn are actually de scriptions of a valve rather th a n the name of a valve (e.g. “separator drain 1st stage valve” versus “valvel” ). However, for clarity, we will use valve names (e.g. “valvel” ) as values of attrib u te HandleOn. For the third step (turn-3), the instructor selects the handle again, which now shuts valve2. The ste p ’s action-exam ple is example3. Unlike the first step. Diligent finds an operator (i.e. turn) th a t m atches the action. Diligent then uses the step 's action-example to refine the operator (line 3). 6Thc generation of default descriptions is described in Section 4.8.1.1. ' See Chapter 5 for details of how operators are created and then later refined. 63 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 4.6.5 C onverting th e D em o n stra tio n in to a P ath As the instructor performs the new procedure’s first dem onstration, Diligent records it in the d a ta structure shown in Figure 4.3. T he type o f dem onstration is add-step because the dem onstration adds steps to the procedure. Because this is the procedure’s first dem on stration, the previous-step (begin-procl) represents the s ta rt of the procedure. The prefix records how to the dem onstration’s initial sta te was created and allows the initial state to be restored. To make other processing easier, the dem onstration is converted into a d a ta structure called a path.8 Figure 4.4 shows the algorithm I n itia liz e -P a th , which is used to convert a procedure’s first dem onstration into a path and to convert clarification dem onstrations into paths. Demonstration: T y p e : add-step P re fix : prefixl P re v io u s -s te p : begin-procl S te p s: turn-1 -> move-‘ 2nd-2 — > turn-3 P re fix : prefixl C o n fig u ra tio n : configl A d d itio n a l-a c tio n s: none S te p tu rn -1 : O p e ra to r: turn A c tio n -e x a m p le : exam plel S te p m ove-2nd-2: O p e ra to r: move-‘ 2nd A c tio n -e x a m p le : example2 S te p tu rn -3 : O p e ra to r: turn A c tio n -e x am p le: example.3 Figure 4.3: First D em onstration 8The d ata structures for both paths and dem onstrations are defined in Section 4.3. 64 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. procedure Initialize-Path Input: demo: A dem onstration pname: The procedure’s name Result: pth: A new, initialized path . 1. If the demonstration is of type add-step then the path will be used to create a plan for the procedure. generates-plan (pf/i) < — yes Otherwise, the path will not be used to create a plan generates-plan (p£/t) < — no 2. Copy the information necessary to restore th e demo’s initial state prefix(p</i) < — prefix(demo) 3. Copy the dem onstration’ s steps. steps(pth) f— steps(dem o) 4. Use the procedure name pnam e to create step names for the procedure’ s beginning (begin-pnam e) and end (end-pnam e). 5. Adjust the path’s steps so th a t th e step representing the beginning of the procedure is th e first step and the step representing the end of the procedure is the last step. Figure 4.4: Initializing a Path G e n e ra te s -P la n : Yes P re fix : prefixl S te p s: begin-procl -> • turn-1 -> move-2nd-2 — ► turn-3 — ► end-procl Figure 4.5: The Initial Path Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The path created from the dem onstration in Figure 4.3 is shown in Figure 4.5. In Figure 4.5, the step begin-procl represents the sta rt of the procedure, and the step end- procl represents the end of the procedure. 4.6.6 A S econ d D em on stration So far Diligent has recorded information about the new procedure may be problems with this information. To correct any problems, to be able to modify a path. Diligent allows instructors to modify additional dem onstrations th a t add steps to the path.9 Some reasons for adding additional steps to a procedure include • The instructor wants to elaborate the procedure by adding more steps. • The instructor w ants to correct an error or a problem w ith the existing steps. The use of additional dem onstrations is limited by Diligent’s assumption th at the steps in a path represent a linear sequence of actions th at transform the path’s initial state into its final state. This assum ption supports plans where unnecessary steps can be skipped a t run-tim e, but the assum ption doesn’t support plans containing alternative steps for different initial states. Nevertheless, the assum ption is used because it simplifies the derivation of the plan’s step relationships and because the assum ption reflects Diligent’s assum ptions about dem onstrations (Section 4.4). To illustrate the algorithm s for combining dem onstrations, we will add a step to the running example. (O f course, such a simple procedure should only need one dem onstra tion.) To augm ent the procedure, the instructor could have shut additional valves. However, to simplify the procedure, the instructor will only add a single additional step. Y V e will assume th at the handle (handlel) th at is used to shut valves should be stored in a standard location (i.e. on top of valvel). This means that the instructor will need to move the handle to valvel. Now suppose th a t the instructor starts a new dem onstration and indicates that it is an add-step dem onstration. 0 Diligent also allows an instructor to delete steps, but deleting steps is an editing feature that we will not discuss. 66 in a path, but there the instructor needs paths by performing Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 4.6.6.1 Setting Up the Dem onstration’s Initial State The first problem is specifying the new dem onstration’s initial state. One approach would be to restore the path’s initial state and then have the in structor perform steps th at put the environm ent in state where the new step could be perform ed. However, this approach has a few problems. The instructor has to duplicate steps from the previous dem onstration. This not only takes time but is also a potential source o f errors. Instead, Diligent takes a different approach. Diligent has the instructor specify an existing step th a t is before the new dem onstration. Diligent then performs the procedure through the specified step. Now suppose th at the instructor indicates th at th e new dem onstration should sta rt after the last step (turn-3) in the procedure’s path. This m eans th a t the new dem onstration will s ta rt in the previous dem onstration’s final state. To do this, Diligent uses R e p la y -P re fix (Figure 4.6) and the path’s prefix (prefixl in Figure 4.3) to restore the p a th ’s initial state. procedure Replay-Prefix Input: pre: A prefix Result: Resets the state of the environm ent 1. Use configuration(pre) and Restore-Environm ent-State (Section 3.1.3) to restore the environm ent to a known state. 2. Now make additional changes using the sequence of actions in additional-actions(pre). This is done by invoking Perform-Action (Section 3.1.3). Figure 4.6: Using a Prefix After restoring the p ath ’s initial state, Diligent perform s all three of the previous dem onstration's steps (i.e. turn-1, move-2nd-2 and turn-3). This results in the new demon stration having the prefix shown in Figure 4.7. The prefix’s additional-actions represent the action-ids needed to perform the path's existing steps. 4.6.6.2 Performing The Dem onstration The instructor then performs the dem onstration by doing the following. The instructor moves the handle from valve2 to valvel by selecting valvel with the mouse. Since no 67 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Prefix: prefix‘ 2 Configuration: configl Additional-actions: turn handiel -¥ move valve2 -¥ turn handlel Figure 4.7: The Second D em onstration’s Prefix matching operator is found, Diligent creates a new operator. The instructor calls the operator “move-1st” and approves the default description “move to the first stage valve.” The new step is called move-lst-4. At this point, the instructor ends the dem onstration. 4.G.6.3 Processing th e Demonstration Demonstration: Type: add-step Prefix: prefix2 Previous-step: turn-3 Steps: m ove-lst-4 Step m ove-lst-4 Operator: m ove-1st Action-example: example4 Action-id: move valvel Pre-state: (valvel shut) (valve2 shut) (HandleOn valve2) (AlarmLight 1 off) (C dm S tatus normal) Delta-state: (HandleOn valvel) Figure 4.8: The Second D em onstration As Diligent observes th e demonstration, it records the d a ta shown in Figure 4.8. Diligent then uses this d a ta to insert the dem onstration’s step into the path. The algorithm for doing this is shown in Figure 4.9. In Figure 4.9, line 1 is used when a dem onstration adds steps to the start of the procedure. In this case, the p a th ’s prefix is replaced by the dem onstration's prefix because the new steps might be dependent on the dem onstration’s prefix. The dem onstration could have a different prefix than the path because the instructor could have added additional actions to the path 's prefix. For example, the instructor might 68 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. procedure Add-Dem o-To-Path Input: dem o: An add-step demonstration pth: A path th a t is used to generate a plan. Result: The dem onstration is incorporated into the path. 1. If the dem onstration’s previous-step is the step representing the procedure’s initial s ta te (e.g. begin-procl), then the dem onstration adds steps to the s ta r t of the procedure. In this case, replace the path’s initial state (i.e. prefix) with the dem onstration’s. prefix(pth) < — prefix(</emo) 2. Insert the dem onstration’s sequence of steps x i .. .xj into the p a th ’s sequence of steps s j . . ,sn. If the dem onstration’s previous-step is s} then steps(pth) < — S [ .. .SjX[.. jc jsj+ i.. ,sn Figure 4.9: Adding a Demonstration to a P ath want students to perform an additional step. He could do this by modifying the path’s prefix so th at an additional ste p was required to successfully perform the procedure. When using the exam ple dem onstration, line 1 in Figure 4.9 is skipped because the dem onstration adds steps to the end of the procedure. The updated path is shown in Figure 4.10. G e n e ra te s -P la n : Yes P refix : prefixl S teps: begin-procl — > turn-1 — > move-2nd-2 -> turn-3 — ► move-lst-4 — » end-procl Figure 4.10: Updated Path 4.6.7 G enerating a P la n V V e now have a path th a t defines the procedure’s steps, but a path is not usable as a procedure. A path only contains a linear sequence of steps and does indicate how the steps are related to each other. 69 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. As mentioned in Section 4.3.5. a procedure consists a set o f p a th s and a plan (Section 3.2.2.1). In the following sections, we will discuss how the d a ta in a path is transformed into a plan.1 0 4.6.7.1 Guessing th e Procedure’s Goals The procedures learned by Diligent attem pt to put the environm ent into a given state. When the state is reached, the procedure is finished. This sta te is called the goal state and is defined by a set of goal conditions that need to be satisfied. A goal condition, like any other condition, is specified by an attribute and its value. Procedures th a t term inate when the environment is p ut into a given state are said to have goals o f attainm ent [Wel94]. Besides a ttrib u tes th a t are present in the environm ent, goal conditions can also include conditions th a t represent the values of mental attributes. A m en ta l attribute is internal to Diligent and contains inform ation th at Diligent has collected d u rin g the procedure. For example, a mental a ttrib u te might record that the instructor explicitly checked whether a light was illum inated.1 1 Since Diligent finishes a procedure when all the procedure’s goals have been attained, mental attributes allow Diligent to perform the steps in a p ath even if the steps cause no net change in the environm ent’s state. Thus, other system s th a t only use goals of attainm ent but do not have m ental attributes (e.g. Instructo-Soar [HL95]) cannot learn this type of procedure. Diligent attem p ts to aid the instructor by identifying likely goal conditions. Diligent can do this because it is learning goals of attainm ent and because the action-examples associated with each ste p indicate how the environment changed d u rin g th a t step. Diligent hypothesizes th a t a ttrib u te s th a t changed value during one of th e procedure’s steps are involved in a goal condition. This heuristic technique ignores attributes whose values are co n sta n t during a proce dure. Although the values of these attributes could be goal conditions, there is no evidence to indicate th a t they are goal conditions. The technique for identifying goals was borrowed from Instructo-S oar [HL95]. However, Instructo-Soar only looks for attributes with different values in th e initial and goal states. In contrast, Diligent looks for attributes that change value during a t least one step. This 10As mentioned before, all path s but one represent clarification dem onstrations. Clarification demon strations provide additional d a ta for machine learning w ithout adding steps to th e procedure's plan. ' ’Section 4.7.4 discusses sensing actions and mental attributes in more detail. 70 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. difference allows Diligent to identify attributes whose value is the same in the initial and goal states but changes during th e procedure. Diligent’s technique has som e advantages over Instructo-Soar’s. Diligent can identify a larger set of candidate goal conditions. Furtherm ore, if an instructor makes an effort to undo state changes from earlier in the path, then the values of the attrib u tes involved might be im portant. Consider an example from a m achine maintenance dom ain. When diagnosing a problem, a device might be kept in standard state. During a diagnostic procedure, a human might perform actions to g a th e r information about the state of the device before returning the device to the standard state. In this case, the conditions involved in the standard state are im portant even if they are the same in the initial and goal states. Potential goal conditions are calculated from a p a th using Derive-Path-Goals (Fig ure 4.11). All the steps in our running example’s path are primitive (line 4). The goal conditions derived from our path are shown in F igure 4.12. The condition (HandleOn valvel) is a goal condition even though the value o f a ttrib u te HandleOn is the sam e at the beginning and end of the path. 4.6.7.2 Deriving Step Relationships Once the procedure’s goals are known, Diligent can a tte m p t to determine how each step supports establishing the procedure’s goal conditions. Steps can do this by directly satis fying goal conditions or satisfying preconditions of later steps. Diligent records the relationships between steps in w hat we will call step relationships. Step relationships consist of causal links and ordering constraints. A causal link indicates th at a state change of an earlier step is a precondition for a later step, and an ordering constraint indicates the relative order for performing a pair of steps.1 2 Step relationships are updated with Update-Step-Relationships (Figure 4.13). The data available for com puting step relationships consists of the procedure’s goal conditions and a path, which contains a linear sequence of steps. Steps contain the follow ing information: • An operator th a t is independent of the procedure. • An action-example th at indicates the environm ent’s state before and after the step. • Step-specific control-preconditions th at may not be required by the operator. I2The plan representation, including causal links and ordering constraints, is discussed in Section 3.2.2.1. 71 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. procedure Derive-Path-Goals Input: pth: A path th a t is used to generate a plan. O utput: goals: A set of goal conditions. I. For each step stp in the path do the following. S ta rt with the last step and iterate backwards through the sequence of steps 2. If the step represents the procedures initial or goal states then do nothing. 3 . If th e s t e p r ep re sen ts an a b s tr a c t s te p (i.e. s u b p r o c e d u r e ), th en a d d th e s u b p r o c e d u r e s g o a l c o n d itio n s to goals if th e r e is n o t a n y c o n d itio n in goals w ith t h e s a m e a ttr ib u te . goals « — goals U {c j | c i G su b p r o c e d u r e -g o a ls(sfp ) A ->3 c 2 6 goals w h er e a t t r i b u t e ^ ) = a tt r ib u t e ( c 2)} 4. If the step stp represents a primitive step goals < — goals U conditions generated by stp th a t involve mental attrib u tes. Also add any delta-state conditions of the ste p ’s action-example th at do not have the sam e attrib u te as a condition in goals. goals < — goals U { c i | e x = a c tio n -e x a m p le (s fp ) A c i G d e lt a - s t a t e ( e x ) A ->3 c 2 G goals w h ere a tt r ib u t e ( c i) = a t t r ib u t e ( c 2)} Figure 4.11: Deriving Goals from a Path (v a lv e l s h u t)(v a lv e 2 s h u t)(H a n d le O n v a lv e l) Figure 4.12: Goal Conditions Derived from Path Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. procedure Update-Step-Relationships Input: proc: The procedure. pth: The path containing the procedure's steps. Result: A procedure with updated causal links and ordering constraints. 1. Use path pth and Derive-Path-Effect-Skeleton to create a skeleton for the path. The skeleton indicates which operator effects are associated with each step. The skeleton is an interm ediate calculation. ‘ 2. Use the p a th ’s skeleton and Derive-Causal-Links to generate a set of candidate causal links (cl-cand). T he procedure also creates a proof, which identifies which operator effects achieve the procedure’s goals. 3. Use the proof and Derive-Ordering-Constraints to generate a set of candidate ordering constraints (ord-cand). 4. For every causal link in cl-cand, add an ordering constraint between the causal link’s tw o steps to ord-cand. Figure 4.13: Computing Step Relationships Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. • Conditions containing m ental attributes (m ental-conditions) th at are established by the step. M ental attrib u tes are internal to Diligent and are not part of the environ ment. Unfortunately, the d a ta associated with steps is not in a form th at can easily be used. Therefore, Diligent simplifies the data representation with Derive-Path-Effect- Skeleton (line 1). The procedure Derive-Path-Effect-Skeleton combines the data for a step’s operator, action-exam ple and m ental-conditions in order to identify the step's preconditions and sta te changes. This data structure is called a skeleton because it is in an unfinished sta te and because it provides a fram ework th a t identifies the procedure's sequence of steps, their preconditions, and their sta te changes. Operator: turn Action-id: turn handlel Effect: effect 1 H-rep preconditions: (valvel open) State changes: (valvel shut) Effect: effect‘ 2 H-rep preconditions: (valvel shut)(valve2 open) (HandleOn valve2) State changes: (valve‘ 2 shut) Operator: move-1st Action-id: move valvel Effect: effect3 H-rep preconditions: (HandleOn valve2) State changes: (HandleOn valvel) Operator: move-2nd Action-id: move valve2 Effect: effect4 H-rep preconditions: (valvel shut)(H andleO n valvel) State changes: (HandleOn valve2) Figure 4.14: The O perators The algorithm for Derive-Path-Effect-Skeleton will be illustrated with our running example. During the exam ple's two dem onstrations, operators were created. These opera tors are shown in Figure 4.14. O perators were defined in Section 3.2.2.2, but we will briefly review them. An operator models how an action affects the environm ent. Since actions can produce different sta te changes in different situations, an operator models different sta te changes with different conditional effects (or effects). Each effect identifies a set of preconditions th a t m ust be met for the given state changes to appear. While an effect has 74 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. three sets of preconditions, Diligent only uses th e best guess, heuristic set of preconditions (h-rep) when creating a plan. A problem with the operators in Figure 4.14 is th a t Diligent has only observed each effect performed once. Because of this lack o f d a ta , the preconditions contain some er rors. For example, effectl is missing the precondition (HandleOn valvel). Unfortunately, missing preconditions can cause missing step relationships, and unnecessary preconditions can cause unnecessary step relationships. (In C hapter 6, we will discuss how to correct preconditions by performing experiments.) We are now ready to discuss Derive-Path-Effect-Skeleton (Figure 4.15). The pro cedure traverses the path sequentially going from path's first step to its last step. For each step, the algorithm identifies operator effects th a t transform the state before the step (pre-state) into the sta te after the step (post-state). Notice th at steps th a t represent an action (prim itive steps) are treated differently than subprocedures (abstract steps). On line 5, Diligent simulates performing a subprocedure in order to determ ine which of its steps are perform ed when starting in the abstract step ’s pre-state. Given the subprocedure’s steps, Diligent can compute the abstract step’s pre conditions. Diligent sim ulates the subprocedure each tim e the skeleton is created because the instructor may have modified the subprocedure. Another concern is th at a subproce dure can have state changes th a t are incidental and unim portant. For this reason, lines 6 and 7 only use the subprocedure’s goal conditions. By creating an effect (line 7) and then adding it to the skeleton (line 8), subsequent processing can treat a subprocedure like a primitive step. Finally, line 1 * 2 incorporates conditions involving mental attributes. Because mental attributes are internal to Diligent, they are not stored in action-examples, which record the state of the environm ent. The computation of the skeleton assumes th a t each action-example has the correct delta-state because the instructor dem onstrated all steps correctly. T h at the instructor dem onstrates steps correctly seems a reasonable assum ption, especially if most procedures are relatively short. However, if the instructor m akes a mistake and has to provide another dem onstration, some ste p ’s action-example m ay be incorrect.1 3 Figure 4.16 shows a skeleton for our procedure using the path in Figure 4.10, the operators in Figure 4.14, and the action-exam ples in Figures 4.1 and 4.8. Notice th at ,3The problem of correcting a step’s action-example is not addressed by Diligent. 10 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. proced ure Derive-Pat h-Effect-Skeleton Input: pth: A path Result: skeleton: Identifies the o perator effects used by each of the p a th ’s steps. T he order of the p a th ’s steps is m aintained. 1. Initialize skeleton as em pty. 2. For each step stp in the path do th e following 3. If the step represents the beginning or end of the procedure, do nothing. 4. If the step represents a subprocedure then 5. Use Internally-Simulate-Subprocedure to determ ine the subprocedure’s preconditions. (Section 4.7.1) 6. G et the subprocedure’s goal conditions. 7. C reate an effect using the preconditions from 5 with the sta te changes of 6. 8. Associate the effect w ith the step in skeleton. 9. Else the step represents an action. 10. Identify th e effects effs of the ste p ’s operator op th a t match the d elta-state of the step’s action-example ex. effs < — { ei | e\ € effects(op) A state-changes(ei) C delta-state(ei)} 11. Associate effs with the step in skeleton. 12. If stp produces conditions containing mental attributes, then create a new effect th a t has the conditions as its state changes. The new effect has no preconditions. Add the effect to skeleton. F ig u r e 4 .1 5 : I d e n tify in g a P a t h ’s E ffects Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Order of steps: turn-1 -* move-‘ 2nd-'2 — ) ■ turn-3 — > move-lst-4 Step: turn-1 Effect: effect 1 Preconditions: (valvel open) State changes: (valvel shut) Step: move-2nd-2 Effect: effect4 Preconditions: (valvel shut)(H andleO n valvel) State changes: (HandleO n valve2) Step: turn-3 Effect: effect2 Preconditions: (valvel shut)(valve2 open)(HandleOn valve2) State changes: (valve2 shut) Step: move-lst-4 Effect: effect3 Preconditions: (HandleO n valve2) State changes: (HandleO n valvel) Figure 4.16: Skeleton of Procedure steps turn-1 and turn-3 are associated with the same operator but are com patible with different effects. Once Diligent has identified the effects used by the p ath ’s steps, it can determ ine which effects help achieve the goal conditions. This is im portant because effects can also produce irrelevant state changes. Diligent identifies the effects th a t achieve the procedure’ s goal conditions while calcu lating the causal links. These useful effects are stored in a d a ta structure called a proof. It is called a proof because it records how the preconditions and state changes of the path’s steps transform the p ath ’s initial sta te into its goal state. Diligent does this calculation with Derive-Causal-Links (Figure 4.17). The algo rithm treats the goal conditions as preconditions of the goal state step (line 1). The algorithm iterates backwards over p a th ’s sequence of steps starting at the end of the pro cedure (line 2). When iterating over the steps, preconditions of later steps are used as indices into the array dstnam . Because the p a th ’ s sequence of steps is known to achieve the goal conditions, Diligent identifies earlier steps that establish the preconditions of later steps. When a state change of an earlier step is found to establish a precondition of a later step, a causal link is created (line 5). Because the precondition has been established, it 77 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. procedure Derive-Causal-Links Input: skeleton: Identifies effects produced by the p a th ’s steps. proc: T he procedure. Result: cand: Set of candidate causal links. proof: Sim ilar to skeleton but only contains the effects th a t help achieve the procedure’s goal conditions. (The following uses th e array dstnam th a t is indexed by a condition. Each element contains a set of steps th a t have the condition as a precondition.) L . For each of the procedure’s goal conditions gcnd, add the goal state step to dstnam (gcnd). 2. Iterate over each step stp in skeleton starting w ith the path’s last step and working backwards to the first step. 3. For each effect eff of stp in skeleton do the following 4. If a condition end in e ff's sta te change has an elem ent in dstnam, then e ff is needed to achieve the procedure’s goal conditions. In this case, do the following. 5. Add causal links to cand for condition end between step stp and later steps th a t have end as a precondition. These later steps are identified by dstnam (cnd). 6. After adding the causal links, remove dstnam (cnd) in order to prevent spurious causal links. 7. Add eff to proof for step stp. 8. Create an effect for the sfp’s control-preconditions and add it to proof. The new effect will have preconditions but no sta te changes. 9. Now add stp 's preconditions to dstnam. (For each effect eff of stp in proof and for each precondition pcond o f eff, add stp to dstnam(pcond).) 10. Any elements left in dstnam are dependent on the procedure’s initial state. For each elem ent of dstnam add a casual link for th at condition from the initial state step to each of the steps listed for th a t condition in dstnam. Figure 4.17: C om putation of C ausal Links Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. is removed from dstnam (line 6). The algorithm also adds effects th at produce useful state changes to the proof (line 7). Line 8 adds the step’s control-preconditions to the proof. Control precondition’s control when the ste p is applicable, but may n o t be required by the environment. For example, a control precondition might require th a t a light be turned on before opening a valve. After processing a step’s sta te changes, Diligent adds the preconditions of the ste p ’s useful effects to dstnam (line 9). After all th e steps have been processed, any preconditions th at haven’t been established must rely on the initial sta te (line 10). In Figure 4.17, the use of the array dstnam greatly reduces the run-tim e overhead. Each of a step’s state changes is checked against one array element rather th an against the preconditions of each of the p a th ’ s later steps. casual links: a) begin-procl b) begin-procl c) begin-procl d) turn-1 e) turn-1 f) turn-1 g) move-2nd-2 h) move-2nd-2 i) turn-3 j) move-lst-4 establishes establishes establishes establishes establishes establishes establishes establishes establishes establishes (valvel open) (HandleOn valvel) (valve2 open) (valvel shut) (valvel shut) (valvel shut) (HandleOn valve2) (HandleOn valve2) (valve2 shut) (HandleOn valvel) for turn-1 for move-2nd-2 for turn-3 for move-2nd-2 for turn-3 for end-procl for turn-3 for move-lst-4 for end-procl for end-procl Figure 4.18: C ausal Links Now suppose th at D e riv e -C a u sa l-L in k s is used with the skeleton in Figure 4.16 and the goal conditions in Figure 4.12. Because all the operator effects in the skeleton are needed, the proof produced by the skeleton is the same as the skeleton (Figure 4.16). The resulting causal links are shown in Figure 4.18. The steps begin-procl an d end-procl represent the procedure's initial and goal states, respectively. In Figure 4.18, row a) indicates that the procedure’ s initial state (begin-procl) establishes the condition (valvel open), which is a precondition for step turn-1. Once Diligent has created a proof of the p ath , Diligent can compute th e ordering constraints between the steps. As mentioned earlier, ordering constraints indicate the relative order for performing a pair of steps. D iligent’s calculation of ordering constraints is simpler than what would be seen in partial ordered planner [Wel94] because Diligent 79 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. already knows a sequence o f steps th a t will correctly perform the procedure. For this reason, Diligent does not have to consider rearranging a procedures steps. Usually, there is an ordering constraint for each causal link, but more ordering con straints may be needed. Consider two steps th at were dem onstrated sequentially. Suppose a precondition of the first step was removed by a state change of the second step. If the first step were to be performed earlier in the procedure, the second step would not interfere with the first step, but if th e second step were to be performed imm ediately in front of the first step, the first ste p ’s precondition would not be satisfied, and the first step would not cause a necessary state change. In this situation, Diligent adds an ordering constraint to prevent the sta te change of the later step from removing a precondition of the earlier step. The technique of adding an ordering constraint to a procedure so th at a later step doesn’t remove a precondition of an earlier step is called prom otion [YVeI94]. In Figure 4.19, Derive-Ordering-Constraints only uses promotion to derive ordering constraints. The array clobberstp improves run-time efficiency because a precondition is only checked against steps th a t change the precondition’s attrib u te rather than against all later steps. The ordering constraints associated with causal links are calculated in Update-Step-Relationships (line 8 in Figure 4.13). The ordering constraints for our running example are shown in Figure 4.20. Any order ing constraints involving the procedure’s initial state and goal state are ignored because, by definition, the initial sta te is before all steps and the goal state is after all steps. The or dering constraints created with the procedure Derive-Ordering-Constraints are listed as being created by promotion. At this point, the instructor is finished with procedure procl. The plan is shown in Figure 4.21. 4.7 Creating a Hierarchical Procedure The techniques th at we've looked a t so far have problems scaling to larger procedures. Y V e need to be able to divide procedures into modular tasks, and we should be able to reuse existing procedures. Diligent addresses this issue with hierarchical procedures. A hierarchical procedure uses another procedure as one o f its steps. A procedure used as a step in another procedure is called a subprocedure, and the procedure containing the subprocedure is called the parent 80 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. proced u re Derive-Ordering-Constraints Input: proof: C ontains the effects needed by each step to achieve the procedure’s goals. O utput: cand: Set of candidate ordering constraints. (The following uses the array clobberstp th a t is indexed by an attribute. Each element contains a set of steps th a t change the a ttrib u te ’s value. The array is used to reduce searching.) 1. Iterate over each step stp in proof sta rtin g with the p ath ’s last step and working backwards to the first step. (Check stp against the steps later in the path.) 2. For each precondition pcond of stp in proof do the following. 3. If the pcond is not equal to a condition for the same a ttrib u te in a later ste p ’s (stp 2 ,s) sta te changes then add an ordering constraint between the two steps to cand. cand < — cand U { ord , | w h e r e o r d , is an o r d e r in g c o n s tr a in t b etw ee n s t e p s stp and stp2 A e jf is an e ffec t o f stp in p ro o f A pcond 6 p r e c o n d it io n ( e ^ ) A attr = a ttr ib u te (p c o n d ) A stp2 6 c lo b b e r stp (a M r ) A cond 6 s ta te -c h a n g e (s fp 2 ) A attr = a ttr ib u te (c o n d ) A v a lu e (c o n d ) ^ v a lu e(/> co n d )} (Prepare to check stp against steps earlier in the path.) 4. For each state change condition of stp in proof add stp to the set of steps in clobberstp using the condition’ s attrib u te as an index. Figure 4.19: C om putation of Additional Ordering C onstraints Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Ordering constraints: Created by promotion: turn-3 before m ove-lst-4 Created by causal links: turn-1 before move-2nd-2 turn-1 before turn-3 move-2nd-2 before turn-3 move-2nd-2 before move-lst-4 Figure 4.20: O rdering C onstraints procedure. A step representing a subprocedure is called an abstract step, while other steps are called prim itive steps. 4.7.1 Internally Sim ulating A Subproced ure W hen a procedure is created, its steps reflect the initial sta te of its path. However, when a procedure is used as a subprocedure, it may have a different initial state. This means th a t some of the procedure’ s steps may no longer be needed. To overcome this problem, Diligent can internally simulate performing a subprocedure. Diligent also internally simulates the performance of subprocedures for other purposes. Diligent sim ulates a subprocedure when com puting step relationships in order to determ ine the preconditions of the subprocedure’s abstract step. Diligent also simulates a subproce dure when figuring out which subprocedure steps to perform during one of its experiments (C hapter 6). A subprocedure has the same semantics as a STR IPS m acro-operator [RN95], and the criteria used by Diligent for determining when to perform a step was developed by JefT Rickel for the STEVE tutor [R.J99].1 4 STEV E examines the current state and determ ines which steps are needed to achieve the goal conditions. However, unlike STEVE, Diligent cannot assume th a t a ste p ’s preconditions are correct. If a primitive ste p ’s operator is not very refined, then the step could have unnecessary or missing preconditions. A missing precondition could cause Diligent to skip the step th at establishes the precondition, and an unnecessary precondition could prevent a necessary step from being performed because the precondition is never satisfied. MDiligent and STEV E were developed as part of the same project [JRSM98]. 82 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Steps: begin-proc, turn-1, move-2nd-2, turn-3, move-lst-4, end-procl Goal conditions: (valvel shut)(va!ve2 shut)(H andleO n valvel) Causal links: begin-proc 1 begin-proc 1 begin-proc 1 turn-1 turn-1 turn-1 move-2nd-2 move-2nd-2 turn-3 move- lst-4 establishes establishes establishes establishes establishes establishes establishes establishes establishes establishes (valvel open) (HandleOn valvel) (valve2 open) (valvel shut) (valvel shut) (valvel shut) (HandleOn valve2) (HandleOn valve2) (valve2 shut) (HandleOn valvel) for turn-1 for move-2nd-2 for turn-3 for move-2nd-2 for turn-3 for end-procl for turn-3 for move-lst-4 for end-procl for end-procl Ordering constraints: turn-1 before move-2nd-2 turn-1 before turn-3 move-2nd-2 before turn-3 move-2nd-2 before move-lst-4 turn-3 before move-lst-4 Figure 4.21: The Plan for Procedure procl Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. procedure Internally-Simulate-Sub procedure Input: pre-state: T he subprocedure’s initial state. proc: T he subprocedure. Result: used-steps: Sequence of steps th a t achieves proc’s goals. pcond: T he preconditions in pre-stale. (In the following, a needed precondition, goal condition or step is called relevant. A step is only assum ed to have a precondition when the step is enabled by a causal link th a t establishes th a t precondition.) 1. If proc does not yet have any causal links, add all of proc's steps to used-steps, set pcond to be em pty, and return. 2. Com pute the steps th a t are needed to achieve proc's goal conditions. This is done by iterating backwards over the procedure from the goal conditions to the start of the procedure. i) All goal conditions are relevant. ii) A step is relevant if it establishes an unsatisfied goal condition or an unsatisfied but relevant precondition. iii) The conditions of all causal links that enable a relevant step are relevant preconditions of th a t step. iv) Relevant preconditions are satisfied when the causal link associated with the precondition is established by either another step or the subprocedure’s pre-state. 3. Add all relevant steps to used-steps. W hile adding steps maintain the sam e step order as the procedure. -1 . If a relevant precondition or goal condition is not established by a relevant step and the condition is true in the subprocedure’s pre-state, then add the condition to th e subprocedure’s pcond. Figure 4.22: Sim ulating a Subprocedure Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The algorithm to sim ulate perform ing a subprocedure is shown in Figure 4.22. The calculation is called “sim ulation” rath e r than “planning” because it uses a linear sequence of steps (i.e. a path) and determ ines which steps will achieve the subprocedure’s goal conditions. Line 1 deals with the situation when a subprocedure doesn’t yet have any step re lationships. One solution is to force the instructor to define goal conditions and step relationships. However, th is approach is intrusive, and the goal conditions and step rela tionships may not yet be necessary. To keep the interaction with the instructor simple, Diligent assumes all the p a th ’s steps should be performed if a procedure has no causal links. In this case, because preconditions depend on causal links, no preconditions can be found. Line 2 determ ines which steps are needed to achieve the procedure’s goal conditions. This calculation is similar to the calculation used by STEVE [RJ99]. Line 3 ju st gathers th a t the steps th a t Step 2 identified as relevant (i.e. need to be performed). Line 4 identifies the preconditions of the subprocedure, but is different th an what STEVE would do. STE V E would include all preconditions th at were marked as relevant, while Diligent only includes preconditions th at are satisfied in the subprocedure’s initial state. To see why Diligent took this approach, suppose th at an existing procedure is reused as a subprocedure. In this case, some of the subprocedure’s preconditions m ight be unsatisfied. Since the preconditions are unsatisfied in the subprocedure’ s initial sta te , they cannot be used as preconditions of the subprocedure. (If the subprocedure can achieve its goals from this initial sta te , these unsatisfied preconditions were unnecessary.) The major problem with sim ulating a subprocedure is com pensating for possible errors in the preconditions of steps, especially the initial preconditions created with heuristics. Diligent handles this problem by utilizing the fact that its heuristics for learning precondi tions favor creating unnecessary preconditions over skipping potentially necessary ones.l° For this reason, it is som etim es reasonable for Diligent to ignore unsatisfied preconditions (line 4). Another issue is dealing with ab stract steps embedded within a subprocedure. In this case, Diligent assumes th a t the causal links involving the abstract steps are reasonable. This allows Diligent to tre a t abstract steps the same as primitive steps and reduces the overhead of sim ulating the a b stract steps inside a subprocedure. l5Thc reasons th at the heuristics favor unnecessary preconditions will be discussed in C hapter 5. 85 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. From this discussion it may seem that the reuse of subprocedures is undesirable. How ever, reusing a subprocedure saves time, and performing a subprocedure under different initial states helps refine the preconditions of the subprocedure’s steps. A problem th a t Diligent does not address is when the internal sim ulation does not correctly identify the steps needed to achieve the subprocedure’s goal conditions. Ideally, Diligent would notice this, notify the instructor, and interact with him in o rd er to fix the problem. This type of dialog is supported by Instructo-Soar [HL95].1 6 4.7.2 C ontinuing th e R u n n in g E xam p le Now let us return to our running example. We will create a hierarchical procedure th at contains three steps, two of which are abstract. We will look a t the two ways of inserting subprocedures into a parent procedure. • An existing procedure can be inserted as a subprocedure. This can save an instructor time and effort. • A new procedure can be defined as a subprocedure inside a dem onstration of the parent procedure. This can be a convenient way of authoring a subprocedure in the desired initial state. Suppose the instructor now authors a hierarchical procedure th at shuts som e valves and checks w hether a light works. The instructor will use the same initial s ta te as our first procedure. T he instructor calls the procedure ‘‘top-level” and gives it the description “perform a hierarchical procedure.” The instructor then dem onstrates the new procedure. 1. The Instructor turns the handle and shuts valvel. This step is called turn-5. ‘ 2. The second step reuses our first procedure p ro d (Figure 4.21). The in stru cto r uses procl by selecting it from a menu of potential subprocedures. This ste p is called procl-6. 3. The third step is a new procedure that checks whether an alarm light is working. The new procedure is defined inside the dem onstration of its parent procedure (top-level). The instructor calls the new procedure “proc2” and gives it the description “check the alarm light.” The instructor finishes dem onstrating and defining proc2 before continuing the dem onstration of procedure top-level. This step is called proc2-7. 1 6 Extensions to support unexpected behavior in subprocedures are discussed in C hapter 8. 86 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Step turn-5 is a redundant ste p th a t performs the work of the first step (turn-1) in procedure procl (step procl-6). S tep turn-5 is used to show why subprocedures need to be internally simulated. Even though step turn-5 is a primitive step, it is m eant to illustrate the situation where the state changes of one subprocedure interact with the preconditions of a later subprocedure. Because step turn-5 performs the first step of subprocedure procl, Diligent needs to sim ulate procl so th at it can determ ine which steps to perform and identify the precon ditions of step procl-6. Diligent sim ulates the subprocedure using the post-state of step turn-5 and Internally-Simulate-Subprocedure (Figure 4.22). As expected, Diligent de term ines th at the subprocedure’s ste p turn-1 is unnecessary because the condition (valvel shut) has already been established. After doing the simulation, Diligent performs the ab stract step procl-6 by performing the steps shown in Figure 4.23. In Figure 4.23, the abstract step procl-6 is also associated with an action-example. Diligent creates an action-example for an abstract step by recording the sta te before and after performing the step. Steps to perform: move-2nd-2 — > turn-3 — > m ove-lst-4 Preconditions: (valvel shut)(valve2 open)(H andleO n valvel) Action-example: Pre-State: (valvel shut)(valve2 open)(H andleO n valvel)(A larm Lightl off) (Cdm Status normal) Delta-State: (valve2 sh u t) Figure 4.23: Results from Simulating Step procl-6 4.7.3 A N ested P roced ure D efin itio n After Diligent has performed ab stract step procl-6, the instructor defines a new procedure inside the current dem onstration. T he concept of nested procedure definitions has been borrowed from Instructo-Soar [HL95]. 87 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. To construct the new subprocedure’s prefix, the prefix of the parent procedure has appended to it every action necessary to reach the subprocedure’s initial state.17 W hen constructing the prefix, a b stra c t step s are represented by their prim itive steps, and primi tive steps are represented by their associated actions. Figure 4.24 shows the prefix for the new subprocedure (proc2). Prefix: prefix3 Configuration: configl Additional-actions: turn handlel — > move valve2 - > ■ turn handlel -* ■ move valvel Figure 4.24: The Subprocedure’s Prefix 4 .7 .4 Sensing A ction s T he new procedure (proc2) checks w hether a light is working. T he procedure illustrates the use of a step th at perform s an information gathering (or sensing) action [AIS88, RN95]. A sensing action gathers inform ation about the environm ent w ithout changing it. This raises three immediate issues: 1. What are the step’s preconditions? The environm ent may place no restrictions on when the sensing action can be performed. 2. How does Diligent indicate th a t the step has been performed? Because the sensing action does not change the environment, w hat is to prevent the step from being repeated indefinitely? 3. Since the step doesn’t change th e state, what is to prevent Diligent from just skipping the step? Diligent addresses these issues by creating preconditions th at control when the step is performed and creating internally maintained mental attributes. A mental attrib u te is an attribute th at is m aintained inside Diligent and is not present in the environment. A sensing action creates a condition involving a new mental attrib u te, and the condition is incorporated into the procedure’s goal conditions [R.J99]. Adding the goal condition ensures that the sensing action is performed once. ll Instructo-Soar does not use prefixes. 88 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. To control when the sensing action is performed. Diligent uses heuristics to create provisional preconditions for the sensing action’s step. While creating the preconditions, Diligent focuses on the current dem onstration of the current procedure. Diligent assumes th a t attributes th a t change value are likely to be im portant. Since earlier steps are likely to establish preconditions for later steps, the state changes caused by earlier steps are likely preconditions. T he preconditions of sensing actions are calculated with Compute- Changes-in-Demo (Figure 4.25), which is invoked during a dem onstration when a sensing action is performed. procedure Compute-Changes-in-Dem o Input: demo: A dem onstration. cur~state: T he environm ent’s current state. O utput: chgs: A set of state changes. (The set attrs contains the names of attributes th at change value during the dem onstration.) 1. For each step in demo, add the a ttrib u te of every condition in the delta-state of the ste p ’s action-example to attrs. 2. For each attrib u te th a t changed value (i.e. in attrs), add the a ttrib u te ’s condition in the current state (cur-state) to chgs. Figure 4.25: Com puting S tate Changes Caused by Earlier Steps In Compute-Changes-in-Demo, the action-examples of a b stra ct steps are treated the same as the action-examples of prim itive steps. This means th a t a ttrib u tes that change value in the subprocedure but have the same initial and final value are ignored. This approach simplifies processing and doesn’t require a subprocedure’s goal conditions or causal links to be defined. Besides, the algorithm for com puting sta te changes only provides heuristic preconditions. Because a sensing action might be performed a t any time and because a procedure may contain several sensing actions, we are using the convention th a t a sensing action performed by a human student will not be recognized unless all of its preconditions are satisfied. Otherwise, a sensing action m ight not be performed in th e proper situation, which means the sensing action would not be performed properly. 89 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Diligent only identifies preconditions for a sensing action when the sensing action is dem onstrated in an add-step dem onstration. Afterwards, the preconditions only change when the instructor edits them . In a future system, machine learning techniques could be used to refine a sensing action’s preconditions if a sensing action could be demonstrated m ultiple times. The system might then look for commonality betw een the dem onstrations. However, beyond multiple dem onstrations, it is unclear how to use machine learning techniques with sensing actions because they don’t affect the sta te of the environm ent. Perhaps, it might be possible to use the placem ent and type of sensing action to make inferences about other aspects of a procedure While this approach for identifying sensing action preconditions worked on the proce dures that we looked at, it became clear during Diligent’s evaluation (which did not use sensing actions) th at the approach would have been more robust if it had also looked a t attributes th at changed value after the sensing action. Using the s ta te changes of earlier steps places the sensing action after the earlier steps, and using a ttrib u te s that change value later in the procedure would have placed the sensing action in front of later steps. Consider the following example of why both sources of preconditions are im portant. Suppose a procedure involves pressing the reset button, checking if a light is illuminated, and turning off the system by pressing the power button. In this case, checking the light has no value if it is checked before the reset b u tto n has been pressed or after the power button has been pressed. By using state changes of steps both before and after the sensing action, the sensing action could have been positioned so that it was correctly performed as the second step. 4.7.5 D em on stratin g th e N ested P ro ced u re Now let us look at the dem onstration of the procedure (proc2) containing the sensing action. During the dem onstration, the instructor performs the following steps. 1. The instructor presses the function-test b u tto n , which causes the alarm light to tu rn on. The instructor calls the operator “press-test” and approves the default descrip tion “press the system test button.” T he step is called press-test-8. 2. The instructor performs a sensing action on the light by selecting the light w ith the mouse. The instructor calls the o perator “check-light” and approves the default description “check the alarm light.” This step is called check-light-9. 90 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 3. The instructor turns off the light by pressing the reset button. The instructor calls the operator “press-reset” and approves th e default description “press the system reset button.” This step is called press-reset-10. Figure 4.26 shows information for proc2’s dem onstration. The conditions found by Compute-Changes-in-Demo are listed as the control preconditions of step check-light- 9. In contrast to the preconditions o f an effect, which are required by the environment to produce the effect’s state changes, control preconditions are specific to a step and need to be true before the step is performed. For this reason, control preconditions are associated with the step rather than with the operator. The mental attribute (AlarmLightl-result) created by step check-light-9 is added to the step ’s mental-conditions because Diligent associates each mental a ttrib u te with a distinct step. The value of the m ental a ttrib u te is not considered im portant (i.e. <any value>) because none of the procedures used with Diligent could utilize the mental a ttrib u te ’s value. A more sophisticated use of m ental attrib u tes and sensing actions will be discussed when we talk about potential extensions (Section 8.4.2.3). A t this point, the instructor derives the procedure’s goal conditions and step relation ships. The plan for proc2 is shown in Figure 4.27. 4.8 The Com pleted Procedure After the instructor has finished subprocedure proc2, he finishes dem onstrating its parent procedure (top-level). The plan for top-level is shown in Figure 4.28. The plans for the abstract steps procl-6 and proc2-7 have already been shown in figures 4.21 and 4.27, respectively. One thing to note about top-level’s plan is that subprocedures are treated as black boxes th a t achieve their goal conditions. T his is done because subprocedures do not term inate until their goal conditions are satisfied. Furtherm ore, a subprocedure’s plan supports some ability to adjust to different initial states. Moreover, treating subprocedures as black boxes simplifies processing on hierarchical procedures (e.g. computing step relationships). Treating subprocedures as black boxes affects top-level’s plan in several ways. O ne way is using subprocedure p rod’s goal conditions for the state changes of its step when com puting top-level’s goal conditions (line 3 in Figure 4.11). T hat is why (HandleOn valvel) is a goal condition of top-level even though th e condition is true in top-level’s initial and goal states. Another way th a t subprocedures are used as black boxes is when the preconditions and goal conditions of a subprocedure (procl) are used to create an effect 91 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. D e m o n s tra tio n : T ype: add-step P refix : prefix3 P re v io u s -s te p : begin-proc2 S teps: press-test-8 — > check-Iight-9 — > press-reset-10 S tep : press-test-8 A c tio n -e x a m p le : P r e -s ta te : (valvel shut)(valve‘ 2 shut)(H andleO n valvel)(A larm L ightl off) (C dm S tatus normal) D e lta -s ta te : (A larm L ightl on)(C dm Status test) S tep : check-light-9 A c tio n -e x a m p le : P r e -s ta te : (valvel shut)(valve2 shut)(H andleO n valvel)(A larm L ightl on) (C dm S tatus test) D e lta -s ta te : <empty> C o n tro l-p re c o n d itio n s: (A larm Lightl on)(C dm Status test) M e n ta l-c o n d itio n s: (AlarmLight 1-result <any value>) O p e ra to r: check-light E ffect: P re c o n d itio n s : <empty> S ta te c h a n g e s: <empty> S tep : press-reset-10 A c tio n -e x a m p le : P re -s ta te : (valvel shut)(valve2 shut)(H andleO n valvel)(A larm L ightl on) (C dm Status test) D e lta -s ta te : (A larm Lightl off)(C dm Status normal) Figure 4.26: Subprocedure D em onstration Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Steps: begin-proc2, press-test-8, check-light-9, press-reset-10, end-proc2 Goal conditions: (Alarm Lightl off)(C dm Status normal) (A larm Lightl-result < any value> ) Causal links: begin-proc2 begin-proc2 press-test-8 press-test-8 press-test-8 press-test-8 check-light-9 press-reset-10 press-reset-10 establishes (A larm Lightl off) establishes (C dm Status norm al) establishes (A larm Lightl on) establishes (C dm Status test) establishes (A larm Lightl on) establishes (C dm Status test) establishes (A larm L ightl-result <any establishes (A larm Lightl off) establishes (C dm Status norm al) for press-test-8 for press-test-8 for check-light-9 for check-light-9 for press-reset-10 for press-reset-10 value>) for end-proc2 for end-proc2 for end-proc2 Ordering constraints: press-test-8 before press-test-8 before check-light-9 before check-light-9 press-reset-10 press-reset-10 Figure 4.27: The Plan for Subprocedure proc2 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. when computing the p a th ’s skeleton (line 7 in Figure 4.15). This is why step procl-6 rather than step turn-5 establishes the goal condition (valvel shut) with causal link g). Steps: begin-top-level, turn-5, procl-6, proc‘ 2-7, end-top-level Goal conditions: (valvel shut)(valve2 shut)(H andleO n valvel)(A larm Lightl off) (C dm Status norm al)(A larm Lightl-result <any value> ) causal links: a) begin-top-level b) begin-top-level c) begin-top-level d) begin-top-level e) begin-top-level f) turn-5 g) procl-6 h) procl-6 i) procl-6 j) proc2-7 k) proc2-7 I) proc2-7 establishes establishes establishes establishes establishes establishes establishes establishes establishes establishes establishes establishes (valvel open) (valve2 open) (HandleOn valvel) (A larm Lightl off) (C dm Status norm al) (valvel shut) (valvel shut) (valve2 shut) (HandleOn valvel) (A larm Lightl off) (C dm Status test) (A larm Lightl-result for turn-5 for procl-6 for procl-6 for proc2-7 for proc2-7 for procl-6 for end-top-level for end-top-level for end-top-level for end-top-level for end-top-level <any value>) for end-top-level ordering constraints: turn-5 before procl-6 Figure 4.28: The Top Level Procedure 4.8.1 Inform ation P rovided by th e Instructor To summarize the previous sections, when an instructor creates a procedure, he needs to provide dem onstrations and names for procedures and operators. He must also provide En glish descriptions th a t can be used to describe procedures to human students. Descriptions of procedures are entered entirely by the instructor, but for other types of descriptions, Diligent can generate a default description. O f course, default descriptions still need to be approved (and possibly modified) by the instructor. 94 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 4.8.1.1 Generating default descriptions Diligent provides default descriptions for operators, steps, causal links and goal conditions. These descriptions exploit Diligent’s ability to query the environm ent for English descrip tions of action-types, objects and attrib u tes (Section 3.1.3). Diligent uses the information returned by the environm ent to fill in templates. • causal links. The tem plate for a causal link is uthe < attribute name> to be < value> .” In the tem plate, <attribute name> and <value> represent the description of the attribute and the attrib u te ’s value, respectively. The tem plate does not start with a complete sentence so th a t the tu to r has flexibility in how it sta rts sentences. For example, the tu to r might say, “Now we want the ‘first valve’ to be ‘open’.’ ’ • Goal conditions. Goal conditions are represented by causal links th a t establish con ditions for the plan’s goal sta te step. • Operators. The tem plate is “<type o f action> the <object>.” For example, the tutor could use the tem plate to say “We will now ‘toggle’ the ‘first valve’.1 " Of course, additional tem plates would be needed if operators modeled actions th at involved multiple objects. • Steps. By default, steps use their operator’s description. The templates are simple, but they provide the instructor with a great deal of help. They correctly identify the objects and attributes involved. Because they usually produce reasonable descriptions, they save the instructor a great deal of typing. Reducing typing not only saves time but also prevents errors. 4.9 Complexity Because Diligent is an interactive system , its algorithms should have reasonable run-time efficiency. In this section, we will discuss the run-time complexity of sim ulating subproce dures and deriving step relationships. These calculations involve identifying connections between steps, and the algorithms center on the processing of individual steps. For this reason, we will consider the processing of a step as the basic operation. We will assume th at each step has the maximum num ber of preconditions and these result in the maximum number of causal links and ordering constraints. For this reason, we will consider the processing on each step as approxim ately the same. 95 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. We will also ignore the access times of associative arrays. An associative array is indexed by a symbolic value (e.g. “blue” ) and can be implemented as a hash table. The worst case time for accessing an element of an associative array is linear in the num ber of elements in the array. L et n = th e n u m b er o f s t e p s in th e c u r r e n t p r o c e d u r e w ith o u t c o n sid e r in g th e s t e p s in sid e s u b p r o c e d u r e s , m = th e m a x im u m n u m b er o f s t e p s in a su b p r o c e d u re w ith o u t considering the steps inside a subprocedure’s subprocedures, (m = n) s = the number of subprocedures in the current procedure, p = the maximum num ber of preconditions or state changes for a step. In Diligent’s algorithm s, the causal links and ordering constraints are derived from the preconditions of steps. The calculations revolve around a step’s preconditions rather than around causal links or ordering constraints. T hus, in the following algorithms, we will expect to process 0 (p ) preconditions every tim e we process a step. When we discuss the steps in a procedure, we m ean the steps in the im m ediate pro cedure. By immediate procedure, we mean only the primitive and abstract steps in a procedure and not the steps inside the subprocedures associated with abstract steps. First, we will look at sim ulating a subprocedure (Figure 4.22). The algorithm uses the subprocedure’s causal links. This results in a b stra c t steps inside the subprocedure being treated exactly like other steps. Determining the relevant preconditions and steps (line 2) involves visiting each of the m steps once. L ater, the m steps are visited once again to identify and store the relevant steps (line 3). Because there are O(p) preconditions, we expect to process 0 (p ) causal links for every step . Thus, the run-tim e complexity is O (pm ). Next, we will look at deriving step relationships (Figure 4.13). The m ajority of the time is spent in the algorithm s th a t compute the path skeleton, the causal links and the ordering constraints. Each algorithm computes interm ediate results th at are used by the next algorithm. The first algorithm (4.15) creates a skeleton o f a path. The skeleton identifies which operator effects were used in the path. If a procedure does not contain any subprocedures, then each step is visited once (lines 9-12) and O (p) preconditions and state changes are processed. Thus, the complexity is 0 (p n ). However, any subprocedures will need to be 96 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. simulated (lines 4-8). If there are a t m ost s subprocedures with a length of a t most m, then the run-tim e com plexity is O (pn -(- spm ). The second algorithm (Figure 4.17) takes the skeleton and com putes causal links. Each step is processed once and associative arrays are used to hold interm ediate results. Because O(p) preconditions are considered, the complexity is O (pn). The third algorithm (Figure 4.19) com putes ordering constraints. T he algorithm looks a t preconditions of steps and com pares them to state changes of later steps (lines 2-3). In the worst case, every step would change the same attribute. This would result in a run time complexity of 0 ( p n 2). However, the algorithm uses an associative array to record which attributes are changed by which steps (line 4). This reduces the expected number of comparisons. In practice, the algorithm has been very fast. Combining the com plexity for various algorithms results in a complexity O (pn 2 -f spm ). The algorithm s have been used on procedures as long 10 to 12 steps, and none of the algorithms have been observed to take more than a few seconds. Diligent gains efficiency from its focus on the immediate procedure. The cost of simu lating subprocedures is limited because Diligent uses the causal links inside subprocedures. Once an abstract ste p ’s subprocedure has been simulated, overhead is reduced because the abstract step is treated like a primitive step. Another source of efficiency is hierarchical procedures. The hierarchy allows instructors to create relatively small and modular pro cedures, and the run-tim e overhead of creating small and m odular procedures is small. 4.10 R elated Work 4.10.1 N atu ral L anguage V ersus D irect M anipulation When using Diligent, the instructor dem onstrates a procedure by directly manipulating the environment. However, a natural language (e.g. English) could have been used to specify the procedure’s steps. Humans find natural languages flexible and easy to use. Unfortunately, com puters have difficulty understanding natural languages. One problem is ambiguity. For exam ple, w hat does “it” or “the button” mean? A nother problem is indirection. Instead of sim ply performing an action, a human needs to provide an abstract description. A system th a t receives dem onstrations with a similar content as Diligent’s, but in English, is Instructo-Soar [HL95]. Although direct m anipulation avoids many of the problems of am biguity inherent in natural languages, a problem when using direct manipulation is handling abstraction. Input is very concrete because it deals with individual objects. This raises the issue of 97 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. liow to specify quantification, negation and sets [Coh9*2]. Because input is directly entered in the current state, it is also difficult to specify hypothetical situations. These are areas where natural language could com plem ent direct m anipulation. 4.10.2 P rogram m in g B y D em on stration This section will discuss related work in basic techniques for Programming By Demon stration (PBD) [C+93]. A PBD system learns how to perform som e task by observing a user perform it. The difference between PBD and learning a m acro is th a t PBD involves a generalization of the task instead o f a rote repetition of the user’s actions. Diligent can be classified as a PBD system because it observes dem onstrations and uses them to create plans and operators. Many PBD systems learn how to perform procedures. These system s typically uti lize a helpful user in order to learn how to perform simple procedures after only a few dem onstrations. Diligent differs from a typical PBD system because it has the ability to experim ent1 8 and the ability to learn the relationships between steps (i.e. step relation ships). Additionally, few PBD system s can learn hierarchical procedures. 4.10.2.1 Procedure Representation An im portant aspect of this chapter is th at it provides algorithm s th a t transform demon strations into hierarchical partially ordered plans. This plan representation has fine grained ordering constraints and causal links th at have been shown to useful for providing good explanations to hum ans [ME89]. Some robotic PBD work [FM D+96] has also produced hierarchical partially ordered plans, but the robotic work learns a very different representation. Each step in a procedure has a set of disjunctive preconditions th at indicate when the step can be performed if zero to all of the procedure's later steps are not performed. If a procedure is long, then these preconditions could become very complicated. Thus, it appears th a t human users could have difficulty understanding or verifying the preconditions of steps. 18Diligent experiments by replaying a procedure, skipping a step and observing the result. 98 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 4.10.2.2 Basic Techniques The PBD literature provides a number of useful basic techniques. O ther than some pre liminary work on sensing actions, Diligent used basic P B D techniques instead of creating new ones. One basic technique is focusing the a g e n ts atten tio n . This can be done by pointing to objects [Mau94] and identifying im portant objects by perform ing extraneous actions on them [Lew92]. Using extraneous actions probably has lim ited value for Diligent because apparently extraneous actions are likely to indicate eith er th a t one of the actions is a sensing action, th a t Diligent cannot see a relevant a ttrib u te , or th a t Diligent is missing knowledge of step relationships. Other work on focusing has looked spatial distance and how quickly actions are performed [Hei89j. However, using spatial distance may have little value on a device with buttons and switches. Using speed of instruction as focus may also be inappropriate because hurrying an instructor may negatively im pact the quality of a dem onstration. Besides focusing, another basic PBD technique is asking the user to provide clarifica tion. The user is asked to select between a set of hypotheses in the PRODEGE+ graphics editor [BS93]. In contrast, M etamouse asks the user to toggle on “thum btacks” which indi cate potentially im portant features [MW93]. Diligent uses this technique when it presents an instructor with hypothesized preconditions, goal conditions and step relationships. Another technique is providing a graphical history o r storyboard [KF93]. A graphical history shows in a sequence of small windows how the window used for instruction varied throughout a dem onstration. One problem with graphical histories is that support for graphical histories might need to be explicitly designed into a graphical interface. Dili gent could not use graphical histories because it did n ot have enough control over the environm ent’s graphical interface. One basic technique is learning hierarchical procedures [KM93]. This promotes reuse because existing procedures can be used as components of larger procedures. This improves scalability because it takes less work by a user to enter a large procedure. Diligent’s hierarchical procedures are unusual because of the causal links in subprocedures. Causal links provide a great deal o f flexibility because they indicate which steps are necessary when starting from a variety of initial states. Still another basic technique is adding textual annotations (e.g. an object's name) to a graphical representation [Lie94]. A problem with the an notation approach is that Diligent does not have the ability to insert text into the environm ent’s graphical interface. If 99 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Diligent had this ability, it might be able to more clearly com m unicate with the instructor, but it is unclear how much effect annotations would have. Another technique is creating graphical rewrite rules. A graphical rewrite rule trans forms a graphical pattern into another pattern. This technique works best when the user can create the new pattern by m aking fine-grain changes to a graphical environment. A system th at uses this approach is KidSim [SCS94, CS95], which allows young children to create simulations. Diligent does not use this technique because it is designed to be used in environments where it has limited control of the environm ent. Work by Bimbo and Viario has addressed an issue th a t Diligent does not consider, which is training multiple agents in a virtual environment [BV96]. They do this by having all but one agent replay a fixed sequence of actions. The system learns how to react to situations based on spatial and tem poral constraints. However, th e system does not learn the knowledge necessary for teaching. An issue with this approach is synchronizing the actions of all agents. Synchronization is a problem because the agent and the instructor may be engaged in dialog th a t conflicts with the tim e line. Synchronization is also an issue because an agent’s actions could cause another agent to deviate from the fixed sequence of actions being replayed. While the other basic PBD techniques used by Diligent have been discussed in the PBD literature, no other system appears to incorporate actions th a t actively gather information (i.e. sensing actions) and then use this information to influence a procedure’s flow of control. Since Diligent learns procedures for the types of dom ains where test results are gathered, sensing actions are im portant. Most PBD system s merely accept the data provided by the user, but some systems actively identify d a ta th a t can be used to refine its knowledge. A system is said to engage in active learning when it identifies d a ta th at can help refine its knowledge. One such system is Disciple [TK98], which finds an example and asks the user w hether it belongs to a given class. However, other than Diligent, there appears to be no system th a t uses direct manipulation and then uses the environm ent to perform experim ents th a t will reduce the need for the user to answer questions. 4.11 Summary The main importance of this chapter is th a t it provides algorithm s th a t transform demon strations into hierarchical partially ordered plans. While m any of the algorithm s are original, they tend to be fairly simple or derived from standard planning techniques. This 100 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. chapter is also im portant because its algorithm s create the basic structure used to learn operators (C hapter 5) or to perform experim ents (C hapter 6). We will now briefly review what this chapter covered. This chapter discussed how Diligent transform s dem onstrations into procedures. To process a dem onstration, Diligent can combine m ultiple dem onstrations into a path. Be cause the path contains all the procedure’s steps, Diligent uses the path to derive the procedure’s plan. By default, a procedure’s goal conditions contain the final values of a ttrib u te ’ s whose values changed during the procedure. Once the goals are known, step relationships can be derived using the p a th ’s sequence of steps and the preconditions and sta te changes of each step. To promote scalability, m odularity and ease of authoring, procedures can be hierarchi cal. Subprocedures can be specified by inserting existing procedures into a dem onstration or by creating a new subprocedure inside a dem onstration of the parent procedure. How ever, when reusing an existing procedure as a subprocedure, Diligent needs to internally sim ulate the subprocedure because the subprocedure’s initial sta te may require skipping some of the subprocedure’s steps. Another issue is how to incorporate sensing actions into a procedure. Because sensing actions do not change the environm ent, Diligent needs to ensure th a t they are not skipped. Diligent does this by creating a mental attrib u te th a t doesn’t exist in the environm ent and then using the mental a ttrib u te in a goal condition. Diligent also ensures th a t a sensing action is performed in the proper state by adding preconditions th at control when it is performed. 101 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Chapter 5 Learning Operators The previous chapter discussed constructing procedures from dem onstrations. However, dem onstrations, by themselves, are not useful because they do not explicitly indicate the dependencies between steps (i.e. step relationships). W ithout knowledge of dependencies, an autom ated tu to r could perform the procedure by rote, but could not answer questions about which steps to perform or how steps depend on each other. Diligent corrects for this problem by learning operators. An ojieralor models actions performed in the environm ent by indicating which preconditions will cause an action to produce given sta te changes. Diligent associates the operators th a t it learns with the steps of procedures. This allows Diligent to use operator preconditions when calculating the dependencies between steps. One of Diligent’s contributions is how it balances the techniques used to learn operators with how it performs experim ents. Experiments, which will be discussed in the next chapter, can more easily remove unnecessary preconditions than identify missing ones. In contrast, the techniques th a t Diligent uses to learn operators have a bias favoring likely but potentially unnecessary preconditions. This bias is im portant because little d a ta may be available for learning. P art of this bias is Diligent’s novel focus on the heuristic that the state changes of earlier steps in a dem onstration are likely to establish preconditions for later steps. This heuristic is used when creating new operators. This chapter discusses how Diligent learns operators. First, we will present require ments for the learning problem. We will then discuss heuristics and d a ta structures. Afterwards, we will discuss how to create new operators and refine existing operators. The chapter will finish with a discussion of run-time complexity and related work. 102 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 5.1 Additional Requirem ents Earlier in section 3.1, we described the authoring problem in term s of requirements, con straints and the interface to th e environm ent. Since then, th e discussion of how demon strations are processed has m ade the problem more constrained and concrete. Factors th a t have constrained the problem include the procedural representation (i.e. plans), how operators are used generate plans, and the number and types o f dem onstrations provided by the instructor. These additional constraints allow us to define additional requirements th a t focus on the problem of learning operators. Most of these additional requirem ents arise from the general requirem ents to make the instructor’ s job easier and to maximize the utility of a procedure’s few demonstrations. The new requirem ents are as follows. Requires very little domain knowledge. Diligent may s ta r t with no domain knowl edge. This m eans that the learning algorithm cannot rely on detailed domain knowl edge. Quick competence because few action-examples. D iligent needs to find reasonable preconditions quickly because it may have seen only a few dem onstrations. If Diligent can find reasonable preconditions, then the instructor’s jo b should be easier. Incremental or appear increm ental. The learning algorithm needs to appear incre mental for a num ber of reasons. First, the d a ta arrives incrementally. Second, instructors would be confused if preconditions looked very different each tim e an operator was updated. T hird, because Diligent is interactive, the algorithm cannot perform slow batch processing. Support error recovery. Because there needs to be quick com petence and because learning is incremental, early preconditions may be incorrect. Thus, Diligent needs to be able to recover from errors th at could include both missing and unnecessary preconditions. Humans can understand the precondition representation. An instructor needs to understand and verify preconditions. Unless the preconditions are concise and ex plicit, he will not be able to do so. An instructor m ust also be able to determine whether or not a specific condition is a precondition. 103 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. One issue is what representations could an instructor understand. T his is a difficult question because there are degrees of understaridability. There is evidence that hu mans have difficulties with som e types of simple logical statem ents [New90]. Because preconditions are a type of logical statem ent, we will give the intuitive argum ent th at simpler representations should be easier to understand. V V e are also going to argue th at, to avoid problems, the representation should be as simple as feasible. As an example, consider tu rn in g on a car’s engine by turning the key. T he precon ditions for this might be th a t the key is in the ignition, the seat belt is fastened, and the door is closed. Two ways of representing these preconditions are shown in Figure 5.1. The conjunctive representation used in a) would be used by Diligent and anecdotally appears similar to w hat hum ans would use. In contrast, hum ans appear unlikely to use b), which m ight be learned by CDL [She93]. a) (keyLocation ignition) A (seatB elt fastened) A (door closed) b) (keyLocation ignition) A ( -i(seatB elt open) V (door closed)) Figure 5.1: Preconditions for Starting a C ar Important attributes need to b e identified The environment may have hundreds, if not thousands, of attributes, and in a given procedure, most attrib u tes will probably not change value and will probably be irrelevant. Therefore, the learning algorithm needs to help distinguish im p o rtan t attributes from unim portant a ttrib u tes. In con trast, the learning algorithm could also have required generalizing o bject classes and replacing attribute values by variables. Bound a precondition’s uncertainty. The instructor should receive som e indication of the system’s certainty ab o u t w hether a condition is or is not a precondition. By indicating its confidence in a preconditions, Diligent can help focus th e instructor’s attention on areas of uncertainty. 5.2 Heuristics The algorithms in this chapter use som e of the heuristics from chapter 3: 1) focus on attributes that change value: 2) the s ta te changes produced by earlier steps are likely to be preconditions of later steps; and 3) favor existing knowledge and hypotheses. This chapter also uses a new heuristic. 104 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Prefer extra preconditions over missing ones. In the algorithm s that will be used, it is easier to rem ove an invalid precondition than to identify a missing precondi tion. It should also easier for humans to spot a mistake among a few proposed preconditions th a n from a large set of unused conditions. 5.3 About this Chapter’s Examples Like the other chapters, this chapter’s examples are taken from the HPAC dom ain. The domain has been simplified in order to illustrate the algorithm s. Despite the similarity, the examples in this ch ap ter do not correspond to the extended example of C hapter 4. 5.4 D ata Structures T he relevant d a ta stru ctu res are the learning algorithm 's input and output. The inputs are action-examples and dem onstrations, and the outputs are operators. The action-examples used for learning operators were defined in Section 3.2.1.1. An action-example records th e sta te of the environment before and after an action is per formed. The state before the action is called the pre-state, and the state after is called the post-state. The p a rt o f the post-state th at changes is called the delta-state. States are composed of conjunctive sets of conditions. A condition contains an attribute and its value. For example, the condition (valvel open) means th at attrib u te valvel has the value open. The current dem onstration is also used when creating new operators. Dem onstrations were defined in Section 4.3. Dem onstrations contain a sequence of steps, each of which is associated with an action-exam ple. The representation of operators was defined in Section 3.2.2.2, but because this chapter focuses on learning operators, we will spend some time discussing and motivating the representation. Operators model how actions performed by the instructor in the environment affect the sta te of the environm ent. O perators identify the preconditions necessary for an action to produce a given set of s ta te changes. Because an action may produce different state changes in different states, an o p e ra to r’s preconditions and state changes are described by one or more conditional effects. Each conditional effect (or effect) has its own set of preconditions and state changes. Preconditions and state changes are described by conjunctive sets of 105 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. conditions. When the preconditions are all satisfied in an action-exam ple’s pre-state, the associated state changes should be observed in the post-state. Let c be a condition c € g-rep =» c € h-rep A c 6 s-rep c 6 h-rep = > ■ c € s-rep Let 5 g , Sh and Ss be the set of environm ent states th a t satisfy the g-rep, s-rep and s-rep, respectively. Ss Q S h C SG Figure 5.2: Relationship between the Precondition Concepts Effects have three sets of preconditions (or precondition concepts). In keeping with the terminology used by Wang [Wan96c], the precondition sets are called the s-rep, h-rep and g-rep. However, Wang only used a s-rep and g-rep. The relationship between precondition sets is shown in Figure 5.2. The most specific precondition, s-rep, is a superset of the other preconditions. Because the s-rep contains the most conditions, it matches fewer environm ent states. The heuristic, best guess precondition (h-rep) is a subset of the s- rep and matches at least as many environm ent states as the s-rep. The most general precondition, g-rep, is a subset of the o th er sets and matches a t least as m any states as the other sets. Although effects have three sets of preconditions, Diligent only uses the h-rep when deriving a plan’s step relationships. Figure 5.3 shows an operator. The o perator’s action-id indicates th a t the operator models turning handle handlel. The operator only has one effect, which means that only one set of state changes has been seen. In this case, turning the handle shuts valvel. The h-rep and s-rep contain the g-rep’s only condition, (valvel open), while the s-rep contains a condition, (HandleOn valvel), th at is absent from the h-rep and g-rep. 5.4.1 P reconditions as a V ersion Sp ace Before proceeding, we will discuss the representation of preconditions as three conjunctive concepts. An obvious question is whether conjunctive concepts can adequately represent pre conditions. An examination of more than 30 domains implemented in PRO D IG Y showed 106 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Action-id: turn handlel Effect: Preconditions: s-rep: (most specific concept) (valvel open)(va!ve2 open)(HandleOn valvel) h-rep: (interm ediate, heuristic concept) (valvel open)(valve2 open) g-rep: (most general concept) (valvel open) State changes: (valvel shut) Figure 5.3: An O perator th a t more than 90% of the operators had only conjunctive preconditions. In the remaining 10%, operators with disjunctive preconditions could be split into multiple operators that have conjunctive preconditions ([Wan96c], page 12). Work on PR O D IG Y [VCP+95] has tended to focus on using general purpose operators for planning, while Diligent focuses on learning a few specific procedures and does not generalize operators across multiple objects of the same class. Therefore, Diligent is less likely than the work on PRODIGY to need disjunctive preconditions. T he idea of having three concepts for each precondition (i.e. s-rep, h-rep and g-rep) is based on Mitchell’s version spaces [Mit78]. In a version space, there is a most general concept. G, and a most specific concept, S. G and S correspond to Diligent’s g-rep and s-rep, respectively. G and S are used to classify whether an exam ple belongs to a category. In our case, the “category” is an effect’s state changes. Exam ples rejected by S do not belong to the category, and examples accepted by both S and G belong to the category. Ideally, training with action-examples should cause S and G to converge to a single concept. 107 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Unfortunately, version space algorithm s have had run-tim e complexity problem s. Mitchell’ s Candidate Elimination algorithm [Mit78, Mit82] learns conjunctive conditions where G and S may each contain m ultiple sets of hypothesized conditions. Unfortunately, Haussler [Hau88] shows th at S and G can have an exponential size in relation to th e num ber of training examples. The complexity problem s can be partially overcome by using Focusing algorithms [BSP85, YPL77], which learn conjunctive tree structured concepts.1 Focusing allows S to be represented as a single conjunctive concept, but G may still contain m any candidate concepts. Haussler [Hau88] shows th at G is still exponential. An exponential size G can be avoided by using the INBF algorithm [SR90], which is a Focusing algorithm that represents G as a single concept because G is conservatively specialized. A key idea of INBF, which Diligent uses, is delaying use of training examples until they can be used and discarded. More recently, Hirsh, M ishra and P itt [HMP97] have identified efficient version space algorithm s for more general classes o f concepts; they avoid complexity problems by not explicitly storing S and G . Instead, they determine whether classifying an example as an instance of the concept is consistent with the training examples. However, the lack of an explicit G and S prevents the representation from identifying specific a ttrib u te values to use as preconditions. Unfortunately, this violates one of our requirements. Given th a t there are m any version space algorithms, we will select one for comparison. We will look a t OBSERVER’S algorithm s [Wan9o, Wan96a, Wan96c] because OBSERVER, like Diligent, learns conjunctive operator preconditions. OBSERVER'S algorithm is simi lar to INBF. However, instead of learning IN BF’s tree structured concepts, OBSERVER generalizes its precondition concepts by unifying training examples with operators. This unification results in variables being introduced into the operator. Unlike OBSERVER, Diligent does not introduce variables into operators through unifi cation. OBSERVER’S unification algorithm requires explicit relations between objects and their attributes, but Diligent’s unstructured environment does not contain these relations. Additionally. Diligent and OBSERVER have different learning tasks: OBSERVER learns general operators for planning, while Diligent learns a few specified procedures in dom ains where many objects in a given class (e.g. buttons) may have idiosyncratic behavior. For example, one button may turn on the power, while another sta rts the motor. Still, Diligent could have provisionally generalized operators to act on objects of the same class. This generalization could then have been withdrawn if an object was shown to have idiosyncratic behavior. However, in domains th at Diligent has used, too many ’In a tree structured concept, concepts lower in the tree are specializations of concepts higher in the tree. For example, birch and elm are specializations of tree and plant. 108 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. objects (e.g. buttons and switches) have idiosyncratic behavior for generalization to be an im portant capability. Another issue is th e convergence of the s-rep and g-rep to a single concept, especially when there are lim ited numbers of training exam ples. W hat if s-rep and g-rep d o n 't converge? Which one should be used as the precondition? The g-rep is likely to be too general, while the s-rep is likely to be too specific. C hoosing between s-rep and g-rep is especially problem atic immediately after the version space is created: the s-rep and g-rep are useless because the s-rep matches only one state an d the g-rep matches any state. This issue is complicated by the fact th at Diligent is unlikely to get enough examples for the s-rep and g-rep to converge. To avoid problems with version space convergence, Diligent creates plans using the h-rep, which is an heuristic, best guess precondition. T he h-rep, which is not present in OBSERVER, is more specific than the g-rep and m ore general than the s-rep. Thus any state that satisfies the s-rep also satisfies the h-rep, an d any state that satisfies the h-rep also satisfies the g-rep. The h-rep serves a number of purposes. The h-rep provides a usable precondition when there isn’t enough d a ta to make the s-rep and g-rep usable. The h-rep also provides a working hypothesis to actively investigate. The idea o f a working hypothesis is apparent when you view the three precondition concepts as representing sufficient (s-rep), likely (h-rep) and necessary (g-rep) preconditions. Even though Diligent uses the h-rep, the s-rep and g-rep are still valuable. As will be shown, the s-rep and g-rep can be used to detect problem s with the learning algorithm . The s-rep and g-rep are also used to add missing conditions to the h-rep. 5.5 Creating a N ew Operator Figure 5.4 shows the algorithm for creating a new o p erato r. The current dem onstration and the action-example of the new o perator’s action a re used to create the operator’s first effect. On line 2, g-rep is set to the em pty set, and on line 3, s-rep is set to the action- example’s pre-state. Thus, g-rep is satisfied by any s ta te , and s-rep is only satisfied by the pre-state. At this point, the g-rep and s-rep are not very useful, but they do bound uncertainty in the preconditions. On line 4, the initial h-rep is set to the pre-state values of attributes th at have changed value during the dem onstration.2 Line 5 gathers the pre-state 2The algorithm for C o n ip u te -C h a n g e s -in -D e m o is in section 4.7.4 on page 4.7.4. 109 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. procedure C reate-New-O perator Given: dem o: A dem onstration. ex: An action-exam ple. Learn: op: A new operator. 1. Create operator op w ith effect e ff 2. g-rep(eff) <- 0 3. s-tep(eff) < — pre-state(ear) 4. h-rep(e/f) t - Compute-Changes-in-Dem o with demo and pre-state(ex). (This identifies attrib u tes th a t have already changed value in the dem onstration.) 5. h-rep-cand < — conditions in pre-state (ex) that have the sam e attrib u tes as conditions in d e lta -state (e r). (Each condition ci such th a t cj G pre-state(cx) and there exists a condition c-i G delta-state(ex) where attribute(cj) = a ttrib u te(c 2 ).) 6. h-rep(eJ 0r) < — h-rep(e/f) U h-rep-cand 7. state-changes(e/)p ) delta-state(ex) Figure 5.4: Algorithm for Creating New O perator conditions of attrib u tes whose value changed in the action-example, and line 6 adds these conditions to the h-rep. Because the h-rep reflects changes during the dem onstration, the h-rep is a better initial precondition than either the g-rep or the s-rep. Finally, on line 7, the effect’s state changes a re set to the action-exampie’s delta-state, which contains the post-state values of a ttrib u te s whose values were changed by the action. This approach has som e sim ilarity with the method used by Instructo-Soar [HL95] to induce conditions under which an action should be performed. Instructo-Soar looks a t two groups of conditions: the first group contains the attributes whose values were changed by the action, and the second group contains relations between the objects being acted upon and the objects associated with the procedure’ s goal conditions. In contrast, the preconditions of Diligent’s operators a ttem p t to model the environment in a way that is independent of a given step or procedure. T hat is why Diligent doesn’t need the procedure’s goals when learning preconditions and th a t is why Diligent looks a t the state changes of the d em onstration’s earlier steps, which are likely to be preconditions of later steps. 110 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Demonstration: dem ol, with the following state change earlier in the dem onstration (HandleOn valvel) Action-example: Action-id: turn handlel Pre-state: (valvel open)(va!ve2 open)(valve3 open)(HandleOn valvel) Delta-state: (valvel shut) Figure 5.5: Input for Creating New O perator Figure 5.5 shows the input for creating a new operator, and Figure 5.6 shows the result ing operator. In Figure 5.5, the only state change from earlier steps in the demonstration is th a t the handle was moved to valvel ((HandleOn valvel)); this condition is added to the new operator’s h-rep by line 4 of Figure 5.4. Additionally, the only attribute in the delta-state (valvel) has its pre-state condition (valvel open) added to the h-rep by line 5 of Figure 5.4. Operator: turn-handle Action-id: turn handlel Effect 1: Preconditions: g-rep: 0 h-rep: (valvel open)(HandleOn valvel) s-rep: (valvel open)(valve2 open)(valve3 open)(HandleOn valvel) State changes: (valvel shut) Figure 5.6: A New O perator 111 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 5.6 Positive and N egative Examples Desired state change: (valvel open) Positive example: Pre-state: (valvel closed) Negative example: Pre-state: (valvel closed) Indeterminate: Pre-state: (valvel open) Post-state: (valvel open) Post-state: (valvel closed) Post-state: (valvel open) Delta-state: (valvel open) Delta-state: 0 Delta-state: Figure 5.7: Some Positive and Negative Examples Because Diligent may receive little input, it needs to learn quickly. O ne way of learning faster is to learn from both success and failure. Success means th at an action produces the desired result, and failure means th at an action doesn’t produce the desired result. Diligent learns from success and failure by com paring an operator’s effects to action-examples. To use an action-example, each action-exam ple’ s action-id is m atched with the operator th at models th at action. The action-example is only used to refine th a t one operator. To refine one of the op erato r’ s effects with the action-example, Diligent uses the com mon machine learning technique of classifying each action-example as either a positive or negative training example. A positive example contains the effect’s sta te changes in its delta-state, and a negative example does not contain the efTect’s s ta te changes in it post-state. It is indeterm inate whether an action-example should be classified as either positive or negative if the action-example contains the effect’s state changes in both its pre-state and post-state. It is indeterm inate because it is unknown w hether the action did not change the attributes in the efTect’s sta te changes or whether the action did change the values but to their pre-state values. Figure 5.7 illustrates how to classify examples for an effect th a t opens valvel. In negative example, the attrib u te in the effect’s sta te change 112 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. (valvel) doesn’t have the desired value in the action-exampie’s post-state. In indetermi nate example, the attrib u te has the desired value in both the pre-state and post-state, and Diligent only looks a t a ttrib u te s th a t clearly changed value (i.e. are in the delta-state). 5.7 Refining Preconditions Once action-exam pies have been classified, Diligent uses the techniques of Incremental Non-Backtracking Focusing (IN BF) [SR90] to generalize precondition concepts with pos itive examples and specialize precondition concepts with negative exam ples. The most specific concept (s-rep) is generalized if it incorrectly classifies a positive example. The s-rep is generalized by rem oving attrib u tes whose pre-state values d o n ’t m atch the values in the s-rep. The most general concept (g-rep) can be specialized if the g-rep incorrectly classifies a negative exam ple. The g-rep is specialized by adding a condition from the s-rep whose attribute has a different value in the s-rep than in the negative example's pre-state. Because the g-rep now contains an additional condition, it can correctly classify the example as negative. Because of the difficulty identifying which condition to add, the g-rep is only updated if th ere is a near-miss between the s-rep and the negative example. There is a near-miss when only one s-rep condition does not match the negative example’s pre-state. Requiring a near-m iss is a conservative approach th a t only adds conditions to the g-rep when they have been shown to be necessary. Because a negative example may not be a near-miss, a negative example is kept until it achieves a near-m iss or the g-rep correctly classifies it as negative. In a sim ilar manner, the h-rep can be generalized like the s-rep or specialized like the g-rep. Because the g-rep and s-rep provide an upper and lower bound for the h-rep, the h-rep doesn’t have to be updated as conservatively as the g-rep and s-rep. The g-rep and s-rep are conservatively updated because they represent the m ost general and most specific candidate preconditions. Because the g-rep is only specialized and the s-rep is only generalized, changes to the g-rep and s-rep cannot be undone. In contrast, the h-rep has a capacity for error recovery since it can be both specialized and generalized. E rror recovery may be necessary for the h-rep because it only represents a “best” working hypothesis. In the following sections, we will look a t refining preconditions with positive and neg ative examples. 113 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. procedure Refine-Positive-Exam ple Given: op: An operator. e ff: An effect of op. ex: An action-exam ple th a t is a positive exam ple of eff. Learn: Refined preconditions for eff. 1. collapse-list < — 0 2. diff < — s-rep conditions whose attributes have different values than in the action-exam pie’s pre-state. (The conditions in d iff appear unnecessary.) 3. For each condition cond 6 diff, a) If cond is in th e effect’s g-rep, then add cond to collapse-list. 4. If collapse-list ^ 0 then a) The version space has collapsed, and the elem ents of collapse-list appear to be incorrect. Ask the instructor to update the preconditions of e ff using collapse-list. b) Return 5. Remove from s-rep any conditions contained in diff. 6. Remove from h-rep any conditions contained in diff. 7. For all unused negative exam ples neg-ex of eff, a) Use Refine-Negative-Exam ple on op and e ff with neg-ex. (Because conditions have been removed from s-rep, there are fewer conditions th a t could distinguish positive and negative exam ples.) Figure 5.8: Refining Preconditions with a Positive Example 5 .7.1 Refining P reco n d itio n s w ith P o sitiv e E x a m p les Positive action-examples are used to remove unnecessary conditions from an effect’s pre condition concepts. The algorithm for refining an effect’s preconditions with a positive action-example is shown in Figure 5.8.3 To process an exam ple, we need to to iden tify unnecessary preconditions. This is done on line 2, which identifies conditions in the m ost specific precondition concept (s-rep) th a t do not m atch the pre-state of the action- exam ple.4 The unnecessary conditions from line 2 are rem oved from the s-rep and h-rep on lines 5 and 6. 3For clarity, some minor efficiency improvements have been rem oved. 4Wc assume that no attributes were added to or removed from th e state. 114 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Besides removing unnecessary conditions, vve need to check th a t the preconditions are consistent with the training d a ta ; this is done in lines 3 and 4. A key idea is th at the g-rep’s conditions have already been shown to be necessary. Line 3 identifies necessary conditions th a t now appear unnecessary, and line 4 indicates an interaction with the instructor to correct the problem. When a condition is shown to be both necessary and unnecessary, the version space is said to collapse. There are several reasons for a version space to collapse: 1) the instructor introduced errors when editing preconditions; 2) Diligent cannot see a necessary environment attribute; or 3) the precondition needs to be represented as a disjunction of conjunctive conditions. All three of these cases need further interaction with the instructor and are beyond the scope of our present discussion. After unnecessary preconditions have been removed by lines 5 and 6, the differences between the preconditions and negative examples might be smaller. For this reason, the effect is checked against negative examples th a t previously produced far-misses (line 7). A far-m iss indicates th a t two or m ore attributes in the s-rep have different values than in the action-example’s pre-state. P o sitiv e e x a m p le p re -s ta te : (valvel open) (valve2 open) (valve3 shut) (HandleOn valvel) (A la rm L ig h tl o ff) P re c o n d itio n s b efo re: g -rep: (valvel open) h-rep: (valvel open) (valve3 open) (HandleOn valvel) s-rep: (valvel open) (valve2 open) (valve3 open) (HandleOn valvel) Figure 5.9: Using a Positive Exam ple 115 P re c o n d itio n s a f te r: g -re p : (valvel open) h -re p : (valvel open) (HandleOn valvel) s-rep : (valvel open) (valve2 open) (HandleOn valvel) Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The algorithm for processing positive examples (Figure 5.8) is illustrated by Figure 5.9. Line 2 computes the set of differences between the s-rep and the action-example {diff). In this case, the set contains (valve3 open) but not (AlarmLightl off). T he condition (AlarmLightl off) is ignored because the s-rep doesn’t contain a condition involving the attribute AlarmLightl. The version space would collapse (lines 3 and 4) only if (valvel open) was not in the action-exam ple’s pre-state. On lines 5 and 6, the condition (valve3 open) is removed from the s-rep and h-rep. 5.7.2 R efining P reco n d itio n s w ith N eg a tiv e E xam ples Before proceeding, we will discuss potentially needed conditions, which are derived from INBF [SR90]. Potentially needed conditions are defined in Figure 5.10.5 At least one of the potentially needed condition m ust distinguish a given negative example from positive examples. In the HPAC dom ain, there are several dozen conditions in an action-exam ple’s pre-state, but usually only a few potentially needed conditions. There are so few conditions because Diligent focuses on learning the procedures specified by the instructor rather than exploring the environm ent, and th is creates a tendency for positive and negative examples to have similar pre-states. Potentially-Needed-Conditions = s-rep conditions whose attributes have different values in an action-exam ple’s pre-state. = { Ci | C [ € s-rep A C 2 € pre-state A a tt r ib u t e ^ ) = a tt r ib u t e ^ ) A value(ci) ^ value(c 2 ) } Figure 5.10: Potentially Needed Conditions Negative examples are used to add conditions to an effect’s preconditions. T his is done by looking for a one condition or near-miss mismatch between the s-rep and an action- example’ s pre-state. The algorithm is shown in Figure 5 .II6 and will be illustrated by the action-examples in Figure 5.12. N ote th a t action-examples are added to a set of unused negative examples (line 2) and then removed when nothing more can learned from them (lines 4a and 6b).7 5 Instead of potentially needed conditions, INBF used potentially guilty conditions, which contain con ditions from the exam ple’ s pre-state rath er than the effect's s-rep. 6In Diligent, incrementally storing and updating potentially needed conditions greatly reduced the number of conditions checked. However, for clarity, these simple changes to the algorithms are not shown. ' Because the s-rep is used to identify potentially needed conditions, both the s-rep and g-rep are necessary for identifying missing preconditions. 116 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. procedure Refine-Negative-Exam ple Given: op: An operator. e ff: An effect of op. ex: An action-exam ple of eff. Learn: Refined preconditions for eff. 1. 1. If state-changes(e/jr) C post-state (ex) then (This is true when state-changes(eJ 9r) C pre-state(ex).) a) return (The exam ple should be classified as indeterminate rather than negative.) 2. Add ex to the set of unused negative exam ples of eff. (Keep ex until it is rejected by the g-rep(eff).) 3. needed-cond < — Potentially-Needed-C onditions of ex for eff (These conditions distinguish ex from positive examples.) 4. If needed-cond D g-rep(eff) ^ 0 then a) Nothing can be learned from ex because g-rep(eff) classifies it as negative. Remove ex from the set of unused negative examples. b) return 5. If needed-cond = 0 then a) collapse-list conditions in effffs original s-rep that are not in the current s-rep. b) Some of the conditions in collapse-list are required, ask the instructor to update the preconditions of eff using collapse-list. c) return 6. If needed-cond has only one condition then a) Add the condition to eff's g-rep and h-rep. b) Nothing more can be learned from ex. Remove it from the set of unused negative examples. c) return 7. If needed-cond n h-rep(eJ 0r) ^ 0 then a) return (h-rep(eJ flr) classifies the ex as negative, but we are uncertain which conditions distinguish ex from positive examples.) 8. h-rep(eJ 0') classifies the ex as a positive example. A ttem pt to refine h-rep (eff) with ex by invoking Discriminate-W ith-Other-Effects. Figure 5.11: Refining Preconditions with Negative Example Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Action-example 1: Pre-State: (valvel shut) (valve‘ 2 open) (valve3 open) (HandleOn valvel) Delta-State: (valve2 shut) Action-example 2: Pre-State: (valvel open) (valve‘ 2 open) (valve3 open) (HandleOn valve‘ 2) Delta-State: (valve2 shut) Effect: State changes: (valvel shut) Action-exam ple 3: Pre-State: (valvel open) (valve2 open) (valve3 shut) (HandleOn valve2) Delta-State: (valve2 shut) Action-exam ple 4: Pre-State: (valvel shut) (valve2 shut) (valve3 open) (HandleOn valvel) Delta-State: (valvel open) Preconditions before: g-rep: 0 h-rep: (valvel open) s-rep: (valvel open) (valve2 open) (valve3 open) (HandleOn valvel) Preconditions after: g-rep: (HandleOn valvel) h-rep: (valvel open) (HandleOn valvel) s-rep: (valvel open) (valve2 open) (valve3 open) (HandleOn valvel) Figure 5.12: Using Negative Examples Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Action-example 1 is rejected by line 1 of the algorithm because Diligent cannot deter mine whether it is a negative or positive example. The effect’s s ta te change, (valvel shut), is satisfied in the action-exam ple’s pre-state and post-state. Diligent cannot determine whether attrib u te valvel’s value was constant or was changed back to the a ttrib u te ’s pre state value. Because Diligent cannot correctly classify th e action-exam ple as either positive or negative, using the action-exam ple could introduce errors into the effect’s preconditions. Action-example 2 adds a condition to the g-rep and h-rep. The preconditions before and after processing the action-exam ple are shown on the bottom o f Figure 5.12. On line 3, only one potentially needed condition is found ((HandleOn valvel)). Since the condition is not part of the g-rep, the g-rep misclassifies the condition as positive (line 4). Since there is only one potentially needed condition, line 6 specializes the g-rep and h-rep by adding the condition to them . A t this point, the algorithm cannot learn anything more from the action-example, and the action-example is removed from the set of unused negative examples. Action-example 3 is rejected because the g-rep correctly classifies it as a negative exam ple. Line 3 identifies the potentially needed conditions ({(valve3 open) (HandleOn valvel)}). Line 4 then checks if any o f these conditions are in th e g-rep. One of the conditions ((HandleOn valvel)) is in th e g-rep. At this point, the action-exam ple is rejected because nothing can be learned from it. On line 4a, the action-exam ple is removed from the set of unused negative examples. Action-example 4 is rejected by the h-rep but not by the g-rep. On line 3, two potentially needed conditions ({(valvel open)(valve2 open)}) are found. The test on line 4 fails because neither condition is in the g-rep.8 Because there is m ore than one potentially needed condition, no condition is added to the g-rep and h-rep (line 6). Finally, on line 7, the action-example is rejected because the h-rep condition (valvel open) is also one of the potentially needed conditions. However, unlike action-examples 2 and 3, action-example 4 remains in the set of unused negative examples because it can still be used for identifying preconditions as necessary (i.e. in g-rep). Line 5 deals with the collapse of the version space. T h e version space collapses when a condition needed for distinguishing between positive and negative examples has been shown to be unnecessary. T he reasons for a collapse were discussed in Section 5.7.1. 8Since the effect’s state change is (valvel shut), one might expect (v a lv e l o p en ) to be in the g-rep. However, condition (v alv el o p e n ) hasn’t been shown to be necessary, and a ttrib u te valvel potentially could have many values. 119 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Line 8 is used when the h-rep misclassifies a negative example as a positive. In this case, the h-rep will be com pared to preconditions in the o p e ra to rs other effects. Although a condition might be added to the h-rep, no condition will be added to the g-rep. This processing will be discussed in the next section. 5.7.2.1 Discriminating Betw een Effects There is an additional opportunity to learn when an e ffe c ts h-rep misclassifies a nega tive example as positive. In this situation, the action-exam ple always has at least two potentially needed conditions, but none of them are in th e h-rep. At least one of these potentially needed conditions should be in the h-rep. Fortunately, the o p e ra to r’s other effects are likely to have similar preconditions. T h at is because the preconditions need to differentiate between situations where different state changes are observed, especially when two effects cause the sam e a ttrib u te to have different values. For example, consider a button th at toggles w hether the power is on or off. The preconditions for turning th e power on need to reject every pre-state where pressing the button will turn the power off. This means th a t we m ight identify a precondition by exam ining the preconditions of other effects. In particular, we are interested incom patible effects. Two effects are incompatible if they have a s ta te change for the sam e a ttrib u te but with different values. Comparing incompatible effects is a reasonable approach because their preconditions must differentiate their sta te changes. When comparing incom patible effects, Diligent requires the action-example to be posi tive for one incompatible effect and negative for the other. We will call the effect with the negative example N and the effect with the positive exam ple P. Diligent adds a condition to effect N’s h-rep when there is a near-miss between N ’s potentially needed conditions and effect P ’ s preconditions. Requiring a near-miss provides m ore evidence for the condition; without this evidence, an a ttrib u te in one effect’s preconditions might get unnecessarily added to all the others. W hen looking for a near-miss. Diligent checks all three of P ’s precondition concepts (i.e. s-rep, h-rep and g-rep). The algorithm in Figure 5.13 will be illustrated with the action-example in Figure 5.14. Recall from the previous section that procedure Refine-Negative-Exam ple invokes procedure Discriminate-W ith-Other-Effects when the h-rep misclassifies a negative example as positive. Unless the instructor had edited th e preconditions, this can only happen if an attrib u te in the effect’s state changes can take three or more values because, by default, the pre-state values of attributes in the state change are in the h-rep. The 120 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. proced u re Discrim inate-W it h-Ot her-Effects Given: op: An o perator. e ff: An effect of op. ex: An action-exam ple th a t is a negative exam ple of eff. Result: Refine effect e ft's h-rep 1. For each incom patible effect (incomp-eff) of effect e ff for o p erato r op, Effect incom p-eff is incom patible when there exists conditions cy and C 2 such th at ci € state-change(e/f) A c2 G state-change( incom p-eff) A a ttr ib u te ^ ) = a ttrib u te (c 2) A value(ci) ^ value(c2). a) If ex is a positive example of incomp-eff, then a tte m p t to refine h-rep(e/f) with Discriminate-Between-Effects. procedure Discrim inate-Between-Effects Given: e ff: An effect. incomp-eff: An effect th a t is incompatible with eff. ex: A negative exam ple of e ff and a positive exam ple of incomp-eff. cands: C andidate conditions for h-rep(eJ 0r). These are the potentially needed conditions o f ex for eff. Result: Refine effect e ff's h-rep 2. For each of incom p-eff's precondition concepts (rep) (i.e. s-rep, h-rep or g-rep) do the following a) Find all conditions in cands th a t are not in rep, but have a com m on a ttrib u te with a condition in rep. Call this set cands2 . cands2 < — {ci | cy 6 cands A 3 c 2 6 rep where a ttrib u te(c i) = attrib u te(c2) A value(ci) ^ value(c2)} b) If cands2 contains one condition, i) Add the condition to eff's h-rep. ii) Return Figure 5.13: Discriminating Between Effects Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Action-example: Pre-state: (valvel open) (valve2 shut) (pressure high) (status test) Delta-state: (status halted) Incompatible effect: State changes: (status halted) Preconditions: g-rep: 0 h-rep: (pressure high) (status test) s-rep: (valvel open) (valve2 shut) (pressure high) (status test) Effect: State changes: (status normal) Preconditions before: g-rep: 0 h-rep: (valvel open) s-rep: (valvel open) ( calve2 open) (pressure normal) Preconditions after: g-rep: 0 h-rep: (valvel open) (pressure normal) s-rep: (valvel open) (valve‘ 2 open) (pressure normal) Figure 5.14: An Example of Discriminating Between Effects Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. h-rep condition can then be removed if the attribute has a different pre-state value in a positive example. The procedure Discriminate-W ith-Other-Effects (Figure 5.13) first identifies in compatible effects th a t m erit fu rth er processing. Line 1 finds all incom patible effects for which the given action-example is a positive example. For these incom patible effects, pro cedure Discriminate-Between-Effects is invoked. In our example (Figure 5.14), only one appropriate incom patible effect is found. In procedure Discrim inate-Between-Effects, the potentially needed conditions (cands) of the first effect (eff) are com pared against the preconditions of the incom patible effect (incomp-eff). In the example, the potentially needed conditions are {(valve2 open)(pressure normal)}. When the needed conditions are checked against the s-rep of the incom patible effect (line *2a), both potentially needed conditions match. Because checking the s-rep failed, the h-rep is checked. In th is case, the h-rep and the potentially needed conditions have a one condition m atch. This condition, (pressure normal), is then added to effect eff's h-rep. The updated effect is shown in the lower right portion of Figure 5.14. Another system th a t com pares the preconditions of different state changes is LIVE [She93, She94], but its algorithm is inappropriate for Diligent. LIVE’s learning algorithm , Complementary Discrimination Learning (CDL), corrects for the misclassification of an action-example by adding additional conditions to a potentially complicated set of dis junctive preconditions. U nfortunately, a complicated precondition can be created when a simple one could have expressed th e sam e concept. A problem with CDL is th a t it cre ates both disjuncts and negated preconditions. A negated precondition indicates th a t an attribute cannot have a given value. For example, a normal condition may indicate th at valvel is shut, while a negated condition might indicate that valvel is not open. Because preconditions may be unnecessarily complicated, the preconditions may not be suitable for teaching and may not seem reasonable to a human instructor.9 5.8 Putting it all Together So far we have discussed how to create an operator and its first effect. We have also discussed how to refine an existing effect with positive and negative examples. However, we have not discussed the higher level processing that deals with operators and action- examples. 3Figure 5.1 contrasted two preconditions for the same effect. The simple one used Diligent's represen tation, and the complicated one is typical of w hat CDL would learn. 123 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The next few sections discuss using an action-exam ple to refine an operator. First, we will cover the high level processing th a t determines how to treat each effect. Second, we will discuss adding a new effect to an existing operator. T hird, we will discuss splitting an efTect with multiple state changes into two effects. 5.8.1 D eterm in in g H ow to P rocess Effects A comparison between an action-exam ple’s delta-state and the state changes of the oper a to r’s effects determ ines the ty p e of processing perform ed on the action-example. If the state changes m atch an effect’s delta-state, the action-exam ple is positive for th a t effect; and if the state changes and d elta-state don’t match, the action-example is negative or indeterm inate. However, the m atch might only be partial. Additionally, some of the delta- sta te ’s conditions may not m atch anjr effect’s state changes. These cases need to be taken into account. Operators are refined by procedure R e fin e -O p e ra to r (Figure 5.15). For an operator to properly model an action-example, the operator needs to predict all the action-example’s d elta-state conditions. Diligent does this by matching each delta- state condition with some effect’s sta te changes. Initially, all delta-state conditions are added to the set of unm atched d elta-state conditions (th e set delta on line 1). As each effect is processed, any d elta-state conditions that m atch the effect’s state changes are removed from the set of unm atched conditions (line '2b). Finally, if any conditions remain unmatched, a new effect is created th a t has the unm atched conditions as its state changes (line 3). To discuss the processing of an effect, we will use th e d a ta in Figure 5.16. On line 1 (Figure 5.15), the initially unm atched delta-state conditions are {(valvel shut)(AlarmLightl on)(Alarml_ight3 on)}. Consider effect 1. All its state changes match the delta-state (line ‘ 2a). Thus, the example is positive (line 2c). Consider effect 2. None of its state changes m atch the delta-state. Thus, the exam ple is negative or indeterm inate (line 2d). Consider effect 3. Some of its state changes m atch ((AlarmLightl on)), but some do not ((AlarmLight2 on)). Thus, the effect is split into two effects (line ’ 2e). Finally, one delta-state condition ((AlarmLight3 on)) is unm atched by any effect. In this case, Diligent creates a new effect for the unmatched condition (line 3). 124 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. procedure Refine-Operator Given: op: An operator. ex: An action-exam ple for the o p erato r. Result: Refine operator op. 1. delta < r- action-example ear’ s d elta-state 2. For each effect e ff of operator op, a) Identify conditions in e ff's s ta te changes th at match (m eff) and do not m atch (Jeff) the action-example ex m eff < — state-changes(eJ 0r) fl delta fe ff < — state-changes(e/f) \ m e ff b) Remove each condition from delta th a t matches one of the effect’s (eff) sta te changes. c) If all state changes m atch th e action-exam ple (feff = 0), i) Refine effect e ff with a positive example ex by invoking R e fin e -P o s itiv e -E x a m p le . d) Else if no sta te changes m atch th e action-example (m eff = 0), i) Example ex is either negative or indeterm inate. Refine effect e ff with exam ple ex by invoking R e fin e -N e g a tiv e -E x a m p le . e) Else the action-exam ple only m atches some state changes, i) Split the effect e ff in two w ith S p lit-E ffe c t. Use the action-exam ple ex and th e m atching (m eff) and m ism atching (feff) sta te changes. 3. If some conditions in the action-exam ple’s d elta-state haven’t been matched (delta ^ 0), a) Create a new effect by invoking C re a te -N e w -E ffe c t and using the action-exam ple ex and the unused delta-state conditions delta. Figure 5.15: Refining an O p e ra to r with an Example Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Action-example: Pre-state: (valvel open)(A larm L ightl off)(AlarmLight‘ 2 off)(A larm Light3 off) Delta-state: (valvel shut)(A larm L ightl on)(AlarmLight3 on) Effect 1: State Changes: (valvel shut) Effect 2: State Changes: (valvel open) Effect 3: State Changes: (A larm Lightl on)(AlarmLight'2 on) Figure 5.16: An Exam ple for Assigning D elta-State Conditions to Effects 5.8.2 A dding a N e w E ffect In previous sections, we have discussed how to create operators and refine them with action-examples, but we have not discussed adding new effects to existing operators. A new effect is added when no conditions in any existing effect's state changes m atch some condition in an action-exam ple’s delta-state. When creating a new effect, the assum ptions used to create the first effect’s precondi tions may be inappropriate. For instance, the action-example might occur while Diligent is performing an experim ent rather than during a careful constructed dem onstration. Dur ing a dem onstration an instructor is likely to group related steps together so th a t earlier steps establish preconditions of later steps. In contrast, a new effect m ight might be seen during an experiment because a precondition of an existing effect was not satisfied. For tunately, the preconditions of existing effects are good sources of knowledge because they have probably undergone som e refinement. Therefore, when an operator already has an effect. Diligent uses the knowledge already contained in the operator rath er than the state changes of previous steps. The algorithm for creating the new effect is shown in Figure 5.17 and will be discussed in the next few paragraphs. 126 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. procedure Create-New-Effect Given: op: An operator, ex: An action-example of th a t operator, and delta: A set of sta te changes. Result: C reate a new effect for operator op. 1. For operator op, create a new effect new-eff. 2. Set the effect’s sta te changes to delta. 3. Since action-exam ple ex is new-eff's first positive example, initialize the version space bounds with ex. s-rep(new-eff) < — pre-state(ex) & g-rep (new -eff) < — 0 4. Find the o p e ra to r’s earlier action-example (sim ilar-ex) th a t is most similar to ex. Similarity is m easured by the fewest differences between action-example pre-states. 5. Find the conditions (h -re p l) in ex’s pre-state th a t are different than conditions in sim ilar-ex's pre-state. h-repl { c t | ci 6 pre-state(ex) A c2 € pre-sta.te(s imilar-ex) A attrib u te (c i) = attribute(c2) A value(ci) ^ value(c2) } 6. Select an earlier effect (old-ce) whose h-rep will used to help initialize the h-rep for new-eff. (Diligent chooses the operator’s first effect.) 7. Create a partial h-rep (h-rep2 ) by making the earlier effect’s (old-ce) h-rep consistent the action-example’s (ex) pre-state. h-rep2 < — { cj | Ci € pre-state(ex) A c2 € h-rep(oW-ce) A a ttrib u te(c j) = attribute(c2) } 8. Initialize the new effect’s best guess precondition concept (h-rep). It-rep(new-eff) < — h-repl U h-rep2 9. For each previous action-example (old-ex) of the operator, a) Refine the new effect new-eff by invoking R efine-N egative-E x a m p le with action-exam ple old-ex. Figure 5.17: Creating a New Effect 127 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The new effect’s most general (g-rep) and most specific (s-rep) precondition concepts are initialized with the sam e m ethod as the operator’s first effect. Diligent uses the same m ethod because incorrect conditions cannot be removed from the g-rep and missing con ditions cannot be added to the s-rep. Thus, the initial g-rep contains no conditions, and the initial s-rep contains every condition in the action-example’s pre-state (line 3). The initialization of the h-rep exploits knowledge of earlier action-exam ples and other effects by finding similarities and differences. Although the current action-exam ple is positive for the new effect, all earlier action-examples are negative. Because the h-rep needs to distinguish between positive and negative examples, conditions th a t distinguish between the current action-example and the closest negative example are likely preconditions (lines 4 and 5). The initialization of the h-rep also exploits knowledge of other effects by finding similar ities between them and the current action-example. Because the preconditions of different effects need to distinguish between various state changes, the a ttrib u tes used in one effect’ s h-rep are likely to be useful in the new effect’ s h-rep (lines 6 and 7). For example, in the HPAC domain, the attrib u te th a t indicates which valve a handle is residing on is equally im portant when opening or shutting the valve. One problem with using existing preconditions is th at they m ay not be very refined. The lack of refinement can result in missing and unnecessary h-rep conditions. To avoid this problem, the h-rep belonging to the first effect is used because Diligent assumes that the first effect is probably the m ost refined and accurate (line 6). Once the new effect has been initialized, Diligent refines the effect with the operator’ s earlier action-examples. Since the earlier action-examples are all negative or indeterm inate, Diligent attem pts to add conditions to the new effect’s g-rep and h-rep (line 9). The creation of a new effect is illustrated by Figure 5.18. The “closest earlier action- exam ple” represents similar-ex on the algorithm ’s line 4 (Figure 5.17), and the “first effect” represents old-ce on line 6 .10 T he differences between the current and previous action-example (h-repl on line 5) are {(HandleOn valvel) (AlarmLightl on)}. The earlier effect’s h-rep and the current action-exam ple’s pre-state are com pared to produce h-rep2 on line 7. The set h-rcp2 contains two conditions: one condition. (HandleOnl valvel), m atches the earlier effect’s h-rep and one condition does not, (valvel shut). Finally, the two sets, h-repl and h-rcp2 , are combined on line 8 to form the new effect’s h-rep. l0Diligent docs not care w hether sim ilar-ex is a positive example of old-cc. 128 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Closest earlier example: Pre-state: (valvel shut) (valve2 shut) (HandleOn valve2) (alarm -lightl off) (alarm-light2 off) Delta-state: (valve2 open) First effect: State changes: (valvel shut) Preconditions: g-rep: Don't care h-rep: (valvel open) (HandleOn valvel) s-rep: Don’ t care Current example: Pre-state: (valvel shut) (valve2 shut) (HandleOn valvel) (alarm -lightl on) (alarm -light2 off) Delta-state: (valvel open) New effect State changes: (valvel open) Preconditions: g-rep: 0 h-rep: (valvel shut) (HandleOn valvel) (alarm -lightl on) s-rep: (valvel shut) (valve2 shut) (HandleOn valvel) (alarm -lightl on) (alarm-Iight2 off) Figure 5.18: An Example of C reating a New Effect 5 .8 .3 Sp litting an E ffect in Tw o In the previous section, we discussed how to create a new effect from an action-exam ple’s delta-state by using conditions th a t are unm atched by any effect. However, we have not yet discussed what to do when an effect’s sta te changes only match part of the delta-state. In this case, the effect is split into two effects: the action-example is positive for one effect and negative or indeterm inate for the other effect. The positive and negative examples of the original effect are still positive and negative examples of the new effects. This means th a t preconditions of the original effect can be used to initialize the preconditions of the new effects. 129 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. procedure Split-Effect Given: op: An operator. e ff: An effect of op. ex: An action-exam ple of th a t operator. m e ff: S tate changes o f e ff th a t m atch ex. fe ff: S tate changes o f e ff th a t do not match ex. Result: Split e ff into two effects. 1. For operator op, create a new effect new-eff. 2. Copy the preconditions of the original effect eff to new-eff. s-rep(new-eff) s-vep{eff) h-rep(new-eff) < — h-rep(eff) g-tep(new -eff) e- g-rep{eff) 3. Copy the unused negative exam ples from e ff to new-eff. 4. Set the state changes of th e effects so th a t the action-example ex is a positive example of eff and a negative example of new-eff. state-changes(eJ 0r) < — m eff state-changes(new -eff) < — fe f f 5. Refine eff with the action-exam ple ex by invoking R e fin e -P o sitiv e -E x a m p le . 6. Refine new-eff with the action-exam ple ex by invoking R e fin e -N e g a tiv e -E x a m p le . Figure 5.19: Splitting an Effect The algorithm for splitting effects is shown in Figure 5.19 and illustrated with the d a ta in Figure 5.20. In Figure 5.20, the action-example is a positive example of new effect 1 and a negative example of new effect 2. W hen new effect 1 is refined with the positive example, the h-rep and s-rep have one condition, (valve2 open), removed. W hen new effect 2 is refined with the negative exam ple, the g-rep has one condition, (valve2 open), added. The above discussion of splitting effects and reusing the original preconditions begs the question - why doesn’t each effect's sta te change contain only one condition. This would remove the need to split effects. However. Diligent is an interactive system, and it takes less work for an instructor to examine and maintain one effect's preconditions than it would if several effects had duplicate preconditions. 130 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Action-example: Pre-state: (valvel open) (valve2 shut) (HandleOn valvel) (Alarm Light 1 off) Delta-state: (valvel shut) New effect 1: State changes: (valvel shut) Original effect: S tate changes: (valvel shut) (A larm L ightl on) Preconditions: g-rep: (valvel open) h-rep: (valvel open) (valve2 open) s-rep: (valvel open) (valve2 open) (HandleOn valvel) (A larm Lightl off) New effect 2: State changes: (A larm L ightl on) Preconditions: g-rep: (valvel open) h-rep: (valvel open) s-rep: (valvel open) (HandleOn valvel) (A larm Lightl off) Preconditions: g-rep: (valvel open) (valve2 open) h-rep: (valvel open) (valve2 open) s-rep: (valvel open) (valve2 open) (HandleOn valvel) (A larm Lightl off) Figure 5.20: An Example of C reating a New Effect Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 5.9 Complexity A nalysis This section analyzes the complexity of th e learning algorithm s. Let a = number of attributes = maximum number of conditions in action-example pre-states, post-states and delta-states c = maximum length of space to represent a condition i = maximum length of an identifier th a t represents a condition (i should be a lot sm aller than c) v = maximum number of values for each attribute m = maximum number of steps in a dem onstration t = maximum number of action-exam ples for an operator w = maximum number of unused negative examples per effect o — maximum number of operators e = maximum number of effects in an operator In the following, sets of conditions are represented as lists. T he elem ents of these lists are ordered by attribute name. A list can contain at most one condition for any one attrib u te. In order to avoid discussing the m erits o f different list im plem entations, the following discussion will make some assum ptions. It is assumed th at lists are implemented with pointers and that many list operations take 0 (1 ) time. These include deleting an element, appending an element to the end and inserting an element in the middle. O f course, finding where to delete an element or where to insert an element may require traversing the list and take 0 (a) time. We will also assume th at lists of action-examples are stored using identifiers and th at copying them takes negligible tim e. Comparing ordered lists o f conditions. In the following, we will repeatedly compare two ordered lists of 0 (a ) conditions in order to extract some elem ents from the lists or to merge the lists. This takes 0 (a ) tim e. We will depend on the lists being ordered by attribute name. Consider finding th e common conditions in two lists. The lists are compared by traversing them and comparing the current elem ent in each list. If the elements are equal, a m atch is found, and the condition in one list can be appended 132 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. to the list of m atching conditions in (0(1)) time. If one element is less th an the other, the lesser elem ent is not in the other list. When a list’s current elem ent is found to be missing from the o th er list, the list’s current element is changed to the list’s next element. Since each list has at most 0 (a ) elements, there are a t m ost 0 ( a ) comparisons. A c tio n -E x a m p le s. We will first look a t time complexity. The action-example needs to be created and the conditions ordered. The pre-state, post-state and d elta-state each have 0 (a ) conditions. T h e conditions need to be first copied (O (a)) and then sorted (0(alog(a))). Thus, the complexity is 0 (a + alog(a)) = 0(alog(a)). Now look at space complexity. The pre-state, post-state and delta-state each have 0 (a ) conditions. It takes O(c) space to represent each condition. There a re 0 ( 0 action-examples for 0 (o) operators. A naive method is to store each condition with each action-example. In this case, the space required for all operators and action- examples is 0{acto). A b e tte r approach is to assign each condition a distinct identifier of length 0 (i) and use identifiers in action-examples. This takes O (aito) space. Representing conditions by identifiers requires one identifier for each a ttrib u te value or 0(acv) space. Thus, the space required for all action-examples is O (aito + acv). C re a tin g O p e ra to rs . Here we are only concerned about time complexity. T he new operator has one effect. T he effect’s state changes and s-rep can have O (a) conditions. The g-rep is em pty (0 (1 )). T he h-rep can get O (a) conditions from the action- example’ s delta-state, b u t the a ttrib u te values in the conditions are incorrect. T hus, the conditions from the d elta-state need to be checked against the action-exam ple’s pre-state with a t most O (a) comparisons. The h-rep can also get conditions from the delta-state of action-exam ples for the dem onstration’s earlier steps. Since each earlier step provides a t m ost 0 ( a ) conditions, merging lists for the m earlier steps takes O (m a). Thus, the tim e complexity for creating an operator is 0 (m a ). C o m p a rin g in c o m p a tib le e ffe c ts . Sometimes the preconditions of incom patible effects are compared. We will look at tim e complexity. There are 0 (e ) incompatible effects. For each incompatible effect, there are a t most three comparisons between pairs of ordered lists of preconditions th a t contain 0 ( a ) attributes. Since the comparison with each list takes O (a), the tim e complexity is 0 (ae). 133 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Processing a negative exam ple. We will look a t tim e complexity. A negative exam ple’s potentially needed conditions are com pared to an effect’s g-rep. There are 0 (a ) attrib u tes, and the comparison takes O (a). If a near-miss is found, one condition is inserted in the g-rep and h-rep. Inserting the condition requires 0 (a ) comparisons to find to where to insert the condition. If a condition is not added, the current effect may be compared against incompatible effects, which takes 0(ae)(see above). Thus, the tim e complexity is O(ae). Processing a positive exam ple. We will look at tim e complexity. The three precondi tion concepts (i.e. s-rep, h-rep and g-rep) are com pared against an action-exampie’s pre-state. There are O (a) attributes, and the comparison takes 0 (a ). Afterwards, the 0 (w ) unused negative examples are processed. Since processing a negative ex ample takes 0 (ae ), th e processing of 0 (w ) negative examples takes O(tnae). Thus, the tim e complexity is O(wae). Splitting an effect. We will look at time complexity. The 0 (a ) attributes in the existing effect’s s-rep, h-rep, g-rep and sta te changes are copied in 0 (a ) time. Then one effect is refined with a negative example (0 (a e )), and the other one is refined with a positive example (O(wae)). Thus, the time complexity is O (wae). Creating a new effect. We will look a t tim e complexity. The new effect’s state changes come from the action-exam pie’s delta-state and contain 0 (a ) conditions. The s-rep initially has 0 (a ) conditions, and the g-rep is em pty (0 (1 )). In the same manner as when the operator was created, some h-rep conditions are found in the action- example’s delta-state (0 (a )). Additional h-rep conditions are found using the h-rep of the operator’s first effect (0 (a) conditions). Com paring first effects preconditions against the action-exam pie’s pre-state takes 0 ( a ) and then merging the conditions with the partial h-rep takes 0 (a ). More h-rep conditions are come from differences between the pre-states o f the action-example and the most similar negative example. Finding the negative exam ple takes O (at) com parisons because it involves comparing 0 (a ) attributes within O(t) action-examples. T he differences between the action- examples are then merged with the partial h-rep in O(a) time. Thus, the time complexity is 0 (at) Refining an operator w ith an action-example. Diligent is an interactive system and cannot spend too much time processing any one action-example. Therefore, we will look a t the time complexity to update an operator with one action-example. First, 134 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. the action-example needs to be created (0 (a lo g (a ))). Second, the 0 (a ) conditions in the action-exam pie’s delta-state are com pared against the state changes of 0 (e ) effects (0(ea) comparisons). Finally, the O (e) effects are refined with the action- example. Let R represent refining an operator with an action-example. 0 (R ) = 0 (co st of creating action-example) -f- 0 (co st of comparing the action-exam ple to each effect’ s d elta-state)+ O (e(cost of refining a positive exam ple)) + O (e(cost of refining a negative exam ple)) + O (e(cost of splitting a conditional effect)) + 0 (co st of creating a new effect) = 0(alog(a)) -h O(ea) + O (wae2) -f- 0 ( a e 2) + O (wae2) + O (at) = O(alog(a) + wae2 + at) 5.9.1 Scalability D iligent’s approach is scalable because operators are learned for a particular object with relatively few action-examples. Because there are so few action-examples, it’s reasonable to maximize learning by spending a little extra tim e on each action-example. We will discuss the scalability issues from the previous section th at appear most im portant. They are the space required to store action-examples, the time for creating new effects, the time for processing a positive example, and the time for splitting an effect. The area for storing action-examples is greatly reduced by associating identifiers with conditions and storing the identifier rather than the condition in the action-example. The savings in space increases as more action-examples are created because most conditions appear in many action-examples. Furtherm ore, the sam e identifiers can be used in action- examples for all operators. The space saved by using identifiers to represent conditions also enables the storage of action-examples in hash table. Storing action-examples in a hash table allows Diligent to check for duplicate action-examples before creating and storing a new action-example. This is im portant because Diligent tends to receive duplicate action-examples. If the space required by action-examples becomes an issue, a limit could be placed on the num ber of previous action-examples stored. Another scalability issue is the time it takes to create a new effect. Creating a new effect involves identifying h-rep conditions by comparing the the current positive example 135 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. against the o p e rato r’s previous action-examples. This is reasonable because the operator represents the m anipulation of one object and has relatively few action-examples. If time became an issue, the num ber of previous action-examples examined could be limited. A third scalability issue is the tim e to refine an effect with a positive example. The tim e spent on the positive exam ple is not the issue. Instead, it’s the tim e spent processing the unused negative exam ples. These negative examples differ from the s-rep in two or more conditions but are still classified as positive by the g-rep. However, processing these action- examples is not a problem because there tend to be only a few of them . Furtherm ore, the number action-exam ples d o esn ’t get large because, as more negative examples are seen, the g-rep gets m ore refined and rejects more negative examples. The final scalability issue is the tim e to split an existing effect in two. This has sam e tim e complexity as processing a positive example. Splitting an effect happens much less often than processing a positive example, and the tim e complexity of splitting an effect is dom inated by tim e com plexity of processing a positive example, which we have already discussed. 5.10 R elated W ork Throughout this chapter, related work has been discussed where applicable. However, some other work should be m entioned. Diligent can learn in an unstructured environm ent th a t does not have any explicit representation of the relationships between objects and attributes. A nother algorithm for learning in this type of environm ent is MSDD [OC96], which learns probabilistic state changes. However, M SDD requires much more d a ta th an is available to Diligent. Diligent is a P rogram m ing By Dem onstration (PB D ) system th at focuses on deter mining which a ttrib u tes are im portant. However, m any PBD system s for m anipulating graphical objects have a different type of environment. Instead, these system s have struc tured environm ents, which contain explicit relationships between objects and attributes. Learning in these system s tends to focus on identifying which relationships are im por tan t and generalizing object classes. An example of this type of system is M etam ouse+ [MWM94]. Disciple has been used in a variety of domains [TK90, THD95, TH96, TK9S]. Like Diligent, Disciple uses a version space algorithm with a single conjunctive concept for its upper and lower bounds (i.e. g-rep and s-rep). Unlike D iligent’s g-rep and s-rep, Disciple’s initial upper and lower bounds are heuristically altered so that they are only probable 136 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. upper and lower bounds. These heuristic bounds define w hat is called a Plausible Version Space [Tec9'2]. To create these bounds, Disciple uses inform ation about its structured environm ent that is unavailable to Diligent. Disciple overcomes errors in its bounds by allowing conditions to be added and removed from both the upper and lower bounds. Like Diligent, a few PBD system s have used a version space algorithm for learning pre conditions. Metamouse-f- [MWM94] learns graphical editing procedures in an environment th at is very different than Diligent’s. Disciple [TH96], which is described above, has also been taught by dem onstration. Recently, Lau and Weld [LW99] used an e-mail processing domain for comparing algorithm s th a t learn preconditions. T hey looked at a version space and an inductive logic algorithm ; however, their environm ent is very different than Dili gent’s, and their version space algorithm only learned a single precondition for an entire procedure. Utgoff [Utg86] has looked a t speeding up version space learning by dynamically cre ating attributes whose values are inferred from other a ttrib u tes. This is inappropriate for Diligent because there is little d a ta and because hum ans may not find the inferred attributes either understandable or reasonable. 5.11 Summary This chapter discussed how Diligent learns operators. It focused on how Diligent identifies the preconditions necessary for an action to produce desired s ta te changes. Good precon ditions are im portant because Diligent uses them to derive a plan’s step relationships. First, we covered some requirem ents specific to learning operators. Diligent needs to quickly and incrementally learn operators wdtli potentially little d ata. For these reasons, Diligent needs to be able to correct errors in the operators. Additionally, the operator rep resentation needs to be usable by the hum an instructor. He needs to be able to understand the preconditions and to determ ine w hether Diligent believes th a t a specific condition is a precondition. It would also be useful to provide him with som e measure of confidence in a precondition. Finally, Diligent’s environm ent contains m any attributes, most of which are not needed by a given procedure. Thus, Diligent’s learning m ethods need to identify attributes th at are likely to be im portant. Two types of d a ta are provided to support learning: exam ples of actions being per formed and the sequence of steps in the current dem onstration. Diligent processes the d a ta using three heuristics. One heuristic assumes that attributes th a t changed value earlier in the dem onstration are likely preconditions. This heuristic 137 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. is used for creating new operators. The second heuristic favors existing knowledge. This means that Diligent should use w hat it already knows rath er than general heuristics. This heuristic is used throughout the learning algorithm , but it particularly influences the creation of new effects when the operator already has an effect. The third heuristic favors extraneous preconditions over missing ones because it is easier to remove unnecessary preconditions than to add missing ones. Preconditions are associated with effects, and an effect represents preconditions using a modified version space th a t has three sets of conjunctive conditions. The version space still has a most general bound (g-rep) and a most specific bound (s-rep), but Diligent augm ents the version space with an intermediate, best guess precondition (h-rep). The h-rep supports learning reasonable preconditions quickly and is used when calculating a plan’s step relationships. The s-rep and g-rep are used for increm ental learning, error recovery, and indicating Diligent’s confidence in a particular precondition. If Diligent is very confident, the precondition is in the g-rep, and if Diligent strongly believes a precondition is unnecessary, then the precondition is not even contained in the s-rep. We also discussed how Diligent refines preconditions using action-examples. Positive examples remove unnecessary conditions from the s-rep and h-rep, while negative examples add conditions to the h-rep and g-rep. Finally, we looked a t the algorithm ’ s complexity and argued th a t the approach is scalable. 138 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Chapter 6 Experimenting In the previous chapter, we discussed learning operators. O perators are associated with each step in a procedure and identify the step ’s preconditions and state changes. Diligent uses these preconditions and sta te changes to derive the dependencies (i.e. step relation ships) between steps. Procedures containing these dependencies will be used by an autom ated tu to r to teach human students. Consequently, errors in the dependencies may mislead students. One source of errors is the lack of training data. Because the instructor has limited time, Diligent may only see a step dem onstrated a few tim es. This forces Diligent to use heuristics when creating preconditions. Unfortunately, heuristic preconditions may contain mistakes, and the quality of the preconditions determ ines the likelihood of errors in a procedure’s step relationships. One method for refining preconditions is to perform a step in several different states and observe what happens. Diligent does this when it perform s experiments. Besides performing steps in multiple states, experim ents need to meet a variety of objectives. They should minimize the work performed by the instructor. They should exploit Diligent’ s access to the environment and focus a tte n tio n on the procedure being learned. Experiments should also compensate for the bias in the heuristics used for creating preconditions. Diligent meets these objectives with a novel technique: Diligent performs steps in a variety of states during autonom ous experiments th at are generated from dem onstrations of a procedure. D em onstrations are useful because they specify a sequence of steps th at can be used to perform a procedure. When experimenting, Diligent performs the procedure but skips a step. Diligent then observes how skipping the step affects subsequent steps. Since the heuristics used for creating preconditions assume th a t th e sta te changes of earlier steps 139 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. are likely preconditions for later steps, skipping steps helps com pensate for the heuristic bias. This chapter discusses using experim ents to refine the preconditions of operators. First, we will describe the problem in term s of requirements. Afterwards, we will discuss issues th a t motivate Diligent’s approach. We will then discuss Diligent’s approach. Finally, we will end with a discussion o f the run-tim e complexity and related work. 6.1 The Problem Earlier in Section 3.1, we described the authoring problem in term s of requirem ents, con straints and the interface to the environm ent. Because the problem has become more constrained and concrete, we will define some new requirements. 6.1.1 R equirem ents Let us consider how the requirem ents in C hapter 3 relate to experim ents. The experi mentation approach needs to help understand dem onstrations by getting the most out of each dem onstration. Experim ents should save the instructor tim e and reduce the diffi culty of authoring. When experim enting, Diligent should exploit its ability to access and manipulate the environm ent. We will also define the following additional requirements. Generate more exam ples o f steps being performed. The goal of experim entation is to better understand the dependencies between a procedure’s steps. To do this, the operator learning algorithm s need examples of the steps being performed in a variety of states so th a t operator preconditions can be refined. Compensate for operator learning bias. Some errors in operator preconditions may result from the bias th a t favors attrib u tes th at change value during a dem onstration. The bias is reasonable because changes caused by earlier steps are often preconditions for later steps. However, some of these preconditions may be incorrect. Positive and negative exam ples should be similar. A positive example is when an action produces the desired sta te changes, and a negative example is an example th at is not positive. Positive examples help eliminate unnecessary preconditions, while negative examples identify necessary preconditions. 140 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. In Diligent’s learning algorithm , it is harder to process negative exam ples than posi tive ones. Unlike positive examples, a negative example requires a near-miss (or one condition) difference between its pre-state and the m ost specific candidate precondi tion (i.e. s-rep). T he likelihood of finding a near-miss increases when negative and positive examples are sim ilar. Interactive system should be fast. Diligent is an interactive system for instructors with a limited am ount of tim e. If experim ents force instructors wait long periods of time or even stop, then instructors may have difficulties because a loss of concentra tion and focus. Thus, general purpose techniques th a t do not focus on understanding the procedure could take a prohibitive am ount of time. A related concern is why are Diligent’s experiments performed interactively when they could have been done off-line. The reason is th a t Diligent’s experim ents focus on understanding dem onstrations and interactive experim ents make the tool easier to use. If experim ents were performed off-line, an instructor might have to wait a long time to see w hat an experiment learned. In contrast, an instructor can quickly see the results of interactive experiments. Because an instructor w aits less time, it should be easier for him to concentrate and focus on the procedures being authored. Bounded number o f steps in an experiment The tim e required to perform experi ments can be controlled only if a limited number of steps are performed. While it is reasonable to perform additional steps in response to an unexpected observation, the number of additional steps should not be too large o r unpredictable. The requirements for being fast and bounding the num ber of steps argue against using autonom ous discovery algorithm s th at may perform a large, unpredictable number of steps. This also argues against using techniques th at may require a bounded but large number of steps. This includes system s th at attem p t to build a correct finite state autom aton of th e environment [Ang87b, Ang87a, RS90, She94]. 6.2 Background This section discusses issues relevant to Diligent’s experim entation approach. V V e will also discuss other approaches th a t are inappropriate for Diligent, but might complement Diligent’s approach in a future system. 141 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 6.2.1 Focused versus U nfocused An im portant issue is why a system performs experim ents. Diligent experiments because it’s attem pting to understand a given procedure well. Experim ents th at concentrate on gaining general knowledge may learn to do a lot o f things well, but are likely to tak e more tim e and be less useful than those th a t focus on understanding the steps of th e given procedure. 6.2.2 Supervised versus U n su p ervised Experim ents can be viewed as generating a question, asking an ‘'oracle" the question, and waiting for the oracle to provide the correct answer. When the oracle is hum an, the experim ent is supervised, and when the oracle is autom ated (e.g. Diligent’s environm ent), the experiment is unsupervised. Some systems perform supervised experim ents by generating potential examples of a concept and then asking the user w hether they are examples of the concept. Humans often find this type of yes or no question easy to answ er. Systems th a t use structural dom ain knowledge (i.e. class hierarchies and relations between objects) for generating exam ples include ALVIN [KW88], MARVIN [SB86] and Disciple [TK98, TH96]. This approach is inappropriate for Diligent because Diligent solves a different problem . Not only does Diligent try to minimize the effort required by the instructor, but D iligent’s unstructured environment does not provide class hierarchies or relations between objects. Still, Diligent could be viewed as performing supervised experim ents when it asks the instructor to verify goal conditions, ordering constraints and causal links.1 However, these questions verify information after it has been com puted rather than providing inpu t for machine learning algorithms. In contrast to supervised experim ents, unsupervised experim ents reduce the instruc to r’s work by letting the environment answer th e questions. Unsupervised experim ents also reduce the possibility of instructor error. System s th at perform unsupervised experi m ents include EXPO [Gil92], OBSERVER [Wan96b] and LIVE [She93]. The method used by these systems to experiment will be discussed in the next section. ‘See C hapter 4. 142 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 6.2.3 E xp erim en tin g w ith P lans Some systems experim ent by building plans th a t transform an initial state into a goal state. This can be done two ways: practice problem s and explicit experiments. A practice problem requires a system to create a plan th a t transform s an initial state into a goal state. An explicit experim ent has two com ponents: an action to be performed and a desired state in which to perform the action. Placing the environment into the desired sta te often involves solving a practice problem where the current state is transform ed into the desired state. Both practice problems and explicit experiments allow a system to learn by observing how various actions affect th e environm ent. Systems th at learn by creating and performing plans include LEX [MUB83], LIVE [She93, She94], OBSERVER [Wan96c], EXPO [Gil92, CG90] and IMPROV [Pea96]. When creating plans several issues need to be addressed. • W hat knowledge does a system utilize when creating a plan? OBSERVER only utilizes knowledge of operators, and it learns by deliberately not satisfying som e potential preconditions. In contrast, E X P O uses sophisticated domain independent techniques. EXPO identifies missing preconditions by favoring hypothesized precon ditions th at involve 1) attributes of o b jects involved in the action, 2) predicates th a t appear in all successful past situations, and 3) operators similar to the one being examined. EXPO also restricts the space of plans with seventeen rules th a t favor certain types of plans (e.g. avoid long plans). • When is an experim ent finished? Does it require the goal state to be reached or does it merely involve performing the steps? A ttem pting to reach the goal after the initial plan fails could take a large number of steps. • How are practice problems generated? Are they autom atically generated as in EXPO, or does someone need to generate them as in OBSERVER? An advantage of autom atically generating problems is th a t the system has some control over a problem’s apparent difficulty, while an advantage of user selected problems is th a t the user can guide learning. Another issue is whether the system is doing an extensive search during planning or whether it is doing more limited and controlled planning. An extensive search could take a long time, while more limited planning m ight ju st involve small changes to an existing plan. 143 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. For Diligent, an extensive search is inappropriate. If tiie system were busy experi menting, the instructor could not provide additional dem onstrations. Furthermore, if each dem onstration resulted in a long delay, instructors might be hesitant to provide additional dem onstrations. Recall th a t one of Diligent’s objectives is allowing instructors to easily author procedures with dem onstrations. Diligent’s potentially limited knowledge is more com patible with limited planning. However, limited planning raises issues beyond Diligent’s scope. Diligent would need to identify which plans to create and when they should be created. This may be difficult when Diligent has seen only a few dem onstrations and knows only a few poorly understood operators. Depending on how plans are created, Diligent m ight not yet have the minimum knowledge necessary for planning. Instead, Diligent needs a basic approach th at can be reliably used even when the system has very minimal knowledge. A t this stage, Diligent could have difficulty solving practice problems whose solutions differ only slightly from a dem onstration. T h at is why the more knowledge-intensive planning techniques of EXPO and OBSERV ER are not used. Nevertheless, planning could complement Diligent’s experim entation approach. Dili gent’s experiments could be used as an initial phase to refine operators and to identify situations th at merit the use o f planning. Later, when enough knowledge has been built-up, more planning intensive techniques could be used. 6.3 Input This section describes the input used for performing experim ents and defines terms th at should make the following discussion easier to understand. Diligent experiments on procedures, whose basic stru ctu re was described in Section 4.3. Procedures contain one or more paths. A path describes an initial state and a sequence of steps. The sequence of steps in each path is specified by one or more demonstrations. A path can represent multiple dem onstrations because a dem onstration can add steps to an existing path. Diligent actually uses paths rather dem onstrations to generate experiments. The specification of a p a th ’s initial state is called a prefix. A prefix identifies a known configuration of the environm ent (Section 3.1.3) and a sequence o f actions th at alters the configuration. A step represents a portion o f the procedure. M ost steps are primitive. A primitive step represents an action performed in the environm ent. If a step is not primitive, it 144 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. is abstract. An abstract step represents a subprocedure th a t contains its own steps. A procedure containing an abstract step is called hierarchical. W hen Diligent performs an abstract step, it a tte m p ts to establish the goal conditions of the abstract step ’s subprocedure. A goal condition indicates the value an attribute should have when the subprocedure is finished. To establish the goal conditions, some of the subprocedure’s steps are performed. Associated with each primitive step is an o p erato r (Section 3.2.2.2). An operator rep resents an action performed in the environm ent and identifies the preconditions needed to produce given state changes. Because actions can produce different state changes when performed in different states, some of an o perator’s s ta te changes may have different pre conditions. The purpose of experiments is to refine th e preconditions of operators. T he preconditions of operators are refined with action-examples, which contain the sta te of the environment before (pre-state) and after (post-state) performing an o p erato r’s action. 6.4 Diligent’s Approach Diligent experiments by repeatedly performing a procedure but altering it so th a t a differ ent step is skipped each tim e. Before performing the procedure, Diligent uses its ability for resetting the environment so th at it, like a student learning the procedure, starts the pro cedure from a specified initial state. As Diligent perform s the procedure, it observes how skipping the step affects later steps. This exam ination of how the state changes of earlier steps affect later steps helps compensate for bias used when creating operators. When all steps have been performed, the experiment is finished. The experiment is finished because its purpose is generating action-examples of the procedure’s steps rather than achieving some goal state. This approach should be quick because it bounds the number of steps performed in an experim ent. Because a procedure’s steps are specified by dem onstrations, experiments are really generated from dem onstrations. Generating unsupervised experiments from dem onstra tions serves a number of purposes. It doesn’t require accurate domain knowledge. It addresses Diligent’s requirem ents to understand dem onstrations and to make the instruc to r’s job easier by m aking more use of each dem onstration, ft also uses Diligent’ s heuristic focus on attributes th a t change value, and it exploits D iligent’s ability to interact with the environm ent, which includes the ability to reset the environm ent's state and to perform actions. 145 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The approach focuses on validating the preconditions created by the heuristics used to create operators. One heuristic is th a t attributes th at changed value earlier in a dem onstra tion are likely to be preconditions of later steps. This results in a bias towards creating unnecessary preconditions. Experiments remove these unnecessary preconditions when they show th a t a later ste p is not dependent on an earlier step. Experiments may also identify when a later step is dependent on an earlier step. W hen this happens, experiments not only provide evidence th a t preconditions are correct b ut also support the identification of missing preconditions. As mentioned earlier, positive examples remove preconditions, and negative examples add and verify preconditions. Because experim ents focus so closely on the procedure, positive and negative exam ples tend to be similar. This sim ilarity should be beneficial because there may be few action-examples and using a negative example requires a one condition difference between it and potential preconditions. This approach is straightforw ard for primitive steps, b u t how does it handle steps that represent subprocedures? In this case, Diligent uses an heuristic th a t focuses on the current procedure. This m eans th a t, as much as possible, ab stra ct steps (i.e. subprocedures) should be treated like o th er steps. In other words, an ab stra ct step is treated as black box th a t achieves the goal conditions of its subprocedure. To allow a subprocedure to achieve its goal conditions, Diligent internally simulates performing the subprocedure in order to determine which of the subprocedure’s steps to perform. O f course, when performing an experiment, an abstract step, like other steps, may som etim es fail to establish the desired state changes. Diligent’s focus on the current procedure reduces the num ber of steps in an experiment. Because there are fewer steps, the instructor doesn’t have to wait as long. 6.5 The Procedure B eing Used As a procedure, we will use the extended example from the chapter on processing dem onstrations (C hapter 4). Figure 6.1 shows the extended example. The steps repre senting a procedure’s beginning and end (e.g. begin-procl and end-procl) are not shown because the experim entation algorithm ignores those steps. The procedure th at wo will experiment on is top-level, which uses procedures procl and proc2 as subprocedures. The steps and procedures do the following. Procedure top-level shuts two valves and checks an alarm light. Procedure procl shuts two valves, and procedure proc2 checks an 146 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Procedure top-level: Steps: turn-5 — ► procl-6 — > proc’ 2-7 Procedure procl Steps: turn-1 -*• m ove-2nd-‘ 2 — > • turn-3 -> move-lst-4 Procedure proc2 Steps: press-test-8 — ► check-Iight-9 — > • press-reset-10 Figure 6.1: A Hierarchical Procedure alarm light while in test m ode. In our later discussion, we will use the fact th a t steps turn-5 and turn-1 both shut valvel. 6.6 The Algorithm procedure Experim ent-O n-Procedure Given: proc : a procedure. Result: Perform experim ents the procedure’s paths. 1. Initialize the stack of experim ental commands expr-stack as empty. 2. For each path pth of the procedure do the following. 3. If path pth has not been updated since it was in an experim ent, then generate experim ents for pth and add them to expr-stack. Do this with Gen-Skip-Step-Experiment. 4. Perform the experim ents contained in expr-stack using Perform -Experim ent. Figure 6.2: T he Top Level Experimentation Algorithm Diligent performs experim ents using procedure Experim ent-On-Procedure (Figure 6.2). On line 1, the stack of experim ental actions to perform is emptied; this merely puts the stack into a known state. O n lines 2 - 3, experiments are generated for each of the procedures paths. The experim ents are stored in the stack expr-stack. A fterw ards, on line 4, experiments are actually perform ed. The experiments are generated by procedure Gen-Skip-Step-Experiment (Figure 6.3). In an experiment, the p a th ’s initial state is reset and all but one of the procedure's steps are performed. This is done for every step but last step in the p ath . Two types 147 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. procedure Gen-Skip-Step-Experiment Given: proc: A procedure. p th: A path with n steps. expr-stack: A stack of experimental commands to perform . Result: expr-stack: Updated stack of commands. 1. Loop over i where i goes from 1 to (n - 1) 2. C om pute the sequence seq of steps to perform; include all the p ath ’s steps except the xth step. (If th e path has steps . .sn, then seq = s L . . .s ,_ is 1+l. . .s„.) 3. Push each of seq's steps onto expr-stack as perform -step com m ands. S ta rt with the last step in seq and work backwards to the first step. (Pushing the steps in reverse order causes the p a th ’s earlier steps to be performed before the path’s later steps.) 4. Push the path p th's prefix onto expr-stack as an reset-environm ent command. (The command will be used to reset the p a th ’s initial state.) Figure 6.3: Generating Skip-Step Experim ents of comm ands are placed in the stack of experim ental commands: perform -step and reset- environment. A perform -step command performs one of the p a th ’s steps, and a reset- environment com m and resets the environm ent’s sta te to p ath ’s initial state. Notice that a p ath ’s steps are pushed onto the stack in reverse order so th at a p a th ’s later steps are performed after its earlier steps. The stack of experim ental commands looks like a) of Figure 6.5 after Gen-Skip- Step-Experim ent has processed procedure top-level, which has only one path. The stack indicates th a t the procedure will be performed twice: once skipping the first step (turn-5) and once skipping the second step (procl-6). As we discussed before, ab stract steps procl-6 and proc2-7 (i.e. subprocedures procl and proc2) are treated the sam e as primitive step turn-1. The procedure Perform-Experiment is shown in Figure 6.4. Lentil the stack (exper- stack) is empty, the procedure keeps popping off and processing th e top comm and in the stack (lines 1 and 2). When perform-experiment is invoked, the stack looks like a) in Figure 6.5, and when it finishes, the stack is empty. The procedure Perform-Experiment first processes a reset-environm ent command (line 4 of Figure 6.4). Performing this command restores the p a th ’s initial state. 148 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. procedure Perform-Experiment Given: exper-stack: Stack of experimental com m ands to perform . Result: Perform all the commands in exper-stack. 1. While expr-stack is not empty 2. Pop the top command off of expr-stack. 3. Based on the type of command, do one of the following: 4. If the command is a reset-environm ent command, then restore the path’s initial state using Replay-Prefix (Section 4.6.6.1) and the prefix associated with the com m and. 5. If the command is a step-perform com m and and the step is primitive, do one of the following: 6. If the step just senses the environm ent w ithout changing it (i.e. a sensing action), do nothing. 7. Otherwise, perform the step ’s action. This is done with the action-id of the ste p ’ s o perator and Perform-Action (Section 3.1.3). This produces an action-example that is used to update the ste p ’s operator with Refine-Operator (Section 5.8.1). 8. If the command is a step-perform com m and and the step is ab stract (i.e. a subprocedure), then 9. Compute the sequence seq o f steps needed to perform the subprocedure from the current state with Internally-Simulate-Subprocedure (Section 4.7.1). 10. Push each step in seq onto expr-stack as a perform-step command. S ta rt with the last step in seq and work backwards to the first step. By working backwards, steps that are earlier in seq will performed before later steps. Figure 6.4: Performing Experim ents 149 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. a) After skip-step experim ents have been generated reset — > • procl-6 — > proc2-7 — > reset -)• turn-5 — ¥ proc2-7 b) Before processing step procl-6 procl-6 — > • proc2-7 — ► reset — > turn-5 — > proc2-7 c) After processing step procl-6 turn-1 — > move-2nd-2 — > turn-3 — > move-lst-4 — ► proc2-7 — ► reset -> turn-5 — > proc2-7 Figure 6.5: The Stack of Actions to Perform If the type of command is perform-step and the associated step is primitive, then Diligent performs the step’s action in order to refine the action’s operator (lines 5 - 7 ) . Line 6 deals with sensing actions, which are actions th at gather knowledge from the environment w ithout changing it (e.g. check whether a light is illum inated). Diligent assumes th at the environment allows sensing actions to be performed successfully in any state. Because sensing actions do not change the environment and can be performed success fully in any state, it is unclear w hether Diligent can learn anything from them. Instead of changing the environm ent, sensing actions create mental attrib u tes th a t record the current values of environm ent attributes. However, Diligent only checks for the existence of men tal attributes and does not consider their values. Given th a t m ental attribute values are ignored, Diligent could only potentially refine a sensing actio n ’s preconditions.2 Consider a procedure (e.g. proc2) th at checks the state of a light while the system is in test mode; outside of test mode, it is irrelevant whether the light is on o r off. How could a system with limited knowledge, such as Diligent, know th at being in te st mode is mandatory? For this reason, Diligent ignores sensing actions during experim ents. If a future system used the values of mental attributes, then performing sensing actions during experiments might be useful. This is area for future work. 2A sensing action’s preconditions are used to control when the sensing action’s step is performed. For this reason, a sensing action’s hypothesized preconditions are associated with the sensing action’s step rather than its operator. The assum ption is that a sensing action’s preconditions are specific to the given step and procedure rather than independent of a procedure like the preconditions of operators. 150 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. If the type of comm and is perform -step and the associated step is abstract, then Diligent treats the step as a black box th a t achieves the goal conditions of the ste p ’s subprocedure. Diligent does this by sim ulating the subprocedure (line 9). Simulating a subprocedure involves looking a t the current sta te and determining which of the subprocedure’s steps need to be performed. This m eans th a t Diligent may perform steps in the subprocedure th at it would normally skip or skip steps th a t it would normally perform. Consider b) and c) in Figure 6.5. In b), the to p com m and in the experimental stack is the ab stract step procl-6, which performs the subprocedure procl. In c), step procl-6 has been replaced by the steps of procedure procl. Normally, when performing procl-6, p r o d ’s first step (turn-1) is skipped because step turn-5 has already shut valvel. However, during this experim ent, step turn-5 is skipped. Because Diligent attem p ts to achieve the goal conditions of procl, p r o d ’s first step (turn-1) is performed. 6.6.1 W hat W as Learned F'rom th e E xperim ent As mentioned earlier, the purpose of experim ents is to refine operator preconditions. Therefore, we will briefly review how operators are represented. In an operator, each state change is associated with three conjunctive sets of preconditions. The m ost general set of preconditions, g-rep, contains conditions that have been shown to be necessary. The best guess set of preconditions, h-rep, contains likely preconditions, and the m ost specific set of preconditions, s-rep, contains unlikely preconditions. All conditions in the g-rep are contained in the h-rep and s-rep, and all conditions in the h-rep are contained in the s-rep. Changes to the three sets of preconditions impact the procedure differently. It is desirable to remove conditions th a t are only in the s-rep, but the s-rep is not used when deriving a plan’s step relationships. In contrast, changes to the h-rep or g-rep are im portant because Diligent uses them to derive step relationships. We will now discuss w hat is learned when experimenting on the above procedures. The above experim ents illustrate how experiments are performed on hierarchical proce dures. Experiments on hierarchical procedures focus on the current procedure and assume th at subprocedures are already refined. However, experiments on a hierarchical procedure can refine subprocedures by performing them in different initial states. The experim ents can also reveal unexpected dependencies between subprocedures.3 In our example, Diligent learned little when experimenting on the hierarchical proce dure top-level. T h a t is because top-level’s three steps are relatively independent of each 3 Extensions that deal with unexpected behavior in subprocedures will be discussed in C hapter 8. 151 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. other. Additionally, few steps were performed in new pre-states. The first step turn-5 is not needed by the second step procl-6 because procedure procl contains step turn-1 th a t is equivalent to turn-5. The steps of subprocedure proc2 had some s-rep preconditions re moved when step procl-6 was skipped, but Diligent doesn’t use s-rep preconditions when building plans. If the instructor had experim ented on procedure proc2, nothing would be learned be cause the procedure has only three steps and one o f them represents a sensing action. Remember th at sensing actions are ignored during experim ents. Step Old Preconditions New Preconditions State changes turn-1 (valvel open) (valvel open) (HandleOn valvel) (valvel shut) move-2nd-‘ 2 (valvel shut) (HandleOn valvel) (HandleOn valvel) (HandleOn valve2) turn-3 (valvel shut) (valve2 open) (HandleOn valve2) (valve2 open) (HandleOn valve2) (valve2 shut) move-lst-4 (HandleOn valve2) (HandleOn valve2) (HandleOn valvel) The preconditions in italics have been identified as necessary. Table 6.1: Changes to p r o d ’s Preconditions In contrast, experimenting on procedure procl would have updated preconditions and caused Diligent to derive different step relationships. T he changes to the preconditions are shown in table 6.1. The preconditions shown italics are in the g-rep, while the others are in the h-rep. Notice th at the preconditions are much better after the experim ents. However, the final preconditions in table 6.1 are not perfect. The steps move-2nd-2 and move-lst-4 should have no preconditions. Diligent is unable to remove the a ttrib u te for the pre-state location of the handle (i.e. HandleOn) because it has only seen the handle move between the two valves. The error would have been corrected if the instructor had dem onstrated moving the handle from other valves. In any case, this is a subtle error th a t might escape the notice of an instructor and hum an students.4 One potential concern is th a t the upper and lower bounds of the version space (i.e. g-rep and s-rep) have not converged to the same concept. However, this convergence is highly unlikely given a potentially large number of a ttrib u te s and the few action-examples. ''W hen evaluating Diligent (C hapter 7), none of the test su b jects appeared to spot this error. 152 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Even OBSERVER [Wan96c], which had a lot more data than Diligent, did not expect con vergence. Nevertheless, the version space’s upper and lower bounds are still useful because they provide the instructor with a m easure of Diligent’s uncertainty. Besides, Diligent’s objective is to provide the instructor with an h-rep containing reasonable preconditions. 6.7 Complexity Analysis This section discusses the run-tim e com plexity of the experim entation algorithm . In the following, • We will consider a procedure to have one path. • We will ignore the cost of resetting th e environment’s sta te and instead focus on the p a th ’s primitive steps. (The environm ent is reset before performing the procedure's steps.) • Because sensing actions are not perform ed during experiments, we will not consider them . Consider a one level procedure (i.e. w ithout subprocedures). If the procedure has n steps, the procedure is performed (n — 1) tim es while skipping steps. Each performance of the procedure takes (n — 1) steps. T hus, experiments on a one level procedure perform 0 ( n 2) steps. Unfortunately, when experim enting on a one level procedure, ^ of th e steps may not provide any information. T he problem is th a t the steps before the skipped step merely perform the procedure. However, perform ing the procedure once might be useful if the procedure’ s path was created from m ultiple dem onstrations because the p a th ’s steps may not have been performed sequentially from s ta rt to finish. Experiments could avoid these unnecessary steps if the environm ent’s s ta te before the last skipped step could be quickly saved and restored. This capability would allow each performance of the procedure to s ta rt a t a later step and a different initial state. W hen hierarchical procedures are considered, the time complexity im proves. A procedure can be viewed as a tree, where the procedure is the root node and each prim itive step is a leaf node. The direct descendents of a procedure are its prim itive and abstract steps, while its descendents are all the nodes in tree whose root is the procedure. The length of the path from the root to a node is called its depth. The direct descendents of the root node have a depth of 1. The height of a tree is the maximum depth of any 153 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. node. A procedure containing only primitive steps has a height of 1. A procedure o f height 2 contains abstract steps, but the procedures performed by the a b stra ct steps contain only primitive steps. Let all procedures contain a t m ost 6 direct descendents. Because a tree of a given height contains m ore nodes when it is balanced, we will assum e th a t all procedures have b direct descendents and that a procedure’s direct descendents are either all prim itive or all abstract. An upper bound on the num ber of leaves of a tree of height h is bh [CLR90]. In other words, a procedure of height h contains at most bh primitive steps. Consider an experim ent perform ed on the hierarchical procedure a t the root node with height h. Diligent only experim ents on a given procedure’s direct descendents. Assume th a t the number o f steps performed by subprocedures does not change because earlier steps were skipped. In th is case, Diligent performs the procedure (b — 1) tim es while skipping a step. Each perform ance of the procedure uses (b — 1) direct descendent steps. Because each direct descendents is an a b stra c t step, each of the direct descendents is realized by bh~ l primitive steps. Thus, total num ber of primitive steps perform ed in an experim ent on the root procedure is at most = bh~l (b - l ) 2 = bh+l - 2bh + bh~ l (X) Now consider the case where the root procedure and every descendent subprocedure have experiments performed on them . We will prove by m athem atical induction th a t this involves performing h(bh+l — 2bh + bfl~l) primitive steps, where h > 1. Let G (h ) = h(bk+i - 2bh + 6/l_1). Consider a procedure with only primitive steps (i.e. h = 1). In this case, (b — 1) o f the procedure’ s steps are performed (b — 1) times. Because G'(l) = (b — l ) 2. the G{h) holds for h = I. Now assume th a t G(h) is correct for procedures of height h — 1. Consider a root procedure of height h. Each of the root procedure’s direct descendents represents a procedure of height h — 1. Since there are b direct descendents, the number of steps performed while experim enting on procedures o th er than the root procedure are bG(h - L)= b{h - 1 ){bh - 2bh~ l 4- bll~2) = (h - l){bh+l - 2 bh + bh~l) = G{h) - (bh+l - 2bh + bh~l) (Y) 154 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. From (X), we know th a t the num ber of primitive steps performed while experimenting on only the root procedure is bh+l — 2bh -f bh~l . Now if we combine the steps performed for the root procedure with steps perform ed for the subprocedures (Y), we get a to tal of G (h ) primitive steps for performing all experiments. Thus, by m athem atical induction it has been shown th a t G{h) bounds the number of prim itive actions performed during experim ents on a multi-level procedure of height h. 6.7.1 Scalability Diligent’s techniques are meant to be used with short procedures th a t can be combined into modular, hierarchical procedures. If procedures are authored in a hierarchical manner, the number of prim itive steps performed by experiments decreases rapidly. Consider a procedure containing 125 primitive steps. If the procedure were authored without subprocedures, experiments would perform over 15,000 steps. However, the same procedure could be authored in a hierarchical manner with a height of 3 and a branching factor of 5. In this case, experim ents would only perform 1,200 steps. However, we have never seen any procedures close to this length. In the two domains th a t we’ve looked a t, the HPAC has th e longest procedures. If we ignore sensing actions, alm ost all HPAC procedures take less than about 15 steps. The longest procedure ap pears to be about 45 steps, but m ost very long procedures use common subsequences of steps, such as checking all 14 tem perature sensors or opening and closing all 5 separator drain manifold valves. These common subsequences could easily be modeled by reusable subprocedures. Furthermore, our experience has been th at the 1 or 2 m inutes spent experimenting is a small portion of the authoring process. Experiments on the hierarchical procedure may not learn as much about the proce dure as experiments on a one level procedure, but Diligent’s focus is not autonomous exploration. Instead, Diligent’s experim ents should provide a bounded, heuristic aid for identifying operator preconditions. Although using a hierarchy of procedures helps, Diligent's approach to experim entation is probably inappropriate for very large procedures. As Diligent gains more experience, experiments are likely provide little additional knowledge because by then both operators and subprocedures are likely to be very refined. Instead, Diligent’s approach appears more appropriate for small subprocedures th a t can be used to construct large procedures. 155 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 6.8 Related Work When appropriate, related work has been mentioned throughout this chapter. However, some other work should be mentioned. 6.8.1 T h e S elf-E xp lan ation Effect The self-explanation effect [CBL+89, CV91, CLCL94, Ren97] describes the phenomenon where human students can solve procedural problems better if they study a few problem solutions in detail rath er than many solutions briefly. The term “self-explanation” is used because students need to make a conscious and deliberate effort to justify each of the solution’s steps. Besides better problem solving, C hi et al. [CBL+89] found th at students who produced self-explanations when studying physics had a b etter understanding about gaps in their knowledge. Although Diligent does not model human cognition, the self-explanation effect mo tivates Diligent’s experim entation technique of examining each dem onstration in detail. Diligent’s dem onstrations are comparable to the problem solutions given human students. To explain a dem onstration, Diligent tries to understand how sta te changes caused by earlier steps affect later steps. The self-explanation effect is modeled by CASCADE [VJC92, Van99], which models human students learning to solve physics problems by studying the solutions of problems. Instead of experim enting with a simulation like Diligent, CASCADE uses knowledge of domain theorems (e.g. physics laws) and problem modeling concepts. CASCADE has been used as the basis for acquiring knowledge for an autom ated tutoring system [GCV98]. If a knowledge acquisition system has easy access to a well-defined domain theory, then CASCADE’S approach m ight be appropriate. Unlike CASCADE, Diligent does not require direct access to a well-defined domain theory. 6.8.2 O ther S y ste m s A system th at experim ents by systematically analyzing dem onstrations is PE T [PK86]. Unlike Diligent, PE T has complete control of th e state. PE T attem pts to understand a sequence of actions by system atically changing the state and then performing actions. However, Diligent cannot use this approach because Diligent has limited control over the environm ent’s state. 156 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. A system uses th a t uses dem onstrations for generating experiments is C A P [HS91]. C A P observes another agent and creates a theory to describe a sequence o f actions. Un like Diligent, C A P uses inverse resolution to create new concepts and to generalize the object classes. As discussed in C hapter 5, Diligent solves a different learning problem. Furtherm ore, C A P reactively experiments when the environment is in an opportunistic s ta te rather than system atically resetting the sta te and performing experim ents. If you consider a successful plan as equivalent to a dem onstration, then some case- based systems can also use dem onstrations to generate experiments. For exam ple, CH EF [Ham89] performs experim ents by adapting and repairing plans for Szechwan cooking. C H E F experiments by creating a plan and then getting feedback about plan failure from a sim ulation. The feedback consists of faults and reasons. A fault is an undesired attrib u te value, and a reason is a causal explanation for the fault. Instead of repairing plans, Diligent learns the type of causal knowledge th at is returned to CHEF by the sim ulation. 6.9 Summary In this chapter we discussed how Diligent performs autonom ous experim ents to help it understand the preconditions of a procedure’s steps. We first looked a t how this problem fits into the general requirements: experim ents should make the instructor’s job easier while maximizing the use of the lim ited num ber of dem onstrations. We also discussed some specific requirements. Experiments should generate action- examples for refining the operators associated with the procedure’ s steps. T he action- exam ples should help compensate for the bias used in creating operator preconditions. To prom ote learning, a step ’s action-examples should have similar pre-states so th a t positive and negative examples have sim ilar pre-states. Because the system is interactive, it should be fast and should a ttem p t to bound the number of steps in an experiment. We then discussed why other approaches were inappropriate: they perform too many steps, require too much domain knowledge, require too much interaction with the instruc tor, and do not focus on understanding the dem onstrations of the given procedure. We then discussed Diligcnt’s approach to experim entation. Diligent performs the proce dure while skipping a step and observing how this im pacts later steps. Diligent’s approach does not require interaction with the instructor and focuses on understanding the given procedure’ s dem onstrations. Furthermore, because Diligent does not a tte m p t to achieve any goal state, each experiment has a bounded number of steps. The num ber of steps is 157 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. further limited because experim ents treat abstract steps the sam e as primitive steps. In other words, Diligent skips steps in the current procedure, but does not perform similar experiments in the subprocedures associated with abstract steps. We finished by showing th a t hierarchical composition of larger procedures from smaller procedures can greatly reduce the num ber of steps performed during experiments. 158 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Chapter 7 Empirical Evaluation So far, we’ve discussed how Diligent understands dem onstrations and how Diligent can be used for authoring. But is Diligent an effective tool for authoring? This chapter addresses this question. Specifically, a stu d y was conducted where people authored procedures with different versions of Diligent. In the study, everyone authored the same procedures, but each subject only used a single version of Diligent. T he different versions were then compared using variables such as accuracy, effort, tim e and subjective evaluation. This chapter is organized as follows. First, we discuss the testable hypotheses and the three versions of Diligent th a t were used to test the hypotheses. We then discuss how we tested the usability of Diligent and its tutorial materials. Afterwards, we dis cuss the experimental m ethod, the experimental results, and how the results support the hypotheses. 7.1 H ypotheses In the evaluation, we were concerned about two hypotheses th a t dealt with the benefits of dem onstrations and of experim ents. One hypothesis is th a t dem onstrations are beneficial even if Diligent does not perform experiments. To test this hypothesis, we com pared subjects who used dem onstrations without experim ents against subjects who only used an editor. The other hypothesis is th a t using both experim ents and dem onstrations is b e tte r than using only dem onstrations. To test this hypothesis, we compared subjects who used both dem onstrations and experim ents against subjects who only used dem onstrations. When testing these hypotheses, all subjects could use an editor. The subjects who only used an editor differed from the others in th a t they had to specify a procedure’s steps with the editor. In contrast, the other subjects had to specify steps with dem onstrations. 159 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. To measure these hypotheses, a number of testable claims were created. Each claim corresponds to one of the dependent variables.1 Claim 1 : Subjects require less work to create a procedure when using demonstrations and experiments than when using only dem onstm tions. Work in this case m eans deliberative changes to Diligent’s knowledge base rather than tim e spent authoring. For example, adding a step is a deliberative change while looking a t a menu is not. Claim 2: Subjects require less work to create a procedure when using only demonstra tions than when using only an editor. Claim 3: Using demonstrations and experiments results in few er errors than when using only demonstrations. Claim 4: Using only demonstrations results in few er errors than when using only an editor. Demonstrations should be helpful because Diligent uses them to identify preconditions. When identifying preconditions, Diligent uses an heuristic bias th at favors likely but po tentially unnecessary preconditions. Thus, subjects who use dem onstrations can focus on a small set of likely preconditions, while subjects who use an editor have to consider a large set of potential preconditions. Claim 5: Subjects require less work to create a correct procedure when using demon strations and experiments than when using only demonstrations. Claim 6 : Subjects require less work to create a correct procedure when using only demon strations than when using only an editor. Claim 7: Subjects can author in less lime using demonstrations and experiments than when using only demonstrations. Claim 8 : Subjects can author in less time using only demonstrations than when using only an editor. Because it did not seem feasible, we did not test of the benefits of hierarchical proce dures or the reuse of existing procedures. 7.2 T he Three Versions o f Diligent In order to test the experim ental hypotheses, three versions of Diligent were created. The versions support different methods for adding steps and for specifying preconditions and 'Section 7.4.3 describes the dependent variables. 160 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. state changes. All versions allow subjects to edit an existing plan. The three versions are described below . 2 • Demonstrations and Experiments. S ubjects can dem onstrate procedures, and Dili gent can experim ent on the procedures. • Demonstrations. Subjects can dem onstrate procedures, but Diligent cannot perform experiments. • Editor Only. Subjects cannot dem onstrate and Diligent cannot experim ent, but subjects can use an editor to create a declarative specification. A subject adds a step by selecting an action to perform. T he subject then specifies preconditions and state changes associated with the step by selecting attributes and typing in their values. The menus for specifying actions and attrib u te values are only available in this version of the system . Requiring subjects to enter attribute values by typing is reasonable because Diligent does not know which a ttrib u te values a re legal. Furthermore, typing isn’t th a t oner ous because most a ttrib u te values are sh o rt (e.g. * ‘shut” ) and because subjects are given a list containing each attrib u te’s legal values (see Appendix B). Furtherm ore, avoiding typing errors is a benefit of dem onstrating. Because the subjects were given a list o f all legal attribute values, one could argue that it would be little effort to provide m enus containing all legal values. However, the list of legal attrib u te values was only provided because it was necessary for subjects who used this version of Diligent. However, this discussion about whether o r not subjects should type in values appears to be moot because subjects appeared to m ake so few errors in typing th a t these errors had little or no effect. Because this version does not allow dem onstrations, this version does not allow interaction with the environm ent while steps are being added. While steps are being added, this version ignores actions performed in the environment, does not perform actions, and ignores the sta te of the environm ent. This version is meant to correspond to declaratively specifying a procedure using a text editor, but unlike a text editor, this version autom atically collects evaluation data, guarantees syntactic correctness and allows the system to check for consistency. 2 Appendix D describes how to use the different versions. 161 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. For example, the system checks for consistency when deriving a procedure's step relationships. A subject is warned about inconsistency when the sta te changes of an earlier step establish an a ttrib u te value th a t is different th a n the value in a later step’s precondition. The menus used for all three versions are very similar. All versions use the same procedure and operator representation. The versions also use th e sam e algorithm s to derive goal conditions and step relationships. However, because th e editor only version lacks knowledge of the environm ent’s sta te , th at version uses the preconditions and state changes of steps to create a pre-state and post-state for each step. 7.3 Usability Analysis Prior to conducting the study, an informal analysis of Diligent’s usability was conducted to ensure th at the user interface and the training docum entation were adequate. For the user interface, this m eant th a t subjects could author procedures w ith Diligent and knew how to find various types of information. For training docum entation, this m eant that subjects could cover the m aterial in 30 to 40 minutes. In order to avoid using all potential subjects, usability was tested on only three subjects ( 1 graduate student and 2 research staff) over a period of a couple m onths. We planned on performing the test in the following manner. S ubjects would be video taped using Diligent and would vocalize their thoughts. However, unlike a formal protocol analysis [Chi97, ES84. JH95, GMAB93], the subject’s vocalizations would not be system atically analyzed. Subjects were to use the sam e training material as th e formal evaluation. Additionally, the subjects were to learn all three versions of Diligent. However, the test did not go as planned. Because of problems in the documentation and, to a lesser extent, the user interface, none of the subjects com pleted all the training m aterial. Of the training material for the stu d y ’s two sessions, the subjects only covered the first session’ s m aterial. 3 Furtherm ore, subjects had difficulty vocalizing their thoughts. T he m ost im portant finding was th at first day’s training took at least 50 minutes. The long training period is im portant because it limited the number of people willing to be test subjects. 3As will be explained in greater detail (Section 7.4.4), the study took two sessions, one in which the subjects learned how to use Diligent, and the second in wliich they reviewed w hat they had previously learned and then carried out the test. 162 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Although only a few subjects were used, each subject caused substantial im provements in the materials given the next subject. This phase of the evaluation resulted in the following changes: • Too many of the windows looked alike. One subject did not notice the window titles on the window borders. This created confusion about which menu was being viewed and about the functionality of different menus. Giving the menus large titles seemed to solve this problem. • The interfaces of too m any program s were used. Subjects interacted with four pro grams th at each had a different look and feel. The most problems were caused by the differences between Diligent and the STE V E tutor. At the tim e, STE V E’s interface was used for testing procedures, but the interface inconsistencies between STEVE and Diligent made the training more difficult. Therefore, Diligent was given control of testing. Although Diligent uses ST E V E ’s functionality, subjects initiated testing inside Diligent. This has a num ber o f advan tages. A lot of debugging activities are combined onto one menu. The approach also supports easier instrum entation and allows Diligent to disable testing during activ ities such as experim ents and dem onstrations. Earlier, the possibility th a t subjects might simultaneously test and dem onstrate was a major concern. This issue has architectural implications for this type of heterogeneous system . Ei ther the disparate softw are com ponents must conform to a common user interface look and feel, or the com ponents need to support the use of their functionality by other components. Given th a t com ponents may come from very different sources, exporting functionality m ay be easier than enforcing a common look and feel. • Use as many forcing functions as possible. A forcing function [Nor8 8 ] prevents a user from performing actions th at are unwanted in a given context. For example, Diligent’s windows contain buttons th a t will close them, but one subject kept using the X-window exit com m and. This behavior was unanticipated and caused incon sistent data. This problem was solved by preventing the X-window exit com m and from closing the window. A nother exam ple of a forcing function is disabling testing during a dem onstration. • The user’s manual was transform ed into a tutorial. Originally, the user’s manual gave instructions for a running example while extensively describing each window. Unfortunately, subjects had difficulty remembering the im portant points. Therefore, 163 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. the user’s manual was transform ed into a tutorial by trimm ing unnecessary descrip tions, adding summ aries and ignoring unnecessary windows. Surprisingly, most of the effort to improve usability involved improving the tutorial. 4 7.4 Experimental M ethod After analyzing and improving D iligent’s usability, a study was conducted to evaluate Diligent. The study had a betw een-subjects design where each subject authored two procedures and used only one version o f Diligent. 7.4.1 Independent V ariable The independent variable was the m ethod of authoring. Each of the three versions of Diligent (Section 7.2) represented a different m ethod of authoring. Thus, there were three experim ental conditions. • EC\'. Authoring with dem onstrations and experiments. • EC2: Authoring with only demonstrations. • ECz' Authoring with only an editor. As mentioned earlier, all three experim ental conditions allowed subjects to edit existing procedures. 7.4.2 Test Subjects Test subjects were recruited by asking com puter science graduate students and sending email to the staff at the Information Sciences Institute. Sixteen subjects started the study, and all but one finished it . 5 O f the fifteen subjects who completed the study, fourteen were com puter science graduate students and one was a member of the technical staff . 6 Most subjects work in areas related to artificial intelligence. Subjects were paid 20 dollars. An effort was made to balance th e subjects’ sex, education and whether they were native English speakers. However, th is proved difficult because few subjects were available * While the tutorial covered authoring an exam ple procedure in a keystroke by keystroke manner. Diligent was not used to directly author the tutorial because Diligent cannot capture screen snapshots of its own menus. 5The subject who quit felt th a t he was to o busy to finish the study. 6 It was initially thought th a t the subject was a graduate student. 164 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. and willing, because subjects cancelled, and because of problem s th a t resulted in some lost d a ta for the first procedure. (We will discuss these problem s later.) The fifteen subjects were placed in the three groups in an uneven manner. G roups E C i, E C 2 and E C 3 had 4, 6 and 5 subjects, respectively. One factor influencing this was the inability to collect som e d a ta for E C 2 subjects. A nother factor was th at few subjects were available later in the evaluation. O ne subject (subject 1 1 ) was switched from E C 1 to E C 2 because the subject used dem onstrations but no experim ents.' Type o f subjects Group E C i e c 2 e c 3 male native speaker 0 1 3 male non-native speaker 3 3 2 female native speaker 1 1 0 female non-native speaker 0 1 0 Table 7.1: D istribution of Subjects Based on Sex and Language Table 7.1 shows the distribution o f subjects based on sex and language. The m ajor balancing effort was attem pting to get enough subjects in each group. The next criteria was trying to balance English ability and then sex. Furtherm ore, if it was felt th a t a subject knew th at Diligent uses program m ing by dem onstration, then the subject was put in groups EC \ or E C 2 - This was done to avoid preconceptions from biasing the control group (E C 3 ) . 8 Because subjects had to cover around 90 pages of train in g material, it was felt th at native English speakers would find the training easier. For this reason, subjects were distributed so that no group had more English speakers th a n the control group (E C 3). One problem with the m ethodology is th at the background questionnaire was filled out after the subjects were assigned to an experim ental condition. This means th a t only sex and English ability were imm ediately obvious. Thus, th e num ber of years of education could only be roughly estim ated and was, therefore, difficult to use for assigning subjects to groups. ' In order to keep Diligent’s user interface responsive, Diligent only experim ents when asked to do so by the user. 8 For the last few subjects, w hether a subject was likely to know th a t Diligent uses programming by dem onstration was not considered. 165 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 7.4.3 D epen d en t V ariables The goal of the experiment was get some m easure of the difficulty in authoring. T he idea is th a t authoring should be faster and more accurate if there is less burden placed on the instructor. The dependent variables were • Time. Time was measured three ways. One time was the training tim e, which includes the few m inutes used to fill out the background questionnaire. T he second time was the time spent authoring before the subject started testing the procedure, and the third tim e was the total tim e spent authoring a procedure. After a subject started testing, the subject still could provide dem onstrations and edit procedures, and Diligent could still perform experim ents. • Logical Edits. A logical edit is an authoring activity th a t requires knowledge o f the procedure or the dom ain. Logical edits can be thought of as deliberative changes to Diligent’s knowledge base. Logical edits were used to factor out tim e-related user interface efficiency issues th a t were highly dependent on the structure and layout of menus. The follow item s were counted as logical edits : 9 - Adding or dem onstrating a step. - Performing an action as part of a dem onstration’s prefix. - Deleting a step from a procedure. - Editing preconditions, state changes, goal conditions, and step relationships (i.e. causal links and ordering constraints). - Edits to a filter. A filter allow a subject to prevent a given attribute from being used in causal links or ordering constraints . 1 0 Logical edits did not include more passive activities, such as looking a t m enus or approving d a ta derived by Diligent. (Diligent uses its knowledge of preconditions and state changes to derive a procedure’s goal conditions and step relationships.) Logical edits were recorded in two places: immediately before a subject sta rte d testing a procedure and when a subject was finished with a procedure. D uring 9 Edits to associate an effect with a step were also measured for E C j b u t were not used because these edits usually required little thought. 10Filters are meant to remove “nuisance” attributes th a t an author doesn’t care about. Filters were not needed in the procedures being authored, and none of the subject's used them . 166 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. authoring, Diligent autom atically collected the metrics used for counting logical edits. After a procedure was finished, the metrics and Diligent’s knowledge base were saved to files. After saving the d a ta , Diligent prepared for the next procedure by erasing its knowledge base and clearing the counters used for gathering metrics. • Errors. A nother metric was num ber of errors in a procedure’s plan. Each plan was compared against an ideal targ et plan. Each additional or missing piece of knowledge was counted as one error. See Section 7.4.3.1 for details on how errors were measured. • Total required effort. This was the am ount of work needed to make a procedure correct. This was the sum of logical edits and errors. For simplicity, we assumed th a t each error could be corrected by one logical edit. • Qualitative Impressions. A fter authoring both procedures, subjects filled out a ques tionnaire about their subjective impressions of Diligent. 7.4.3.1 Measuring Errors in Plans W hen a subject finished authoring a procedure, the procedure’s plan was saved to a log file. After all subjects had com pleted the study, the subjects’ plans were com pared against idealized target plans (Appendix B). This comparison identified errors in the subjects' plans. Errors occur when a plan has missing or unnecessary steps or step relationships. The problem is that it is sometimes difficult to count errors. For example, a plan may have a necessary step that is repeated several times. Obviously, the step should only be counted once. However, the subject’s plan may contain all the causal links for the target plan’s step without containing a single step th a t is associated with all the causal links. The issue is how to decide which step relationships are correct. Errors were calculated as follows. • Each difference from the targ e t plan counted as one error. In other words, each incorrect or missing step, causal link or ordering constraint counted as one error. • A step was correct if the targ e t plan contained a step with the sam e action. How ever, each step in the targ et could only match a single step in the su b ject’s plan. When multiple steps in su b ject’s plan mapped to a step in the target, one of the sub je c t’s steps was selected based on the comparison of its relative position compared to the plan’s other steps. In particular, the step was chosen to preserve as many dependencies between steps as possible. 167 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. In several instances with the editor only version (EC 3 ), the sta te changes of several target steps were produced by a single step. W hen this happened, the su b ject’s step was associated with the targ et step that seemed most reasonable. • Causal links and ordering constraints were checked by com paring corresponding steps in the target and subject’s plans. Causal links and ordering constraints could also be m atched if only one o f their two steps mapped to a step in the target procedure. In this case, the action of the excluded step must have matched an action of one of the steps in the targ et procedure. However, a step relationship (i.e. causal link or ordering constraint) in the target plan could only m atch one step relationship in the subject’s plan. Counting a step relationship when only one matching step helped when a su b je c t’s procedure contained an unnecessary repetition o f one of the target procedure’s steps. This most often helped plans by subjects th a t only used an editor (EC 3 ). • W hen authoring a procedure, subjects som etim es authored several plans. W hen this happened, the plans were inspected and the most complete and correct plan was used. (This always appeared to be the most recent plan.) However, the logical edits for all the plans were counted. • The final version of plans were also inspected to see if STEV E could dem onstrate them . Things that prevented successful dem onstrations include — Missing steps. - Some of the step relationships used in the plan would not be satisfied by the environment. A dem onstration was considered possible if th e order of the steps in the procedure was valid, even if some of the necessary step relationships were missing. 7 .4.4 T est Procedure To perform the study, the subjects completed the activities listed in Table 7.2. The activities took place over two consecutive days. Each day’s activities took approxim ately two hours. Participation of each subject covered two days so that subjects could assim ilate the first day’s training. All subjects appeared more proficient on the second day. 168 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Day A ctivity Tim e Limit (m in utes) 1 Fill o u t background questionnaire. Work through the first day tutorial. Read short sys tem overview. M anipulate the environm ents graph ical interface. Read about procedural representation and fill out procedural representation worksheet. C re ate a procedure. Edit and test the procedure. Review' sum m ary of how to use Diligent. 2 Review the first day tutorial by focusing on th e sum m ary 10 Work through the second day tutorial. Review the first d ay ’s material by authoring a sim ple procedure. Learn how to delete unwanted steps. Solve a practice problem, which involves creating and testing a procedure. 10 Look a t practice problem solution S tart experiment by authoring the first procedure. 30 A uthor the second procedure 30 Fill o u t questionnaire about impressions of Diligent Table 7.2: Activities Performed By Subjects The training received by all experimental conditions was deliberately very similar. The training for the group with both dem onstrations and experim ents (E C \) was nearly identical to the group th a t allowed dem onstrations but no experim ents (E C 2 )■ Even the training for the group who could only use an editor (E C 3 ) was similar to the other groups. In fact, the tutorial for E C 3 differed from the other groups only in how steps were added to a procedure and how preconditions and sta te changes were specified. The tutorial starts out very specific but becomes more abstract after an activity has been described. When first encountering an activity, the tutorial describes each action on a button click by button click basis. Associated with these detailed instructions were dozens of screen snapshots. Later, after an activity has already been covered, the tutorial only provides a high level description of w hat needs to be done. The initial detail provides scaffolding th a t promotes initial understanding, and the later removal of the scaffolding promotes competence by reducing the reliance on detailed instructions. For exam ple, the 169 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. first day tutorial for E C \ is 77 pages1 1 and has over 48 figures and tables. In contrast, the second day tutorial reviews much of th e same m aterial in only 7 pages. Before authoring a procedure on the 1st day (Table 7.2), subjects read about the procedural representation and filled o u t a worksheet on it (Sections B.2 and B.3). This separation was critical for training. Otherwise, users would be required to author a pro cedure before they understood the procedural representation. If subjects were to focus on learning the representation, they m ight pay less attention to learning the user interface. The material on procedure representation was believed to be much more im portant for group that used an editor instead of dem onstrations {EC^). The practice problem a t the end o f training helped ensure th a t subjects were ready to perform the experim ent. The problem let subjects use the system without directions, and the solution allowed them to check if they had misconceptions about how they should author. Originally, the test m onitor was to have little or no comm unication with a subject th a t was not part of the test script. Q uestions would be answered by pointing to windows or pre-printed answers, such as “yes” or “no.” However, this proved very awkward. Therefore, during training, pointing to windows and tutorial pages was used, but verbal answers (e.g. “yes” ) were sometimes given. An effort was made to make verbal answers as short and as specific as possible. Questions were only answered if they were relevant to the tutorial m aterial that the subject was working on. Because of the detail in the tutorial, questions were infrequent. In contrast, during th e experiment, pre-printed directions were used and questions could not be answered. However, if there was a software problem during an experiment, the test m onitor spoke to the subject and attem pted to put the system back into a usable state. Before authoring each procedure, Diligent's existing knowledge of the domain was erased. The environm ent was then placed in the procedure's initial state, and the subjects were then given the following inform ation: • A functional description of the procedure without an explicit specification of which steps to perform. The steps were not included because there was concern th a t subjects would simply transcribe the description into Diligent’s representation. • A set pictures with labels for relevant objects. 1 1 Many pages contain a great deal of whitespace. L70 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. • A list of all HPAC attributes and their legal values. (The list was needed by the control group (E C z), which could not use dem onstrations.) Although both th e training and the experiment used th e HPAC domain, the HPAC objects used in the experim ent were not used for authoring procedures during training. Additional inform ation on the test procedure is contained the appendices. Appendix D contains some o f the tutorial m aterial. 1 2 Appendix B contains the other evaluation materials (e.g. directions). Appendix C contains deviations from the test procedure as well the other d a ta collected during the study. 7.4.5 T he P ro ced u res B eing A uthored During the experim ent, subjects authored two procedures . 1 3 The procedures are derived from real procedures in the HPAC domain, but have been adapted to the portion of the HPAC th a t is supported by the graphical interface. T he simulation th a t controlled the environment was modified so that the procedures were supported and were partially ordered . 1 4 The procedures were chosen for the following reasons: • They were partially ordered. • Knowledge of one procedure should provide little or no help on the other procedure. • Each procedure was logically one procedure rather th an a concatenation of two pro cedures. • They had between 6 and 8 steps. The two procedures have slightly different properties. T he first procedure has a de liberately more ab stract description and is more complex th an the second procedure. ( 8 steps, 13 ordering constraints and 30 causal links versus 6 steps, 7 ordering constraints and 16 causal links.) The procedures were authored in this order because reversing the order might have caused subjects to include U unwanted” attrib u tes in the more complex procedure, which would have made scoring the procedure m ore difficult. 1 2 The training m aterial for each of the groups contains approxim ately 90 pages. Because the length and the similarity of the m aterial between groups, Appendix D combines training material for the three groups. l3The procedures th a t were authored and the test m aterials are in A ppendix B. 1 4 Diligent is designed for partially ordered procedures. The experim ents performed by Diligent may not learn much if a procedure is totally ordered. A procedure is totally ordered if there is only a single valid order for performing the steps. 171 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. There were a num ber of problems when subjects authored the first procedure. First, there was a memory leak th a t would cause the environment"s graphical interface (i.e. V ista Viewer) to become less responsive and sometimes crash. This problem was fixed with a software upgrade. Second, subjects would sometimes dem onstrate steps too quickly. This caused the steps to appear to be sim ultaneous, and sim ultaneous steps cause problem s with Diligent’ s operator learning algorithm s. This problem was m ade more likely as the Vista Viewer became less responsive. The problem was addressed by fixing the memory leak and by reminding subjects not to dem onstrate too quickly. An additional problem was th at the description of the first procedure was unclear. This caused some subjects to have difficulties identifying the correct steps. This problem was addressed by clarifying the description. Because of these problems, different numbers of subjects are used when analyzing the two procedures. Only the final 6 subjects are used for the first procedure, while all subjects are used for the second procedure. 7.4.6 D ata A n a ly sis Section 7.1 contains testable claims a b o u t differences between groups E C \ and E C 2 and between groups E C 2 and E C 3 . To test for differences between groups, we used Analysis of Variance (ANOVA) [WW72], which tests for the differences between all groups. ANOVA compares variance within groups to variance between groups . 1 5 Because ANOVA depends on groups having a normal distribution and similar variances, we also used the Kruskal-Wallis test. T he Kruskal-Wallis test is a non-param etric test, which means that it does not depend on the distribution. Instead of using a dependent variable’s values, the Kruskal-Wallis te st sorts the values and uses their relative order. O f course, a non-param etric test requires greater differences than an ANOVA test. Because ANOVA and Kruskal-Wallis compare all groups, we performed post hoc tests to compare pairs of groups. A post hoc test simultaneously com pares pairs of groups to identify significant differences while m aintaining a 95% probability th a t all com parisons are true. The post hoc test used was SchefTe’s F [Sch53], which requires a significant ANOVA F value but is robust with in regard to heterogeneous variances. For statistical significance, we used the .05 probability level. i5A 11 statistical calculations were performed w ith version 5.0 of StatView for W indows by SAS Institute Inc.. 172 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. A word o f caution - the statistical significance of this c h a p te r’ s results should be viewed with a little skepticism. Significance was difficult to establish because there were few subjects. Furtherm ore, because there were so few d a ta points, the results are too sensitive to individual d a ta points. In fact, some researchers do n o t consider data as statistically significant unless there several times as m any subjects as in this study. Nevertheless, the results are valuable because they suggest patterns and tren d s. 7.5 Results This section presents the d a ta collected during the stu d y . 1 6 (The d a ta will be discussed in Section 7.6.) In the following tables, a few conventions are used. T h e num ber of digits shown may not indicate the num ber of significant digits. The “p re-test” values are the values when subjects started testing their procedures. If a subject d id n ’t test a procedure, the final value was used. As mentioned earlier, because there were relatively few subjects, the following d a ta are used to suggest trends and patterns rather than to provide solid statistical proof. 7.5.1 R esults o f B ackground Q uestionnaire At the beginning of training, subjects filled out a background questionnaire. Their re sponses were then analyzed to look for patterns in the distribution of subjects between groups. The experim ental condition (e.g. EC{) was used as an ANOVA factor for this analysis. The results are shown in tables 7.3 and 7.4 . 1 7 The only significant difference is the typical time spent browsing per week. The group th at demonstrated w ithout experim ents (E C ?) spent th e m ost tim e browsing (13 hours). The variable education represents years of education. T his includes 1‘ 2 years for grad uating from high school. The group th a t dem onstrated an d experimented (E C i) was the oldest and the group th a t only dem onstrated (EC?) was th e youngest. However, the stan dard deviation of the group th at only used the editor (EC%) is several times larger than the standard deviations of the other groups. The variable English ability indicates a subject’s rating o f his English proficiency. The subject’s rating was converted into a num eric value: good (1), excellent (2), native (3). 1 6 Appendix C contains a more detailed presentation of the d a ta collected during the study. ' ‘The ANOVA values were computed with 12 and 2 degrees of freedom, except for previous week's com puter use, which had 11 and 2 degrees of freedom. 173 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. D e p e n d e n t V a ria b le F P r o b a b ility Education 1.542 .2535 English ability 1.346 .2969 Sex 0.912 .4278 Age 0.863 .4466 Machine learning knowledge 0.340 .7187 Artificial intelligence planning knowledge 0.340 .7187 Program m ing ability 0.167 .8484 Typical browsing 4.851 .0286 Program m ing last week 3.806 .0554 Typical hours/week • 2 . 2 1 1 .1522 Browsing last week 1.928 .1916 Total hours last week 1.774 .2149 Table 7.3: Background ANOVA Tests D ependent V ariable EC i EC, EC's M ean S td .D e v M ean Std.Dev M ean Std.D ev Education 21.9 0.85 18.9 1.5 19.5 4.3 English ability 1.7 0.96 • 2 . 0 0.89 • 2 . 6 0.55 Sex 0.75 0.5 0.67 0.5 1 . 0 0 Age 33.7 •2.5 30.0 3.2 35.0 1 0 . 6 Machine learning knowledge 0.5 0.58 0.3 0.52 0 . 6 0.55 Artificial intelligence planning knowledge 0.5 0.58 0.3 0.52 0 . 6 0.55 Programming ability • 2 . 0 0 . 0 • 2 . 0 0.63 2 . 2 0.84 Typical browsing 8 6 13 5 5 0 Programming last week 8 9 2 0 1 0 31 15 Typical hours/week 40 0 49 1 1 44 9 Browsing last week 1 0 7 7 6 3 2 Total hours last week 30 8 37 14 45 7 Table 7.4: Background M eans and Standard Deviations Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The variable sex indicates w hether a subject is m ale (1) or female (0). Because several female subjects canceled, the distribution of fem ales is skewed. The variable age is the su b ject’s age in years. Because the questionnaire asked subjects to circle a range of ages, the top age in the interval was used . 1 8 The reason th a t group E C z had the largest age is th a t the group had the oldest subject (50). The variables machine learning knowledge and artificial intelligence planning knowledge represent a yes ( 1 ) or no (0 ) about whether a su b ject felt he had significant knowledge in th a t area. The variable programming ability contains a s u b je c t’s self rating. A su b ject’s rating was converted into a numeric value: intermediate ( 1 ), good (2 ), expert (3 ). The “typical” com puter use num bers represent typical hours per week spent using a com puter, and the “last week” num bers reflect the hours spent during the previous week on a computer. 7 .5 .2 T im e Spent T raining We looked for correlations between the subjects’ backgrounds and training tim e. T he d a ta are shown in tables 7.5 and 7.6. We expected all groups to have sim ilar training tim es because all groups received very sim ilar training. As expected, no significant difference between groups was found for training time. T he first day’s training time had more variation than the second day’s training. The decrease in variation on the second day was expected because less m aterial was covered and because the subjects were already familiar w ith the system. D e p e n d e n t V a ria b le F P ro b a b ility Day 1 0.916 .4265 Day 2 0.281 .7598 Total time 0.762 .4881 Table 7.5: Background ANOVA Tests In order to find correlations between training tim e and background variables, a m ultiple linear regression was performed. D uring the regression, a subject’s experimental condition was ignored, and the total training tim e was used a s the dependent variable. The best fit 1 8 A range was used because one usability te st subject com plained about asking for an exact age. 175 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. D ep endent V ariable ECi ECn ECz M ean S td.D ev M ean S td .D e v M ean Std.D ev Day 1 (min.) 115 25 97 19 97 26 Day 2 (min.) 44 8 43 14 39 1 0 Total Time (min.) 159 23 140 28 136 35 Table 7.6: Training Time M eans and S tandard Deviations th at was found is shown in Table 7.7. Three independent variables were identified: years of education, artificial intelligence (AI) planning knowledge and English proficiency. O f the independent variables, only English proficiency was expected, and unlike the other independent variables, English proficiency is not statistically significant (P-Value). The regression coefficients (Std. Coeff.) indicate th a t English proficiency and AI planning knowledge decrease training tim e, while more education increases training time. The R 2 (R Squared) indicates th a t the independent variables only predict 61 percent of the variation in training time. 7.5.3 L ogical E dits While subjects authored procedures, Diligent recorded the num ber of logical edits th a t they performed. A logical edit is an authoring activity th a t requires knowledge of the procedure or the domain (e.g. dem onstrating a step). Logical edits do not include passive activities, such as looking at menus or approving d a ta derived by Diligent. Instead, a edit is deliberative change to Diligent’s knowledge base. The d a ta from the analysis are shown in Table 7.8, and grap h s of the d a ta are shown in Figure 7.1. The “pre-test” value is the value when the su b jects started testing their procedures. Procedure I ’s results are weak because only 6 six subjects were used. No significant differences between groups were found, but the values for E C \ (dem onstrations and ex periments) are much smaller than for the other tw o groups. Procedure ‘ 2’s results are stronger. There is a significant difference between the groups both before and after testing (ANOVA and Kruskal-W allis). T here is also a significant difference between groups E C \ and ECz and between groups E C z and E C z■ The group th a t used dem onstrations and experim ents (E C i) had the lowest values, while the group th at used an editor (ECz) had the highest values. 176 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. fkgreaiion Summary tola! training v*. 3 Indapandnnta COunl Man. Mssing R R Squared Adjusted R Squared HAG Residual 1 .781 .609 .503 20.483 ANOVA Table total training v i.3 b id * p an d aate DF Sum of Squares Mean Square F-Value P-Value Regression 3 7195.687 2398.562 5.717 .0131 Residual 1t 4615.247 419.568 Total 14 11810.933 ftegreaaion Coefficients total training vs. 3 Independents Coefficient Std. Error Std-Coeff. t-Vafce P-Value Intercept 83291 54.617 83.291 1.525 .1555 education 4.915 2.185 .471 2 2 4 9 .0460 Bigishabily -11.171 7.706 -.321 -1.450 .1751 Ranting know ledge -28.569 11.356 -.506 -2516 .0287 Table 7.7: Linear Regression on T otal Training Tim e Means and Standard Deviations D ependent V ariable EC i EC , EC 3 M ean Std.D ev M ean S td.D ev M ean Std.D ev Procedure 1 total edits 9.5 2 . 1 35.0 12.7 37.5 7.8 Procedure 2 pre-test edits 8.7 2 . 1 1 1 . 8 5.1 24.6 4.6 Procedure 2 total edits 9.0 2 . 2 16.8 6.4 26.0 3.8 ANOVA R esults D ependent V ariable F P robability Procedure 1 total edits 6.346 .0836 Procedure 2 pre-test edits 18.048 (*) . 0 0 0 2 Procedure 2 total edits 14.021 (*) .0007 Kruskal-W allis R esults D ependent V ariable P ro b ab ility Procedure I total edits .1801 Procedure 2 pre-test edits (*) .0079 Procedure 2 total edits (*) .0070 Post Hoc Test P ro b ab ilities D ependent V ariable E C \, EC , ECitECz E C t'E C z Procedure 1 total edits .1316 .1064 .9601 Procedure 2 pre-test edits .5599 (*) .0006 (*) .0014 Procedure 2 total edits .0785 (*) .0008 (*) .0273 Table 7.8: Logical Edit Analysis 1 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 45 ■ ° • 40 ■ 35 • S ® 30 ■ 3 • • editor (EC3) “ 25- o O demonstration (EC2) § 2 0 - A experiment (BC1) Q . 15- 1 0- ▲ ▲ 5 Subjects • m 30 ' 1 ' 8 2 5 - • « I • • editor (EC3) n 20 ' o • O demonstration (BC2) 8 A experiment (EC1) q.15 - o 0 0 A 10 - A A O ► o Subjects 35 • 30 - • < n I 25 • O • o • • editor (BC3) § 2 0 o ■ O demonstration (BC2) CM A experiment (EC1) I 15 o - 10 * ° * - c ° A 5 Subjects Figure 7.1: G raphs of Logical Edits 178 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 7.5.4 Errors We will now present data on the errors in the subjects’ procedures. This has several aspects: how well were the subjects able to determ ine a procedure’s steps; the number of components (e.g. causal links) missing from a procedure: and th e num ber of unnecessary components in a procedure. We will finish by looking a t the to ta l errors. 7.5.4.1 Errors in Identifying Steps An im portant influence on the num ber of errors is how many o f the procedure’s steps are incorrect. This is im portant because any step relationships involving a missing or unnecessary step will be counted as errors. It was expected th a t subjects would have little difficulty in correctly identifying steps. D ependent V ariable ECi ECn ECz M ean Std.D ev M ean S td.D ev M ean Std.D ev Procedure 1 final missing steps 0 0 0 0 3.5 .7 Procedure 1 final unnec essary steps 0 0 1.5 .7 1 1.4 Procedure 2 final missing steps 1 . 0 . 8 0 0 . 2 .4 Procedure 2 final unnec essary steps 0 0 . 2 .4 .4 .9 Procedure 1 final invalid steps 0 0 1.5 .7 4.5 .7 Procedure 2 final invalid steps 1 . 8 1 2 . 6 .9 Procedure 1 works 1 0 1 0 0 0 Procedure 2 works .250 .5 .833 .4 .4 . 6 Table 7.9: Means and Standard Deviations on Invalid Steps How well the subjects were able to determine which steps to perform is shown in Table 7.9. The values in the table represent the final versions of the procedures. The “invalid steps” are the to tal missing and unnecessary steps. The last two rows (e.g. Procedure 1 works) indicate w hether a valid sequence of steps was specified ( 1 m eans yes and 0 means no). The biggest difference in the number of errors was for Procedure 1. A significant difference between the groups was found (ANOVA a t a 1 % level). The post hoc tests 179 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Means and Standard Deviations D ependent V ariable EC\ EC 2 e c 3 M ean S td .D ev M ean S td.D ev M ean S td.D ev Procedure 1 final errors 4.5 .7 9 4 44 .7 Procedure 2 pre-test errors 11.5 8.5 3.5 4 1 L . 6 5 Procedure 2 final errors 11.5 8.5 6 7 1 1 . 8 5 ANOVA R esults D ependent V ariable F P ro b a b ility Procedure 1 final errors 151.605 (*) . 0 0 1 0 Procedure 2 pre-test errors 3.388 .0681 Procedure 2 final errors 1.268 .3164 Kruskal-W allis R esults D ependent V ariable P ro b ab ility Procedure 1 final errors .1017 Procedure 2 pre-test errors .1054 Procedure 2 final errors .3371 P ost Hoc Test P ro b ab ilities D ependent V ariable EC uE C i ECuECz E C i,E C 3 Procedure 1 final errors .3368 (*) .0013 (*) .0018 Procedure 2 pre-test errors .1504 .9978 .1157 Procedure 2 final errors .4739 .9978 .3949 Table 7.10: Errors of Omission Analysis indicates a significant difference between E C 3 and the other groups. The groups th a t used dem onstrations {E C \ and E C 2 ) had fewer invalid steps. The differences between groups for Procedure 2 are relatively minor, and no significant differences were found. W hen considering the percentage of procedures th at would have worked, the subjects in EC '2 did a better job of dem onstrating than the subjects in E C \. 7.5.4.2 Errors o f Omission If a plan is missing a com ponent (i.e. step relationship or step), the error is called an error of omission. The data from the analysis are shown in Table 7.10, and graphs of the d a ta are shown in Figure 7.2. Because Diligent’s heuristics favor errors of commission, one would expect the group th a t used an editor {EC 3) to have more errors of omission. ISO Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. • editor (EC3) O demonstration (ECS) ▲ experiment (BC1) Subjects 22.5 S 17.5 - 1 15‘ 2 12.5 - o 10 - • editor (BC3) O demonstration (BC2) A experiment (BC1) 2 d > a. 0 ' -2.5 Subjects 2 2.5- 2 0 - o 17.5 - - • editor (EC3) - O demonstration (EC2) . A experiment (8C1) 12.5 w 7.5 a. 2 .5 - 0 - -2.5 - Subjects Figure 7.2: G raphs of Errors of Omission 181 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. In Procedure 1 , the groups are significantly different (ANOVA). G roup E C 3 has more errors and is significantly different than the other groups. G roup E C 1 has slightly fewer errors than group E C 2 . In Procedure 2, there is no significant difference between the groups. However, group E C 2 did better than the o th er groups. This was unexpected because E C \ and EC '2 both used demonstrations. 182 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Means and Standard Deviations D ependent V ariable ECi EC , ECz M ean Std.D ev M ean S td .D e v M ean Std.D ev Procedure 1 final errors 6.5 .7 26 18 8-5 .7 Procedure 2 pre-test errors 4 4 1 2 5 5 8 Procedure 2 final errors 4 4 1 0 6 6 8 ANOVA R esults D ependent V ariable F P robability Procedure I final errors 2.037 .2762 Procedure 2 pre-test errors 2.843 .0975 Procedure 2 final errors 1.015 .3916 K ruskal-W allis R esults D ependent V ariable P ro b a b ility Procedure 1 final errors .1017 Procedure 2 pre-test errors .0712 Procedure 2 final errors .3566 P ost Hoc Test P robabilities D ependent V ariable EC UEC , EC\,ECz E C ,,E C 3 Procedure 1 final errors .3236 .9826 .3808 Procedure 2 pre-test errors .1524 .9577 . 2 0 2 0 Procedure 2 final errors .4122 .8765 .6750 Table 7.11: Errors of Commission Analysis 7.5.4.3 Errors of Commission If a plan has an unnecessary component (i.e. step relationship or step), the error is called an error of commission. The data from the analysis are shown in Table 7.11, and graphs of the data are shown in Figure 7.3. Diligent’ s heuristics favor errors of commission over errors o f omission because it should be easier for an instructor to identify a mistake am ong a sm all set of unnecessary items than among a large set of missing items. Thus, we would expect group E C 2 to have the most errors. Group E C \ should have fewer errors than E C i because experiments should remove unnecessary conditions. G roup ECz should also have few errors because subjects have to explicitly specify each unnecessary item. 183 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. proc. 2 total errors o f comrosston proc. 2 pre-test errors o f commission • editor (BC3) O demonstration (B 2 ) A experiment (HM) Subjects • O 17.5 - • editor (EC3) ■ O demonstration (BC2) . A experiment (BC1) 7.5 2.5 0 ■ -2.5 Subjects • editor (EC3) O demonstration (BC2) A experiment (EC1) 12.5 - 1 0 - 7 .5 - 2.5 - o - -2.5 - ■ Subjects Figure 7.3: Graphs of E rrors of Commission 184 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. M eans and Standard Deviations D ep en d en t V ariable ECi ECi ECZ M ean S td .D ev M ean Std.D ev M ean Std.D ev Procedure 1 final errors 1 1 1 35 14 53 1 Procedure 2 pre-test errors 15 5 16 3 17 1 1 Procedure 2 final errors 15 6 16 1 18 1 0 AN OV A R esults D ep en d en t V ariable F P ro b a b ility Procedure 1 final errors 13.059 (*) .0331 Procedure 2 pre-test errors .039 .9617 Procedure 2 final errors .240 .7906 K ruskal-W allis R esults D ep en d en t V ariable P ro b a b ility Procedure 1 final errors .1017 Procedure 2 pre-test errors .9972 Procedure 2 final errors .9842 P o st H oc Test P robabilities D ep en d en t V ariable EC 1 , EC i EC'uECz ECi,ECz Procedure 1 final errors .1338 (*) .0334 .2402 Procedure 2 pre-test errors .9923 .9626 .9850 Procedure 2 final errors .9992 .8432 .8337 Table 7.12: Total Error Analysis Procedure 1 has a large num ber of step relationships. Thus, one would expect a large num ber of errors for group E C 2 , while groups E C \ and E C 3 would have few errors. This is w hat was found. However, there were no significant differences between the groups. Procedure 2 is simpler than Procedure I. Thus, all groups should have fewer errors. This is what was found. Although group E C 2 had slightly more errors than the other groups, there were no significant differences between the groups. 7.5.4.4 Total Errors A plan’s total errors are the sum of its errors of omission and commission. T he d a ta from the analysis are shown in Table 7.12, and graphs of the d a ta are shown in Figure 7.4. Because Procedure 1 is the m ore com plicated procedure, we expect the differences be tween groups to be larger. The groups are significantly different (ANOVA), and groups 185 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. to 50 ■ • edttor(BC3) O demonstration (EC2) A experiment (BC1) Subjects • editor (BC3) O demonstration (BC2) A experiment (EC1) Subjects 5 25 • • editor (BC3) O demonstration (EC2) A experiment (BC1) Subjects Figure 7.4: G raphs of Total Errors 186 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Means and Standard Deviations D ependent V ariable EC\ ECn ECz M ean S td .D ev M ean S td.D ev M ean S td .D ev Procedure 1 final re quired effort 2 0 4 70 1 90 9 Procedure 2 pre-test re quired effort 24 5 28 6 41 16 Procedure 2 final re quired effort 24 6 32 7 44 13 ANOVA R esu lts D ependent V ariable F P ro b a b ility Procedure 1 final required effort 78.490 (*) .0026 Procedure 2 pre-test required effort 3.803 .0526 Procedure 2 final required effort f 5.370 (*) .0216 K ruskal-W allis R esults D ependent V ariable P ro b a b ility Procedure 1 final required effort .1017 Procedure 2 pre-test required effort .0775 Procedure 2 final required effort (*) .0238 P o st H oc T est P robabilities D ependent V ariable EC 1, ECn EC 1, EC'z ECz,ECz Procedure 1 final errors (*) .0077 (*) .0028 .0833 Procedure 2 pre-test errors .8565 .0771 .1301 Procedure 2 final errors .4162 (*) .0237 .1517 Table 7.13: Total Required Effort Analysis E C i and E C z are significantly different. T he group that dem onstrated and experim ented (E C i) did b etter than the other groups. M ost errors for group E C 2 were erro rs of com mission, while most errors for groups E C 1 and E C z were errors of omission. In Procedure 2, each group had roughly th e same num ber of total errors, and no significant differences between the groups were detected. G roups E C \ and E C z had mostly errors of omission, while group EC z had m ostly errors of commission. G roup E C \ did a little better than the other groups even though its subjects did the worst job o f identifying the procedure’s steps. (Poor dem onstrations caused group E C \ to have errors o f omission. 187 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. _ 100 ■ 5 90 - S 70 - • editor (EC3) O dem onstration (EC2) A exp erim en t (EC1) • 60 - o o S. SO ■ 40 ■ 30 ■ 20 - Su b jects a 70 T ~ 65 - © 60 ■ i 55 - « 50 ■ o 45 - i « - 35 ■ ; 30 ■ 25 - 20 - S u b jects • editor (EC3) O dem on stration (EC2) A ex p erim en t (EC1) 70 = 65 = 60 « 55 S 50 °! 45 a 2 40 a 35 30 25 20 15 editor (EC3) dem on stration (EC2) exp erim en t (EC1) S u b jects Figure 7.5: Graphs of Total Required Effort 188 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 7.5.5 Total R equired Effort The total required effort is a measure of the am ount of work required to produce a correct plan. The required effort includes the work th a t has been done as well as the work th a t needs to be done. The previous work is m easured by logical edits, and the additional work is estim ated by the total errors. Total errors is used because each error can be corrected by an edit. The d a ta from the analysis are shown in Table 7.13, and graphs of the d a ta are shown in Figure 7.5. Procedure 1 , the more complex procedure, has significant differences (ANOVA) be tween the groups. G roup E C \ is significantly different and better than groups E C -2 and E C z• Group E C 2 is not as good as E C \ but b etter than E C z■ The differences between the groups result from differences in both the logical edits and the total errors. In Procedure 2, the differences between groups are close to significant before testing and are significant (ANOVA and Kruskal-Wallis) after testing. Like Procedure 1 , group E C 1 is significantly different and better than group EC z, while group E C 2 is worse than E C 1 but better than ECz- The differences between the groups result from the fewer logical edits required by groups th a t use dem onstrations (E C 1 and E C 2 ). 7.5.6 T im e Spent A uth oring When subjects authored procedures, two tim es were measured: when testing started and when the subject finished. After testing had started , subjects could still use D iligents full capabilities for dem onstrating, experim enting and editing. The d a ta from the analysis are shown in Table 7.14, and graphs of the d a ta are shown in Figure 7.6. The pre-test times for Procedure 1 are included even though none of the procedures was modified after testing started. There was a 30 minute time limit placed on each procedure, and subjects often seemed to run out of time. None of the groups are significantly different. However, the times for group E C 1 are slightly less than the times for EC-i. and the tim es for E C 2 are slightly less than the times for ECz- 7.5 .7 S u b jective Im pressions After the subjects finished authoring the two procedures, they filled out a questionnaire about their impressions of Diligent. The results are shown in Table 7.15. 189 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Means and Standard Deviations D ependent V ariable ECi e c 2 e c 3 M ean S td.D ev M ean S td.D ev M ean Std.D ev Procedure 1 pre-test time 27 4 29 2 30 .9 Procedure 1 total time 29 2 30 .4 30 .9 Procedure 2 pre-test time 2 1 3 2 2 1 0 25 8 Procedure 2 total time 25 6 26 9 28 5 ANOVA R esults D ependent V ariable F P ro b ab ility Procedure 1 pre-test time . 8 8 6 .4377 Procedure 1 total time .777 .4816 Procedure 2 pre-test time .217 .8077 Procedure 2 total time . 2 1 2 .8118 K ruskal-W allis R esults D ependent V ariable P robability Procedure 1 pre-test time .6191 Procedure 1 total time .9370 Procedure 2 pre-test time .7952 Procedure 2 total time .4338 P ost Hoc Test P ro b ab ilities D ependent V ariable EC u EC2 EC ECz EC2,EC3 Procedure I pre-test time .6178 .4567 .9356 Procedure 1 final time .4969 .6646 .9618 Procedure 2 pre-test time .9980 .8506 .8536 Procedure 2 final time .9526 .8140 .9294 Table 7.14: Analysis of Tim e Spent Authoring 190 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 30 5 29 5 .2 7 r 26 a o 25 a 24 23 22 20 • ed ito r (EC3) O d e m o n stra tio n (E C 2 ) A e x p e rim e n t (E C !) S u b je c ts 30 o 29 26 25 24 • e d ito r (EC3) O d e m o n stra tio n (E C 2 ) A e x p e rim e n t (ECY) S u b je cts 35 c 2 5 • ed ito r (EC3) O d e m o n stra tio n (E C 2 ) ▲ e x p e rim e n t (EC1) S u b je cts • e d ito r (EC3) O d e m o n stra tio n (E C 2) A e x p e rim e n t (E C 1) S u b jects Figure 7.6: Graphs of Time Spent A uthoring 191 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Q uestion G ro u p D istrib u tio n o f Answers M ean 1 2 3 4 5 6 7 Like the system EC\ 1 1 2 5.2 ECz 2 2 2 4.0 e c 3 I 1 1 1 1 4.0 Easy to use ECx 1 1 1 1 3.7 e c 2 2 1 2 1 3.3 ECz 2 2 1 2.8 Easy to specify a step ECx 1 1 1 1 4.7 EC2 1 2 1 2 4.8 ECz 1 1 2 4.4 Easy to identify preconditions ECx 1 L 1 1 4.2 ECz 2 2 1 1 4.8 ECz 1 1 3 4.7 Easy to identify state changes ECi 1 1 1 1 4.7 ECz 1 1 3 1 4.7 ECz 1 1 3 4.8 Easy to identify how operators influence preconditions and state changes ECi 2 2 5.0 ECz 4 2 3.7 ECz 1 1 2 1 2.8 Easy to demonstrate EC\ 1 I 2 5.0 ECz 1 2 1 1 4.2 Additional demonstrations useful ECx 1 1 1 4.3 ECz 1 2 2 1 4.5 Like experimenting ECx 1 1 1 1 4.2 Experiments quick enough ECx 1 1 2 5.5 Experiments saved work ECx 1 2 6.0 Experiments caught errors that would have been missed ECx 1 2 3.0 Table 7.15: Subjective Impressions Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The data under Distribution of Answers indicates how the subjects rated Diligent: 1 means not at all, 4 means somewhat, and 7 means a great deal. T he numbers in a column indicate how many subjects gave th a t answer. Blank cells indicate th at no subjects gave th a t answer. 7.6 Discussion T he previous section (Section 7.5) presented the stu d y ’s results. In this section, we will analyze the results and discuss their meaning. 7.6.1 A ssum ptions A b ou t T est S u b jects O ur research attem pts to identify techniques th a t could assist dom ain expert instructors. By instructor, we mean someone who teaches these procedures to human students. However, instructors were not available as test subjects. Instead, graduate students were used because they were the most available pool of subjects. In particular, we used com puter science graduate students who worked mostly in fields related to artificial intel ligence. This raises the question of how sim ilar are graduate students and instructors. To address this issue, consider the assum ptions th at we have m ade about the people who author with Diligent. • An author is a domain expert. A graduate student is not domain expert, but he has access to a functional description of the procedure. • An author knows a valid order for performing a procedure’s steps. A graduate student has to identify a valid order of steps when given a functional description, which does not explicitly specify the order of steps. In this sense, a graduate student has a more difficult task than an instructor. In fact, some test subjects ordered steps incorrectly. The problem with an invalid step order is th a t it interferes with the heuristics th at Diligent uses to create initial operator preconditions (i.e. the h-rep). Because the heuristics assume th a t the state changes of earlier steps arc likely preconditions for later steps, mistakes in the order th a t steps are dem onstrated interferes with the identification of likely preconditions. This suggests th a t the groups who used these 193 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. heuristics (EC\ and E C 2 ) were m ore likely to be negatively affected by disordered steps. • An author may not be familiar with the simulation th a t models the domain. This means th at he may have problems m apping his knowledge to sim ulation attributes. Additionally, the simulation may have some idiosyncrasies th at are not obvious to the author. A graduate student doesn’t know the domain, but his functional description should describe the necessary attributes. Like an instructor, a graduate student needs to m ap domain attributes to sim ulation attributes. • The author is an instructor who can articulate dependencies between steps. However, he may forget to mention some dependencies. Not applicable to graduate students. • The author may not be a program m er, and he may have difficulty understanding the sim ulation’s code. He may also have problems using the rigid syntax required for declaratively specifying a procedure. This does not apply to graduate students. Although they have no access to the sim ulation’s code, they all program and many of them have been exposed to declar ative plan representations. Because of their prior exposure to plan representations, graduate students should learn how to use Diligent more quickly than instructors. Familiarity with the representation may also allow graduate students to use the editor only version {EC 3 ) more easily than instructors. Overall, when comparing graduate students to instructors, the students should have a harder tim e authoring, but should learn how to use the system more quickly. Furtherm ore, g raduate students should have an easier tim e using the editor only version than would instructors. T he difficulty that subjects had correctly identifying a procedure’s steps (Section 7.5.4.1) lends support to the idea th at the authoring task would have easier for instructors. 7 .6 .2 D iscussion o f Background Q u estion n aire The first activity that subjects performed during the study was to fill out a questionnaire about their background. 194 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Several questions asked subjects to rate themselves in area. A subject's answers seemed to depend heavily on the subject’s modesty. For exam ple, a non-native speaker’s English ability did not appear to correspond to the subject’s “real” English ability. Additionally, a subject’s rating of his program m ing ability did not seem reliable, but this cannot be determined. The num ber hours on a com puter both last week and typically also appeared question able because a common value was 40, which is the num ber of hours in a standard work week. This value seems unlikely for graduate students, who tend to keep irregular hours. A pair of yes or no questions asked if a subject had knowledge of machine learning or artificial intelligence planning. Some subjects did not provide the desired answers: subjects who should have answered yes sometimes answered no. In hindsight, a range of values would have been better than a sim ple yes or no. The only significant variable was typical hours spent browsing. There is no obvious reason why com puter use numbers should influence the results of the experiment, especially since all subjects are experienced com puter users. 7.6.3 D iscu ssion o f Training T im e Multiple linear regression identified three variables th a t seemed to influence training time: English ability, knowledge of AI planning techniques, and years of education. It is not surprising th a t English ability and knowledge of AI planning techniques re duced training time. B etter English proficiency should increase reading speed, and knowl edge of AI planning techniques should make Diligent’s plan representation easier to un derstand. In contrast, the correlation with years of education was unexpected. It is unclear why more education should increase training tim e. Perhaps, m ore experienced subjects study more carefully. While the training tim es had a large variance, the group th a t used demonstrations and experiments (E C \) had a larger mean training time than the other groups. This might suggest th a t group E C \ received b etter training. However, subjects followed detailed instructions during training, and the instructions for groups E C \ and EC 2 were alm ost identical. The difference between E C \ and E C 2 can be explained by the subjects in E C \ having the worst English proficiency and the most education. 195 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 7.6.4 D iscussion o f L ogical E dits When comparing groups E C \ and EC-2 , the results suggest th a t experim ents reduce the number oflogical edits. Group E C \ requires fewer edits than E C 2 for both procedures, but the difference is greater for the m ore complex Procedure 1. The num ber of edits for E C i remains fairly constant, while E C 2 requires many more edits on Procedure 1. This increase in E C 2 S edits on Procedure 1 probably results from Diligent "s operator creation heuristics creating more preconditions. G roup E C i’s more constant num ber of edits suggests th at experiments can help remove unnecessary preconditions. When comparing groups E C 2 and E C 3 , the results suggest th a t dem onstrations help on simpler procedures. G roup EC '2 requires fewer edits than E C 3 for the simpler procedure, but requires the sam e num ber of edits on the more complex procedure. The difference between groups E C 2 and EC 3 does not seem to be influenced by the fact th at subjects in E C 3 had to type in attribute values. For group E C 3 , entering or changing an attrib u te value was only counted as one edit. Additionally, most attribute values were one word and spelling errors did not appear to be a problem. This result for groups E C 2 and E C 3 seems to be a tradeoff between the benefits of dem onstrations and Diligent’s bias towards creating unnecessary preconditions.19 Demon strating a step saves edits because it takes one edit and identifies the step, its preconditions, and its state changes. It appears th a t subjects who dem onstrated ( E C 2 ) spent their time removing unnecessary preconditions, while those who used an editor (E C 3 ) spent their tim e adding missing preconditions and sta te changes. 7.6.5 D iscussion o f Errors in Identifying S tep s An initial concern was th a t the procedures were too easy, but it turns out th at they were too difficult. The procedures were m eant to be challenging but not to the point where some subjects had difficulty figuring out which steps to perform. For this reason, the differences between groups in identifying steps were unexpected. Because Diligent’s heuristics for learning operators assum e correct demonstrations, mistakes in identifying steps probably affected the groups th at learned preconditions from dem onstrations (E C 1 and E C 2 ) m ore severely than the group th a t specified preconditions with an editor (E C 3 ). 19It is easier for Diligent to remove unnecessary preconditions than for it to identify missing preconditions (C hapter 5). 196 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. On Procedure 1, the groups th a t dem onstrate (E C i and E C 2 ) have fewer errors. The difference between group E C z and the other groups may be influenced by several fac tors. One potential factor is the abstraction of Procedure l ’s description. However, the descriptions of both procedures do not seem very different. Procedure l ’s description is simply less explicit in describing the ordering of the steps. A nother potential factor is the complexity of Procedure 1, which has many more step relationships than the other proce dure. For Procedure 1, m aybe using only an editor (E C 3 ) is more cognitively challenging than dem onstrating the steps (E C \ and ECz)- Procedure com plexity seems a more likely explanation than the description’s abstraction, but this is area for further study. Because Diligent assumes th a t authors know a procedure’s steps, group EC z s perfor mance suggests performing a future study th a t eliminates the influence of invalid steps. This could be done by giving the subjects a valid sequence of steps. The subjects would then have to determ ine the causal links and ordering constraints between the steps, which is w hat Diligent is designed to learn. T he differences between groups on Procedure 2 are minor. However, groups E C \ and EC z should have had similar values because both groups use dem onstrations. The subjects in E C z did a better job of dem onstrating the procedure because th a t group produced a higher percentage of procedures th a t would have worked. G roup E C j ’s problems demon strating might have counteracted the benefits of Diligent’s experim ents because group E C i ’s procedures had more errors after dem onstrations. 7.6.6 D iscussion o f Errors o f O m ission Diligent's heuristics are designed to avoid errors of omission. If a procedure is dem onstrated correctly, there should be few errors of omission. In Procedure 1, mean errors for all groups are better than they appear because 4 of those errors are step specific control preconditions th a t are not required by the environment and, thus, are not learned by Diligent.2 0 In Procedure 1. group E C z is much worse than the other groups. However, the sub jects in group EC z did a much worse job of identify the procedure’s steps. This poor identification of steps might have exaggerated the differences between groups. Because Procedure 2 has fewer step relationships than Procedure 1, the number of errors of omission should be lower. This what was found for all groups but EC i. Although 20 In Procedure I, two valves should be opened when their alarm lights are illuminated and should be shut when their lights turn off. The environment requires the valves to opened because of internal pressure. 197 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. the subjects in group E C i did a poor job of identifying the procedure's steps, they still did as well as the subjects in g roup ECz, who did a better job of identifying the steps. One striking result is th e large number of errors of omission for the group th a t used only an editor (E C z). This was not entirely unexpected because subjects in EC3 have to explicitly specify ail steps, th eir preconditions, and their state changes. It seems unlikely th a t group E C z s large num bers of errors result from the group’ s being required to type in a ttrib u te values. M ost attrib u te values were one word, and there were not th a t many preconditions and state changes. Perhaps spelling errors caused problems? But when exam ining th e subjects’ procedures, spelling errors were not an issue, and when asked, several su b jects indicated th a t spelling errors were not a problem. When com paring groups E C \ and ECz, experim ents did not reduce the num ber of errors of omission. This was expected because Diligent has a bias towards errors of com mission. When com paring groups E C z and ECz, dem onstrations reduced the number of errors of omission in the complex procedure, but appeared to have little benefit in the simple pro cedure. This suggests th at, when using an editor (E C z), the num ber of step relationships is correlated with the num ber o f errors of omission. 7 .6 .7 D iscussion o f E rrors o f C om m ission W hen comparing groups E C \ and EC z, the results suggest th a t experim ents reduce the errors of commission. The benefits of experiments are greater in th e more complex proce dure. When comparing groups E C z and ECz, the results suggest th a t dem onstrations result in more errors of commission. M ore errors were com m itted in the com plex procedure where Diligent’ s heuristics created m ore unnecessary preconditions. 7 .6.8 D iscussion o f T otal Errors When comparing groups E C \ and E C z, the results suggest experim ents will only reduce the number of errors on complex procedures. If both groups had equally good dem onstrations for the second procedure, then experim ents might also have shown a benefit on the simple procedure. When comparing groups E C z ar*d ECz, the results suggest dem onstrations only reduce the number of errors on com plex procedures. This reduction occurred even though the groups had a sim ilar num ber o f logic edits. 198 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 7.6.9 D iscu ssion o f Total R equired E ffort The total required effort is a measure of am ount of work required to produce a correct plan. The required effort includes the work th a t has been done as well as the work th at needs to be done. W ork is measured by logical edits, and future work is estim ated by total errors. On Procedure 2, the increase in total required effort after testing suggests th a t to tal errors underestim ates the additional work th a t needs to be done. Because domain experts might be b e tte r able to identify errors, it is unclear whether total errors would underestim ate the future work for domain experts. When com paring groups E C \ and ECi-, the results suggest th at experim ents reduce the total required effort and have a greater effect on more complex procedures. When com paring groups E C i and E C i, the results suggest that dem onstrations reduce the total required effort and have a greater effect on more complex procedures. 7.6.10 D iscu ssio n o f T im e Sp en t A u th o rin g There was a 30 m inute time limit placed on each procedure. It was assumed th a t the sub jects in group E C \ would be able to author each procedure in approximately 15 m inutes. However, the results indicate th at 30 minutes was not enough time. This probably results from the subjects being unfamiliar with the dom ain, and the fact th at the subjects had determine which steps to perform. Oftentimes, subjects would spent around 10 m inutes studying the procedure description before startin g to author. In Procedure 2, the last 6 subjects started testing much earlier than most previous subjects. It is unclear why this so. Maybe there was a change in the experim ental setup. Perhaps, the subjects were able to understand th e directions better because they had a better description for the first procedure. However, the statistics involving logical edits and errors don’t ap p ear different for these subjects. If subjects were given more time, they might have produced better procedures. This means th at they m ay have made fewer errors and perform ed more logical edits. The fact th a t subjects in group E C i had to type in a ttrib u te values may have increased the times for this group slightly. However, relatively little typing was needed, and subjects appeared to make few spelling mistakes. Therefore, the impact of typing is probably relatively minor. Because of the tim e limit placed on subjects, we cannot draw any conclusions about whether dem onstrations or experiments reduce the am ount of time spent authoring. 199 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 7.6.11 D iscussion o f S u b jective Im pressions The last thing the subjects did during the study was fill o ut a questionnaire about their subjective impressions of Diligent. The subjective impressions focus on aspects of the user interface. The impressions provide some indication of the usability of the three versions of Diligent. The impressions also indicate how subjects perceived various features. There is relatively little d a ta and a lot of variation between subjects. A number of factors probably influenced the subjects’ impressions. The impressions likely reflect both the training and the evaluation. The su b jects’ ratings may reflect am biguities in the menus and difficulties with the environm ent’s graphical interface. For example, some subject’s who experienced software problems gave lower ratings. Addition ally, the difficulty of the procedures being authored was probably also a factor. All groups indicated th a t they liked the system som ew hat, and the group th at ex perimented (E C i) liked it a little better than others. Unfortunately, this question is ambiguous because it does not indicate w hether it is asking ab o u t only Diligent or about all the software used for authoring (e.g. environm ent’s graphical interface). The subjects found the system a little difficult to use. However, this may reflect the difficult procedures being authored. Subjects th a t experim ented (ECi) found it a little easier than subjects who only dem onstrated (E C 2 ). Subjects who only used the editor (ECz) found it the most difficult. This pattern was expected because using only an editor is difficult. All groups indicated th a t it was somewhat easy to specify steps, preconditions and state changes. This is very desirable. This is an indication that all three system s are reasonable and that editor on ly version (E C z) is n°L a straw man. The ratings for the ease in identifying “how operators influence preconditions and state changes” are confusing. It is unclear why the different groups are so different. The group th a t experimented (EC 1) found it easier than the group th a t only dem onstrated (EC2). Maybe this is a reflection of how experiments improve preconditions. The group th a t only used the editor (ECz) bad the most difficulty. Maybe this indicates th at representing preconditions and state changes with operators is more difficult when using an editor. This might also indicate th at it is harder to determ ine the correct preconditions and state changes when using an editor. Subjects found it som ew hat easy to dem onstrate. Although the groups th a t demon strated have slightly different means, both groups used the sam e techniques for demon strating. It is surprising th at the rating is this high given the problems during the first • 200 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. half of the study when a m em ory leak caused the environm ent’s graphical interface to be unresponsive and slow. The subjects also found th e ability to provide additional dem onstrations of a procedure somewhat useful. Before sta rtin g the evaluation, it was assumed th a t few subjects would need this capability. It is unclear w hether this rating reflects the training or the subjects’ difficulty in identifying a procedure’s steps. Subjects only som ew hat liked having Diligent perform autonom ous experim ents. Some subjects seemed to feel th a t experim enting was a strange feature and were not sure what to think about it. Subject 12 indicated a strong dislike of experim ents even though the subject felt th a t experim ents were useful and quick. Subjects felt th a t experim ents were quick enough. This provides support for the ar guments in C hapter 6 th a t th e experim entation approach has a reasonable run-tim e com plexity. There was strong su p p o rt for the correct conclusion th at experim ents save work, but subjects had a only m oderate belief th at experiments would have caught errors that they would have missed. This contradicts the d ata for errors of commission on Procedure I. The d ata for Procedure 1 suggest th at experiments prevented m any errors in the final procedure. 7.7 Reviewing the Claims Now that we have discussed the results, we will look a t how well the results support the hypotheses presented in Section 7.1. Because there were few subjects, statistical significance was rarely achieved. Moreover, the probabilities should be viewed with a little skepticism because they are too sensitive to individual d a ta points. However, the results are still valuable because they indicate patterns and trends. The claims com pare group E C \ against E C i and group E C i against ECz- However, group E C i often has the interm ediate value. This means th at statistically significant differences between E C \ and E C z are not used to justify the claims. The various claims are addressed by d a ta in the following sections. Claim s 1 and 2 are addressed by the d a ta for logical edits (Section 7.5.3), which is discussed in Section 7.6.4. Claims 3 and 4 are addressed by the data for total errors (Section 7.5.4.4), which is discussed in Section 7.6.8. Claim s 5 and 6 are addressed by the d a ta for total required 201 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. effort (Section 7.5.5), which is discussed in Section 7.6.9. Claim s 7 and 8 are addressed by the d a ta for the time spent authoring (Section 7.5.6), which is discussed in Section 7.6.10. • Claim 1: Subjects require less work to create a procedure when using demonstrations and experiments than when using only demonstrations. This claim is supported. However, there appears to be less benefit on simpler pro cedures. • Claim 2: Subjects require less work to create a procedure when using only demon strations than when using only an editor. This claim is partially supported. On complicated procedures, there does not appear to be a difference between using dem onstrations or an editor. However, dem onstra tions appear to provide an advantage on simpler procedures. • Claim 3: Using dem onstrations and experiments results in few er errors than when using only demonstrations. This claim is partially supported. On complicated procedures, experiments appear to be beneficial. However, experim ents do not appear to be th a t useful on simpler procedures. One problem with this claim is th at the subjects who experim ented did a poor job of demonstrating the sim pler procedure. This caused the group th a t experimented to have more errors of omission than the group th a t d id n 't experiment. • Claim 4: Using only dem onstrations results in fewer errors than when using only an editor. This claim has weak partial support. On the com plicated procedure, dem onstrations seemed to help, but dem onstrations did not appear to have much effect on the simpler procedure. • Claim 5: Subjects require less work to create a correct procedure when using demon strations and experiments than when using only dem onstrations. This claim is supported. However, the benefits of experim ents are less on simpler procedures. • Claim 6: Subjects require less work to create a correct procedure when using only demonstrations than when using only an editor. 2 0 * 2 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. This claim is supported. However, the benefits o f dem onstrations are less on simpler procedures. • Claim 7: Subjects can author in less time using dem onstrations and experim ents than when using only demonstrations. The d a ta are inconclusive. The tim e spent authoring indicates only a small difference, and most subjects appeared to have run out of tim e before they were finished. • Claim 8: Subjects can author in less time using only demonstrations than when using only an editor. The d a ta are inconclusive. The tim e spent authoring indicates only a small difference, and most subjects appeared to have run out of tim e before they were finished. Dependent Variable Relation Ho ds on Direction of increased difference Simple Complex Edits EC\ > EC-* Yes Yes complex ECi > ECi Yes No simple Errors EC,. > ECi No Yes complex ECi > ECi No Yes complex Total Effort ECi > ECi Yes Yes complex ECi ^ ECi Yes Yes complex Time ECi > ECi - - ECi > ECi - - Table 7.16: Summary of Results These results are summarized in Table 7.16. The relations compare the groups that experimented (E C \), only dem onstrated (E C i) and only used an editor (E C z)• The re lation A > B means th a t A does b etter than B. The results indicate th at experiments help more on complex procedures. An interesting result is th a t subjects who only demon strated on the complex procedure had as many edits as those who used the editor, but the subjects who dem onstrated produced fewer errors. N either experim ents or dem onstrations appeared to reduce errors in simple procedures, but they did reduce errors in complex pro cedures. The total effort required to produce a correct procedure includes both edits and errors. The total required effort was reduced by both experim ents and dem onstrations. Because of time restrictions, no conclusions could be m ade about time spent authoring. In hindsight, the patterns found in this study ap p ear reasonable, and it seems likely th a t the patterns would be maintained if the test subjects were domain experts rather than graduate students. •203 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 7.8 Observations During the study, a few miscellaneous issues were observed. • A fter subjects finished the evaluation, they were given a dem onstration of Diligent. One remark th a t was heard several tim es was th a t they hadn’t realized how to use Diligent effectively. One reason for this is that Diligent has a very unusual user inter face. The subjects indicated th at they would have liked to have seen a dem onstration of Diligent at the s ta rt of training. However, dem onstrating the system separately for each subject would have introduced a great deal of variation in the training of subjects. One way to deal with this issue, when testing systems with unusual types of user interfaces, is to play a video th a t illustrates how to use the system . • T here is a tradeoff between asking test subjects to perform simple versus complicated tasks. A simple task is more likely to get statistically significant results, but if a task is too simple, the results may be trivial because the task is too much o f a toy problem. The tradeoff is relevant to this study because Diligent focuses more on understanding dem onstrations than on the usability of its user interface. By not telling subjects a valid sequence of steps, the authoring task was m ade more challenging, but one of Diligent’s assum ptions was violated. T he challenging procedures introduced more variability into the study and placed more emphasis on the user interface. Before the study, some user interface features were thought to be insurance rather than necessities (e.g. the ability to delete steps). However, subjects used these fea tures quite often. Additional features th a t were deemed unnecessary were sometimes requested by subjects (e.g. a dynam ically updated graph of a procedure). A related issue is the am ount of flexibility allowed by the user interface. The usability testing identified the need to use forcing functions to prevent very undesirable behav ior. The formal evaluation also indicated a need to disable features th a t are irrelevant to the task. For example, although the ability to create hierarchical procedures was not discussed during training, one subject created a hierarchical procedure. 7.9 Summary This chapter discussed an empirical evaluation of Diligent. Instead of focusing on how well Diligent could understand dem onstrations, the study focused on how Diligent’s techniques help a human author. ‘ 201 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The study had a between-subjects design where the subjects were divided into th ree groups. T he subjects in a given group had similar training and used the same version of Diligent. A fter approxim ately two hours o f training, the subjects authored two procedures. One of the procedures could be considered more com plicated because it is a little longer and has m any more step relationships. Finally, subjects gave their impressions of Diligent in a post-test. The differences between the three versions of Diligent involved dem onstrations and experiments. One version supported both dem onstrations and experiments, while another version used dem onstrations but did not allow experim ents. A third version provided an editor and did not support dem onstrations. The user interface for the three versions was as similar as possible. T he versions th a t used dem onstrations were basically identical. T h e version th a t only provided an editor differed from the others in how steps were added to a procedure and in how preconditions and sta te changes were specified. The results of th e post-test suggest th a t subjects felt th a t the editor only version was reasonable and fair. The study identified benefits of using dem onstrations and experiments. Using experi ments and dem onstrations appeared to be b etter than ju st using dem onstrations, and using dem onstrations without experiments appeared to be better than using only an editor. T h e differences between the groups appear greater on complex procedures. Experim ents re duced the num ber of edits th a t subjects performed, while dem onstrations only appeared to reduce the number of edits in simpler procedures. A lthough neither experiments nor dem onstrations appear reduce errors in simple procedures, both experiments and dem on strations appear to reduce errors in com plicated procedures. When considering both edits and errors, both experim ents and dem onstrations appear beneficial for both simple and complex procedures. Because of time restrictions, the stu d y could not determine how experiments and dem onstrations influenced the time spent authoring. The responses to the post-test suggest th a t D iligents experim entation approach is acceptably fast on procedures of 6 to 8 steps, which is approxim ately the expected size o f non-hierarchical procedures. •205 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Chapter 8 Analysis and Future Work [a Chapter 3, we discussed Diligent at high level. The subsequent chapters then focused on individual topics, such as processing dem onstrations, learning operators and experim ent ing. This background enables us to have a more unified discussion of Diligent, including its limitations and potential extensions. This chapter is organized in the following m anner. We will first discuss how Diligent’s methods address the problem of understanding dem onstrations by discussing several per spectives for viewing dem onstrations. We will then talk ab o u t assumptions and how easily they can be relaxed. Afterwards, we talk about lim itations and potential extensions. 8.1 Perspectives for Understanding Dem onstrations One way th at Diligent addresses the problem of understanding dem onstrations is by view ing a dem onstration from m ultiple perspectives. Each perspective asks a different question, and by focusing on each question, dem onstrations can be better understood. One could view Diligent as a set of m ethods th a t address the following four questions. W h e n sh o u ld a s te p b e p e rfo rm e d ? Under w hat conditions should a step be per formed in order to achieve the procedures goals? T his perspective deals with iden tifying knowledge for controlling when steps are perform ed. In contrast, the other perspectives deal with how the environm ent fu n ctio n s independently of the proce dure's goals. Diligent methods address this question in the following ways. Knowing when to perform a step requires knowledge of the procedures goal condi tions. Diligent proposes a set of goal conditions to the instructor that contain the final values of a ttrib u tes th at change value during the procedure. When a proce dure’s goal conditions are satisfied, the procedure term inates because no more steps are necessary. •206 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Diligent com putes when to perform steps by analytically deriving step relationships (i.e. causal links and ordering constraints) between a procedure’ s steps. Later, when performing the procedure, the step relationships identify which steps are currently applicable. Some preconditions of steps come from operators, which reflect how the environ ment functions independent of the procedure; but other preconditions are associated with individual steps. Consider sensing actions (e.g. examining a gauge), which gather information from the environment w ithout changing its state. Because the environment may allow a sensing action to be performed anytime, a sensing action’s operator might not have any preconditions. For this reason, a sensing action’ s pre conditions are associated with its step. Sensing actions need preconditions to ensure that they are performed in the proper place within a procedure. By default, a sens ing action’s preconditions contain the attributes th at have changed value before the sensing action during the dem onstration. W hat pre-state conditions are common when a given state change is seen? This is an instance of the standard concept learning question. Diligent addresses this issue with its version space algorithm for learning operator preconditions. An advantage of this approach is th at Diligent can learn from both positive and negative examples. W hat is different when different state changes are seen? The question deals with comparing the preconditions of effects th at produce different state changes. This perspective is used when creating an effect for an o perator that already has an effect. W hen identifying the heuristic preconditions (h-rep), Diligent uses the h-rep of an existing effect but adjusts it with the current action-exam ple’s pre-state. The new h-rep also contains conditions in the current action-exam ple’ s pre-state th at are different than ones in the most similar earlier action-exainple. (Only th e current example is positive for the new effect, while earlier exam ples are negative.) Diligent also compares effects with different state changes. When the h-rep of one effect cannot correctly reject a negative example, conditions from the exam ple’s pre state may be compared to the preconditions of other effects for which th e example is positive. If a one condition match is found, a condition is added to the h-rep. W hy isn’t a step earlier? The instructor probably has reasons for dem onstrating steps in a given order. One reason is th at the sta te changes of som e earlier steps are likely •207 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. to be preconditions of som e later steps. Diligent is novel in how it emphasizes this question. Diligent has a couple heuristics th a t deal with this perspective: focus on a ttrib u tes that change value, and earlier steps are likely to establish preconditions of later steps. Diligent uses this perspective when creating an operator’s first effect. The initial h-rep contains a ttrib u tes th a t have changed value during the current dem onstration. A similar approach is used to create preconditions for sensing actions. Diligent’s experim ents also focus on this perspective by skipping a step and observing how later steps are im pacted. 8.2 Assum ptions This section discusses the assum ptions used by Diligent and the difficulty in relaxing them . 8.2.1 Easier to R ela x Relaxing the following assum ptions appears to be relatively easy. N o attributes are added or removed from the environment. In Diligent’s dom ains, the number of attrib u tes in the environm ent’s state is constant. This can be reason able in tutorial dom ain because students might get confused if attributes were being added or removed. In any case, there hasn’t been a need to relax this assum ption when using Diligent. Previous work by W ang [Wan96a] has relaxed this assum ption, and her technique could be incorporated into Diligent. However, there are a couple special cases where relaxing this assum ption would be more difficult. • If the domain is under development, then new attributes could be added to the environment and existing attributes could be used differently. If new attrib u tes don’t affect existing procedures, then this might not be a m ajor problem, but if the new attrib u tes do affect existing procedures, then it may be difficult to use previously learned knowledge. • The domain is so large th a t agents (e.g. Diligent) are given a limited view the domain. For exam ple, agents in each simulated room might see different sets 208 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. of attributes. If an agent’s current view did not include all relevant attributes, then there could be difficulties. (This issue is discussed below under relaxing the assum ption th a t all relevant attributes are visible.) No generalized conditions. When Diligent uses a condition, the condition always refers to a specific a ttrib u te and a specific value.1 An alternative would have been to intro duce variables into preconditions and sta te changes. O perators containing variables could then apply to multiple objects of the same class. Diligent's approach was used for three reasons: there is relatively little input data; the environm ent’s lack of structure hides relationships between objects and attributes; and m any objects (e.g. switches) have idiosyncratic behavior. As an exam ple of idiosyncratic behavior, consider two switches; one switch may turn on some lights, while another switch may sta rt the motor. If many objects of th e class have sim ilar behavior, then introducing variables into preconditions and s ta te changes could allow more generally applicable operators. This type of approach is also looked a t by OBSERVER [Wan96c]. Qualitative attribute values. A ttributes are assumed to have only a few discrete values rather than continuous or numeric values. For example, a tem perature sensor might only have the values ok and too-hot. Qualitative attrib u te values have several advantages. They are easy to use with machine learning algorithm s, and they may provide descriptions th a t hum ans find conceptually easy to understand (e.g. too-hot). However, sometimes qualitative attrib u tes are not appropriate. It might be difficult to identify meaningful qualitative values, or there m ight be a large num ber of values th a t are associated w ith qualitatively different behavior. Moreover, som etim es it may be im portant for people know numeric values or know relations between values (e.g. height < 5). W hether qualitative o r quantitative attrib u te values are used, an im portant issue is w hat is the most effective authoring m ethod. Determ ining this may involve consid ering both the ease of authoring and the quality of stu d en t remediation. Qualitative attrib u te values are required by Diligent’s learning algorithm s, but this restriction could be overcom e by associating a range o f numeric values with a single 1 An exception is conditions involving mental attributes. These conditions indicate th a t their value is unim portant. •209 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. qualitative value. This technique also could be used with numeric form ulas or con ditions involving relations other than equality (e.g. tem p < 5) [Wd90]. Providing the ability to assign numeric ranges to a qualitative value appears easy. However, it is unclear whether Diligent would ever get enough data to autom atically generate quantitative boundaries (e.g. numeric formulas) th a t specify a qualitative attribute value (e.g. too-hot). Conjunctive preconditions and goals. Although a step m ight produce s ta te changes from several of its operator’s effects, Diligent assumes th a t the preconditions of a step are conjunctive. Diligent also assumes th a t a procedure term inates when its conjunctive goal conditions are m et. Allowing disjunctive preconditions raises two issues. F irst, learning disjunctive pre conditions may take more d a ta than learning conjunctive preconditions. Second, the system would need to determ ine which disjuncts correspond to each step. This should not pose a problem if the preconditions are very refined, but could be problem atic when the preconditions are less refined. Disjunctive preconditions would probably require more interaction with the instruc tor. Presently, disjunctions can only be detected when the version space collapses. Disjunctive goal conditions seem more problem atic. Specifying disjunctive goal con ditions does not seem difficult, but would require a t least one dem onstration of each disjunct. However, using a subprocedure with disjunctive goals appears more diffi cult. If a subprocedure’s abstract step could have m ultiple distinct post-states, then each post-state might require a different sequence of subsequent steps in the parent procedure. Instructor correctly demonstrates procedures. If the instructor doesn’t correctly dem onstrate a procedure, the procedure’s path will not produce a correct plan. More over, Diligent’s heuristics assume th a t there is a good reason for the sequencing of a procedure’s steps. Correcting a path poses no problem, but an invalid sequencing of steps might lead to worse heuristic preconditions. Although experim ents m ight help, correcting the preconditions might require th at the instructor provide more training d a ta . An aspect of this problem is th a t Diligent’s learning algorithm s can more easily remove unnecessary hypothesized preconditions than identify missing ones. •210 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 8.2.2 Harder to R e la x Relaxing the following assum ptions appears to be relatively difficult. Relaxing most of these assumptions does not appear particularly im portant. One action at a tim e. Diligent assumes th a t only one action takes place at a time. This helps in identifying preconditions and state changes. It also helps in determ ining the sequence of a p ath ’s steps. If the instructor were able to perform several actions sim ultaneously and if these actions could have been performed sequentially, the action-examples of these actions might be misleading because a post-state could contain the results of several actions. To handle this situation, Diligent could either use a more robust operator learning algorithm or delay learning until it has had a chance to replay the dem onstration with the actions separated by time. Deterministic actions. In a given pre-state, Diligent needs to know which state changes will be caused by a given action. Actions appear non-deterministic when a relevant environment a ttrib u te is not seen [She94]. Sometimes an action appears non-deterministic when it needs to be repeated several times. An example from the HPAC domain is a dipstick which needs to be selected several times when being extracted. W hen the dipstick is in its interm ediate position, Diligent cannot tell w hether selecting it will move dipstick in or out of its hole. An action, like selecting th e dipstick, th at is repeated several times could be modeled by considering the sta te changes after last action is performed. Understanding repeated sequences of actions is an im portant issue for robotic program ming by dem onstration systems [Hei93, FM D + 96]. Handling non-determ inism in actions th at can be repeated until they produce the desired result appears to be easy and im portant, but it is unclear how the system should handle other cases of non-determinism. One problem with non-determ inistic actions is handling non-determinism during experiments. How does the system detect non-determinism? Perhaps, experim ents could be repeated several tim es. Can tell when an action begins and ends. It is implicit in D iligents interface with environment that action-exam ples will identify an action’s pre-state and post-state. This knowledge is required to identify preconditions and state changes. 211 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Relaxing this assum ption in general appears fairly difficult and may not be very im portant. However, delayed sta te changes (or delayed effects) could im p o rtan t. A delayed state change occurs when an action is finished but a future sta te change has not yet happened. For example, a copy m achine may not finish warm ing up for a minute after it is started. Consider the following cases: • The delayed sta te change happens before the next action. In this case, the system could notice the change and associate it with the previous action. • The delayed sta te happens after subsequent unrelated actions and changes to the state, but does not happen during a later action. In this case, the system might look for the last action th a t changed the state change’s attrib u te. For example, when starting a copy machine, the machine may take one m inute to warm up before it is ready. In this case, the sta te of the m achine m ight go from off to warming-up when the machine is started and after one m inute to ready. A system m ight then infer th at startin g the machine eventually caused it to become ready. • The delayed sta te happens after subsequent unrelated actions and changes to the state and happens during a later action. In this case, the environm ent would appear non-deterministic. It is unclear how this should be handled. Perhaps, a system could detect this if enough training d a ta were available. C a n see all re le v a n t a ttr ib u te s . An a ttrib u te is considered relevant when it is needed for teaching or for operators to appear deterministic. Besides non-determ inism, which we’ve discussed, this assumption im pacts teaching. An a ttrib u te is useless for teaching when neither Diligent nor an autom ated tu to r can see it. Relaxing this assum ption appears difficult. Maybe missing a ttrib u tes could be rep resented by m ental attributes, but it is unclear how well this would work. N o ise-free se n so rs. Diligent assumes th a t th e data it gets from the environm ent contains no errors. This is im portant because little data is received, and the lack of d a ta would make recovering from errors more difficult. A method for relaxing this assumption would be to replay dem onstrations and repeat experiments. M ultiple action-examples for each step could then be com pared. Of course, this approach would take more tim e. 212 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Can see all actions. Diligent’s ability to record dem onstrations depends on its ability to observe all actions performed in the environm ent. Relaxing this assum ption appears fairly difficult. N o exogenous events Exogenous events are things th a t happen to the simulated do main that are not caused by the user or by the authoring tool (e.g. Diligent). For example, exogenous events include actions performed by other agents or special events in the sim ulated world (e.g. a fire startin g in the engine room). If the authoring tool knew th at an exogenous event was an exogenous event, then it should not be th a t difficult to model it. Otherwise, handling exogenous events is similar to not being able to see all actions. Partially ordered procedures. If a procedure has only one valid sequence of steps, then Diligent’s experim ents might not learn anything useful. Experim ents attem pt to produce new action-examples for refining the preconditions of desired sta te changes. In an experiment on a totally ordered procedure, all exam ples might be negative. These negative examples might identify necessary preconditions, but the examples would not remove any unnecessary preconditions. Relaxing this assum ption appears difficult. M odular procedures. Diligent performs fewer actions in experim ents when large pro cedures are divided into modular subprocedures. It’s unclear how to relax this assum ption. Perhaps, a system could perform experi ments when the instructor was not present. However, if the instructor is present, the number of actions performed might be reduced by exam ining operator preconditions and only performing experiments likely to refine preconditions. Non-interleaved plans. Interleaved plans [RN95]2 interleave the performance of the steps of two subprocedures. When the instructor uses subprocedures in demon strations, he uses the subprocedures sequentially. This makes Diligent incapable of learning a procedure whose subprocedures can only achieve their goals by interleaving their steps. Relaxing this assum ption appears difficult. 2These types of plans have also been called non-linear. •213 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Diligent can reset the environm ent. Diligent’s techniques assume that it can reset the state of the environm ent. Although some of the ideas th a t Diligent uses to understand dem onstrations might be useful, Diligent’s algorithm s are probably inappropriate for an agent th a t cannot reset the sta te of th e environm ent. 8.3 Limitations 8.3.1 C oordinated S im u ltan eou s A ctions Besides the lim itations inherent in a direct m anipulation interface [Coh9’ 2], Diiigent’s use of a single m anipulation device (i.e. mouse) caused problems in the HPAC dom ain. In particular, the HPAC’s Tem perature M onitor requires the user to perform pairs of actions simultaneously: the read reset and trip temperature buttons need to be depressed sim ulta neously to view the tem perature at which the currently selected sensor will illuminate an alarm light. People can do this with tw o hands, but it is unclear how to do this with only one mouse. A related issue is when to consider similar types of actions finished. In the above example, the tem perature displayed on the gauge disappears when the buttons are re leased. Thus, Diligent would not even see the tem perature because its action-examples tre a t depressing and releasing a button as an atom ic action and hide intermediate states. An alternative is having separate action-examples (and steps) for pressing and releasing a button. However, this alternative is likely to irritate humans. This raises the question of how to uniformly process a given type of action (e.g. pressing b uttons).3 Extending Diligent to handle coordinated sim ultaneous actions might require modeling a set of simultaneous actions with a single operator. 8.3 .2 W hen P r e -S ta te and P o st-S ta te V alues are Independ ent Diligent has problems learning operators w rhen an a ttrib u te ’s post-state value does not depend on its pre-state value. When this happens, the attrib u te m ay have its value reset but to the same value as in the pre-state. The problem is distinguishing between situations where the a ttrib u te ’s value is and is not reset. 3Becaus.'! Diligent gets action-examples from the environment (Section 3.1.3), it’s the environm ent’ s responsibility to make decisions on when to create action-examples. ’ 214 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. This indeterminism reduces the num ber of positive examples available for learning. If an attrib u te has its value reset to its pre-state value, the example cannot be classified as positive because Diligent cannot tell th a t it was reset. If the value wasn’t reset, then some necessary preconditions were unsatisfied; in this case, treating the example as positive could eliminate necessary preconditions and cause the version space to collapse. This problem is worse for a ttrib u tes th a t take only two values. If both pre-state values are equally likely and do not affect th e post-state value, then one half of the “real” positive examples cannot be used. This situation is illustrated by an example from the HPAC domain. In figure 8.1, the attribute CurrentValvelsOpen indicates whether the valve under the handle th a t manipu lates valves is open. If the handle is moved to valve2, CurrentValvelsOpen changes its value without appearing to change. Therefore, the attribute is not listed in the exam ple’s delta- state (delta-state 1). In contrast, if the handle is to valve3, the a ttrib u te ’s value changes from true to false (delta-state 2). A c tio n -e x am p le: P re -s ta te : (CurrentValvelsOpen true) (valvel open) (valve2 open) (valve3 shut) (HandleOn valvel) D e lta -s ta te 1: (when moving to valve'2) (HandleOn valve2) D e lta -s ta te 2: (when moving to valve3) (CurrentValvelsOpen false) (HandleOn valve3) Figure 8.1: An A ttribute whose Post-State is Independent of its Pre-State In this case, the problem results from an attribute (i.e. CurrentValvelsOpen) th at contains redundant inform ation. Instead, this fact could have been inferred from other observable attributes.4 This suggests th a t a program th at learns preconditions can have 4The STEVE tutor [R J99] uses an attrib u te like C urrentV alvelsO pen for determining when a handle has been turned. When evaluating Diligent, the a ttrib u te was filtered out so th at neither students nor Diligent 215 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. problems when the sim ulation th at controls the environm ent uses certain modeling tech niques, but this topic is beyond our present scope. One way to deal with this problem is to classify an action-example as positive for only one effect. Thus, even if an attrib u te d id n ’t appear to change value, the action-example could be classified as positive because of changes in other attributes. O f course, this approach only works when effects contain multiple sta te changes. Unfortunately, this approach m ust deal with misclassified examples. Because an action- example is a positive exam ple of only one effect, m ultiple effects could change the sam e attrib u te. As a result, it could be difficult to determine which effect has the positive exam ple. Later, when effects are more refined, it might be discovered than an action-example was misclassified as a positive example of a given effect. After detecting a misclassification, there is the overhead of recalculating two effects: the effect with the false positive and the effect with the false negative. Even worse, the scope of the recalculation is unclear because recom puting one effect may identify a misclassification with a third effect. An algorithm of this type was im plem ented for Diligent. The algorithm worked well in the HPAC domain, but was removed o u t of concern about the worst case performance in dom ains where misclassifications are likely.5 The type of operators we’ve ju st described are called relational (or som etim es rewrite rules). For relational operators, the entire pre-state as a whole is transform ed into the post-state. An example of a relational o perator is a m athem atical transform ation such as performing symbolic integration on an integral. Relational operators have been discussed in work by Langley [Lan80] and by P orter and Kibler [PK86]. Their approaches, however, use domain dependent sta te transform ation rules. 8 .3 .3 Transitive D ep en d en cies Diligent’s experimentation approach may not work well when a procedure's set of steps is totally ordered. A set of steps is totally ordered when there is only one valid order for performing the steps. The problem is th a t skipping a step early in the procedure im pacts each of the later steps. The problem could be classified as involving transitive dependencies. A step Z has a transitive dependency on an earlier step X when Z depends on interm ediate step Y and saw it. In this case, the a ttrib u te ’s change in value could have been successfully modeled with disjunctive preconditions. 5Diligent’s user interface provides some support for attributes like C urren tV alv elsO p en . Instructors can edit preconditions and can filter out unwanted attrib u tes so th a t they do not appear in plans. 216 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. step Y depends on step X. In other words, there are causal links from X to Y and from Y to Z. Skipping step X in an experim ent interferes with Y, and anything th a t interferes with Y also interferes with Z. Thus, Diligent cannot determine w hether Z has a causal link w ith X or w hether Z is indirectly dependent on X through Y 's causal link with X. Other th an experimenting with multiple paths, Diligent’s experim entation technique does not address this problem. This appears to be a general problem of system s, like Dili gent, that learn procedure independent knowledge (e.g. operators) by observing sequences o f steps. In contrast, it may not be a problem for systems th a t learn when to perform steps (i.e. learn control knowledge) w ithout understanding the dependencies between steps (i.e. causal links). 8.4 Extensions In this section, we will discuss extensions th a t could enhance Diligent. V V e will first discuss extensions to th e procedural representation because they m otivate some of the extensions to authoring. We will then finish the section by discussing extensions to learning and experim entation. 8.4.1 P roced u ral R ep resen tation Every procedure has one or more paths, but only one path is actually used to generate a plan. If Diligent allowed multiple paths to be used for generating plans, then instructors could author a larger set of procedures. M ultiple paths could support starting a procedure in a variety o f initial states. M ultiple paths could also support conditionally performing steps based on the state earlier in the procedure. The following sections discuss ways to use multiple paths. 8.4.1.1 M ultiple Methods for Performing a Procedure Originally, Diligent allowed the instructor to specify different orders of steps for performing a procedure. A different order of steps resulted in an additional path. This capability not only supported different initial states, but also allowed the relative order of som e actions to be reversed in different paths. However, this capability was removed because of the problems described below and because none of the procedures authored w ith Diligent required this capability. 217 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. When a procedure has different p a th s for achieving its goals, the relative order of some actions in different paths might be reversed. If the p ath s are used to create a single plan, the plan could contain circular dependencies (i.e. step relationships) between its steps. This actually happened with the p a th s shown in figure 8.2. In the figure, M2 represents moving the handle to the second valve, and S2 represents shutting the second valve. There were two demonstrations, and each created a different path. In one demonstration, the handle was initially moved to the second valve, while in the other dem onstration, it was initially moved to the first valve. path A: move to second valve (M2) — > sh u t second valve (S2) — > move to first valve (M l) — > sh u t first valve (SI) path B: M l -> S i -> M2 -> S2 Desired plan: goto first or second valve — > sh u t the valve — > • goto the other valve — > shut the o th er valve The Problem is Step Relationships: ...- > M2 -> S2 -»• M l -» S i -> M2 -* ■ S2 -»• . . . Figure 8.2: Incompatible Paths If the identifiers used for the steps in one path are reused for the equivalent steps in the other path, the procedure will only contain four steps (i.e. S i, M l, S2 and M2). When the step relationships for the two p ath s are used in the same plan, there is a circular dependency between steps (i.e. the second valve needs to be shut before the first valve, and the first valve needs to be shut before the second valve.). Because of this circularity, the plan cannot be executed without violating some of the dependencies. One approach is to create ordering constraints th a t favor one path over another. How ever, it was unclear which was the best method for doing this. An approach used by Instructo-Soar [HL95] involves asking th e user which step to prefer when multiple steps are applicable. However, this approach was not used by Diligent because we were focusing on machine learning rather than on com plex interaction with the instructor. A different problem appears when equivalent steps in different paths use different identifiers. In this case, the plan would contain eight steps. The problem with the resulting plan is th at the first step of each path removes the s ta te changes of the first step in the 218 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. other path. Thus, a system using the plan could indefinitely move th e handle back and forth between the first and second valves without ever shutting either one. One solution is associating each step with a distance from the goal sta te . If multiple steps are applicable, then the system could then choose the step th a t was closest to the goal state [PK86]. However, using this approach would have required us to use a non-standard plan representation. Another solution is to create several plans, or m ethods, for the procedure. For our purposes, a method is a plan of the procedure. W hen a procedure is sta rte d , an autom ated tu to r would select the appropriate m ethod. If there was a student error or an unexpected problem, the tutor might recover by switching to another method. 8.4.1.2 Conditional Plans Diligent cannot learn conditional plans [PS92, DHW94, RN95]. A conditional plan con tains branch steps, and different sequences of later steps are performed based on a decision made a t a branch step. A branch step looks at the current state and determ ines which subsequent steps to perform based on whether its preconditions are satisfied. Branch steps can be thought of as creating a mental attribute whose value is a precondition for the steps following it. Consider the procedure “If the light is on, press buttons B and C; otherwise, ju st shut valve D.” In this instance, the branch step checks whether the light is on. Diligent could produce conditional plans by having two paths for every branch step. One path would represent an unsatisfied branch condition, while the other path would represent a satisfied branch condition. Some of the issues involved include • If a procedure already has multiple paths, how to incorporate dem onstrations of each branch in multiple paths? Otherwise, the instructor may have to dem onstrate the steps in a branch m ultiple times. • Identifying the conditions that control which branch is performed. One heuristic is using the pre-state differences between the dem onstrations of the two branches. 8.4.1.3 Disjunctive Goal Conditions Diligent assumes that a procedure has conjunctive goal conditions. However, disjunctive goals are sometimes desirable, especially in conditional plans. For exam ple, a plan might have one goal state for successful execution and another for unsuccessful execution. 219 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. An im portant issue is how to handle subprocedures th a t have disjunctive goals. When inserting a subprocedure into a parent procedure, the parent needs to handle all the sub- procedure’ s goal states. Furtherm ore, after adding a disjunct to a procedure’s goals, any use o f th a t procedure as a subprocedure may require updating each parent procedure. 8 .4.2 A uthoring This section discusses extensions th a t could make authoring easier, especially if procedures or dom ains are complicated. 8.4.2.1 Additional Types o f Dem onstrations Diligent supports two types o f dem onstrations (Section 4.2): one type adds steps to a procedure’s plan, and the o th er type provides data for machine learning without adding steps to the plan. Only two types of dem onstrations were needed because only simple procedures were needed by the portion o f the HPAC domain th at was implemented. However, if the procedure representation were more complicated, then the following types of dem onstrations might also be useful. A lte rn a tiv e -s te p -o rd e r. T his type of dem onstration allows instructors to dem onstrate a procedure’s steps in a different order or from different initial states. These types of dem onstrations would su p p o rt m ore robust procedures and provide more d ata for learning. This type of dem onstration was implemented and then later removed. (Section 8.4.1.1 discusses some of th e issues.) Branch. This type of dem onstration would support conditional plans. The dem onstration would start at the branch ste p and perform a sequence of steps based on the branch conditions. Because a branch requires a t least two alternative sequences of steps, the pre-state of each sequence could help identify the branch conditions. U n d e s ira b le -a c tio n . This type of dem onstration would teach control knowledge. The environment would be p ut in a desired sta te and an undesirable action performed. The system could then com pare pre-states where the action should be avoided to the pre-states where the action is applicable. One issue is how to incorporate this knowledge into the plan. 220 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Applicable-state. This is sim ilar to undesired-action dem onstrations, but in this case, performing the action in the pre-state is desirable. This type of demonstration would be useful for refining branch conditions and the preconditions of sensing actions. 8.4.2.2 Continuous/Param eterized Actions Diligent supports actions where the only param eter associated with an action is the object selected by the instructor. However, successfully modeling som e types of actions requires associating more param eters with the action. In the two dom ains used with Diligent, several actions have this property. • There is a tem perature gauge that shows the tem perature associated with one of about a dozen sensors. The actual sensor shown is determ ined by the position of a rotary selector switch. • The thrust of the ship’s engines is determ ined by the position of a throttle. Actions involving the selector switch and the throttle a tte m p t to move the object into a desired position. These types of actions could be modeled by associating the desired position with the action. 8.4.2.3 Types o f Mental Attributes A mental attribute is an a ttrib u te that is stored internally by Diligent, or an autom ated tu to r, and is not present in the environment. The type of inform ation represented by a m ental attribute is an im portant issue. At least three types of mental attributes seem reasonable. • The attribute is global. It represents the agent’s knowledge of the world independent of which step sets its value. An example from a medical dom ain is whether someone’s throat is obstructed. • The attribute is specific to sensing actions in one procedure. For example, a con ditional plan may test a light and repair it if it doesn’t work. Because the sta te of the device is uncertain, this may involve several sensing actions that test whether the light is turned on. The fact that the light turns on m ight be treated the same regardless of which step actually observed th a t the light was on. • The attribute is specific to one step. Diligent supports this type of attribute. For example, in C hapter 4, a mental attribute was used to represent that an alarm light •221 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. was checked while the HPAC was in test mode. Later in the same procedure, another sensing action could have created a different m ental attribute to store the result of checking the alarm light when the system was no longer in test mode. Another issue is w hat to do with m ental a ttrib u te s th at are created by reused subpro cedures. If the same subprocedure is reused m ultiple times in the same procedure, how should Diligent distinguish between m ental a ttrib u te s th a t are created by the different abstract steps th a t represent this subprocedure? 8 .4 .2 .4 In fe rre d A ttr ib u te s An inferred attribute represents an a ttrib u te whose value is inferred from the values of other attributes. Inferred attributes could help the instructor create fewer, more abstract attributes. Consider a subprocedure th a t checks four alarm lights. Instead of returning a mental attrib u te for each sensing action, the instructor might create a single mental attribute th at indicates whether all four lights work. Inferred attributes could also be used to create qualitative attributes th at assign the sta te of the environm ent to one of several categories.6 One approach for creating inferred a ttrib u te s is using Kelly’s Personal C onstruct Psy chology [Kel55], which has been used to acquire knowledge for expert system s [SG88, Boo85]. This approach is also known as a repertory grid. The basic idea is for the author to create an attrib u te th a t differentiates betw een several examples. As the author creates more attributes, he constructs a fram ework for viewing the domain. 8.4.3 Learning We will first discuss some simpler extensions before discussing some more involved exten sions. 8.4 .3 .1 S im p le E x te n sio n s There are a few simple ways th at learning could be improved. N e g a te d p re c o n d itio n s . In contrast to D iligent’s conditions, which specify the value an attribute must have, negated conditions specify the value an attribute cannot have. Negated conditions are occasionally useful as preconditions. 6Diligent assumes th at attributes have qualitative values. ‘ 222 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Negated conditions have already been used with version spaces by OBSERVER [Wan96c]. In OBSERVER, a negated precondition is added if a negative exam ple’s pre-state has an attrib u te th at was missing from the environm ent in earlier posi tive examples. OBSERVER, however, will not detect that a negated precondition is required if the a ttrib u te is present in earlier positive examples. W ith attributes th a t are constantly present in the environm ent, negated conditions are only needed if an attribute can take more than two values. Suppose th at an attribute only takes values X and Y. If the value X was undesirable, the condition could simple specify th a t the value has to be Y. Because Diligent’s environment doesn’t have attrib u tes added or removed, Diligent couldn’t use OBSERVER’S approach for learning negated preconditions. However, negated preconditions could still be detected. A negated precondition might be needed if a specific attribute value is never present in positive examples, while two or more other values are present in positive examples. Correlating attribute values between effects. Sometimes an a ttrib u te value is highly correlated with positive examples and poorly correlated with negative examples. Dili gent could be extended to infer th a t these a ttrib u te values had a higher likelihood of being preconditions. If an a ttrib u te value gets a high enough likelihood, it could even be added to Diligent’s heuristic preconditions (i.e. h-rep). D is ju n c tiv e p re c o n d itio n s . Assuming th a t all relevant attributes are visible, a disjunc tive precondition can be inferred when the version space collapses. Supporting dis junctive preconditions would probably require interaction with th e instructor in or der to identify the conditions th a t differentiate the disjuncts and to associate each positive example with the appropriate disjunct. 8 .4 .3 .2 M o re In v o lv e d E x te n sio n s The above extensions are reasonably simple. In this section, we will talk about extensions th a t would involve larger changes to Diligent. U se s tr u c tu r a l k n o w le d g e . Diligent may have an unstructured environm ent. An un structured environm ent contains a set of attrib u te values w ithout any indication of the relations between attributes and objects. W hile Diligent’s techniques can work in a structured environm ent, the techniques do not take advantage of knowledge about the environm ent’s structure. If Diligent used knowledge o f how attributes •223 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. were associated with objects and how objects were related, it might be able to do a b etter job of learning preconditions. This knowledge would allow Diligent to fo cus on the attributes of objects being m anipulated by actions; this might be useful because some of these attributes are likely to be im portant. But more im portantly, structural knowledge would make it easier to generalize operators so th a t they could apply to a class of objects, contain variables, or even use relations between objects. Use a deeper domain model. W hen Diligent s ta rts working on a new domain, it has no knowledge of the domain. It would be interesting to see how Diligent’s approach to understanding dem onstrations could be modified to exploit access to a deeper dom ain model. 8 .4 .4 E xp erim en tation In the chapter on experim entation, experim ents were loosely defined as activities initi ated by the system th at acquire more knowledge. These activities included autonom ously m anipulating environment as well as querying the user for more information. The extensions to Diligent’s techniques fall into two categories. One group contains extensions th a t follow naturally from Diligent’s approach. T he other group contains more involved extensions th a t could com plem ent Diligent’s approach. 8.4.4.1 Simple Extensions The following extensions follow naturally from Diligent’s approach. • If the system has not seen enough examples o f an action producing a desired sta te change, then ask the instructor for more data. This extension is inspired by Galdes [Gal90] study of expert human tutors. • Notice when a subprocedure’s internal step relationships change. This can happen when the instructor explicitly works on the subprocedure, but it can also happen during experiments or during dem onstrations when a subprocedure is inserted into another procedure. These situations are a good source of experiments and a good place to interact with an instructor. • Notice when a subprocedure unexpectedly misses its goal conditions. At this point, Diligent needs information about where to add steps or why unused steps are nec essary. This extension is also inspired by Galdes [Gal90] study of expert human tutors. 22 4 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. • When all else fails, interact with the instructor. For example, if some necessary preconditions are missing, the system may continue to classify a negative example as positive. In this case, the instructor could be presented with several attribute values and asked about their im portance. Interacting with the instructor to classify examples is explored in much greater depth by MOLE [EEMT87], which learns diagnostic knowledge from an expert by focusing on how to classify situations and differentiate between hypotheses. O ther work has looked a t engaging the instructor in a dialog in order to determine which action to perform in a given situation [HL95, Gru89]. 8.4.4.2 More Involved Extensions The following extensions are more involved than the ones in the previous section, but could complement Diligent’s approach in some future system. • Diligent avoids asking the instructor questions. However, the instructor’s assistance could help when the system is confused and the num ber of questions and answers is limited. The P r o D eGE-F graphics editor [BS93] explores this type of dialog. • After Diligent has experim ented on dem onstrations, the system has better knowledge of operators. At this point, it may be appropriate for the system to experiment by creating plans. This could involve explicit experiments where the environment is put in a specific sta te so th a t action can be tested, or it could involve solving practice problems where the environm ent is transform ed from some initial state into a specified goal s ta te .' W ith human students, a sim ilar approach is often used. They exam ine the solutions of few problems before solving a some related problems. 8.5 Summary We started the chapter by discussing Diligent’s methods in term s of different perspectives for viewing dem onstrations. We then discussed assumptions and how easily they could be relaxed. We finished by discussing various limitations and potential extensions. 'T h is use of plans was discussed in Section 6.2.3. •225 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Chapter 9 R elated Work Throughout this docum ent we’ve discussed related work where appropriate. This chapter covers other work th a t hasn’t been discussed. The chapter focuses on three somewhat separate topics. T he first topic is how to present examples in order to promote learning. The second topic is intelligent tutoring system s. The third topic is system s th at learn from dem onstrations. (Although many system s th a t learn from dem onstrations have already been discussed, they have not been discussed as a group or as complete systems.) 9.1 The Presentation o f E xam ples Because dem onstrations are the prim ary in p u t th a t Diligent receives from instructors, we will briefly look a t other work th at deals w ith the presentation of similar types of data. We will first discuss properties of good instruction and then discuss how to present examples. 9.1.1 Felicity C ond itions Good instruction of human students follows a set of conventions. VanLehn [Van83] char acterizes some of these conventions and calls them felicity conditions. VanLehn uses the felicity conditions in SIERRA [Van83. Van87], a system th a t models human students learn ing subtraction. O ne difference between SIERRA and Diligent is the nature of their inputs. SIERRA receives an ordered sequence of lessons where each lesson can contain solutions to multiple sim ilar problems. In contrast, Diligent receives a sequence of dem onstrations, and each dem onstration corresponds to a lesson th a t contains the solution to only one problem .1 ’T h e types of dem onstration's supported by Diligent are described in Section -1.2. •226 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. In the following discussion, keep in m ind some of the differences between Diligent and human students. Hum ans require reinforcem ent and repetition of w hat they have learned, while Diligent never forgets. One advantage Diligent has is its access to a simulation, which can be used to perform experiments. Usually, human students don’t have access to the equivalent of Diligent’s simulation (e.g. when they are learning subtraction). D eterm ining the beliefs of human students is much m ore difficult for a teacher than it is for D iligent’s instructor, who can use menus to look directly at Diligent’s knowledge. A natural question is how well does th e relationship between Diligent and the in stru cto r match VanLehn’s felicity conditions. Let us consider each of the felicity conditions. • Assim ilation. A procedure is increm entally improved by adding to the existing pro cedure without revising large portions of it. VanLehn writes, “incremental learning is an im portant and nearly universal feature of hum an skill acquisition” ([Van83], page 10). Diligent’s add-step dem onstrations guarantee this felicity condition because th e in structor indicates where to insert steps in an existing procedure. However, dem on strations can have a large impact w'hen they alter step relationships. Diligent’s clarification dem onstrations do not have an equivalent in SIERRA. A clar ification dem onstration provides d a ta for machine learning w ithout adding steps to a procedure’s plan. However, clarification dem onstrations also incrementally im prove a procedure by refining operator preconditions. Because Diligent never forgets and does not get confused when switching between contexts, Diligent does not have the problems th a t hum ans do when large portions of a procedure are changed. This m eans th a t a machine learning system , like Diligent, may not need to follow this felicity condition. (However, following it may seem natural to an instructor.) • Generalization. D uring a lesson, one way that a stu d en t learns is by generalizing the lesson’s example solutions. Diligent also does this when it uses machine learning techniques to learn precondi tions. However, unlike a human stu d en t, Diligent does not make generalizations for whole classes of objects (e.g. how to log into all com puters). • Show work. At least in introductory lessons, all work should be shown.2 2SIERRA also has lessons th a t optimize an existing procedure by showing how to eliminate unnecessary work, but this type of lesson may not be necessary. 227 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Diligent’s instruction meets this condition. Diligent sees all relevant attributes of the environm ent and observes all actions performed in the environment. In fact, it is easier for Diligent's instructor to meet this felicity condition than it is for someone who teaches hum an students. • One disjunct per lesson. In each lesson, the student should need to add a t most one disjunction to his m ental model of a procedure. A disjunct contains a sequence of actions and a conditional test to decide whether to perform the actions. VanLehn sometimes refers to disjuncts as subprocedures. Because of different procedural representations, this is harder to characterize. One of Diligent’s add-step dem onstrations could be considered a disjunct because an add- step dem onstration contains a sequence of steps th a t are inserted between existing steps. However, Diligent learns preconditions for each step rather than one for the entire dem onstration. This felicity condition does not apply to clarification dem onstrations because they don’t add steps to the procedure’s plan and because they can contain arbitrary sequences of steps. Unlike a human, Diligent can use clarification dem onstrations because it does not forget and does not get confused when switching between con texts. Clarification dem onstrations can be thought o f as exploratory dem onstrations in which an instructor illustrates the behavior of the environm ent. Other people have adapted VanLehn’s felicity conditions. W hen discussing felicity conditions, Wenger [Wen87] includes the condition m inim al set o f examples. This means th at example solutions are sufficient to learn the new subprocedure. However, Diligent does not assume th at the instructor provides a minimal set of ex amples. Instead, Diligent uses heuristics to create a “reasonable” procedure th at the instructor can then exam ine, edit and test. In this sense. Diligent, without instructor input or critique, is not expected to achieve the m astery or proficiency of human students, which makes Diligent’s task much easier. The felicity conditions have also been adapted for Program m ing By Demonstration (PBD) systems [CKM93]. Because Diligent uses PBD, these conditions are relevant. • Be consistent. T he steps in a dem onstration need to be performed consistently in the same order. 228 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Consistency is im portant for typical PBD system s because they only need to know how to autom ate a procedure. Because these system s do not usually have access to a simulation, they use induction to learn how to sequence a procedure’s steps. In contrast, Diligent attem pts to acquire the knowledge necessary for teaching, which requires more knowledge ab o u t the dependencies between steps. Diligent not only needs to be able to answer questions, but it m ust also be able to m onitor students as they perform a procedure. Because students may legitim ately perform steps in a different order than any dem onstration, Diligent needs to be able to recognize whether an alternative sequence of steps will achieve a procedure’s goals. Diligent can violate this felicity condition because it uses a sim ulation to induce operator preconditions th a t are independent of the current procedure. Later, when creating a plan, Diligent uses these preconditions to analytically derive the depen dencies between steps. Because operator preconditions are not procedure specific, Diligent’s clarification dem onstrations do not cause a problem when they violate the “be consistent” felicity condition. In fact, clarification dem onstrations are meant to violate this felicity condition. • Correctness. The procedure is correctly dem onstrated. Diligent makes this assum ption. If this assumption is violated, then Diligent can still learn, but the preconditions of its operators may not be as good. • No extraneous activity. An extraneous step might not be incorrect, but it doesn’t contribute to the goal. One problem is that extraneous activities are likely to confuse or mislead a typical PBD system . Diligent can com pensate for extraneous activity because it has access to a simulation. While extraneous steps in add-step dem onstrations are undesirable, Diligent learns to skip them through its experim ents. Thus, extraneous steps are not usually a problem for Diligent. Furtherm ore, this issue is not relevant for clarification demonstrations because they do not add steps to plans. In fact, extraneous activities are probably beneficial in clarification dem onstrations because they provide m ore d a ta for learning. 9 .1.2 Presenting a Sequ en ce o f Exam ples O ther work by M ittal has focused on how to present sequences of exam ples to humans [Mit93, MP93]. This raises two issues: how well do the inputs given D iligent's instruction meet these criteria and how do Diligent’s abilities com pare to a hum an stu d e n t’s. •229 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. In M ittal’s work, “example” is used to describe the training d ata, and both Diligent’s action-examples and dem onstrations (i.e. sequences of action-examples) would be consid ered “examples.” M ittal [Mit93] looked a t m any issues involved in the presentation of examples. O f these issues, the following appear to be relevant. • M inimum detail. Studies have shown th a t people learn best when examples contain a minimum number of irrelevant features. M ittal calls this the minimum detail principle. Diligent’s action-examples do not meet this requirement; Diligent learns in an envi ronment with a constant num ber of attributes. However, Diligent uses the minimum detail principle when it heuristically focuses on a small num ber of likely operator preconditions. For example, when creating an operator, Diligent assumes th a t the state changes of the dem onstration’s earlier steps are good candidate preconditions for the new operator. The principle of minimum detail applies to D iligent’s add-step dem onstrations, which should not contain unnecessary steps. T he principle also applies to Diligent’s envi ronment during dem onstrations because only the instructor is performing actions. Like a human, Diligent learns best when there are few irrelevant details, but unlike a human, Diligent not forget and can save negative action-examples until it is able to process them. • Number o f examples. If hum ans are given too many examples, they tend to have lapses of attention. Lapses of attention are not an issue with autom ated systems, and Diligent learns best with many examples. • Order o f presentation. The order of presentation helps avoid confusion and focuses the student’s attention. Simpler, more easily understood examples should be pre sented before more com plicated and difficult examples. This is closely related to VanLehn’s assimilation and one disjunct per lesson felicity conditions (Section 9.1.1). Diligent learns best when its dem onstrations represent small, modular and logi cally coherent procedures. However, since Diligent uses state changes from within a demonstration to identify likely preconditions, Diligent should learn better when it has a long logically coherent dem onstration rath e r than several incrementally more complicated procedures. 230 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. • Pairing o f examples. A n exam ple should highlight some feature. This means th at there is a relationship betw een an example and the principle being taught. M ittal identifies three ty p es of examples: A positive exam ple is instance of the concept being taught; a negative example is not an instance o f the concept; and an anomalous example represents a special case or an exception. Diligent processes positive and negative examples, but makes no special provision for anomalous examples. Anom alous examples are treated like any other example. If Diligent were to make special provision for anomalous exam ples, it would probably have to support disjunctive preconditions. For example, a device might normally be reset by pressing the reset button, but while in test mode, it m ight only be reset by pressing the system test b u tto n . In this example, resetting the device in test mode is an anomalous or special case. M ittal discusses how pairing different types of examples teaches different principles. Pairing two positive exam ples allows students to identify unnecessary (or variable) features. Pairing a positive and a negative example allows students to identify nec essary (or critical) features. Furtherm ore, pairs of positive exam ples should be as dissimilar as possible, while a positive and negative example should be as similar as possible. In fact, M ittal writes th a t studies [FeI72, HMD73, KGF74, MT69] sug gest that the most effective pairing of examples are minimally different positive and negative examples. Like human students, D iligent’s algorithm s for refining o perator preconditions also do best with maximally different positive examples and minimally different positive and negative examples. However, Diligent’s dem onstrations do not necessarily provide this type of data. An add-step dem onstration only provides one action-example for each step. For a clarification dem onstration, it is entirely dependent on the in structor whether the demonstration provides dissim ilar positive examples or sim ilar pairs of positive and negative examples. Diligent overcomes its lack of action-examples by performing experim ents. Experi ments are derived from dem onstrations and tend to produce sim ilar pairs of positive and negative examples. However, without the help of the instructor, Diligent cannot create very dissimilar pairs of positive exam ples. One obstacle is the minimal assum ptions th a t Diligent 231 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. makes about its ability to m anipulate the environm ent. This problem might be addressed by using planning techniques to create more elaborate experiments. M ittal also addresses a couple of issues th at don’t m ap well to Diligent. One issue is how advanced is the material. Because Diligent receives action-examples with a fixed number o f attrib u tes rather than increasing numbers of attrib u tes, Diligent’s input doesn’t correspond well the increasingly detailed training given hum ans. A nother type issue is the type of knowledge being taught. W hile Diligent can learn ab o u t relationships between inputs and ou tp u ts (i.e. operators) and sequences of relationships (i.e. procedures), Diligent does not learn the types of concepts th a t a human learns (e.g. apples grow on trees). 9.2 Intelligent Tutoring System s Because Diligent creates procedures for a tutoring system , we need to discuss tutoring system s and authoring system s for tutoring system s. We will first discuss com puter systems th a t provide instruction, and we will then discuss issues and approaches for authoring. 9 .2 .1 C om puter A id ed Instruction Traditional forms of Com puter Aided Instruction (C A I) require authors to create a fully specified presentation of th e material, including questions and answers [Wen87, Ric89, Mur97]. This includes specifying the flow of control through the m aterial. Because the m aterial is grouped into fixed blocks or “frames” of knowledge, traditional CAI has been referred to “electronic page turning” [Ric89]. While CAI is useful for som e types of instruction, it has problems. CAI systems tend to be inflexible and allow only limited tailoring of instruction to individual students. The problem is th at CAI system s know little or nothing ab o u t w hat is contained in the frames. A reaction to traditional CAI is Intelligent Tutoring System s (IT S ) [Wen87. Ric89]. A main distinction between CAI and ITS is th at, instead of using CAI fram es, ITSs use the knowledge that was used to compose the frames [Wen87j. A prim ary characteristic of ITS is using this knowledge for multiple purposes. For example, the sam e piece of knowledge might be used for presenting m aterial, form ulating a question and answering it. Consider the STEVE tutor, which is used with Diligent. STEV E uses plans to dem onstrate procedures, monitor students as they perform procedures, answer student questions, and recover from student errors. STE V E couldn’t do this if it ju st knew about a fixed sequence of steps. •232 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. ITS research has focused in a number o f areas. One area is modeling the stu d e n ts knowledge. The model m ay include w hat students have seen as well as w hat the system believes about their knowledge [SS98, Wen87, Sel74, Car70]. Another area is modeling different teaching strategies; this includes how to present material, w hat type of questions to ask, and when to intervene [MAW97, M aj95, Hil94, SJ91, Wen87]. And a third area is using simulations to provide students with a richer, more complex and interactive learning environment [M JP+97, VD96, YVen87]. In this thesis, we have focused on authoring procedures for use with a sim ulation. We have ignored student m odeling and teaching strategies because we have assum ed th a t an autom ated tu to r would already have knowledge of these activities. Another problem w ith CAI systems is th a t authoring these systems takes a long time. According to Woolf and Cunningham , each hour of instruction typically requires ‘ 200 hours of development [WC87]. Ideally, by reusing knowledge, authoring knowledge for ITSs should be sim pler than for a CAI, b ut this is not so. Not only do ITSs have an additional capabilities, which require additional knowledge, but their knowledge also needs to be m ore structured. In fact, M urray [Mur97] has w ritten th at one o f the biggest problems with ITS research is th at ITSs are “difficult and expensive to build.” For this reason, ITS authoring is an active area of research. The next few sections will discuss ITS authoring issues. 9.2.2 W ho is th e A u th or A primary concern when considering an ITS authoring tool is what type o f person will do the authoring. Is a tool designed primarily for experienced, expert users o r is it designed for wider class of user? This is im portant because different tools are designed for different types of users. When considering various approaches to ITS authoring, we will consider two types of authors. • An instructional designer provides m aterials for many teachers and students. An instructional designer may have specialized training in instructional design and in the use of authoring tools. However, an instructional designer m ight have little interaction with teachers or students. • A teacher authors m aterial for his class. T he teacher is unlikely to have the same specialized training as an instructional designer and will most likely have limited ‘ 233 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. time for authoring. However, unlike an instructional designer, a teacher should have a lot of interaction w ith students. Diligent focuses on quick and easy authoring o f procedures so th a t its techniques could be used by a large class o f users th a t includes both instructional designers and teachers. 9.2 .3 A pproach to A u th o rin g Because of the difficulty in creating an ITS, researchers have tried different approaches. Below are some of the basic ITS authoring approaches. • M onolithic/evolutionary. These systems contain everything needed for instruction. This type of system a tte m p ts to incrementally evolve the s ta te of the art of commer cial CAI authoring tools. The system is usually targeted tow ards instructional de signers. For example, EON adds improved m odularity and abstraction to a CAI ap proach [Mur98]. In co n trast, the IRIS Shell [AFCFG97] stru ctu res authoring around Gagne’s theory of instructional design [GBW88]. A problem with this type approach is the tim e involved. For example, Murray [Mur98] reports the success of an earlier ITS authoring tool th a t supported authoring an hour of instruction in 100 hours. He com pares this favorably to the 100 to 300 hours of a traditional CAI approach. However, using this type of system doesn’t have to be laborious. REDEEM [MAVV97, MA97] is targeted tow ards teachers rather than instructional designers. REDEEM allows teachers to reuse the content of an existing CAI course and to tailor the teach ing strategies used with individual students. When given the CAI d ata, REDEEM appears easy to use. • Framework. This type of system asks the instructor to provide d a ta for use in a predefined instructional framework. The author will provide predefined types of data, and the system will reuse predefined pedagogical knowledge. This type of system is also monolithic. Much of the work in this area has been done a t Northwestern and has focused on Goal Based Scenarios (GBS) [Sch94, .JK97, Bel98, DR98]. GBS system s have students work on several scenarios using the method determined by the given framework. For example, the Investigate and Decide framework requires students to make a decision after investigating the situation with a set of tools. Another framework. Persuade. 234 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. lets stu d en ts interact with simulated characters and to build a consensus or to change the positions of the sim ulated characters. To a u th o r a GBS, the au th o r provides scenarios, tools, video clips, questions and an swers. T h e GBS fram ework will then integrate the d a ta when providing instruction. Unfortunately, authoring with these system s can take several weeks [Bel98] or from 5 to 10 m onths [DR.98]. Because of tim e involved, a teacher is unlikely to author with existing GBS frameworks. In contrast, XAIDA allows quick authoring (i.e. m inutes to hours) [Red97, HHR99]. Little work is needed because XAIDA has a great deal of knowledge about how to present machine m aintenance training. XAIDA focuses on what to present rather than how the domain works. For example, to teach a device’s physical characteristics, the au th o r labels portions of a picture and provides simple knowledge (e.g. the function o f a part). However, XAIDA is self-contained and cannot interact with a complex simulation of the device. • Component. In this framework, a heterogeneous group o f tools interact [RK96, RB98, JRSM 98]. Not only can high quality com ponents be developed independently, but com ponents can potentially be reused on other system s. However, when using this approach, knowledge is localized inside the com ponents. For example, one component may know a great deal about teaching but little about the domain. This dissertation deals with the component framework and focuses on helping an author exploit the dom ain knowledge already contained in other com ponents. 9.2.4 E asier D ata E n try All ITS authoring research focuses on making ITSs easier to author, but m ost work has focused on supporting the additional capabilities not found in CA I. Relatively few systems have focused on d a ta acquisition with machine learning techniques or using extremely quick authoring. Systems th a t acquire knowledge quickly can do so because they focus on acquiring shallow knowledge about well-defined and constrained activities. One system th a t we’ve discussed is XAIDA [HHR99], which knows a great deal about generating instruction. XAIDA uses data provided by the instructor to instantiate a generic instruction template. 235 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. A nother system. DlAG [To\v97b, Tow97a], focuses on teaching fault diagnosis. DIAG generates a probabilistic table of faults by modifying a sim ulation. Unlike Diligent, DIAG is contained in the simulation and can directly access and modify it.3 Demonstr8 [Ble97] can author an ACT tutor [A+ 95]. Demonstr8 allows the author to create the student’s interface using the Graphical User Interface (GUI). Demonstr8 also induces expert behavior from exam ples. However, the version of Demonstr8 described in the paper can only create simple arithm etic tutors. It is unclear how easily the system can be scaled to domains where functions are not simply lookup tables. Recently, Disciple [TH96, TK90] has been used to inductively learn how to classify examples of a given concept [TK98]. Disciple first has the author build a semantic net of object classes and relationships. Then, the author provides Disciple with examples of a concept. Finally, Disciple asks the author w hether other examples are members of the concept’s class. Although Diligent learns procedures rather than individual concepts, the preconditions of Diligent’s operators are similar to Disciple’s concepts. Unlike Dis ciple, Diligent does not use a sem antic net and can perform experim ents that query the environm ent rather than the author. Work at the University of P ittsb u rg h ’s Learning Research and Development Center has looked at using human style reasoning to learn how to solve procedural problems (e.g. physics problems) [GCV98].4 This approach requires access to well-defined domain rules (e.g. physics laws) and problem m odeling techniques. In contrast, Diligent is made for dom ains where this type of knowledge is not readily available. Although not strictly an ITS authoring tool, ODYSSEUS [Wil90, Cla86, Wen87] learns knowledge about medical diagnosis th a t can be used by the GUIDON family of ITSs. ODYSSEUS learns best by observing a physician make a diagnosis. It then attem pts to explain the diagnosis using a dom ain model and a diagnostic strategy model. If an ex planation is not found, it uses heuristics to make inductive changes to the domain model. ODYSSEUS is able to update the dom ain model because it uses a known problem solving strategy and assumes th at the dom ain model is alm ost correct. The validity of the ap proach was dem onstrated in an experim ent [Wil90]. A fter observing only two diagnoses. ODYSSEUS showed a 37% im provem ent in its ability to make a correct diagnosis. This 3DIAG is implemented in RIDES, which is an ITS authoring tool Tor simulations. After a simulation has been built, RIDES also supports quick authoring of instruction. RIDES is discussed in section 9.3.1. 4The work at the University of Pittsburgh has explored the use of the human Self-Explanation Effect, which was discussed in Section 6.8.1. 236 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. improvement occurred even though the physician misdiagnosed one o f the two cases. How ever, ODYSSEUS differs from o th er system s in this section because it uses a deep domain model. For example, the model used in the experiment was acquired over seven year period [Wil90]. 9.3 Learning From Dem onstrations This section talks about system s th a t learn from dem onstrations. Specifically, it discusses system s that learn from traces. A trace is a record of the procedure being performed. If this type of system learns by observing users carry out their normal activities, the system is called a Learning Apprentice System (LAS) [MMS90]. 9.3.1 Program m ing B y D em on stration Diligent’s use of dem onstrations to learn procedures is called Program m ing By Demon stration (PBD) [C+ 93]. Unlike sim ply recording a macro, PBD by definition requires some generalization. Diligent is an unusual PBD system in th at it generates d a ta by performing autonom ous experiments. PBD has been used for several types of purposes, such as creating user interfaces and learning procedures. We will focus on PBD systems th at learn procedures. A traditional PBD system learns procedures in order to autom ate tasks. This involves making procedures work on m ultiple objects and determining which conditions indicate a change in a procedure’s flow of control. A condition th at indicates a change in the flow of control is called a branch condition. An example of a branch condition is a condition that indicates whether to exit a loop. However, traditional PBD system s do not attem pt to learn in detail how steps depend on each other (i.e. step relationships). This means that they would not be able to recognize w hether a different sequence of step s was valid or to provide explanations about the dependencies between steps. The actions in Diligent’s dem onstrations bear a lot of sim ilarity to those of robotic PBD systems [FMD+96, Hei93, Hei89, And85]. However, these system s concentrate on eliminating sensor noise and finding loops and branch conditions. Like traditional PBD system s, these systems learn to perform a task without learning the step relationships required for the type of teaching th a t Diligent supports. 237 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. A system th a t can use dem onstrations to learn sim ilar types of procedures as Diligent is the RIDES [M JP+ 97, M JSW 93] authoring system .3 RIDES is one of the most used ITS authoring system s. It supports authoring of graphical sim ulations without a great deal of programming expertise. RID ES also supports the ability to enter many types of training exercises, and it is the training exercises th at are relevant to Diligent.6 While training exercises use the executable sim ulation model, training exercises are separate objects th at contain little knowledge about the model. Unlike D iligent’s plans, these training exercises do not contain detailed knowledge about the dependencies between steps (i.e. causal links). Because less has to be known about the procedure, it is much easier to dem onstrate in RIDES than it is in Diligent. A uthoring with RIDES involves dem onstrating the procedure and interacting a little with menus. However, because R ID E S’ exercises lack causal links, RIDES can only provide lim ited help and remediation. 9.3.2 D etailed D om ain M odels An early robotic dem onstration system th at only requires one dem onstration is ARMS [Seg87]. Unlike Diligent, ARM S relies on a detailed dom ain model and a geometric reasoner to deduce a procedure’s structure. Like ARMS, another system th a t uses a detailed dom ain model is LEAP [MMS90]. LEAP uses its theoretical knowledge of circuits for learning how to implement com ponents of a circuit. LEAP relies on Explanation Based Learning (EBL) [MKKC86, DM86] and can only learn when its domain theory can explain a training exam ple. In contrast, Diligent sta rts with little dom ain knowledge and focuses on acquiring the dom ain theory necessary for explaining a procedure. LEX [MUB83] does not learn procedures; instead, it is given a set of operators and learns when to perform them . LEX starts knowing a set o f m athem atical transform s th at it uses to solve symbolic integration problems. These transform s are analogous to the operators th at Diligent learns. Instead of receiving traces as input, LEX uses the solutions to problems th at it has solved. LEX is relevant because it tries to maximize the use of its limited problem solutions by minimally modifying the problem and then attem pting to solve it. 5DiIigent’ s environment is controlled by a version of RIDES called VIVIDS. G RIDES’ procedure and goal pattern ed exercises are similar to D iligent’s procedures. 238 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CELIA [Red9'2] can learn machine maintenance procedures similar those learned with Diligent. However, instead of learning procedures for teaching hum ans, CELIA mod els human performance and learning. Because CELIA contains a detailed but possibly incomplete domain model, CELIA, unlike Diligent, is able to learn com plicated trouble shooting tasks. CELIA receives high level English descriptions of diagnostic procedures and can ask the user questions when it gets confused or discovers problem s in its domain model. Because CELIA emphasizes reducing gaps in its knowledge, C ELIA only learns when a failure identifies missing knowledge. In contrast, Diligent can learn from both success and failure because it atte m p ts to reduce the uncertainty in its knowledge. CELIA focuses on learning how the diagnostic goals of a procedures steps are related rather than the low-level preconditions th at Diligent uses for creating step relationships. Like many case-based systems, CELIA’s indexing of the steps of a procedure is very dependent on the order that CELIA receives training examples. 9.3.3 Procedure R ecogn ition Two system s that require traces of slightly increasing complexity are SIERRA [Van87, Van83] and NODDY [And85]. SIERR A models children learning subtraction and creates procedures in the form of AND-OR graphs, while NODDY, an early PB D system for two dimensional robots, learns procedures in the form of flow charts. B oth systems learn incrementally but non-interactively from traces. These system s learn by matching their model of a procedure against a trace to find differences. Because these system s need to match existing procedures, they rely on the user adding little complexity per trace. For example, in Section 9.1.1 we discussed SIERRA’ s assum ption that the instructor adds only one disjunct per lesson. Diligent avoids this problem by requiring the instructor to specify the position where steps are inserted. Because SIERRA and NODDY learn the control knowledge necessary to a perform procedure rather than operators th a t model the domain, they can only use traces th a t illustrate how to perform a procedure and could not use traces similar to Diligent’s clarification dem onstrations. Additionally, neither system refines its knowledge with experim ents. 9.3 .4 U niversity o f M ichigan Soar Group Soar [LNR87] is a production system th a t implements a unified theory of hum an cognition [New90]. Diligent was written in the environm ent of the Soar community. In fact, the tutor used with Diligent, STEVE, is implemented prim arily as Soar productions. The •239 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. work on instructable agents in Soar a t Michigan heavily influenced Diligent’s interaction with the instructor. Instructo-Soar [HL95, Huf94, HL93] receives tutorial instruction in a m anner sim ilar to Diligent but in English rather th an by direct manipulation. Unlike Diligent, a user can tell Instructo-Soar what to do in hypothetical situations (e.g. “when the light is red, press the green button” ). Unlike Diligent, which learns operators, Instructo-Soar is given set of general-purpose operators th a t model actions performed its domain. Unlike Diligent, Instructo-Soar does not m odify its operators and does not refine its knowledge by performing autonom ous experim ents. Instead of learning plans, Instructo-Soar uses its operators to learn when to reactively perform actions (i.e. operator proposal rules). If Instructo-Soar’s operators were correct, it could generate the step relationships th a t Diligent learns. However, if Instructo-Soar’s operators were incomplete or incorrect, then it would have problems generating D iligent’s step relationships. If Instructo-Soar’s o perators cannot explain a dem onstration, it uses heuristics to create operator proposal rules th a t allow it to correctly perform a procedure. Instructo-Soar has been extended by IMPROV [PL96, Pea96], which refines its knowl edge with experiments. IMPROV perform s procedures and refines its knowledge when failure is detected. Unlike Diligent, IM PROV can handle noise and work in dynam ic dom ains whose properties change. IM PRO V experiments by performing actions in the environm ent during a search for a sequence of steps th at achieves a procedure’ s goals. In contrast, Diligent can learn without failure and doesn’t care if its experim ents achieve the procedure’s goals. When IMPROV finds a successful plan, it learns to perform the p lan ’s steps in the same order as the successful plan. IMPROV does this by learning reactive rules to propose operators. The problem with this approach is th at it d o esn 't learn ab o u t alternative orders of steps th a t would also achieve the goals. This means th a t IM PRO V ’s approach doesn’t learn good preconditions for deriving Diligent’s step relationships. A n other problem is how IMPROV represents its reactive rules. IMPROV’ never forgets a rule even though it may have missing or unnecessary preconditions; instead, IM PROV creates a patchwork of overlapping, prioritized rules. It appears likely th a t a hum an instructor would find this representation difficult to comprehend and verify. 9 .3 .5 Approach to E xp erim en ta tio n Diligent’s approach to experim entation is m ost similar to P E T ’s approach [PK86]. Unlike Diligent, PET learns relational rules, which use arbitrarily complex dom ain dependent transform ations to change the state before the action into the state after the action. In 240 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. contrast to Diligent, which modifies a dem onstration by changing the actions that are performed, P E T modifies a dem onstration by changing the state. P E T ’s approach to experim enting requires complete control of the sta te and involves repeatedly performing an action a fter making fine grain changes to the state. Because Diligent does not have complete control over the state, it could not use P E T ’ s approach. 9.3.6 S y ste m s that Learn O perators O perators model actions performed in the environm ent and identify the preconditions necessary to produce various state changes. Diligent is unusual in that it learns operators th a t are only applied to a few instructor specified procedures. In contrast, other systems learn operators for solving general planning problems. These systems experim ent by solv ing practice planning problems, where an initial sta te is transformed into a goal state. In contrast, Diligent experiments by modifying its dem onstrations. Diligent doesn’t care about an experim ent’s final sta te because its experim ents focus on identifying dependencies between the given procedure’s steps. A system th a t systematically refines its operators is EX PO [Gil9‘ 2, CG90]. EXPO re fines operator preconditions when an unexpected sta te change is observed while solving planning problems. Unlike Diligent, EX PO is given a set of incomplete operators with their preconditions partially specified. EX PO then refines its operators by adding precon ditions. Unlike Diligent, EXPO cannot remove incorrect preconditions. EX PO introduces general heuristics for proposing preconditions th a t rely on the similarity of objects and the relationship between objects and actions. Unlike Diligent, EXPO can also learn a new procedure by an analogy to an existing procedure th at uses similar classes of objects. In contrast, Diligent does not have a hierarchy of object classes, and m any o f Diligent’s objects (e.g. switches) have idiosyncratic behavior th at prevents reuse of operators with different objects (e.g. switchl turns on the motor, while switch2 turns on a light). A system th a t heavily influenced Diligent is OBSERVER [Wan96c, YVan96a, Wan95, Wan96b], Unlike Diligent, OBSERVER generalizes the objects and attributes in its oper ators. Diligent doesn’t do this because it has less knowledge of its environm ent and many objects in its environment have idiosyncratic behavior. OBSERVER learns operators by observing traces of many dem onstrations and solving many planning problem s. In con trast, Diligent has only a few dem onstrations and does not solve planning problems. Unlike Diligent, OBSERVER does not consider the relationship between steps in a dem onstration when hypothesizing preconditions. 241 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 9 .3 .7 O ther W ork A system that learns a different type of operator th an Diligent is TRAIL [Ben95j. TRAIL processes dem onstrations and uses inductive logic techniques to learn reactive teleo-operator proposal rules. Teleo-operators [BN95] model actions th a t can have a duration. Unfortu nately, TRAIL learns only one definite state change per operator. The operator’s other sta te changes have a probability o f appearing. T his would be unacceptable for teaching procedures in a dom ain where an action can change the values of multiple attributes. It m ight be possible to learn different conditional effects for different sta te changes; how ever, because a conditional effect’s s ta te change is definite, it is unclear whether TR A IL’S probabilistic learning algorithm would still be useful. Recent work by Bauer [Bau98] takes a different approach for understanding traces. Un like Diligent, which focuses on the a ttrib u tes in the environm ent, Bauer looks at acquiring plans using relationships between argum ents of different actions. For a number of reasons, this approach is inappropriate for Diligent. The program th at is learning procedures (e.g. Diligent) may not know how some objects are related to each other. This means th at performing an action may cause changes in distant objects th at appear to be unrelated to the object acted upon. For example, pressing a b u tton m ay turn on a fan in another room. Furtherm ore, in Diligent’s procedures, every step m ay m anipulate a different object; thus, the argum ents to a procedure’s actions often have little commonality. Finally, Diligent pro duces procedures for a type of tu toring th a t requires causal links between steps, and causal links are derived from the preconditions of steps. Som e of these preconditions may involve the environm ent’ s sta te rather than the properties o f the objects being manipulated. LIVE [She93, She94] is a system th a t uses autonom ous exploration and experiments by creating plans. Because LIVE doesn’t focus on learning user specified procedures, it is unclear how well it would scale to more complex dom ains because of the time involved and the lack of focus. Because LIVE doesn’t process traces, its main relevance is its machine learning techniques. Besides experimenting with plans, LIVE learns rules to predict when an action will produce given state changes; these prediction rules are similar to the preconditions of Diligent’s conditional effects (or effects). Unlike Diligent, LIVE requires structural dom ain knowledge and only learns from prediction failure. LIVE’s approach for learning prediction rules, Com plem entary Discrimination Learning (CDL), updates prediction rules by com paring the prediction rules for different effects. However, the updated rules can contain both disjuncts and negated conditions. When compared •242 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. to Diligent's simple conjunctive preconditions, the representation of prediction rules may seem overly complex to a human instructor. •243 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Chapter 10 Conclusion In this last chapter, we will summarize this thesis and its contributions. We will also discuss some potential future work. 10.1 Summary o f the Approach This thesis looks at the problem of authoring procedures for an autom ated tutor that is used in a heterogeneous, simulation-based training environm ent. To teach, the autom ated tu to r needs certain capabilities. It must be able to dem onstrate procedures for human stu dents, m onitor students as they perform procedures, answer questions about a procedure, and recover from student errors and unusual environm ent states. M onitoring students is difficult because students may use a valid sequence of steps th a t is different than what was dem onstrated, and answering questions is difficult because missing or incorrect informa tion causes confusion. It is assumed th at the tu to r has general knowledge of how to teach, but is missing knowledge of the procedures th a t it teaches. Unfortunately, acquiring knowledge from dom ain experts (e.g. instructors) can be diffi cult. Domain experts may not be program mers or expert knowledge engineers. Therefore, Diligent exploits the presence of a simulation to make authoring easier. T he techniques explored in this thesis could potentially allow non-programm ers to author procedures by dem onstrating them with a graphical interface th at represents the state of a simulation. Less work is required from an instructor because Diligent uses the sim ulation to per form experim ents. These experim ents allow Diligent to get answers to questions from the simulation instead of the instructor. Because Diligent can answer its own questions, not only is there less chance of instructor error, but Diligent also needs fewer dem onstrations. Because less d a ta is required from the instructor, the difficulty of authoring is also reduced. 244 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. One way th a t Diligent’s techniques could help an instructor is by providing feedback about its beliefs. For example, Diligent uses three sets of preconditions (i.e. s-rep, 1 1 - rep and g-rep) and each set represents a different level of confidence. When users look at preconditions, Diligent indicates its level of confidence th at a given precondition is necessary. For example, preconditions th a t only appear likely (in h-rep but not in g-rep) have lower level of confidence than preconditions th at have been shown to be necessary (in g-rep). Because Diligent may have very little knowledge, it uses heuristics to speed up learning. It assumes th a t the sta te changes of earlier steps are likely to be preconditions of later steps. It also uses an heuristic, best guess precondition concept (i.e. h-rep) th a t is in between the the upper and lower bounds of its version space. Unlike the version space bounds, the h-rep supports error recovery by allowing preconditions to be both added and removed. Diligent also bounds the cost of experim entation. Its experiments change th e order of a procedure’ s steps by skipping a step and observing what happens to later steps. Because the purpose of an experim ent is to perform the steps rather than to achieve som e goal state, experiments perform a bounded of number of steps. Additionally, in experim ents on hierarchical procedures, Diligent only experiments on the current procedure and treats subprocedures of the current procedure as single steps. A nice aspect of Diligent’s approaches to experimentation and to learning operators is th at they balance each other. W hen operators are created, the preconditions tend to have errors of commission (i.e. unnecessary preconditions). On the other hand, by skip ping steps, experiments tend to identify errors of commission. Furtherm ore, in D iligent’s version space learning algorithm , errors of commission are easier to eliminate th an errors of omission (i.e. missing preconditions). 10.2 Contributions The main contribution are the following. • A method th a t balances the strengths and weaknesses of dem onstrations and exper iments. Experim ents are used to identify missing or unnecessary preconditions, but can more easily identify unnecessary preconditions. For this reason, operators are created during dem onstrations using heuristics th a t have a bias towards creating un necessary preconditions. W hile creating operators, the system uses a novel heuristic th at focuses on how earlier steps in a dem onstration establish preconditions for later 2 4 5 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. steps. Because experim ents com pensate for the bias towards creating unnecessary preconditions, Diligent can learn a great deal from a single dem onstration. • A method for performing useful and focused experiments while requiring only mini mal knowledge. The approach only needs to know the sequence of steps in a demon stration. T he approach exploits the simulation to focus on how the state changes of early steps in a dem onstration affect later steps. This approach effectively transform s one dem onstration into m ultiple related dem onstrations. A lesser highlight of the thesis is th a t it also presents algorithm s th a t show how to transform dem onstrations into hierarchical partially ordered plans. These algorithm s, additionally, provide the fram ew ork th a t supports learning operators and performing ex periments. 10.3 Evaluation An empirical evaluation using hum an subjects was performed (C hapter 7). The evaluation looked a t the benefits of both dem onstrations and experiments. T he analysis of the study focused on contrasting a sim ple versus a complex procedure. The study suggested th a t both experiments and dem onstrations help, and they help more on complex procedures. 10.4 Future Work Earlier in C hapter 8, we discussed a number of extensions. Some of the extensions for dem onstrations required m ultiple paths (or sequences of steps) for performing a procedure. This would allow additional types of dem onstrations and more complicated procedural representations, including conditional plans. Some of the extensions for machine learn ing include supporting disjunctive preconditions, using structural knowledge and using a deeper domain model. Some o f th e extensions for experiments include practice problems and modifying experiments in response to unexpected events. However, the techniques discussed here could be used for other purposes. Diligent’s techniques could help system s th at learn general-purpose operators for plan ning by helping them to b etter understand dem onstrations and the solutions to practice problems, which could be treated like dem onstrations. In Diligent’s current project, procedures are learned for a tu to r th a t uses a virtual envi ronm ent. However, Diligent only requires a graphical interface and not a three dimensional •246 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. virtual environment. One potential application is creating procedures th at teach people how to run a factory using a two dimensional display of various controls and indicators. Diligent could also be used by students who are attem pting understand a device. Stu dents could identify th e state changes produced by m anipulating various controls. Students could also use Diligent to learn preconditions and to learn procedures. Another use is debugging sim ulations, especially when the simulation is developed by external organization. A m ajor problem with sim ulations is th a t it is often difficult to determine what type o f calculations they perform internally. This means th at it is difficult to know how normal results are reached or what the simulation will do in unusual situa tions. A non-programm er, domain expert could test a simulation by authoring procedures and looking at looking a t the preconditions. This could identify missing and unneces sary preconditions. It could also allow the domain expert to identify situations where the simulation behaves in an undesirable manner. 247 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Reference List [A+95] [AFCFG97] [AIS88] [AJR97] [Ana83] [And85] [Ang87a] [Ang87b] [Bal93] [Bau98] [Bel98] [Ben95] John R. Anderson et al. Cognitive tutors: Lessons learned. The Journal of the Learning Sciences, 4(2): 167-207, 1995. A. A rruarte, I. Fernadez-Castro, B. Ferrero, and J. Greer. The IRIS shell: “how to build ITSs from pedagogical and design requisites". International Journal o f Artificial Intelligence in Education, 8:341-348, 1997. Jose A. Ambros-Ingerson and Sam Steel. Integrating planning, execution and m onitoring. In A A A I 1988, pages 735-740, 1988. Richard Angros, J r, W . Lewis Johnson, and Jeff Rickel. Agents th a t learn to instruct. In A A A I 1997 Fall Sym posium Series: Intelligent Tutoring System Authoring Tools, pages 1-8. AAAI Press, November 1997. Technical Report FS-97-01. J. A nania. The influence of instructional conditions on student learning and achievement. Evaluation in Education, 7:1-92, 1983. Peter M errett Andreae. Justified Generalization: Acquiring Procedures From Examples. PhD thesis, M IT, 1985. D. Angluin. Learning regular sets from queries and counter-examples. Infor mation and Computation, 75(2):87-106, 1987. D. Angluin. Queries and concept learning. Machine Learning, 2(4):319-342, 1987. Cecile Baikanski. Actions, Beliefs and Intensions in M ulti-Action Utterances. PhD thesis, Harvard University, May 1993. M athias Bauer. Acquisition of abstract plan descriptions for plan recognition. In Fifteenth National Conference on Artificial Intelligence, pages 936-941, M adison, Wisconsin, July 1998. The AAAI Press / T he M IT Press. Benjamin Bell. Investigate and decide learning environm ents: Specializing task models for authoring tool design. The Journal o f the Learning Sciences. 7(1):65-105, 1998. Scott Benson. Inductive learning of reactive action models. Machine Learn ing: Proceedings o f the 12th International Conference, pages 47-54, 1995. 248 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. [Ble97] [Blo84] [BN95] [Boo85] [BS93] [BSP85] [Bur83] [BV96] [C+93] [Car70] [CBL+89] [CBN89] [CG90] Stephen B. Blessing. A program ming by dem onstration authoring tool for model-tracing tutors. International Journal o f Artificial Intelligence in Edu cation, 8, 1997. B. S. Bloom. The 2 sigm a problem: The search for methods of group instruc tion as effective as ono-to-one tutoring. Educational Reseacher, pages 4-16, Ju n e/Ju ly 1984. Scott Benson and Nils J. Nilsson. Reacting, planning, and learning in an au tonomous agent. In Koichi Furakawa, Donald Michie, and Stephen Muggle- ton, editors, Machine Intelligence, volume 14, pages 29-64. Oxford University Press, 1995. J. H. Boose. A knowledge acquisition program for expert system s based on personal construct psychology. International Journal o f M an-M achine Studies, 23(5):495-525,1985. Michael S. Bocionek and Siegfried B. Sassin. Dialog-based learning (DBL) for adaptive interface agents and program m ing-by-dem onstration system s. Tech nical Report CMU-CS-93-175, School of C om puter Science, Carnegie Mellon University, July 1993. Alan Bundy, Bernard Silver, and Dave Plum m er. An analytical comparison of some rule-learning program s. Artificial Intelligence, 27:137-181, 1985. A. J. Burke. Stu d en ts’ potential fo r learning contrasted under tutorial and group approaches to instruction. PhD thesis, University of Chicago, 1983. Alberto Del Bimbo and Enrico Viario. Visual programming of virtual worlds anim ation. IE E E Multimedia, 3(1), 1996. Allen Cypher et al., editors. Watch What I Do: Programming by D em onstra tion. The M IT Press, 1993. J.R . Carbonell. AI in CAI: an artificial intelligence approach to com puter-assisted instruction. IE E E Transactions on Man-M achine System s, 11 (4): 190-202, 1970. M. T. H. Chi, M. Bassok, M. W. Lewis, P. Reim ann, and R. G laser. Self explanations: How students study and use exam ples to solve problem s. Cog nitive Science, 13:145-182, 1989. Allan Collins, John Seely Brown, and Susan E. Newman. Cognitive appren ticeship: Teaching the crafts of reading, w riting, and m athem atics. In Lau ren B. Resnick, editor, Knowing, learning, and instruction: essays in honor or Robert Glaser, pages 453-494. L. Erlbaum Associates, Hillsdale, N .J., 1989. Jaim e G. Carbonell and Yolanda Gil. Learning by experim entation: The operator refinement method. In Yves K odratoff and Ryszard S. Michalski, editors, M achine Learning: An Artificial Intelligence Approach, volume HI, pages 191-213. Morgan Kaufmann, San M ateo, CA, 1990. 249 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. [Chi97] [CKM93] [Cla86] [CLCL94] [CLR90] [Coh92] [CS95] [CV91] [Dav84] [DHP+94] [DHVV94] [Di 94] [DM86] Michelene T . H. Chi. Quantifying qualitative analysis of verbal data: A practical guide. The Journal of the Learning Sciences, 6(3):*271-315,1997. Allen Cypher, David S. Kosbie, and David M aulsby. Characterizing PBD systems. In Allen C ypher et al., editors, Watch W hat I Do: Programming by Demonstration. The M IT Press, 1993. William J. Clancey. From GUIDON to NEOM YCIN and HERACLES in twenty short lessons: ORN final report 1979-1985. A I Magazine, 7(3):40— G O , August 1986. Michelene T. H. Chi, Nicholas De Leeuw, M ei-Hung Chiu, and Christian LaVancher. Eliciting self-explanations improves understanding. Cognitive Science, 18:439-477, 1994. Thom as H. Cormen, Charles E. Leiserson, and Ronald L. Rivest. Introduction to Algorithms. The M IT Press, Cam bridge, M assachusetts, 1990. Philip R. Cohen. T he role of natural language in a multimodel interface. In (JIST'92, pages 143-149, Monterey California, 1992. Allen Cypher and David Canfield Sm ith. KIDSIM: End user programming of simulations. In SIG C H I ’ 95, pages 27-34, Denver C olorado, May 1995. ACM SIGCHI. Michelene T . H. Chi and Kurt A. VanLehn. T he content of physics self explanations. The Journal o f the Learning Sciences, 1(1):69-105, 1991. Randall Davis. Interactive transfer of expertise. In Bruce G. Buchanan and Edward H. Shortlifle, editors, Rule-Based Expert System s: The M YC IN Experiments o f the Stanford Heuristic Programming Project, pages 171-205. Addison-Wesley Publishing Company, 1984. Judy Delin, Anthony Hartley, Cecile Paris, Donia S cott, and Keith Vander Linden. Expressing procedural relationships in m ultilingual instructions. In Proceedings o f the Seventh International Workshop on Naturtil Language Gen eration, pages 61-70, Kennebunkport, ME, 1994. Denise D raper, Steve Hanks, and Daniel Weld. Probabilistic planning with information gathering and contingent execution. In Proceedings o f the Second International Conference on Artificial Intelligence Planning Systems, pages 31-36. Chicago, Illinois, 1994. AAAI Press. Barbara Di Eugenio. Action representation for interpreting purpose clauses in natural language instructions. In Proceedings o f the Fourth International Conference on Knowledge Representation and Reasoning, 1994. Gerald DeJong and Raymond Mooney. Explanation-based learning: An al ternative view. M achine Learning, 1(2):145-176, 1986. 250 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. [DR98] [EEMT87] [ES84] [Fel72] [FMD+96] [Gai87] [GaI90] [GBW88] [GCV98] [Gil92] [GMAB93] [Gru89] [Ham89] Wolff Daniel Dobson and Christopher K. Riesbeck. Tools for incremental development of educational software interfaces. In C H I 98, pages 384-391, Los Angles, CA, 1998. Larry Eshelman, Damien Ehret, John M cDerm ott, and Ming Tan. MOLE: a tenacious knowledge-acquisition tool. Int. J. o f M an-M achine Studies, 26:41- 54, 1987. K. A. Ericsson and H. Sim on. Protocol Analysis: Verbal reports as data. MIT Press, Cambridge, MA, 1984. Katherine Voerwerk Feldman. The effects of the num ber of positive and negative instances, concept definitions, and em phasis of relevant attributes on the attainm ent of m athem atical concepts. In Proceedings o f the Annual Meeting o f the Am erican Educational Research Association, Chicago, Illinois, 1972. H. Friedrich, S. M unch, R. Dillman, S. Bocionek, and M. Sassin. Robot programming by dem onstration (RPD): Supporting th e induction by human interaction. Machine Learning, 23:163-189, 1996. Brian R. Gains. An overview' of knowledge-acquisition and transfer. Int. J. Man-Machine Studies, 26:453-472, 1987. Deborah Krawczak G aldes. A n Empirical study o f H um an Tutors: The Im plications for Intelligent Tutoring Systems. PhD thesis, The Ohio State Uni versity, 1990. R. M. Gange, L. J. Briggs, and Wager W. W . Principles o f Instructional Design. Holt, Rinehart and Winston, third edition, 1988. Abigail S. G ertner, C ristina Conati, and K urt VanLehn. Procedural help in andes: G enerating hints using a bayesian network stu d en t model. In Fifteenth National Conference on Artificial Intelligence (A A A I 1998). pages 106-111, Madison, Wisconson, 1998. Yolanda Gil. Acquiring Domain Knowledge fo r Planning by Experimentation. PhD thesis, Carnegie Mellon University, 1992. S. Goldin-Meadow, M. W . Alibali, and R. Breckinridge Church. Transitions in concept acquisition: Using the hand to read the mind. Psychological Review, 100:279-298, 1993. Thomas R. G ruber. A utom ated knowledge acquisition for strategic knowl edge. Machine Learning, 4:293-336, 1989. Kristian J. Hammond. C H E F. In C. Reisbeck and R. Shank, editors, Inside Case-Based Reasoning. Lawrence Erlbaum Associates, Hillsdale, NJ, 1989. 251 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. [Hau88] [Hei89] [Hei93] [HHR99] [Hil94] [HL93] [HL95] [HMD73] [HMP97] [Hof87] [HS91] [Huf94] [JH95] David Haussler. Q uantifying inductive bias: Artificial intelligence learning algorithm s and valiant’s learning framework. Artificial Intelligence, 36:177— •221, 1988. Rosanna Heise. D em onstration instead of of programming: focussing a t tention in robot task acquisition. Technical Report Research Report No. 89/360/22, University of Calgary, September 1989. Rosanna Heise. Program m ing robots by example. International Journal o f Intelligent System s, 8:685-709, 1993. Patricia Y. Hsieh, Henry M. Halff, and Carol L. Redfield. Four easy pieces: Development system s for knowledge-based generative instruction. Interna tional Journal o f Artificial Intelligence in Education, 10, 1999. Randall W . Hill, J r . Impasse-driven tutoring for reactive skill acquisition. Technical Report J P L Publication 94-9, Je t Propulsion Laboratory. Califor nia Institute of Technology, April 1994. Reprint of University of Southern California PhD thesis. Scott B. Huffman and John E. Laird. Learning procedures from interactive natural language instructions. In P. Utgoff, editor, M achine Learning: Pro ceedings o f the Tenth International Conference, volume 15, page : a total of 12. ?, Am hearst, M ass., June 1993. Scott B. Huffman and John E. Laird. Flexibly instructable agents. Journal of Artificial Intelligence Research, 3:271-324, 1995. John C. Houtz, J. W illiam M oore, and J. Kent Davis. Effects of different types of positive and negative examples in learning ” non-dimensioned" concepts. Journal o f Educational Psychology, 64(‘ 2):206-211, 1973. Haym Hirsh, Nina M ishra, and Leonard P itt. Version spaces w ithout bound ary sets. In Proceedings o f the Fourteenth National Conference on Artificial Intelligence, pages 491-496. AAAI Press/The M IT Press, 1997. Robert R. Hoffman. T he problem of extracting the knowledge of experts from the perspective of experim ental psychology. A I Magazine, 8(‘ 2):53-67, 1987. David Hume and Claude Sam m ut. Using inverse resolution to learn relations from experim ents. In Proceedings of the Eighth Machine Learning Workshop, Evanston, II, July 1991. Scott B. Huffman. Instructable Autonomous Agents. PhD thesis, University of Michigan, 1994. B. Jordan and A. Henderson. Interaction analysis: Foundations and practice. The Journal o f the Learning Sciences, 4:39-103, 1995. •252 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. [JK97] [JRSM98] [Kel55] [KF93] [KGF74] [KM93] [I<ri95] [KW88] [Lan80] [Lew92] [Lie94] [LNR87] [LW99] [MA97] Menachem Jo n a and Alex Kass. A full-integrated approach to authoring learning environm ents: Case studies and lessons learned. In AAAI 1997 Fall Symposium Series: Intelligent Tutoring System Authoring Tools, pages 39-43. AAAI Press, Novem ber 1997. Technical R eport FS-97-01. W . Lewis Johnson, Jeff Rickel, R. Stiles, and Allen M unro. Integrating peda gogical agents in to virtual environm ents. Presence: Teleoperators and Virtual Environm ents, 7(6):523-546, December 1998. G . A. Kelly. The psychology o f personal constructs. Norton, New York, 1955. David K urlander and Steven Feiner. A history o f editable graphical histo ries. In Allen C ypher et al., editors, Watch What I Do: Programming by Demonstration, pages 405-413. The M IT P ress, 1993. Herbert J. Klausm eier, E. S. G hatala, and D. A. Frayer. Conceptual Learning and Development, a Cognitive View. Academic Press, New York, 1974. David S. Kosbie and Brad A. Myers. A system -w ide macro facility based on aggregate events: A proposal. In Allen C ypher e t al., editors, Watch What I Do: Programming by Demonstration, pages 433-444. The MIT Press, 1993. Balachander Krishnam urthy, editor. Practical Resusable UNIX Software. John Wiley & : Sons, New York, NY, 1995. Brent J. Krawchuk and Ian H. W itten. On asking the right questions. In 5th International M achine Learning Conference, pages 15-21. Morgan Kaufmann, June 1988. P. Langley. Finding common paths as a learning mechanism. In Third Con ference o f the Canadian Society fo r C om putational Studies of Intelligence, pages 12-19, 1980. John D. Lewis. Task acquisition from instruction. M aster's thesis, University of Calgary, 1992. Henry Lieberman. A user interface for knowledge acquisition form video. In Twelfth N ational Conference o f the A m erican Association for Artificial Intelligence, A ugust 1994. John E. Laird, Allen Newell, and Paul S. Rosenbloom. Soar: An architecture for general intelligence. Artificial Intelligence, 3 3 (l):l-6 4 , 1987. Tessa A. Lau and Daniel S. Weld. Program m ing by demonstration: An in ductive learning form ulation. In 1999 International Conference on Intelligent User Interfaces, pages 145-152, Redondo Beach, CA, January 1999. Nigel M ajor and Shaaron Ainsworth. Developing intelligent tutoring systems using a psychologically motivated authoring environm ent. In AAAI 1997 Fall Symposium Series: Intelligent Tutoring S ystem Authoring Tools, pages 53-59. AAAI Press, Novem ber 1997. Technical R eport FS-97-01. 253 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. [Maj95] [Mau94] [MAW97] [ME89] [Mit78] [Mit82] [Mit93] [MJP+97] [MJSVV93] [MKKC86] [MMS90] [MP93] [MR91] Nigel M ajor. Modeling teacliing strategies. Journal o f Artificial Intelligence in Education, 6 (2 /3 ):117-152, 1995. David Maulsby. Instructable Agents. PhD thesis, University of Calgary, Ju n e 1994. Nigel M ajor, Shaaron Ainsworth, and David W ood. REDEEM : Exploiting the symbiosis between psychology and authoring environments. International Journal o f Artificial Intelligence in Education, 8:317-340, 1997. Chris Mellish and Roger Evans. N atural language generation from plans. Com putational Linguistics, 15(4), 1989. Tom M. Mitchell. Version Spaces: A n Approach to Concept Learning. PhD thesis, Stanford University, 1978. Tom M. Mitchell. Generalization as search. Artificial Intelligence, 18:203- 226, 1982. Vibhu O. M ittal. Generating n atural language descriptions with integrated text and examples. Technical R eport ISI/RR-93-392, U SC/Inform ation Sci ences Institute, Septem ber 1993. Allen M unro, M ark C. Johnson, Quentin A. Pizzini, David S. Surmon, D ou glas M. Towne, and Jam es L. Wogulis. A uthoring simulation-centered tu to rs with RIDES. International Journal o f Artificial Intelligence in Education, 8:284-316, 1997. A. M unro, M. C. Johnson, D. S. Surmon, and J. L. Wogulis. A ttribute- centered sim ulation authoring for instruction. In Proceedings o f the A I-E D 93 World Conference o f Artificial Intelligence in Education, pages 82-89. E d inburgh, Scotland, 1993. Tom M. Mitchell, Richard M. Keller, and Sm adar T. Kedar-Cabelli. Explanation-based generalization: A unifying view. Machine Learning, 1(1):47— 80, 1986. Tom M. Mitchell, Sridbar M abadevan, and Louis I. Steinberg. LEAP: A learning apprentice for VLSI design. In Machine Learning An Artificial Intel ligence Approach, volume III, pages 271-289. Morgan Kaufmann, San M ateo, CA, 1990. Vibhu O. M ittal and Cecile L. Paris. Generating natural language descrip tions with examples: Differences between introductory and advanced texts. In Proceedings o f the Eleventh N ational Conference on on Artificial Intelligence, pages 271-276, W ashington, DC, Ju ly 1993. David McAllester and David Rosenblitt. System atic nonlinear planning. In Proceedings o f the Ninth N ational Conference on Artificial Intelligence (AAAI-91), pages 634-639, Menlo Park, CA, 1991. AAAI Press. 254 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. [MT69] [MUB83] [Mur97] [Mur98] [Mus93] [MW93] [MWM94] [New90] [Nor88] [OC96] [Ous94] [Pea96] [PK86] [PL96] S. M. M arkle and P. W. Tiem ann. Really Understanding Concepts. Stipes Press, U rbana, Illinois, 1969. Tom M. M itchell, Paul E. Utgoff, and Ranan Banerji. Learning by experimen tation: Acquiring and refining problem-solving heuristics. In R. Michalski, J. Carbonell, and T. Mitchell, editors, M achine Learning An Artificial Intel ligence Approach, volume I. M organ Kaufm ann, San Mateo, CA . 1983. Tom M urray. Expanding the knowledge acquisition bottleneck for intelligent tutoring system s. International Journal o f A rtificial Intelligence in Educa tion, 8:2*22-232, 1997. Tom M urray. Authoring knowledge-based tutors: Tools for content, instruc tional strategy, student model, and interface design. The Journal o f the Learning Sciences, 7(l):5-64, 1998. M ark A. M usen. An overview of knowledge acquisition. In J. M . David, J. P. Krivine, and R. Simmons, editors, Second Generation Expert System s. pages 405-427. Springer-Verlag, 1993. David M aulsby and Ian H. W itten. M etam ouse: An instructible agent for pro gram m ing by dem onstration. In What What I Do: Programming by Demon stration. T he M IT Press, 1993. Antonija M itrovic, Ian H. W itten, and David L. Maulsby. An experiment in the application of similarity-based learning to programming by example. International Journal o f Intelligent System s, 9:341-364, 1994. Allen Newell. Unified Theories o f Cognition. H arvard University Press, 1990. Donald A. N orm an. The Psychology o f Everyday Things. Basic Books, New York, 1988. Tim Oates and Paul R. Cohen. Searching for planning operators with context- dependent and probabilistic effects. In Proceedings o f the Thirteenth National Conference on Artificial Intelligence, pages 863-868, 1996. John K. O usterhout. Tel and the Tk Toolkit. Addison-Wesley, Reading, M assachusetts, 1994. Douglas John Pearson. Learning Procedural Planning Knowledge in Complex Environm ents. PhD thesis, University of M ichigan, 1996. Bruce W. P orter and Dennis F. Kibler. Experim ental goal regression: A method for learning problem-solving heuristics. Machine Learning, 1:249— *286, 1986. Douglas J. Pearson and John E. Laird. Toward incremental knowledge cor rection for agents in complex environm ents. In S. Muggleton, D. Michie. and K. Furukawa, editors, Machine Intelligence, volume 15. Oxford University Press, 1996. 255 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. [Pol90] [PS92] [PV96] [PVF+95] [RB98] [Red92] [Red97] [Ren97] [Ric89] [RJ99] [RK96] [RN95] [RS90] M artha Pollack. Plans as complex mental attitu d es. In Phil Cohen, Jerry Morgan, and M artha Pollack, editors, Intention in Communication. MIT Press, 1990. Mark A. Peot and David E. Smith. Conditional nonlinear planning. In Pro ceedings o f the First International Conference on A rtificial Intelligence Plan ning System s, pages 189-197, College Park, M aryland, 1992. Morgan Kauf man n. Cecile Paris and Keith Vander Linden. An interactive support tool for writing multilingual manuals. IE E E Computer, 29(7):49-56, 1996. Cecile Paris, Keith Vander Linden, M arkus Fischer, Anthony Hartley, Lyn Pemberton, Richard Power, and Donia Scott. A support tool for writting multilingual instructions. In Proceedings o f the Fourteenth International Joint Conference on Artificial Intelligence, pages 1398-1404, Montreal, C anada, 1995. Steven R itter and Stephen B. Blessing. A uthoring tools for component-based learning environm ents. The Journal o f the Learning Sciences. 7(1):107-132, 1998. Michael A. Redmond. Learning by observing and understanding expert prob lem solving. PhD thesis, Georgia Institute of Technology, 1992. Carol Luckhardt Redfield. An ITS authoring tool: Experim ental advanced instructional design advisor. In A A A I 1997 Fall Sym posium Series: Intelli gent Tutoring System Authoring Tools, pages 72-78. AAAI Press, November 1997. Technical Report FS-97-01. Alexander Renkl. Learning from worked-out examples: A study on individual differences. Cognitive Science, 21(1):1— 29, 1997. Jeff W. Rickel. Intelligent computer-aided instruction: A survey organized around system com ponents. IEEE Transactions on System s, Man and Cy bernetics, 19(l):40-57, 1989. J. Rickel and W. L. Johnson. Anim ated agents for procedural training in virtual reality: Perception, cognition, and m otor control. Applied Artificial Intelligence, 1999. Steven R itter and Kenneth R. Koedinger. An architecture for plug-in tutor agents. Journal o f Artificial Intelligence in Education, 7(3/4):315-347, 1996. Stuart J. Russell and Peter Norvig. Artificial Intelligence A M odem Ap proach. Prentice Hall Series in artificial intelligence. Prentice Hall, 1995. Ronald L. Rivest and Robert E. Schapire. A new approach to unsupervised learning in determ inistic environments. In Yves K odratoff and Ryszard S. Michalski, editors, Machine Learning: An Artificial Intelligence Approach. volume III, pages 670-684. Morgan Kaufmann, San M ateo, CA, 1990. 256 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. [RS97] [SB86] [Sch53] [Sch94] [SCS94] [Seg87] [Sel74] [SG88] [She93] [She94] [She97] [SJ91] [SMP95] Charles Rich and C andace L. Sidner. COLLAGEN: W hen agents collabo rate with people. In Proceedings of the First International Conference on Autonom ous A gents, pages 284-291, February 1997. Claude Sam m ut and Ranan B. Banerji. Learning concepts by asking ques tions. In R. S. M ichalski, J. G. Carbonell, and T. M. Mitchell, editors, Ma chine Learning: A n Artificial Intelligence Approach, volume II, pages 167- 191. Morgan K aufm ann, Los Altos, CA, 1986. H. SchefFe. A m ethod for judging all contrasts in the analysis of variance. Biometrika, 40:87-104, 1953. Roger C. Schank. Goal-based scenarios: A radical look at education. The Journal o f the Learning Sciences, 4(3):429-453, 1994. David Canfield Sm ith, Allen Cypher, and Jim Spoher. KIDSIM: Program ming agents w ithout a programming language. CACM , 94(7):55-67, July 1994. A lberto M aria Segre. A learning apprentice system for mechanical assembly. In IE E E Third Conference on Artificial Intelligence Applications, pages 112— 117, 1987. J.A . Self. Student models in computer-aided instruction. International Jour nal o f M an-M achine Studies, 6:261-276, 1974. Mildred L. G. Shaw and Brian R. Gaines. An interactive knowledge-elicitation technique using personal construct technology. In Knowledge Acquisition fo r Expert System s: A Practical Handbook, pages 109-136. Plenum Press, New York, 1988. Wei-Min Shen. Discovery as autonomous learning from the environment. M achine Learning, ll(4):250-265, 1993. Wei-Min Shen. A utonom ous Learning From The Environm ent. W. H. Free m an, New York, NY, 1994. Michael Shermer. Why People Believe Wierd Things. W. H. Freeman and Company, New York, NY, 1997. Roger C. Schank and Menachem Y. Jona. Empowering the student: New perspectives on th e design of teaching systems. The Journal o f the Learning Sciences, l(l):7 -3 5 , 1991. Randy Stiles, Laurie M cCarthy, and Michael Pontecorvo. Training studio: A virtual environm ent for training. In Workshop on Simulation and Interaction in Virtual Environm ents (SIVE-95), Iowa City, IW, July 1995. ACM Press. 257 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. [SR90] [SS98] [Tec92] [TH96] [THD95] [TK90] [TK98] [Tow97a] [Tow97b] [Utg86] [Van83] Benjamin D. Smith and Paul S. Rosenbloom. incremental non-backtracking focusing: A polynomial bounded generalization algorithm for version spaces. In Proceedings o f the Eighth National Conference on Artificial Intelligence, pages 848-853, 1990. Raymund Sison and Masamichi Shim ura. Student modeling and machine learning. International Journal of Artificial Intelligence in Education, 9:128— 158, 1998. Gheorghe Tecuci. A utom ating knowledge acquisition as extending, updating, and im proving a knowledge base. IEEE Transactions on System s, Man and Cybernetics, 22(6):1444-1460, 1992. Gheorghe Tecuci and Michael R. Hieb. Teaching intelligent agents: the Disciple approach. International Journal o f Human-Computer Interaction, 8(3):259-285, 1996. G heorghe Tecuci, Michael R. Hieb, and Tom asz Dybala. Building an adaptive agent to m onitor and repair the electrical power system of an orbital satellite. In Goddard Conference on Space Applications o f Artificial Intelligence and Emerging Information Technologies, pages 57-71, NASA G oddard, Greenbelt, M aryland, 1995. Gheorghe Tecuci and Yves Kodratoff. Apprenticeship learning in imperfect domain theories. In Machine Learning A n Artificial Intelligence Approach, volume III, pages 514-552. Morgan K aufm ann, San M ateo, CA , 1990. Gheorgie Tecuci and Harry Keeling. Delevoping intelligent educational agents with the Disciple learning agent shell. In Barry P. G oettl, Henry M. Halff, Carol L. Redfield, and Valerie J. Shute, editors, Intelligent Tutoring Systems: 4th international conference, pages 454-463. Springer-Verlag, Berlin, 1998. Douglas M. Towne. Approximate reasoning techniques for intelligent diagnos tic instruction. International Journal o f Artificial Intelligence in Education. 8:262-283, 1997. Douglas M. Towne. Diagnostic tutoring using qualitative sym ptom infor m ation. In A A A I 1997 Fall Symposium Series: Intelligent Tutoring System Authoring Tools, pages 86-95. AAAI Press, November 1997. Technical Report FS-97-01. Paul E. Utgoff. shift of bias for inductive concept learning. In Machine Learning A n Artificial Intelligence Approach, volume II. pages 107-148. Mor gan K aufm ann, Los Altos, CA, 1986. Kurt VanLehn. Felicity conditions for hum an skill acquisition: Validating an Al-based theory. Research report no. CIS-21, Xerox Palo Alto Research Center, 1983. 258 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. [Van87] [Van93] [Van99] [VCP+95] [VD96] [VJC92] [VM95] [Wan95] [Wan96a] [Wan96b] [Wan96c] [WC87] [Wd90] [Wel94] Kurt VanLehn. Learning one subprocedure per lesson. Artificial Intelligence. 31:1-40, 1987. Keith Vander Linden. Speaking o f Actions: Choosing Rhetorical Status and Grammatical Form in Instructional Text Generation. PhD thesis, University of Colorado, D epartm ent of C om puter Science, 1993. K urt VanLehn. Rule-learning events in the acquisition of a complex skill: An evaluation of Cascade. The Journal o f the Learning Sciences, 8( 1):71— 125, 1999. M anuela Veloso, Jaim e G . Carbonell, M. Alicia Perez, Daniel B orrajo, Eugene Fink, and Jim Blythe. Integrating planning and learning: The PRODIGY architecture. Journal o f Experim ental and Theoretical Artificial Intelligence, 7(1), January 1995. W .R. VanJoolingen and T . DeJong. Design and im plem entation of simulation-based discovery environments: the SMISLE solution. Interna tional Journal o f A rtificial Intelligence in Education, 7:253-276, 1996. K urt VanLehn, R andolph M. Jones, and Michelene T. H. Chi. A model of the self-explanation effect. The Journal of the Learning Sciences, 2(1): 1— 59, 1992. Keith Vander Linden and Jam es H. Martin. Expressing rhetorical relations in instructional test: A case study of the purpose relation. Computational Linguistics, 21:29-57, M arch 1995. Xuemai Wang. Learning by observation and practice: An increm ental ap proach for planning o p erato r acquisition. In The 12th International Confer ence on Machine Learning, 1995. Xuemai Wang. A m ultistrategy learning system for planning o perator ac quisition. In The Third International Workshop on Multistrategy Learning, Harpers Ferry, West Virginia, M ay 1996. Xuemai Wang. Planning while learning operators. In The Third International conference on artificial planning systems, May 1996. Xuemei Wang. Learning Planning Opemtors by Observation and Practice. PhD thesis, Carnegie Mellon University, 1996. Beverly Woolf and P atricia A. Cunningham. Multiple knowledge sources in intelligent teaching system s. IE E E Expert, 2(2):41-54, 1987. Daniel S. Weld and Johan de Kleer, editors. Readings in Qualitative Reason ing About Physical System s. M organ Kaufman, San M ateo, CA, 1990. Daniel S. Weld. An introduction to least commitment planning. A I Magazine, pages 27-61, W inter 1994. 259 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. [Y V en87] [Wil90] [WW72] [You97] [YPL77] Etienne Wenger. A rtificial Intelligence and Tutoring System s: Com putational and Cognitive Approaches to Communication o f Knowledge. M organ Kauf m ann Publishers, Inc., Los Altos, California, 1987. David C . Wilkins. Knowledge base refinement as improving an incorrect and incomplete dom ain theory. In Machine Learning An Artificial Intelligence Approach, volume III, pages 493-513. M organ Kaufmann, San M ateo, CA, 1990. Thom as H. W onnacott and Ronald J. W onnacott. Introductory Statistics fo r Business and Economics. John Wiley & Sons, 1972. R. Michael Young. Generating Descriptions o f Complex Activities. PhD thesis, University of P ittsburgh, 1997. Richard M. Young, G ordon D. Plotkin, and Reinhard F. Linz. Analysis of an extended concept-learning task. In Proceedings o f the Fifth International Conference on A rtificial Intelligence, page 285, 1977. 260 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Appendix A Implementation A .l Architecture D i l i g e n t STEVE Audio E f f e c t s Soar Agent I n t e r f a c e Visual Figure A .l: The VET Softw are Architecture Diligent was implemented in the context of the V irtual Environments for Training (VET) project [JRSM98]. For purposes of m odularity, the different components run as separate processes on possibly different machines. The project uses Silicon Graphics w orkstations running version 6.2 - 6.5 of the IRIX operating system. Figure A .l shows a schem atic of the VET architecture. M essage Dispatcher. The software com ponents talk to each other via the message dis patcher. For this we are using Sun’s TooITalkrA /. Visual Interface. The visual interface is the graphical representation of the environm ent. The visual interface is provided by Lockheed M artin’s Vista Viewer [SMP95]. On the VET project, two types of visual interfaces are supported: a browser on the computer console and an immersive virtual reality environment th at uses a head- mounted display and d a ta gloves. Because of the need to use the keyboard and to 261 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. interact with Diligent’s menus, Diligent only supports authoring with the browser. However, once a procedure has been authored, the procedure can be used to teach students with either the browser or the immersive environment. Audio Effects. Human students have the ability to hear various sound effects on their head-mounted display. Diligent does not deal with this capability. Speech Generation. STE V E is able to speak to students. Diligent’s test subjects used this capability when testing procedures. This capability is provided by E ntropic’s T rueT alk™ . Speech Recognition. This com ponent is allows students to communicate with STEV E agents. The capability is provided by Entropic’s G rapH V iter u . Diligent does not deal with actions th at involve comm unication. Simulation. The simulation controls the environm ent. It is implemented with VIVIDS, which is a version of RIDES [M JP+ 97]. RIDES was developed at the USC Behavior Technology Laboratory (BTL). The people at BTL modified VIVIDS so th a t Diligent was able to save and restore environm ent configurations. Soar Agent. The Soar agent [LNR87] is a production system that contains both the STEVE1 tutor [RJ99] and the Diligent authoring program . STEVE and Diligent are separate modules th a t behave like separate program s. STEVE uses a synthetic body (Figure A.2) to interact with students. STEVE uses the body to perform activities such as dem onstrating procedures and pointing to o r looking a t objects. STEV E is prim arily implemented as Soar productions. STEV E uses tksoar version 7.0.0.beta, TCL version 7.4 and T K version 4.0. Diligent is primarily implemented in T C L /T K [Ous94]. Most of Diligent resides in the sam e process as STEV E, but the code that produces graphs of procedures has its own process. The graph process uses the tkdot portion of the Graph Visualization tools from AT&T Laboratories and Bell Laboratories (Lucent Technology) [Kri95]. The graph process uses TCL version 7.6, TK version 4.2 and the TK Dash patch by Jan Nijtm ans. A.2 M aintenance o f Agenda One of problems faced by a system like Diligent is properly sequencing its input. Input can come from both the environm ent and the instructor. Furtherm ore, the environm ent and the instructor can be sending input a t the same tim e. Additionally, some activities may involve a sequence of behaviors, some of which can take variable am ounts of tim e. For example, consider initializing the environm ent before the s ta rt of a procedure’s second dem onstration. The following activities take place. ‘Soar Training Expert for Virtual Environments 262 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure A.2: The STEVE Tutoring Agent 263 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1. The instructor is asked for an initial environm ent configuration. (Assume th at the configuration m atches the first configuration.) 2. The environment is reset. 3. At this point, Diligent may have records o f actions performed in the environm ent th a t have not yet been processed. In order to prevent confusion, Diligent deletes these records. 4. Actions in the p a th ’s prefix are replayed. 5. The instructor can now add additional actions to the prefix. 6. The instructor indicates th at th e dem onstration shouid sta rt. Because we want the instructor have m axim um flexibility for interleaving activities, it is inappropriate to encode a fixed sequence of activities in a procedure or gram m ar. An approach is needed th a t allows m axim um flexibility and minimizes the code th a t handles special cases. To solve this problem, Diligent m anages the interaction with an agenda. Diligent’s agenda has stack of lists. Each level in the stack corresponds to one procedure, and each list contains activities to perform for th a t procedure. The dialog with the instructor is focused on the procedure a t the to p of the stack. To prevent confusion and to avoid problems, some activities are only allowed on the top procedure. The restricted activities include dem onstrations, experim ents and using STEV E to test the procedure.2 This approach was influenced by o th e r work. T he idea of a stack of procedures where the agent focuses on the top procedure was borrowed from Instructo-Soar [HL95]. T he idea for each level of the agenda to contain a list o f activities was inspired by COLLAGEN [RS97], whose agenda is a stack of plans for m anaging user interaction. A .3 Providing Feedback A bout D iligent’s Beliefs A nother problem faced by Diligent is providing feedback about its confidence in aspects of its knowledge base (e.g. how certain is Diligent th a t a causal link is correct?). By providing feedback, Diligent indicates w hat it believes strongly as well as areas of uncertainty where the instructor could focus. To com pute its confidence, Diligent could have used a formal numeric approach, such as certainty factors, fuzzy logic or D em pster-Scafer theory [RN95]. However, Diligent may have little d a ta from which to calculate numeric values. Furtherm ore, it was hypothesized th a t numeric values might confuse users who are not experts in this area. Instead, Diligent uses a small set o f symbolic sta tu s values to describe its beliefs. These values are not described earlier because they are a minor p art of the system and are not based on a rigorous theory. Nevertheless, the sta tu s values are im portant for two reasons. • The status values are used in the user interface, which is described in Appendix D. 2Using an agenda greatly reduced the user interface’s complexity. 264 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. • T he sta tu s values are useful when using multiple paths to generate a plan.3 Status value Is the object used in a plan? Meaning required yes Instructor has indicated th a t it be used. suspect yes Instructor has indicated th a t it be used, but he appears to have m ade a mistake. (The object will still be used.) provisional yes Likely to be correct. ignored no Likely to be correct but not needed. unlikely no Appears to be incorrect. useless no Evidence strongly suggests it to be incorrect. rejected no Instructor has indicated th at it should not be used. Table A .l: Status Values Used by Diligent The sta tu s values used by Diligent are shown in Table A .l. The types of objects that have statu s values are preconditions, goal conditions, causal links and ordering constraints. By default, Diligent gives a statu s value of provisional to objects th a t it believes to be needed. The sta tu s values required and rejected are only used when the instructor explicitly indicates w hether or not th a t object should be used when building a plan. The status value ignored is only used with ordering constraints involving a step th a t represents the procedure’s initial state or goal state. These sta tu s values are sim ilar in concept to the three sets used to contain operator preconditions (i.e. s-rep, h-rep and g-rep) but are not the same; preconditions in the h-rep and g-rep have a statu s of provisional unless the instructor indicates th a t they should be required. As mentioned earlier, the statu s values are useful when hypothesizing goal conditions using multiple paths. A useful heuristic is th a t a condition is a goal condition when the condition’s a ttrib u te value changes during a t least one path and the condition is present in the final sta te of every path. These hypothesized goal conditions are given a status of provisional. The heuristic also identifies conditions th a t appear to be goal conditions in some paths but not in others. These conditions have a statu s of unlikely or suspect. Conditions with a status of unlikely indicate th at the instructor may have made an error in one of the paths, while conditions with a statu s of suspect indicate an error because a previously required goal condition is not satisfied in the final sta te of a least one path. 3Thc capability to generate a plan from multiple patlis was removed from Diligent. Some of the issues are described in Section 8.4.1.1. 265 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Appendix B Evaluation M aterials This appendix contains m aterial used for evaluating Diligent. B .l Background Questionnaire The first thing subjects did was fill out this questionnaire. Name: Date: 1) Educational background. How many total years of education do you have (e.g. 12 years of high school + 4 years of college 2 years of graduate school)? W hat degrees do you have and in what subjects? If you are a graduate student, when did you sta rt graduate school? 2) How old are you? <25 <30 <35 <40 <50 >50 3) Are you male or female? 4) Are you color-blind? If so, in w hat way? 5) Are you right or left handed? 6) Do you have a personal com puter at home? 7) Do you use a com puter at work? 8) W hat is your occupation? 9) How many hours a week do you typically use a com puter? 10) During a typical week, w hat are your prim ary activities on a com puter and how m any hours do you spend on each? 266 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. programming: word processing: browsing: using a spread sheet: other (nam e the activities): 11) W hat are the main activities you have performed on a com puter in the last week? About how m any hours have you spent on each? 1. 2. 3. 12) Which program m ing languages do you have a lot of experience with? 13) How would you rate yourself as a com puter program m er? a) not a program m er b) novice c) interm ediate d) good e) expert 14) Circle the following topics for which you feel th a t you have significant knowledge. a) AI planning techniques b) machine learning induction techniques c) program m ing by demonstration d) high pressure air compressor maintenance e) machine maintenance in general f) Diligent: the system we are testing 15) How would you rate your ability to read English? a) poor b) m oderate c) good d) excellent e) English is my first language Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. B.2 Procedure Representation Description This section was read by subjects near the start of the first day’s training. The subjects then filled out the worksheet on Diligent’s procedure representation (Section B.3). Note that the tutorial uses the term “ordering relationships” instead of the term “step relationships" that is used in this thesis. In this section, we’ll discuss how procedures are represented. First, we need to define some terminology. The environment is represented by a set of attributes. Each attrib u te has a value. A condition contains an a ttrib u te and its value. A condition is true, or satisfied, when the attrib u te has the value and false when the attribute doesn’t have the value. A procedure transform s an initial environm ent sta te to a desired goal state. The state is transform ed through a sequence of steps, where each step represents some action that is performed in the environment. A procedure is finished when all its goal conditions are true. •268 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. press-button-1 turn-handle-2 Figure B .l: Procedure with Steps in Specification Order Figure B .l shows a procedure called “exam ple” . The “begin-example” step represents the initial state, and the “end-exam ple” step represents the goal state. T he steps “press- button-1” and “turn-handle-2” represent the actions performed during the procedure. The steps are ordered by the sequence in which they were specified. However, a procedure’s steps d o n ’t have to be performed in the order th a t they were specified. Instead, the steps in a procedure can be performed in any order th a t satisfies the preconditions of each step. •269 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure B.‘ 2: Procedure with Steps Ordered by Dependencies Figure B.2 shows procedure “example” where the steps are ordered by dependencies of later steps on earlier steps. In order to keep track of the preconditions and state changes of each step, every step is associated with an operator. An operator models an action performed in the environm ent. An o p erato r can have m ultiple effects. Each effect has a set of preconditions and a set of sta te changes. If an effect’s preconditions are satisfied, the effect’s state changes will be observed. Because operators model actions, each operator can be associated with m ultiple steps. Because an operator can have multiple effects, each step is associated w ith a subset of an o p e ra to r’s effects. •270 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Operator: toggle-valvel Action: toggling valve Valvel Effect 1: Effect 2: Preconditions: Preconditions: (Valvel = open) (Valvel = shut) State changes: S tate changes: (Valvel = shut) (Valvel = open) Figure B.3: O perator with Tw o Effects Figure B.3 shows an operator th a t models the toggling of valve V alvel. The operator has two effects: if the valve is open, it becomes shut (Effect 1), and if th e valve is shut, it becomes open (Effect 2). 271 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Procedure: Exam ple2 The specified order of steps: toggle-valve-1 toggle-valve-2 Step: toggle-valve-1 O perator: toggle-valve O perator Effects : Effect 1 Step preconditions: (Valvel = open) Step state changes: (Valvel = shut) Step: toggle-valve-2 O perator: toggle-valve O perator Effects : Effect 2 Step prerequisites: (Alarm-light 1 on) Step preconditions: (Valvel = shut)(A larm -lightl = on) Step state changes: (Valvel = open) Figure B.4: Example Steps Figure B.4 shows the steps in procedure Exam ple2. Both steps use the operator in figure B.3. The first step (toggle-valve-1) shuts V alvel, and the second step (toggle-valve- 2) opens Valvel. All preconditions of the first step (toggle-valve-1) come from the operator effect (Effect 1). Although the second step (toggle-valve-2) also gets preconditions from an operator effect (Effect 2), the second step has an additional precondition (A larm -lightl = on). The additional precondition is step prerequisite. A step prerequisite is a precondition th a t belongs only to the step and not to the operator effects associated with the step. A step prerequisite allows you to specify additional preconditions th at are not required by the operator effects associated with the step. Unfortunately, to actually perform a procedure, we need to know more th a t the precon ditions of each step; we need to know the how the steps depend on each other. This involves knowing which steps establish preconditions of other steps. It also involves knowing if the state changes of one step will interfere with the preconditions of other steps. Because the dependencies between steps contain th e preconditions of each step, only the dependencies will be given to the Steve tutor. Figure B.5 shows the dependencies between the steps in procedure Example2 (figure B.4). In this document, these dependencies will be called ordering relationships because they order a procedure’s steps. Diligent uses two types of ordering relationships: causal links and ordering constraints. A causal link is an attrib u te value caused by one ste p that is a precondition for a later step. Each step precondition can have a causal link. In the example, the first two causal links are actually dependencies on the procedure’s initial state (begin-Example2). The 272 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Procedure’s initial sta te (begin-Example2): (Valvel = open)(A larm -lightl = on) Procedure goals (end-Example2): (Valvel = open) Causal links: begin-Example2 (Valvel = open) toggle-valve-1 begin-Example2 (Alarm -lightl = on) toggle-valve-2 toggle-valve-1 (Valvel = shut) toggle-valve-2 toggle-valve-2 (Valvel = open) End-Example2 O rdering constraints: toggle-valve-1 before toggle-valve-2 Figure B.5: Procedure Exam ple2’s Dependencies last causal link is between the last step (toggle-valve-2) and the procedure’ s goal (end- Example2). A causal link with the procedure’s goal indicates th a t the step establishes one of the procedure’s goal conditions. Causal links are used to represent the preconditions of steps and to provide explanations of how earlier steps affect later steps. An ordering constraint indicates the relative order for performing a pair of steps. In the example, the first step (toggle-valve-1) should be performed before the second step (toggle-valve-2). There are no ordering constraints involving procedure’ s initial sta te (begin-Example2) and goals (end-Example2) because all steps are performed after the initial state and before the end of the procedure. Ordering constraints are used to determine which step to perform when when all the preconditions of multiple steps are satisfied. You may have noticed th a t the procedure’s goal condition is satisfied in the initial state. This means th a t none of the procedure’s steps would normally be performed. However, if the procedure was started when Valvel was shut, then the second step would be performed. Because understanding this chapter will prevent confusion, stop reading the tutorial. Please fill out the worksheet on the next page in your directions. W hen you are satisfied with w ith your answers, continue reading the tutorial and verify that your answers are correct. 273 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. B.3 The Procedure R epresentation Worksheet After subjects read the tutorial chapter on the procedural representation, they filled out this questionnaire. When they were done, they checked their answers against those in section B.4. In the following, circle the correct answers. (M ore than one answer may be correct.) If you discover th a t you’ve made an mistake, just change your answer. 1. Do the steps in a procedure change the state o f the environment? True or False 2. A procedure is finished when (a) All its steps are executed (b) All its goal conditions are true (or satisfied) 3. Steps have to be performed in the order th at they are specified? True or False 4. An operator models an action performed in the environm ent? True or False 5. Each step (a) Is Associated with only one operator (b) Can be associated with m ultiple operators (c) Is associated with only one operator effect (d) Can be associated with m ultiple operator effects. 6. Each operator (a) Is associated with an action performed in th e environment (b) Is associated with multiple actions performed in the environment (c) Can have m ultiple effects (d) Is associated with a single step (e) Can be associated with m ultiple steps 7. Each operator effect (a) Has preconditions (b) Has state changes (c) Has causal links (d) Has ordering constraints (e) Produces the given state changes if the preconditions are satisfied 8. Step preconditions (a) Include all preconditions from the associated operator effects (b) Do not include the preconditions from the associated operator effects (c) Can include step specific preconditions called step prerequisites 274 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 9. Dependencies between steps (a) Are called ordering relationships (b) Include step preconditions (c) Include operator preconditions (d) Include causal links (e) Include ordering constraints 10. Causal links (a) Indicate th a t an earlier step establishes a precondition of a later step . (b) Indicate the relative order for performing a pair of steps (c) Are used to provide explanations about the dependencies between steps (d) Can involve m ore th a n two steps 11. Ordering constraints (a) Indicate th a t an earlier step establishes a precondition of a later step . (b) Indicate the relative order for performing a pair of steps (c) Are used to provide explanations about the dependencies between steps (d) Can involve more th a n two steps 12. You may want ordering constraints between a pair of steps when (a) There is a causal link between the steps (b) The state changes o f th e later step interfere with the preconditions o f the earlier step (c) The later step is specified immediately after the first step 13. What is given to the Steve tutor? (a) A set of steps (b) A set of ordering relationships (i.e. causal links and ordering constraints) (c) A set of causal links th a t establish the procedure's goal conditions (d) A set of step preconditions (e) A set of operators 275 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. B .4 Worksheet Answers These answers were contained in tutorial. 1. True. 2. (b). 3. False. 4. True. 5. (a),(d). 6. (a),(c),(e). 7. (a),(b),(e). 8. (a),(c). 9. (a),(d),(e). 10. (a),(c). 11. (b). 12. (a),(b). 13. (a),(b),(c). Do you have any questions about these answers? Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. B .5 The Post-Test The last thing that subjects did was answer the following questions. How did you like it In the following, please provide answers from 1 to 7. (1 m eans not at all, 4 som ew hat, and 7 means a great deal.) If you cannot answer a question write N /A . The following questions were only given to subjects who only used an editor Authoring a) Did you like the system? b) Was it easy to use? c) Was it easy to specify a procedure’s steps? d) Was it easy to identify a step’s preconditions? e) Was it easy to identify a step’s state changes? f) Was it easy to identify how operators influenced causal links and ordering constraints? g) Any other comments about authoring? The following questions were only given to subjects who demonstrated. Demonstrating a) Did you like the system ? b) Was it easy to use? c) Was it easy to dem onstrate a procedure? d) Did you find additional dem onstrations useful? e) Was it easy to specify a procedure’s steps? f) Was it easy to identify a step ’s preconditions? g) Was it easy to identify a step’s state changes? h) Was it easy to identify how operators influenced causal links and ordering constraints? i) Any comments about dem onstrations? The following questions were only given to subjects who experimented. Experiments a) Did you like experimenting? b) Did experiments take too long? c) Did experiments save you work? d) Did experiments find errors th a t you would have missed? e) Any comments about experiments? Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Were there any other aspects o f system that were useful or worth mentioning? Thank you! •278 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. B.6 The Directions Given Subjects This packet contains your directions for authoring procedures using Diligent. Please go to the next page and answer th e questions. At this point, the subjects filled out the background questionnaire. Please indicate that you are ready to continue. First Day Directions You will be given the Diligent tutorial. Please open the tutorial and read through the first chapter and stop when you’ve finished it. Indicate that you are done and ask to continue. Now work through the rest of the tutorial. Since som e menus are visited several times, please follow the directions rather than explore the system . At this point, the subjects filled out the Procedure Representation Worksheet. Continue with the tutorial when you have finished the above worksheet. Remember to follow the directions instead o f exploring the system. W hen you have finished the tutorial, stop. Indicate that you are done and ask to continue. Please look over the tu to rial’s synopsis. Do you have any questions? End o f the first day 279 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Second Day Directions Please review the tu to ria ls synopsis (chapter 9) and the worksheet on procedure rep resentation. You should focus on those two sections but you can look at o th er parts of the tutorial. Do not spend more than ten minutes. Stop when you are finished. Indicate that you are done and ask to continue. Now go over the second day tutorial. Stop when you are finished. Indicate that you are done and ask to continue. At the end of the second day tutorial, subjects solve the practice problem (section B .ll and look at its solution (section B.12). Do you have any questions? •280 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Authoring Now you will author two procedures. For each procedure, you should 1. Enter th e procedure and modify it until you’re satisfied. 2. Test it. 3. Indicate when you are finished with it. You cannot spend more than 30 minutes on a procedure. Instructions These are the directions given to the subjects who both demonstrated and experimented. If a subject did not demonstrate or experiment, then directions that mention demonstrations or experiments were removed. Remember to consult the tu to ria l’s synopsis chapter if you have questions. Please do not change statu s values between “provisional” and “required.” Both values indicate th at th e object will be used. When authoring, remember th a t we are primarily concerned with attributes that change value during the procedure. A procedure should only contain necessary ordering relationships. When dem onstrating a procedure, make sure that a step has been processed before dem onstrating the next step. This can be done by making sure th a t the tex t “wait2” and “wait3” is scrolling in the Soar window. Demonstrating the next step too quickly can cause serious problems. A good rule of th um b is a t least 5 to 10 seconds between steps. Also remember to let Diligent “ experim ent” with the procedure. A fter experimenting, you need to derive the ordering relationships so that they reflect what was learned during the experiments. Because your activities are being m onitored, focus on authoring procedures rather than exploring the system out of curiosity. Assume th a t each procedure s ta rts in the state shown in the Vista window. The procedure’s description assumes th a t you sta rt in th a t state. Please give each procedure a distinct name. You will now be given • A description of the procedures to be authored. • Pictures o f the device with labels identifying the names of various objects. • A description of all attrib u tes and their legal values. Stop and indicate that you are ready to continue. The person helping you will prepare the system for the first procedure. Now author the “High C ondensate Level Shutdown” procedure. Stop when you have finished with the procedure. Indicate that you are done and ask to continue. T he person helping you will prepare the system for the second procedure. 281 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Now author the “Overload Relay Tripped” procedure. Stop when you have finished with the procedure. Indicate that you are done and ask to continue. T h e person helping you will save the second procedure. Go to the next page and fill out the questionnaire. At this point the subjects, filled out the post-test. 282 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. B.7 The List of A ttribute Values This list of attribute values was was given to all subjects, but was only required by subjects who only used an editor. The list provides some indication of the size and complexity of the domain. # The follow ing l i s t provides descriptions and values for the H P A C • attrib u tes. Attribute Nam e Description Values cdm_chnll_lt_state " fir st stage alarm ligh t" "on" "off" cdm_chnl2_lt_state "second stage alarm ligh t" "on" "off" cdm_chnl3_lt_state "third stage alarm ligh t" "on" "off" cdm_chnl4_lt_state "fourth stage alarm ligh t" "on" "off" cdm_power_state "condensate drain monitor power" 'condensate drain monitor status' " o il level" "off" cdm_status "system reset" "function test" "halted" cp_oil_level "normal" "low" "high" ctrl_m on_sel_state "compressor mode" "monitored" "unmonitored" ctrl_motor_status "motor" "on" "off" ctrl_power_status "power" "on" "off" c tr l.r e la y r e se t.sta te "overload relay" 283 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. "ok" "tripped" d ip stick .p o sitio n "dipstick position" "in" "halfway" "out" gb _ a irl_state "open" "shut" gb _air2_state "open" "shut" g b .co v stg l.sta te "open" "shut" gb_covstg2_state "open" "shut" gb_covstg3_state "open" "shut" gb_covstg4_state "open" "shut" gb_covstg5_state "open" "shut" sdm_handle_location "location of th e handle" "separator drain 1st stage valve" "separator drain 2nd stage valve" "separator drain 3rd stage valve" "separator drain 4th stage valve" "separator drain 5th stage valve" sdm_sep_drnvlvl_pressure " first stage pressure" "high" "normal" sdm_sep_drnvlvl_state " first stage valve" "open" "shut" sdm_sep_drnvlv2_pressure "second stage pressure" "high" "normal" sdm_sep_drnvlv2_state "second stage valve" "open" "shut" " first a ir intake valve" "second a ir intake valve" " first cutout valve" "second cutout valve" "third cutout valve" "fourth cutout valve" " fifth cutout valve" Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. sdm_sep_drnvlv3_pressure "third stage pressure" "high" "normal" sdm_sep_dmvlv3_state "third stage valve" "open" "shut" sdm_sep_drnvlv4_pressure "fourth stage pressure" "high" "normal" sdm_sep_drnvlv4_state "fourth stage valve" "open" "shut" sdm_sep_dmvlv5_state " fifth sta g e valve" "student speaking" "open" "shut" student.speaking "true" "false" surge_tank_level "surge tank level" "empty" "normal" "full" tm _ltcrkcsoil_state "indicator light" "off" tm _ltd isl_state "on" "off" tm _ltdis2_state "on" "off" tm _ltdis3_state "on" "off" tm _ltdis4_state "on" "off" tm _ltdis5_state "on" "off" tm _ ltfin d is.sta te "on" "off" tm _ltjkvtrout_state "indicator light" "on" "indicator light" "indicator light" "indicator light" "indicator light" "indicator light" "indicator light" •285 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. "off" tm _ltsucl_state "indicator light" "on" "off" tm _ltsuc2_state "indicator light" "on" "off" tm_power_state "temperature monitor power" "on" "off" tm_status "temperature monitor status" " test 100" " test 350" NONE "reset" "testing" " test trip temperature" Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. B.8 Labeled Pictures o f th e HPAC The following pictures were given to test subjects. Froat of the HPAC Gauges and valves Separator drain manifold Condensate drain monitor Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Ganges and Valves Air intake valve I Air intake vaJve2 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Separator Drain Manifold 4* stage valve Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Coadonalc Drain Monitor l" stage alarm light 2*“ s a g e alarm light 3"* s a g e alarm light 4* stage alarm light System reset button Function test button •290 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Control Door Power light (gray off & white on) Motor light (dark green off & bright green on) Power on/off button Overload Relay Reset Switch Motor start/stop button Latex is not form atting the picture properly. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. B.9 Procedure Descriptions This section contains the procedure descriptions that were given to test subjects. The first procedure authored is High C ondensate Level Shutdown, and second procedure authored is Overload Relay Tripped. B .9.1 H igh C on d en sate Level Shutdow n Sometimes high levels of condensation can build up inside the com pressor. To avoid damaging the machine, the com pressor’s condensate drain m onitor tu rn s off the motor. At this point, some alarm lights on the drain m onitor’s panel turn red. The alarm lights will turn off only after the pressure is relieved. For each alarm light th at is red, the student can relieve the pressure by opening the separator drain manifold valve th a t corresponds to th a t alarm light. Once the pressure is relieved, valves should be shut for normal operations. Once the m otor has been started w ith the control door panel’s m otor button, the pressure will be relieved and the alarm lights will turn off. Before startin g the motor, the student should reset the drain m onitor by pressing the drain m onitor panel’s system reset button. The procedure’s initial sta te can be seen in the Vista window. Initially, high levels of condensation have caused the motor to tu rn off and two alarm lights to tu rn red. When performing the procedure, the student will need to both open valves and turn on the motor. When you are finished, the alarm lights should be off, the valves should be shut, and the motor should be running. Reminder: you will only be asked to name an operator the first tim e the operator's action is used. In other words, if the action is used again, you will not be asked for an operator name. ‘ 292 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. B .9.2 O verload R ela y Tripped When the compressor gets overloaded, a relay will trip and tu rn off the m otor. At this point, the com pressor’s electronics may be in an anomalous sta te . The student can correct state by turning off the power with the power button on the control door panel. The button is a toggle th a t tu rn s the power on or off. One reason for overload is too much air pressure. To limit the air pressure, the student should shut the two air intake valves. Once the relay is tripped, the compressor will not work until the relay switch on the control door panel is toggled. In order to make sure th a t the power has been turned off, the relay will not reset unless the power is off. Once the relay has been reset, the student should turn the power on and then start the motor with the control door panel’s m otor button. When you are finished, the air intake valves should be shut and the m otor should be running. •2 9 3 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. B.10 Desired Procedures This section contains the desired procedures against which the subjects are evaluated. B .10.1 High C on d en sate L evel Shutdown The first procedure restarts the m otor after high condensate pressure has shut it done. The desired procedure has the following steps: 1. Turn the handle and open th e second stage valve. Do this by selecting the handle. 2. Move the handle to the first stage valve. Do this by selecting the first stage valve. 3. Turn the handle and open the first stage valve. 4. Press the reset button. 5. Press the m otor button and turn the motor on. 6. Turn the handle so th a t first stage valve is shut. 7. Move the handle to the second stage valve. 8. Turn the handle so th a t th e second stage valve is shut. The plan for the procedure is as follows: Steps: begin-clsd turn-1 m ove-lst-2 turn-3 reset-4 motor-5 turn-6 move-2nd-7 turn-8 end-clsd Ordering Constraints: 1. turn-1 before move-lst-2 2. turn-1 before motor-5 3. turn-1 before turn-8 4. move-lst-2 before turn-3 5. move-lst-2 before turn-6 6. turn-3 before motor-5 7. turn-3 before turn-6 8. turn-3 before move-2nd-7 9. reset-4 before motor-5 10. motor-5 before turn-6 11. motor-5 before turn-8 12. turn-6 before move-2nd-7 13. move-2nd-7 before turn-8 294 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Causal Links: 1. begin-clsd 2. begin-clsd 3. begin-clsd 4. begin-clsd 5. begin-clsd 6. begin-clsd 7. begin-clsd 8. begin-clsd 9. begin-clsd 10. begin-clsd 11. begin-clsd 12.turn-1 13.turn-1 14. move-lst-2 15. move-lst-2 16.turn-3 17.turn-3 18. reset-4 19. reset-4 establishes (cdm .chnl2Jt.state on) for tu m -l establishes (sdm_handle_location “separator drain 2nd stage valve” ) for turn-1 establishes (sdm_sep_drnvlv2_state shut) for turn-1 establishes (cdm_chnll_lt_state on) for turn-3 establishes (sdm_sep_drnvlvl_state shut) for turn-3 establishes (cdm .status halted) for reset-4 establishes (cdm_chnll Jt-S ta te on) for motor-5 establishes (cdm _chnl2Jt.state on) for motor-5 establishes ((ctrl_m otor_status off) for motor-5 establishes (sdm.sep_drnvlvl_pressure high) for motor-5 establishes (sdm.sep_drnvlv2_pressure high) for motor-5 establishes (sdm_sep_drnvlv2_state open) for motor-5 establishes (sdm_sep_drnvlv2_state open) for turn-8 establishes (sdm JiandleJocation “separator drain 1st stage valve” ) for turn-3 establishes (sdm JiandleJocation “separator drain 1st stage valve” ) for turn-6 establishes (sdm -sep.drnvlvl jsta te open) for motor-5 establishes (sdm -sep.drnvlvl_state open) for turn-6 establishes (cdm .status “system reset) for motor-5 establishes (cdm .status “system reset) for end-clsd 295 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. '20. motor-5 21. motor-5 22. motor-5 23. motor-5 ‘ 24. motor-5 ‘ 25. motor-5 ‘ 26. motor-5 27. turn-6 ‘ 28. move-‘ 2nd-7 29. move-2nd-7 30. turn-8 establishes (cdm _chnll_It_state off) for turn-6 establishes (cdmjchnl*2Jt_state off) for turn-8 establishes (cdnuchnll J t.s ta te off) for end-clsd establishes (cdm jchnl2Jt_state off) for end-clsd establishes (ctrl_m otor_status on) for end-clsd establishes (sdm-sep_drnvlvl_pressure normal) for end-clsd establishes (sdm_sep_drnvlv‘ 2_pressure normal) for end-clsd establishes (sdm _sep_drnvlvl_state shut) for end-clsd establishes (sdm JiandleJocation “separator drain 2nd stage valve” ) for turn-8 establishes (sdm JiandleJocation “sep arato r drain 2nd stage valve” ) for end-clsd establishes (sdm-sep_drnvlv‘ 2_state shut) for end-clsd ‘ 296 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. B .1 0 .2 O verload R ela y Tripped The second procedure restarts the m otor after high air pressure has caused relay to trip. The desired procedure has the following steps: 1. Shut the first air intake valve. 2. Shut the second air intake valve. 3. Turn off the power. 4. Toggle the relay reset switch. 5. Turn on the power. 6. Turn on the m otor. The plan for the procedure is as follows: Steps: begin-rlytp airl-1 air-2 power-3 reset-4 power-5 motor-6 end-rlytp Ordering Constraints: 1. airl-1 before m otor-6 2. air2-2 before m otor-6 3. power-3 before power-5 4. power-3 before reset-4 5. reset-4 before power-5 6. reset-4 before m otor-6 7. power-5 before m otor-6 297 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Causal Links: 1. begin-rlytp establishes (gb_airl_state open) for airl-1 2. begin-rlytp establishes (gb_air2_state open) for air‘ 2-2 3. begin-rlytp establishes (ctrl_power_status on) for power-3 4. begin-rlytp establishes (ctrl_relayreset_status tripped) for reset-4 5. begin-rlytp establishes (ctrl_m otor_status off) for m otor-6 6. airl-1 establishes (gb_airl_state shut) for m otor-6 7. airl-1 establishes (gb_airl_state shut) for end-rlytp 8. air2-‘ 2 establishes (gb_air2_state shut) for m otor-6 9. air2-2 establishes (gb-air‘ 2 .state shut) for end-rlytp 10. power-3 establishes (ctrl_pow er.status off) for reset-4 11. power-3 establishes (ctrl_pow er.status off) for power-5 12. reset-4 establishes (ctrl_relayreset_status < ok) for m otor-6 13. reset-4 establishes (ctrl_relayreset_status ok) for end-rlytp 14. power-5 establishes (ctrl_power_status on) for m otor-6 15. power-5 establishes (ctrLpow er.status on) for end-rlytp 16. motor-6 establishes (ctr!_motor_status on) for end-rlytp 298 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. B .l l Practice Procedure At the end of the second day’s training, subjects solved the following problem. Up to this point you have been following very specific instructions. In this section you are going to author with only general directions. To au th o r the procedure, you need define and test it. Like the above procedure, the practice procedure will toggle two cutout valves. How ever, instead of toggling the first and second cutout valves, you will now toggle the third and fourth cutout valves, which are to the right of the second cutout valve. The attributes “gb-covstg3jstate” and ugb_covstg4jstaten should initially be “open” and should be "shut” when the procedure is finished. You have 10 m inutes to finish this task. 299 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. B.12 Practice Procedure Solution Looking at the practice problem's solution was last thing that subject’s did during training. Only the subject’s who used demonstrations or experiments saw the parts of the solution that mention demonstrations or experiments. The following is a solution for the practice procedure. You should compare your proce dure against the solution. Ask questions if you do not understand why this is a reasonable solution. Let the procedure be called “practice Steps: (execution order) (The order th a t step s are toggled doesn’t m atter.) begin-practice, toggle-3rd-3, toggle-4th-4, end-practice Causal links: begin-practice establishes “gb_covstg3.state = open” for toggle-3rd-3 begin-practice establishes “gb_covstg4jstate = open” for toggle-4th-4 toggle-3rd-3 establishes “gb_covstg3.state = sh u t” for end-practice toggle-4th-4 establishes “gbjcovstg4_state = sh u t” for end-practice Ordering C onstraints: None. (All are “ignored” because they involve th e steps begin-practice and end-practice.) Operators: toggle-3 rd Preconditions: (gb.covstg3_state = open) State changes: (gb_covstg3.state = shut) toggle-4 th Preconditions: (gb_covstg4_state = open) State changes: (gb.covstg4_state = shut) Step Prerequisite preconditions: None are necessary. Number of dem onstrations: Only one is necessary. Continued on next page. 300 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Experiments resulted in: The removal o f one incorrect precondition, one incorrect causal link, and one incorrect ordering constraint. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Appendix C Evaluation D ata The following contains d a ta for the three experimental conditions. Experim ental condition E C i allows dem onstrations and experim ents. Condition EC? allows dem onstrations but not experiments. Condition EC3 uses only an editor. In order to protect the privacy o f subjects, the masculine pronoun “ he ” will always be used when referring to a subject. The use o f “ h e” does not indicate whether the subject was male or female. 302 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. C .l Background Questionnaire This data has been withheld to protect the privacy of the subjects. The d a ta is summarized in section 7.5.1. 303 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. C.2 Impressions o f Diligent This section contains the d a ta describing the subjects’ impressions of Diligent. The last activity that subjects performed were answering these questions, which are located a t the end of the subjects’ directions (appendix B). Because some of the questions are inappropriate for some of the experiment conditions, the subjects in each experimental condition answered a different subset of questions. The answers listed in the following tables represent the following questions. An answer of 1 means not a t all, 4 means som ew hat, and 7 means a great deal. 1 1 = General Questions about A uthoring I la = Like the system lib = Easy to use He = Easy to specify a step lid = Easy to identify a ste p ’s preconditions lie = Easy to identify a ste p ’s sta te changes Ilf = Easy to identify how operators influenced a step ’s preconditions and state changes 12 = Questions about dem onstrating I2a = Easy to dem onstrate I2b = Were additional dem onstrations useful 13 = Questions about experiments I3a = Did you like experimenting I3b = W here experiments quick enough I3c = Did experiments save work I3d = Did experiments find errors th at would have been missed Item I3b is different than what the questionnaire asked. The questionnaire asked, “Did experiments take too long.” So th a t the d a ta is easier to interpret, question was reformulated so th a t a lower answer is less positive. When transform ing the answers from the questionnaire to 13b, the following mappings were used: 1 to 7, 2 to 6, 3 to 5 and 4 to 4. 304 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. C .2.1 E xp erim en tal C on d ition E C i subject U a lib li e lid H e I lf 3 6 2 3 2 4 4 6 5 3 4 5 2 4 12 4 4 5 4 7 6 14 6 6 7 6 6 6 Table C .l: E C i Impressions about A uthoring subject I2a I2b I3a I3b I3c I3d 3 2 5 4 6 7 1 6 4 N/A 5 2 N/A N/A 12 •y 1 7 1 7 4 4 14 7 1 7 7 7 4 Table C.2: E C i Impressions about D em onstrations and Experiments Subject 3 didn’t feel th a t error recovery was covered well enough in training. T he subject also felt th at the user interface was confusing. Subject 6 had difficulty dem onstrating while remembering the initial state. T he subject also felt that the ambiguous use of term s made things more difficult. Subject 12 answered the questions about experim ents (i.e. I3a-I3d) with yes or no rather than a num ber. In the table, 1 is used for no, and 4 is used for yes. Because the d a ta for I3b was transform ed, the subject’ s “no” became a 7. Subject 14 answered question I3d with yes rath er than a num ber. In the table 4 is used for yes. Subject 14 d id n ’t understand how helpful experim ents can be during training. Instead, the subject learned how helpful experim ents can be during the evaluation’s first procedure. 305 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. C .2.2 E xp erim en tal C ond ition EC 2 subject Ila 11 b l i e lid H e I l f 2 4 4 7 7 5 5 5 5 2 4 5 5 3 8 3 4 1 3 4 3 9 4 5 7 3 3 3 10 3 3 6 6 6 5 11 5 2 4 5 5 3 Table C.3: E C 2 Impressions about A uthoring subject I2a I2b I3a I3b I3c I3d 2 4 4 - - - - 5 6 6 - - - - 8 1 4 - - - - 9 4 5 - - - - 10 7 3 - - - - 11 3 5 4 2 N /A 4 Table C.4: E C 2 Impressions about D em onstrations and Experim ents Subject ‘ 2 complained th a t the environment was slow to react. Subject 5 had a few com plaints. The descriptions of the procedures authored during the experiment were “som ew hat unclear.” The environm ent was unresponsive; the subject felt that m anipulating an o b ject required the mouse to be clicked in sm all region. The subject also had to be told w hether lights in the environm ent were turned on. Subject 8 had difficulty correcting mistakes. The subject felt th at there were a lot of distractions. The subject also felt th at the system was slow and unresponsive. (This comment appears to be directed at the environm ent.) However, the subject liked the GUI and wrote th at the GUI, “allowed a feel of ease o f use th a t d idn't always come across in [training].” Subject 9 would have really preferred to use an editor (e.g. E C 3 ) instead of dem onstrat ing procedures. The subject wanted to specify all the steps before dealing preconditions and state changes. The subject wrote, “The system seems to have several features to autom atically do several things, but they are not very useful. I would have liked to specify my initial state and final sta te option and then go on to define my steps, so I need not be concerned with preconditions.” Subject 10 couldn’t figure o u t how to make a step optional so th a t it would be skipped if it wasn’t needed. This a m isunderstanding of procedural presentation. T he subject had difficulty with the experim ent’s procedure descriptions; it was “not easy” to determ ine the steps or the minimal dependencies between th e steps. The subject also had problems removing extraneous dependencies because the removed dependencies d id n ’t immediately disappear. (The comment on removed dependencies may be related to the fact th at the 306 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. window containing a procedure’s graphical representation does not update its graph. To update the graph, a subject needs to close and re-opening the window.) Subject 11 was in E C i, but never used experim ents. For this reason, the subject was moved to E C j. The subject said that he d id n ’t know that experiments would remove excess causal links. 307 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. C.2.3 Experim ental Condition EC3 subject Ila 11 b l i e j lid l i e I lf 1 4 2 2 2 2 2 4 5 3 5 6 6 5 7 3 3 5 6 6 3 13 2 2 4 4 4 1 16 6 4 6 6 6 3 Table C.5: E C 3 Im pressions about Authoring Subject 1 was confused in a num ber of areas. T he subject was didn’t understand the operator’s and steps. The subject didn’t understand why both were needed and w hether the relationships was one-to-one or m any-to-one. (This is the only subject th a t did not fill out the procedural representation worksheet during the first day’s training.) The subject was also confused on how to insert a step in front of another step. Subject 5 would have liked to use tem plates for steps with similar preconditions and sta te changes. The subject wrote, “The testing and explanation com ponents were very good.” Subject 7 had quite a few problems. T he subject had difficulty familiarizing himself with the dom ain’s attrib u te names. The sim ilarity of these names made things more diffi cult. The subject wrote, “Having predeterm ined nam es for the actions actually disoriented me when solving the the problems.” The subject also had problems with the environm ent. It was difficult to zoom in or out. It was also difficult to determine where to click the mouse when m anipulating an object. Subject 13 had a number of user interface problem s. The subject wanted a button to press for help. The subject was frustrated because he couldn’t figure out how to delete unwanted conditional effects (you can ’t). T he subject felt th a t step preconditions could only be added when the procedure is graphically displayed. (This probably reflects the fact th at the procedure’s graph is not updated until the graph’ s window is closed and re-opened.) The subject was also irritated th a t windows d id n 't open more quickly. Subject 16 had problems with ambiguously ordered steps. In the second procedure, the subject felt th at he had to order the steps “illogically.” However, the subject felt th a t system was “easy to use once understood.” T he subject also felt that the second procedure was easier because the first procedure involved a “steep learning curve” on "parts of the environment and the linking of more complex sets o f steps.” 308 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. C.3 Authoring This section contains the d a ta describing how the subjects authored during the experim ent. The answers listed in the following tables represent the following data. Except for some of the time values, each procedure of the two procedures has the following data. Sometimes a subject would abandon a flawed procedure and create a new procedure. W hen this happens, the edits for the abandoned procedure are still counted. Edits: ed l = Steps added in normal dem onstration ed2 = Steps added in clarification dem onstration (E C i and E C 2 only) ed3 = Actions in prefix before sta rt of dem onstration (E C 1 and EC ’2 only) ed4 = Deleted steps ed5 = Edits to causal links ed6 = Edits to ordering constraints ed7 = Edits to goal conditions ed8 = Edits to filter attrib u tes out of causal links ed9 = Edits to filter attrib u tes out of ordering constraints edlO = Edits to conditional effect preconditions e d ll = Edits to conditional effect sta te changes (E C 3 only) e d l2 = Edits to control preconditions (associated with step s rather than conditional effects) edl3 = Edits to associate conditional effects to steps (E C 3 only) edl4 = Total logical edits. This is the sum of edl - edl2. ed l3 is ignored o u t o f concerns for fairness. Experiments: (E C 1 only) expl = Prefix actions performed preparing experiments expl = Steps performed during experim ents experiments Errors: erl missing ordering constraints er2 unnecessary or incorrect ordering constraints er3 missing causal links er4 unnecessary or incorrect causal links er5 missing steps er6 unnecessary or incorrect steps er7 total errors of omission (i.e. missing objects: e rl + er3 + er5) er8 total errors of commission (i.e. unnecessary or incorrect objects: er2 + er4 + er6) er9 total errors (er7 + er8) a l = Number of steps in the procedure a2 = Could the final procedure be dem onstrated 309 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. a3 = total effort (total edits(edl4) + total errors(er9)) Time: (in minutes) t l = First day training tim e t2 = Second day training tim e t3 = Total training time t4 = 1st procedure tim e before testing t5 = lnd procedure total tim e t6 = 2nd procedure time before testing t7 = 2nd procedure total tim e In the following tables, th e authoring d a ta represents two times: when testing starts and when the procedure is finished. If only one value is given, then both are the same. When a value is of the form A /B , the tw o values are different. The value a t the start of testing is A , and the value at the end is B. The times are derived from both log files and notes taken while the subject was training and authoring. Some of the times may be off by ± two minutes. T he error is this large because some of the times had to be explicitly logged and because som e of the times came from the notes. When a tim e was logged, sometimes the procedure had to put into the proper state, which involved closing windows and deriving the procedure’s goals and causal links from the current database. Times th at were explicitly logged include starting training, finishing training, starting a procedure and ending a procedure. However, the total tim e allowed for authoring a procedure was measured with an alarm clock. The sta rt of the training tim e for the first session is when the subject sits down. This means th at first session’s training tim e includes the 5 to 10 minutes required to fill out the background questionnaire. 310 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. C.3.1 E xp erim en tal C ondition E C i Topic Subject 3 6 12 14 ed l 11 5 8 8 ed 2 0 0 0 0 ed3 4 0 0 0 ed4 0 2 0 0 ed5 0 0 1 0 ed6 0 0 0 0 ed7 0 0 1 0 ed 8 0 0 0 0 ed9 0 0 0 0 edlO 0 0 0 0 e d ll - - - - e d l 2 0 5 1 0 edl3 - - - - edl4 15 1 2 1 1 8 expl 0 0 0 0 exp 2 25 0 49 49 e rl 10 1 2 0 0 er2 3 2 3 3 er3 24 28 5 4 er4 6 7 4 3 er5 3 6 0 0 er6 2 1 0 0 er7 37 46 5 4 er8 1 1 10 7 6 er9 48 56 1 2 10 a l 7 3 8 8 a 2 no no yes yes a3 63 6 8 23 18 Table C .6 : E C \ Procedure 1 A uthoring Information Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. T o p ic S u b je c t 3 6 1 2 14 ed l 5 4 5 6 ed ‘ 2 0 0 0 0 ed3 0 0 0 0 ed4 0 0 0 0 ed5 0 0 0 0 ed 6 0 1 0 0 ed7 1 1 0 0 ed 8 0 0 0 0 ed9 0 0 0 0 edlO 0 0 0 0 e d ll - - - - e d l 2 0 3/4 4 5 edl3 - - - - edl4 6 9/10 9 1 1 expl 0 0 0 0 exp 2 16 9 16 25 e rl 6 5 4 0 er2 0 0 4 4 er3 8 13 6 0 er4 0 1 / 2 2 4 er5 1 2 1 0 er6 0 0 0 0 er7 15 2 0 1 1 0 er8 0 1 / 2 6 8 er9 15 2 1 / 2 2 17 8 a l 5 4 5 6 a 2 no no no yes a3 2 1 30/32 26 19 Table C.7: E C \ Procedure 2 A uthoring Inform ation Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. T opic S u b je c t 3 6 1 2 14 t l 93 1 0 0 149 1 2 0 t 2 40 48 34 53 t3 133 148 183 173 t4 28 30 30 2 1 t5 30 30 30 25 t 6 2 2 26 18 2 0 t7 29 30 18 2 2 Table C . 8 : E C i Tim e Spent on Activities Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. C .3.2 E xp erim en tal C ond ition EC2 Topic S u b je c t 2 5 8 9 1 0 1 1 edl 17 - - - 29 1 0 ed 2 0 - - - 0 0 ed3 0 - - - 1 0 ed4 0 - - - 0 0 ed5 0 - - - 0 0 ed 6 0 - - - 4 14 ed7 0 - - - 2 0 ed 8 0 - - - 0 0 ed9 0 - - - 0 0 edlO 0 - - - 8 0 e d ll - - - - - - e d l 2 0 - - - 0 2 edl3 - - - - - - edl4 17 - - - 44 26 expl - - - - - - exp 2 - - - - - - e rl 8 - - - 4 3 er 2 18 - - - 7 1 2 er3 1 1 - - - 8 3 er4 26 - - - 5 25 er5 3 - - - 0 0 er 6 5 - - - 1 2 er7 2 2 - - - 1 2 6 er 8 49 - - - 13 39 er9 71 - - - 25 45 a l 1 0 - 19 - 9 1 0 a 2 no - yes no yes yes a3 8 8 - - - 69 71 Table C.9: E C 2 Procedure 1 A uthoring Information Subject 5 dem onstrated the steps too quickly. Diligent’s implementation could not de termine which state changes were caused by a given action. To correct this, the subject would have had to empty Diligent’s knowledge base and s ta rt over. Subject 8 didn’t understand the directions. The subject tried to move the valve handle to every valve. The procedure is so bad th at it cannot be easily graded. The procedure had 51 edits and at least 65 errors. Subject 9 authored a hierarchical procedure with several subprocedures. This makes it difficult to com pare the procedure to the other subjects, who did not attem pt a hierarchical procedure. The procedure has two problems: 1 ) the subject performed unnecessary steps 314 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. th a t moved the handle between the valves, and 2 ) the subject forgot to turn on the m otor a t the end of the procedure. T opic S u b je c t 2 5 8 9 1 0 1 1 edl 7 1 1 6 ■ T 1 6 6 ed 2 0 0 / 6 2 0 0 0 ed3 0 0 0 0 0 0 ed4 0 0/5 0 0 0 0 ed5 0 0 0 0 1 0 / 1 ed 6 0 0 1 0 4 2 0 /3 ed7 0 1 0 0 2 0 ed 8 0 0 0 0 0 0 ed9 0 0 0 0 0 0 edlO 0 0 0 0 3/12 0 e d ll - - - - - - e d l 2 0 0 2 1 0 0 / 6 edl3 - - - - - - edl4 7 12/23 2 0 1 2 14/23 6/16 expl - - - - - - exp 2 - - - - - - e rl 0 0 3 3 0 /7 0 / 1 er 2 7 7 3 4 4 /0 7/6 er3 0 1 / 0 7 5 2 / 1 0 0 er4 8 8 4 3 5 /0 8 /7 er5 0 0 0 0 0 0 er6 1 5/0 0 0 0 0 er7 0 1 / 0 1 0 8 2/17 0 / 1 er 8 16 20/15 7 7 9 /0 15/13 er9 16 21/15 17 15 11/17 15/14 a l 7 1 1 / 6 6 6 6 6 a 2 yes yes no yes yes yes a3 23 33/38 37 27 25/40 21/30 Table C.10: EC? Procedure 2 A uthoring Information The final procedure for subject 10 is marked as working because the steps are in the correct order and all ordering constraints are reasonable. However, the final procedure is basically unordercd. (The version before testing was much better.) Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. T opic S u b je c t 2 5 8 9 1 0 1 1 t l 119 75 84 91 91 123 t *2 38 27 35 6 6 39 54 t3 157 1 0 2 119 157 130 177 t4 24 30 30 30 30 30 t5 ^ 9 30 30 30 30 30 t 6 ^5 15 30 30 17 19 t7 8 30 30 30 30 •29 Table C .ll: E C 2 Tim e Spent on Activities Subject 9 skipped a day and took longer to train on the second day because of software problems. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. C .3.3 E x p erim en ta l C ond ition EC: i T opic S u b je c t 1 4 7 13 16 edl 3 4 1 0 1 1 7 ed 2 - - - - - ed3 - - - - - ed4 0 0 6 4 3 ed5 0 0 0 0 0 ed 6 0 0 0 0 0 ed7 0 0 0 0 0 ed 8 0 0 0 0 0 ed9 0 0 0 0 0 edlO 3 24 0 2 1 1 1 e d ll 1 1 2 2 7 1 1 e d l ‘ 2 0 0 0 0 0 edl3 3 1 2 6 1 1 1 1 edl4 7 40 18 43 32 expl - - - - - exp 2 - - - - - e rl 13 13 13 13 13 er 2 0 1 0 0 2 er3 30 30 29 29 27 er4 1 5 0 7 6 er5 7 6 5 3 4 er6 2 2 0 2 0 er7 50 49 47 45 44 er 8 3 8 0 9 8 er9 53 57 47 54 52 a l 3 4 3 7 4 a 2 no no no no no a3 60 97 63 97 84 Table C.l*2: EC's Procedure 1 A uthoring Information The correct procedure has 8 steps. Subject 1 d id n 't have any edl3 d a ta so the value of edl3 equals the number of steps. For subject 7. it is not clear why (edl - ed4 = 4) rather than 3 (al). This discrepancy was not reproducible. Subject 7 did so well because he didn’t do much authoring. For example, the su b ject did not specify a single precondition (edlO and edl*2). Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. T o p ic S u b je c t 1 4 7 13 16 e d l 7 8 6 6 6 ed 2 - - - - - ed3 - - - - - ed4 1 0 0 0 0 ed5 0 0 0 0 0 ed 6 0 0 0 0 0 ed7 0 0 0 0 1 / 2 ed 8 0 0 0 0 0 ed9 0 0 0 0 0 edlO 9 16 8 8 6 e d ll 8 8 9 6 6 e d l 2 0 0 1 0 / 2 3 /7 ed l3 6 1 1 7 6 6 ed l4 25 32 24 2 0 / 2 2 22/27 expl - - - - - exp 2 - - - - - e rl 7 5 6 3 2 er 2 0 6 0 0/4 0 er3 1 1 8 7 3 5 /6 er4 3 1 2 2 0 / 2 0 er5 0 0 1 0 0 er6 0 2 0 0 0 er7 18 13 14 6 7 /8 er 8 3 2 0 2 0 / 6 0 er9 2 1 33 16 6 / 1 2 7/8 a l 6 8 5 6 6 a 2 no no no yes yes a3 46 65 40 26/34 29/35 Table C.13: E C 3 Procedure 2 A uthoring Information The correct procedure has 6 steps. Subject I didn’t have e d l3 d a ta so th e value of e d l3 equals the num ber of steps. For subject 7. it is not clear why (edl - ed4 = 6 ) rather than 5 (a l). This discrepancy was not reproducible. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Topic Subject 1 4 7 13 16 tl 117 114 117 67 72 t 2 46 47 43 2 2 35 t3 163 161 160 89 107 t4 30 30 30 28 30 t5 30 30 30 28 30 t 6 30 30 30 14 19 t7 30 30 30 19 30 Table C.14: E C 3 Tim e Spent on Activities Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. C.4 Session Log This section contains d a ta collected during each su b ject’s two sessions. The section also mentions changes to the system and training to correct problem s with earlier subjects. The changes were m eant to correct problems with the study. First, it was impor tan t that subjects understood how to correctly use Diligent. Second, subjects needed to understand w hat steps were needed in the two procedures being authored. Two changes th at dealing w ith how subjects authored are not mentioned. One change is repeatedly reminding E C i and E C 2 subjects to avoid dem onstrating too quickly. Demon strating too quickly caused problem s with Diligent’s im plem entation. In particular, it caused pairs of actions to ap p ear simultaneous, and Diligent does not handle simultaneous actions. The other change is telling E C 1 subjects to experim ent with their procedures. One EC \ subject, who did n ’t experim ent, was switched group E C 2 . The potential for sim ultaneous actions was aggravated by a memory leak involving the VIVIDS simulation and the V ista browser. As more m em ory was lost, the V ista would get progressively slower and less responsive. Shortly after subject 7, updated versions of VIVIDS and Vista were installed. This fixed many of the performance problems th at subjects experienced with V ista. The material in this section is derived from notes rath e r than the answers to the ques tionnaire on the subject’s impressions of Diligent. In the following, the experim enter/author is referred to as the test m onitor. Minor errors in m anuals, such as typographical and grammatical errors, are not m entioned. • Subject 1 . - Session 1 The subject had questions about using V ista (the environm ent’s graphical in terface) . The subject looked a t menus th at hadn’t been discussed yet. The test m onitor told the subject, “it will become clear later o n .” The subject was confused that the graph of the procedure was not updated when a step was added. (T he graph is not updated after the window is opened.) The subject had difficulty understanding the concepts involved in a authoring procedure. P art of the reason is th at he d id n ’t know w hat he was trying to produce. He also had difficulty connecting a graph of a procedure with STE V E’s explanation. The subject felt th a t he was having to sim ultaneously learn the procedural representation and how to use Diligent. T he subject felt that he could do this, but other subjects might have more problem s. (This comment caused the creation of the procedural representation section and worksheet.) - Session 2 : training The subject read the procedural representation section (which later subjects read during the first session). Problems zooming in and out in Vista. During the practice problem, the subject was told to test his procedure. 320 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. — Session 2: 1 st procedure Confused about which steps to perform and their order. Expressed a desire for a list of available actions. (No subject was given this list. For this group (F C 3 ), the available actions are listed in one of Diligent’s menus.) • Changes The procedural representation section and worksheet were added to the first day’s tutorial. • Subject 2 — Session I The subject was confused about how he could tell whether a precondition is correct or not. The training m aterial ju st said a precondition was incorrect. (This question couldn’t be answered because it depends on the domain.) — Session 2: training The subject authored th e tutorial’ s procedure with separator drain manifold values rather than cutout valves. — Session 2: 1 st procedure The subject was confused ab o u t the procedure’s description. The test m onitor pointed to a description o f the procedure’s goals. The subject was surprised when a m enu for the operator name did not ap pear the second time the subject performed operator’s action (i.e. turning the handle). The test m onitor had to show the subject how to get to the control door. When the subject indicated th a t he was finished, he was told to test the proce dure. • Changes The domain attribute “sdm_handle_open” is no longer available to subjects. This attribute interferes with learning, but is needed by Steve for determ ining th a t the handle has finished turning. Subjects th at only use the editor (E C 3 ) can now add control preconditions directly to steps. Before these subjects had to add the preconditions to a conditional effect. The groups using dem onstrations (E C 1 and E C 2 ) already had this capability. Modified the description of the first procedure by adding a paragraph. The paragraph reminded the subject th at Diligent only asks for an operator’s name once. The second time th at the operator’s action is seen, Diligent does not ask for the name. In the first procedure, the operator for turning a handle is used multiple times. • Subject 3 321 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. — Session 1 Subject asked if he could play with the system while reading the tutorial. T he subject was told to follow the directions. The subject thought the procedural representation worksheet questions were confusing. — Session 2: training The subject had problems zooming in w ith Vista. The subject forgot to s ta rt testing the tu to ria l’s procedure. The subject then asked questions about options th at are only available during testing. During the practice problem , the subject asked questions. When asked about preconditions, the subject was told, “w hatever you think is best.” The subject asked if he should test his procedure and was told yes. — Session 2: 1 st procedure The subject was not told th a t he could w rite on the sheet containing the pro cedure’s description. T he subject didn’t see the picture identifying the separator drain manifold valves. — Session 2: 2nd procedure The subject was told th at he could write on the sheet containing the procedure’s description. The subject asked about the am ount of tim e left when there were 12 and 5 m inutes left. — Session 2: later comments The subject thought V ista was too slow. The subject didn’t feel th a t he knew the system well enough to recover from errors. The subject tried to turn lights on/off by selecting them with the mouse. (Of course, this did not work.) • Subject 4 — Session 1 Showed the subject how to zoom in with V ista. Vista sometimes responded a little slowly. — Session 2: training Stopped after finishing the tutorial instead of reading the directions. The sub ject was told to continue. — Session 2: 1 st procedure The subject had problems with inconsistent procedure goals. (The subject used the E C 3 editor.) — Session 2: later comments 322 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The subject said th at having to spell a ttrib u te values was not a problem when using the editor. During the experim ent, the subject asked if he could ask questions. He was told “no.” • Changes The second day tutorial now shows experim ental groups ECy and E C i how to give a second dem onstration. This helps with error recovery. During the experim ent, subjects are told to s ta rt the procedures from the state shown in the Vista window. Sheets with pre-printed statem ents were created. They are used during training and for preparing subjects for authoring the experim ent’s procedures. Sensing actions are disabled in Diligent. This should not im pact subjects because students shouldn’t use sensing actions. Subjects are now told to test the practice problem . (So far, the subjects have tested it.) Limit the review a t the sta rt of the second session to 1 0 m inutes. • Subject 5 — Session 1 The subject was confused about the use of pseudo-steps th a t represent the procedure’s initial and goal states. The subject accidently started defining a subprocedure and was told to abort it. Sometimes the procedure’s graph looks different than w hat is shown in the tutorial. This confused the subject. The subject was a little confused about why causal links and ordering con straints are rejected independently. The subject was told th a t an author may want an ordering constraint w ithout a causal link when he doesn’t want to show the causal link’s condition to students. — Session 2: training At sta rt of session, the subject was told to focus on the synopsis and procedural representation worksheet. However, the subject could look a t other parts of the tutorial. Told the subject to do “w hatever you think is best” during th e practice problem. — Session 2: 1 st procedure The subject had a serious error when he dem onstrated the procedure too quickly and experienced the sim ultaneous actions problem. This h u rt the final proce dure. The test monitor told him what caused the problem. The subject thought that the second stage valve would tu rn off the first stage light. (The first stage valve tu rn s off the first stage light.) In the middle of a dem onstration, the subject suspended the dem onstration. However, this prevents learning and is undesirable in the experim ent. 323 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. — Session 2 : 2 nd procedure STEVE did nothing while testing the procedure. The test m onitor told the subject to ab o rt the test. T he test monitor appears to have m ade a mistake because the sym ptom s indicated th a t procedure was bad and th a t STEVE could not find any appropriate actions to perform. — Session 2: later com m ents The subject did not like the procedure descriptions. • Changes Changed the color of the control door power on and m otor on lights. Before this, subjects were told w hat color was on and off. Disabled Diligent’s suspend dem onstration command. Subjects should not use this feature. The description of the experim ent’s first procedure was changed. It was made explicit th at each alarm light can be turned off by opening the corresponding separator drain manifold valve. This change was m ade because subject 5 thought th a t opening the second stage valve would turn off both the first and second stage alarm lights. • Subject 6 — Session 1 The subject had to be shown how to reset the view of the device with the simulation (i.e. VIVIDS). The subject tried to think about plans in term s finite state machines. The subject had to be shown S T E V E ’ s control panel. — Session 2: training The subject had difficulty specifying the step after which a new step is inserted. The subject was told that experim ents interacted with the environm ent. In the practice problem, the subject was confused about step specific precon ditions and conditional effect preconditions. (The subject had obvious miscon ceptions during the practice problem.) — Session 2: 1 st procedure The subject dem onstrated actions too quickly twice. This problem could not be fixed. — Session 2: later com m ents The subject’s nearsightedness caused real problems in training and in using the system. The subject was frustrated because Vista was slow and moving around in Vista was difficult. (Subjects don’t need to zoom or pan during the experiment.) • Changes Created solution for practice problem. The solution allows subjects to verify th a t they understand how to author. 324 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The description o f the experim ent’s first procedure was changed. It was m ade explicit th at subjects should focus on turning off alarm lights th at are red. The description of the experim ent’s second procedure was changed. It now says to shut the “two air intake valves” rather than the “air intake valves.” • Subject 7 — Session 1 The subject refused to follow directions. He read the synopsis a t the end of the tutorial first. The subject was a planning expert th a t believed th a t a causal link implies an or dering constraint. T he subject didn’t care about the representation worksheet’s answers. Showed the subject how to associate an effect with a step. The subject couldn’t finished because V ista crashed. The su b je c t’s d a ta was reloaded, but testing with STEVE d id n ’t work. (STEVE couldn’t be used because Diligent was not providing ST E V E with some low-level knowledge.) The subject finished the testing section by reading the tutorial. — Session 2: training The subject was told th a t a procedure’s graph was not updated after the window was opened. Showed the subject how to answer questions with Steve’s control panel. This was the portion of the first session th a t was skipped after V ista crashed. — Session 2 : 1 st procedure Wanted to know about checking the condensation, but the test m onitor couldn’t say anything. • Subject S — Session 2 : 1 st procedure The subject had problems with his procedure and wanted to s ta rt over. The subject was told to create a new procedure. — Session 2: later com m ents The subject didn’t understand that the ordering relationships shown in a proce dure’s graph are not updated. The subject felt th a t this was D iligent’s biggest problem. The subject thought th a t each dem onstration should contain only one step. This makes learning preconditions more difficult. • Changes Subjects in EC3 can now only add one step a t a time. Before they could, specify the previous step and add several sequential steps. This change removed a menu from editor th a t is very sim ilar to the D em onstration menu used by E C \ and EC i- However, by skipping a menu, the editor is a little simpler to use. 325 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The practice problem solutions for groups E C \ and E C 2 now say th at only one demonstration is necessary. The description of the experim ent’ s first procedure was changed. The description now mentions the initial sta te . It said th a t the m otor is turned off, two alarm lights are red and the initial sta te can be seen in the Vista window. • Subject 9 - Session 1 The subject had problems with Vista. The subject was shown how to select objects. The subject was also shown how to reset V ista’s the view of environment. - Session 2: training This is the only subject to skip a day between the tw o sessions. The subject had some problems m anipulating Vista. The subject had problems with the practice problem, which had to be restarted twice. Because of these problems, the practice problem took 20 minutes rather 10. One problem is th a t the subject performed actions too quickly and experi enced the sim ultaneous actions problem. A nother problem is th at the subject closed a window with an X-window comm and instead of using the button pro vided for the task. W hen using the X-window com m and, the subject ignored a window th a t warned her about closing a window in th a t manner. - Session 2: 1 st procedure The subject was confused about the procedure’s description. He wasn’t sure whether he needed to open the valves. He was told th a t needed to open the valves. - Session 2: 2nd procedure The subject asked if th e power had to be turned off. He was told, "yes.” - Later comments This is the only subject th at tried authoring with subprocedures, which is a topic th a t was not covered during training. • Changes The creation and use of subprocedures was disabled. The directions for the experim ent’s first procedure were changed. It now explicitly says that the valves need to be opened and the m otor turned on. This is meant to prevent subjects from thinking that either the valves can be opened or the m otor turned on. • Subject 10 - Session 1 The subject performed actions too quickly at the s ta r t and experienced the simultaneous action problem. Afterwards, the subject seemed to have no prob lems. The subject appeared to be familiar with moving around in Vista. 326 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. - Session 2: 1 st procedure T he subject expressed concern about her inability to turn off lights, but subject did eventually figure this out. - Session 2: later comm ents T he subject said th a t editing was hard, b u t testing with STEVE was easy. T he subject was also trying to put in optional steps so th a t the steps could be performed in different orders. (Presently, this is unsupported.) • Subject 1 1 - Session 1 Initially, the subject had problems zoom ing out too far with Vista. - Session 2: training For the practice problem, the subject was shown how to access causal link inform ation. - Session 2: 2 n d procedure The subject was told th at the power light is white rather than gray a t the sta rt of the procedure. - Session 2: later comm ents T he subject felt th a t the environment was unusual, and it is was difficult getting used to it. The subject d id n ’t realize th at experim ents would remove dependencies. For this reason, the subject was moved from group E C i to moved to group EC'2 . • Changes The practice problem solution for group E C \ now lists how experim ents correct the plan. • Subject 12 - Session 1 The subject zoomed in too fast in Vista. T he subject was shown how to reset the view. The subject was very meticulous when covering the tutorial. - Session 2: 1 st procedure Experim ented w ithout recomputing ordering relationships. - Session 2: 2 nd procedure After th e 1st procedure but before startin g the ‘ 2 nd, the subject was told to recom pute the ordering relationships after testing. - Session 2: later com m ents The subject felt th a t Vista zoomed in or o u t too fast. The subject also d id n ’t think that testing was necessary. 327 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. • Subject 13 — Session 1 The subject dem onstrated the steps in the wrong order. The subject was told to edit the procedure so th a t it resembled the tu to rial’s procedure. While the out of order problem was being discovered, the subject saw the test monitor use menus to identify the problem. Session 2: later com m ents The subject said th a t he did not have any problems with V ista. • Subject 14 — Session 1 Explained to the subject th a t the Soar window’s “wait*2 ” and "wait3“ meant that nothing else was happening. • Subject 15 Q uit after the first session. • Subject 16 — Session 1 The subject was familiar with STEVE but not Diligent. (T he procedures being authored during the experiment would not work in the versions of the environ ment th a t the subject had seen.) 328 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Appendix D How to U se Diligent This section contains selected parts of the first d ay ’s tutorial. It focuses on how to create a procedure, add steps to it and edit it. These are the areas where the three versions of Diligent used in the empirical evaluation differed. To limit this section’s length, some things have not been show n. Things not shown in clude deriving goal conditions, deriving ordering relationships, experim enting and testing. The chapter and tutorial sum m aries are also not shown. Most of the following sections represent th e version th at was given to subjects who could both dem onstrate and experim ent. This m aterial is probably identical to the m aterial given to subjects who could dem onstrate but not experim ent. Section D.3 describes how steps, preconditions and state changes are added by the subjects who could only use an editor. /Is mentioned earlier, this thesis uses the term “ step relationships” while the tuto rial uses the term “ ordering r e la tio n s h ip s I n order to m aintain consistency with screen snapshots, the term “ ordering relationships” will be used in this appendix. Because Diligent used a whole suite of software com ponents, it was not feasible to include everything in this docum ent. If you would like to get a copy of the system , please contact. Center for Advanced Research in Technology for Education Information Sciences Institute University of Southern California 4676 Adm irality Way, Suite 1001 Marina del Rey, California 90292 329 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. D .l Starting to Specify a Procedure Figure D .l: Main Learning Menu Option Description Update existing procedure Select and change an existing procedure. Create procedure C reate a new procedure. Which attributes are used Allows attrib u tes to ignored when computing ordering relationships. Figure D.2: M ain Learning Menu “Editing” O ptions Now that we can m anipulate th e V ista browser, we are ready to s ta r t defining a pro cedure. We will be using Diligent’s M ain Learning Menu. Figure D .l shows th e Diligent’s Main Learning Menu and figure D.2 shows the submenu options available under “Editing” . Select the “Create new procedure” option on the Main Learning m enu’s “Editing” submenu. Figure D.3: Procedure Description Menu The menu shown in figure D.3 will appear. Each procedure has a name, th a t is used to identify it, and a description, that is given to human students, who are to learn it. Please enter the procedure nam e “foo” and the description “dem onstrate how to author a procedure” . Indicate that you want to continue defining a procedure by selecting the “Accept” button. 330 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. D .2 Dem onstrations This version of the chapter is for when demonstrations are used. The next chapter contains code that was used for evaluation’s control group, which was not allowed to demonstrate. At this point we have started a procedure and given it a name and description. We are now ready to define the procedure’s steps. A step is another procedure or an action performed in th e simulated environm ent. We are going specify actions by performing (or dem onstrating) them in th e V ista window. D .2.1 C hapter G oals • Learn how to dem onstrate a procedure. • Learn to provide more than one dem onstration. • Learn about different types of dem onstrations. D .2 .2 S ettin g th e Initial E n vironm ent S ta te Figure D.4: Simulation Configuration Menu Before we dem onstrate the procedure, we need put the environment in the proper initial state. After defining our procedure’s name and description, you will see the Sim ulation Con figuration menu (figure D.4), which specifies an initial state for the environm ent. Select “Ok” to choose the default configuration. Resetting the environm ent takes several seconds. The state has been reset when the text stops scrolling in the Com m unications Bus M onitor window (figure D.5). After resetting the environm ent, you could make additional changes to the environm ent. Steps will not be added to the procedure until we indicate th at we are done m aking additional changes (figure D.6 ). Indicate that we are ready to start adding steps by selecting the “R eady” button figure D.6. 331 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1 I ) 1 J < 1 i 1 1 ; i 1 i • « f • • ! i m _ _ _ _ ' f ■ t r I 1 1 | I • t | - . t ' , . ! I n , ; ■ t ! | • t ' • : : i ■. i . . 1 . r , : Figure D.5: Communications Bus M onitor Window Figure D.6 : Additional Environm ent Changes D .2 .3 A dding S tep s At this point the D em onstration menu will appear (figure D.7). The menu has 3 options th at need to be understood. 1. “ Define new subprocedure” will sta rt the definition of a brand new procedure as a step in the current procedure. 2. “ Insert” allows use of an existing procedure as a step in the current procedure. 3. “ End demonstration” will end our dem onstration and add the steps we have demon strated to the procedure. Before the dem onstration, the V ista window should look like figure D.8 , and afterw ards, it should look like D.9. Now start the demonstration by toggling the leftm ost valve. Toggle the valve by putting the cursor over it, holding down the SH IFT key, and pressing the left mouse button. 332 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure D.7: Dem onstration Menu Figure D.8 : Environm ent before Dem onstration 333 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure D.9: Environm ent after D em onstration Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. D .2 .4 O perator D escrip tion s Figure D.10: O perator Description Window A window will appear th a t asks for operator information (figure D.10). W hat is an operator? O perators describe the preconditions and state changes for actions th at are performed in the sim ulated environm ent. The preconditions and sta te changes will be useful for com puting the ordering relationships between steps. The operator's name is used to identify it. Give the operator th e name “toggle- ls t ” . The operator's description is given to human students. Use the default descrip tion, “toggle the first cutout valve.” Close the window by selecting “A ccept.” W hen performing an action, always make sure Soar has finished processing it. You can tell th a t soar is finished when the Soar window looks something like figure D .ll. W hen the processing is finished, “wait2” and “wait3” will be scrolling in the Soar window. Wait for Soar to finish processing the action. D .2 .5 Add M ore Steps To elaborate our example, we will add two more steps to the procedure. This will give you a chance to practice. Now manipulate the second valve from the left. Do this by pressing the left mouse button on the valve while holding down the SHIFT key. Call the operator “toggle-2nd”. N ext, manipulate the third value from the left and call the operator “toggle- 3rd” . A t this point, the picture in the browser should look like figure D.9. D .2 .6 End D em on stration To end our demonstration and add the steps to the procedure, select “End demonstration” on Dem onstration menu (figure D.7). The Demonstration menu will disappear. 335 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure D .ll: Soar Processing an Action 336 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. D .2 .7 A dd ition al D em onstrations Figure D.1'2: Demonstration Version of Procedure M odification Menu After you finish dem onstrating a procedure, you can provide additional dem onstrations. This is done using the Procedure Modification menu (figure D.12), which is activated when you finish a dem onstration. Start a new demonstration by selecting the “Dem onstration” option on the Procedure M odification menu. 337 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure D.13: Demonstration Type Menu A window will appear th at asks you to indicate w hat type of dem onstration you w ant to perform (figure D.13). • “ Additional step s” This option allows you to insert additional steps between two steps that are already in a procedure. • “ Clarify without adding steps” This option allows you to dem onstrate how the envi ronment works without adding any steps. T his type o f demonstration helps Diligent discover the preconditions o f a procedure’ s steps. Since Diligent assumes the order that step s are performed is significant, a good heuristic for this type o f demonstration is to change the order o f the steps as much as possible. For example, our previous dem onstration toggled the 1st cutout valve before toggling the 2 nd and 3rd cutout valves. A good clarifying dem onstration would be to toggle the 3rd cutout valve before toggling the 2nd and 1st cu to u t valves. Indicate that you want to give a clarification demonstration by selecting the diamond next to “Clarify without adding steps” . Then select “Ok” to continue. D .2 .8 C hoosing a P reviou s S tep Once a procedure has some steps, you need to specify which existing step precedes the first step in a new dem onstration. Figure D.14 shows how the previous step is specified. The upper window contains a graph th at shows the order of execution for the p ro ced u res existing steps. The lower window allows you to specify the previous step. Cancel the demonstration by selecting th e “Cancel” button in the lower window. Also close the graph’s window by selecting “Ok”. 338 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure D.14: Previous Step Menu D .3 Adding Steps to a Procedure The previous chapter discussed how to demonstrate a procedure. This chapter describes how to add steps to a procedure using only an editor. At this point we have started a procedure and given it a name and description. We are now ready to define the procedure’s steps. A step is another procedure or an action performed in the simulated environm ent. D .3.1 C hapter G oals • Learn how to add steps to a procedure. • Learn how to associate operator effects with a step. • Learn how to define operator effect preconditions and sta te changes. D .3 .2 A dding Steps After defining our procedure’s name and description, you will see the Procedure Modifi cation menu (figure D.15). which is the main menu for m odifying a procedure. To add steps to the procedure, select the Procedure Modification menu’s “Add a step” option. 339 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure D.15: Manual Editor Version of Procedure Modification Menu D .3.3 C h oosin g a P r e v io u s Step Before we can add a step, we need to specify which existing ste p goes before the new step. Figure D.16 shows the windows th a t help you specify the previous step. The upper window in figure D.16 contains a graph th a t shows order of execution for the procedure’s existing step s. Initially, there are two steps, which indicate the procedure’s beginning and end. The lower window in figure D.16 allows you to specify th e previous step. You could change the previous step by selecting the box containing “begin-foo”. Since the procedure is new, “begin-foo” has to be the previous step. Agree to continue adding a step by selecting “Ok” in the lower window. Also close the graphical view of the procedure by selecting “Ok” in the upper window. D .3 .4 S electin g an A c tio n The Action Selection menu will appear (figure D.17). The menu describe th at the actions th a t can be added to the procedure. We want to toggle the first cutout valve. Select “toggle the first cutout valve.” Then approve the action by selecting “Ok” . 340 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure D.16: Previous Step Menu m s § A prmttm s;> r. S M ■ifcjw^yso v .> ^ tc*v aMteBfaBI. Figure D.17: Action Selection Menu Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. D .3 .5 O perator D escrip tion s Figure D.18: O perator Description Window A window will appear th a t asks for operator inform ation (figure D.18). W hat is an operator? O perators describe the preconditions and state changes for actions th a t are performed in the sim ulated environm ent. The preconditions and sta te changes will be useful for com puting the ordering relationships between steps. The operator's name is used to identify it. G ive th e operator the name “toggle- ls t ” . The operator’ s description is given to human students. Use the default descrip tion, “toggle the first cutout valve.” Close th e window by selecting “A ccep t.” Figure D.19: Effect Selection Menu Before Effects Defined 342 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. D .3 .6 S electin g O perator Effects When adding a step not only does the action need to be associated with an operator, but the step must also be associated with some of the operator’s effects. The Effect Selection menu will appear (figure D.19). Unfortunately, the new operator has no defined effects. Define an effect by selecting “Add effect to operator”. 343 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure D.20: Initial Operator Effect Menu 344 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. D .3 .7 A dding O p erator E ffects T he O perator Effect menu will appear (figure D.20) for operator “toggle-1st” !s first effect. Let us first add som e preconditions by selecting “Modify preconditions” , which allows us to add, delete and modify th e effect’s preconditions. Figure D.21: Precondition A ttrib u te List A window will appear th a t contains a list of environm ent attribute names th at can be used in preconditions (figure D.21). If an attribute has a defined value for preconditions, the checkbox (little square box) next to the attrib u te nam e will be selected. Scroll down the list and select the checkbox next to the attribute “gb.covstgl_state” Figure D.22: A ttribute Value Input Window The A ttribute Value Input window will appear (figure D.22). Figure D.22 shows th a t a ttrib u te “gb_covstgl_state” is described as the “first c u to u t valve” . 345 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Enter the attrib ute valve “open” and close the window by selecting “Ok” . In the Precondition A ttrib u te list, the square next to a ttrib u te ’s name is now red. Let us look a t the precondition th at we just defined. Select the rectangle containing the attribute name ( “gb_covstgl_state”). i ? S S S Figure D.23: Precondition Value W indow A window containing information about the precondition will appear (figure D.23). The attrib u te’s description is “first cutout valve,” and the a ttrib u te ’s value is “open” . We now want to go back to the Operator Effect menu. C lose the Precondition Value window and th e Precondition A ttribute List window by selecting “Ok” . Now th at we are back on the Operator Effect menu, we will add a state change. The process is exactly like th a t used to add preconditions. Add a state change to the effect by selecting “M odify state changes” , which allows us to add, d elete and modify the effect’s state changes. Indicate that the attribute “gb _covstgl.state” should have the value “shut”. W hen you are done, close the State Change Attribute List and go back to the Operator Effect menu. 346 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure D/24: Updated Operator Effect Menu 347 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. At this point the O perator Effect menu should look like figure D.24. One precondition and one state change are now defined. You should know a couple of things about the O perator Effect menu. 1. Only preconditions with a “Likelihood” o f “ high” or “medium ” are used. B y default the preconditions that you add will have a “ high” likelihood. 2. By selecting the rectangle containing a precondition’ s “Condition” (e.g. “gb-covstgl .state = open”), you can look at inform ation about the precondition. You can also change the precondition’s “Status” , which control’s its “Likelihood” . 3. By selecting the rectangle containing a state change (e.g. “gb_covstgl .state = shut” ), you can look a t information ab o u t the state change. Now add the effect to the operator by selecting “Approve” on the bottom o f th e Operator Effect menu. This returns us to the Effect Selection menu. 348 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure D.25: Updated Effect Selection Menu 349 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. D .3 .8 S electin g O perator E ffect’s R ev isited T he Effect Selection menu for our first step should now have an effect listed (figure D.25). Associate th e operator’s first effect w ith the step by selecting the checkbox next effect “1” . Now approve the association of step “to g g le-lst-1 ” to operator “toggle- ls t ”’s first effect by selecting “Ok”. D .3 .9 A dd a C ouple M ore Steps To elaborate our example, we will add two more steps to the procedure. This will give you a chance to practice. After step “toggle-lst-1” , add the “toggle the second cutout valve” action, name the operator “toggle-2nd,” and have attribute “gb_covstg2_state” change its value from “open” to “sh u t.” “open” is the precondition value, and “shut” is the state change value. After step “toggle-2nd-2” , add the “toggle th e third cutout valve” action, name the operator “toggle-3rd,” and have attribute “gb_covstg3_state” change its value from “open” to “shut.” 350 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. D .4 Editing a Procedure In this chapter will we explore how to edit the objects associated with a procedure. D .4.1 C h ap ter G oals • For objects associated with a procedure, - Learn how to exam ine and modify them . - Gain familiarity with their menus. • Learn about ordering relationships (i.e. causal links and ordering constraints). (See section D .4.11 on page 363). • Modify our example procedure in preparation for testing. D .4.2 R eview : R each in g th e P roced u re M od ification Menu Figure D.26: Main Learning Menu The Main Learning menu’s (figure D.26) “Editing” submenu allows you to access a pro cedure’s Procedure M odification menu. For an existing procedure, select “Update existing procedure” , and a list of procedures appears. Select the name of a procedure and then select “Ok” . This will open a Procedure Modification menu for the selected procedure. Do nothing, the Procedure Modification menu is visible fo r procedure "foo". 351 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. D .4 .3 Procedure G raphs Figure D.'27: Procedure Graph from “Ordering relationships” A Procedure graph presents the steps in a plan as nodes in a graph and allows you to access d a ta for individual steps. Create a graph o f our procedure by selecting the “Graph” button on the Procedure Modification menu and choosing “Ordering relationships” . The “ordering relationships” Procedure graph of our procedure is shown in figure D.27. The rectangles “begin-foo” and “end-foo” represent the beginning and end of the proce dure. The ovals represent the three steps we specified. T he arrows represent ordering relationships between pairs of steps. The procedure’s initial s ta te is represented as sta te changes caused by the procedure’s s ta rt step (“begin-foo” ), and the procedure’s goals are represented as preconditions of the procedure’s end step (“end-foo” ) . 352 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure D.28: Procedure Graph showing “execution order” 353 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure D.29: S tep Modification Menu Switch to an execution order view o f the procedure by selecting the box containing “ordering relationships” and choosing “execution order”. The “execution order” Procedure graph o f our procedure is shown in figure D.*28. The arrows order the steps in the sequence th a t we specified when we added them to the procedure. D .4 .4 L ooking at a step The Step Modification menu allows you to examine and modify objects associated with a step. Bring up the Step Modification menu for step “toggle-2nd-2” by moving the cursor over the oval containing “toggle-2nd-2” . When the oval's outline changes color (becom es black), press th e left mouse button. The Step M odification menu for step “toggle-2nd-2” is shown in figure D.29. The step's operator (“toggle-2nd” ) associates an action in the environm ent to the ste p ’s effects. This step produces the o p erato r’s first effect ( “ 1” ). Each effect associates a set of preconditions 354 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. th at must be true before th e step to a set of sta te changes th at result from executing the step. Edit operator “toggle-2nd’” s first effect by selecting the square that says “ 1” . 355 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. D .4 .5 O perator E ffect M enu Figure D.30: O perator Effect Menu T he O perator Effect menu maps a set of state changes caused by an action in the envi ronm ent to a set of preconditions. The O perator Effect menu for the first effect of o p e ra to r “toggle-2nd” is shown in figure D.30. You should know a couple of things about the m enu. 1. The area at the top of the menu describes preconditions, which are attribute values th a t need to be true before the operator’s action is performed. 2. Only preconditions with a “Likelihood” o f u high ” o r “ m edium” are used. 3. By selecting the rectangle containing a precondition’s “Condition” (e.g. “gb.covstg2_state = open” ), you can look at information about th e precondition. You can also change the precondition's “ S ta tu s”, which control's its “Likelihood". 4. The bottom of the menu lists sta te changes produced by the effect. State changes are the values of attributes after the operator’s action is perform ed. 356 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 5. By selecting the rectangle containing a state change (e.g. “gbjcovstg2_state = sh u t” ), you can look a t inform ation about a state change. D .4 .6 P recon dition W ind ow Figure D.31: Precondition Window Using the Operator Effect m enu, look at a precondition by selecting the rect angle containing “gb_covstg2_state = open”. The Precondition window describes a precondition for an operator effect. Figure D.31 tells us that the state of the “second cutout valve” needs to be “open” and th a t the precondition is “provisional” , which means th at it will be used. Preconditions are used only when their status is “required” , “suspect” or “provi sional” .1 Close the Precondition window by selecting “Ok” . D .4 .7 State C hange W indow Itp fstat Figure D.3'2: S tate Change Window Using the Operator Effect menu, look at a state change by selecting the rect angle containing “gb_covstg2-state = shut”. 'In th e tutorial, th is c h a p te r's sum m ary has a tab le th a t describes th e various sta tu s values. In th is th esis, th e calcu latio n of sta tu s values is d escrib ed in Section A.3. 357 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The State Change window describes a sta te change caused by an operator’s effect. Figure D.32 tells us th a t the sta te of the “second cutout valve” will be “shut” . Close the State Change window by selecting “Ok” . D .4 .8 M odifying P recon d ition s We will now introduce two preconditions for step “toggle-2nd.” The preconditions will help us when we test the procedure. D .4,8.1 Using the Operator Effect menu T he first precondition is erroneous. It will be identified when we test the procedure. The precondition is the last precondition in the O perator Effect m enu’s list of precon ditions. The precondition has a likelihood of “none” because the experiments determined th a t it is unnecessary. Be aware th at the scrollbar next to the preconditions does not indicate the number of preconditions in the list. Select the precondition w ith the condition “gb_covstgl.state = shut” . In th e Precondition window, set the status to “required” . Select “Ok” to close the Operator Effect menu. D .4.8.2 Using the Step Prerequisites menu T he next precondition th at we will specify is not required to perform the step. Instead, th e precondition is used to control when the step is performed. O perator effects are inappropriate for this purpose because • Preconditions are autom atically eliminated if they are not required by the environ ment. • The same effect could be used with several steps. You can specifying preconditions for controlling when a step is performed using the “S tep Prerequisites” menu. On Step Modification menu for step “toggle-2nd-2” , open the “Step Pre requisites” menu (figure D.33) by selecting the “Step Prerequisites” button. We will specify th a t the first stage alarm light should be off before performing step “t oggle-2nd-2” . Select the precondition for first stage alarm light by selecting the rectangle w ith the condition “cdm .chnll_lt_state = off” . In the Precondition window, set the status to “required” by selecting the diamond next to “required” . Select “Ok” to close the Step Prerequisites window. D .4 .9 U pd ated P rocedure G raph A fter updating the preconditions, we need to close some windows and recalculate the ordering relationships between the procedure’s steps. 358 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure D.33: Step Prerequisites Menu Close the Step M odification menu and the Procedure graph by selecting the “Ok” button on the bottom of each menu. On the Procedure M odification menu, recalculate our procedure’s ordering constraints by selecting the “Complete” button and choosing “Derive ordering relationships” . On the Procedure Modification menu, open up a new Procedure graph, by selecting the “Graph” button and choosing “Ordering relationships” . 359 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure D.34: Incorrect Procedure Graph 360 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure D.34 shows the Procedure graph when operator "toggle-'2nd""s first effect con tains the erroneous precondition. You can see the error because the second step ( “toggie- 2nd-2” ) should not depend on th e first step (“toggle- 1st-1” ). Go to the updated Step Modification menu for step “toggle-2nd-2” by se lecting its oval. Rem em ber to look fo r a change in color of the ovaVs outline. D .4.10 U p d a ted S tep M od ification M enu Figure D.35: Step Modification Menu with Error A fter the error is introduced, the Step Modification menu looks like figure D.35. To see dependencies w ith steps later in the procedure, select “this step depends upon” . You will see tw o options “this step depends upon” and * ' depend upon this step". Choose the “depend upon this step” option. Only “end-foo” will be listed as depending directly on step “toggle-2nd-2". (The preconditions for step “end-foo*’ are the procedure’s goals.) Undo the previous action by selecting “depend upon this step” and choosing the “this step depends upon” . 361 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The menu should now say this step (“toggle-2nd-2” ) depends on steps "begin-foo" and “toggle- 1st-1” . ( “begin-foo” represents the initial sta te in which the procedure starts.) To look at the dependencies between step “toggle-lst-1” and our current step (“toggle-2nd-2”), select th e rectangle containing w to g g le -lst-l” . This brings up the Dependencies menu fo r step s “toggle-1st-1* and '‘toggle-2 n d -2 ”. 362 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. D .4 .1 1 D ependencies M enu Figure D.36: Dependencies Menu Before we can discuss the Dependencies M enu, we need to define some term s. Ordering Relationships are causal links and ordering constraints. A causal link is an a ttrib u te value caused by one step th at is a precondition for a later step. An ordering constraint is indicates the relative order for performing a pair of steps. You want an ordering constraint between the step s when 1 . There is a causal link between the steps. 2. The sta te changes o f the la tte r step interfere with the preconditions o f the earlier step. Figure D.36 shows dependencies between steps “toggle-1st-1” and “toggle-2nd-2” . In figure D.36, notice three things. 363 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1. Near the top o f the menu there is a “provisional” ordering constraint between the two steps. If the diamond next to “rejected” is selected, no ordering constraint will be included in the procedure. The ordering constraint says th a t step “toggle-1st-1” should be performed before “toggle-2nd-2” . 2. There is one causal link between the steps with the condition “gbjcovstgl.state = shut” . This m eans th at step “toggle-1st-1” causes a ttrib u te “gb-covstgl.state” to have the value “shut” and th a t this value is a precondition for step “toggle-2nd-2” . 3. The causal link is the only reason for the ordering constraint. D .4.12 L ooking a t th e C ausal Link M enu Figure D.37: Causal Link Menu On the Dependencies menu, look at data for the causal link by selecting the rectangle containing “gb-covstgl-state = shut” . Figure D.37 shows the Causal Link menu. The figure says th a t there is a causal link between steps “toggle-1st-1” and “toggle-2nd-2” where a state change caused by “toggle- ls t- 1” is a precondition for “toggle-2nd-2” . The state change is th a t the first cutout valve becomes shut. The causal link’s sta tu s is “provisional” . Causal links with a s ta tu s of “rejected” will not be included in the procedure. Close the open editing windows by selecting their “Ok” buttons. These windows are the Causal Link menu, the Dependencies m enu, the Step M odi fication menu, and the Procedure Graph window. The Procedure Modification menu should still be open. 364 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Abstract (if available)
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Terascale knowledge acquisition
PDF
One-round zero -knowledge proofs and their applications in cryptographic systems
PDF
Low-state mechanisms to protect the network from greedy and malicious agents
PDF
Learning object identification rules for information integration
PDF
Active learning with multiple views
PDF
Self -organizing hierarchies and their application to the Internet and emerging pervasive computing applications
PDF
Multi-view three-dimensional object description with uncertain reasoning and machine learning
PDF
Modeling the mirror: Grasp learning and action recognition
PDF
Development and evaluation of value -based review (VBR) methods
PDF
Scientific visualization and data mining for massive scientific datasets
PDF
A framework for learning from demonstration, generalization and practice in human -robot domains
PDF
On factoring polynomials, constructing curves and lifting points
PDF
Model-based segmentation and tracking of multiple humans in complex situations
PDF
Inference of two-dimensional layers from uncalibrated images
PDF
Towards applied geometry in graphics
PDF
Models and algorithms for distributed computation in wireless sensor systems
PDF
Fast, small and reliable self -assembly
PDF
An integrated environment for modeling, experimental databases and data mining in neuroscience
PDF
Noun phrase translation
PDF
The effects of CASE tools on software development effort
Asset Metadata
Creator
Angros, Richard Harrington, Jr.
(author)
Core Title
Learning what to instruct: Acquiring knowledge from demonstrations and focussed experimentation
Degree
Doctor of Philosophy
Degree Program
Computer Science
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
Computer Science,OAI-PMH Harvest
Language
English
Contributor
Digitized by ProQuest
(provenance)
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c16-169938
Unique identifier
UC11334317
Identifier
3054846.pdf (filename),usctheses-c16-169938 (legacy record id)
Legacy Identifier
3054846.pdf
Dmrecord
169938
Document Type
Dissertation
Rights
Angros, Richard Harrington, Jr.
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the au...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus, Los Angeles, California 90089, USA