Contents
Beyond AIML: Chatbots 102
 
 
Printer-Friendly VersionPrinter-Friendly Version
 
Latest News
spacer View All spacer
 
November 22, 2009
 
Video Game Watchdog National Institute On Media And The Family Shutting Down [11]
 
Modern Warfare 2 Infinity Ward's 'Most Successful PC Version' Yet [12]
 
New Tech, Design Details Of Project Natal To Emerge At Gamefest In February
spacer
Latest Jobs
spacer View All     Post a Job     RSS spacer
 
November 22, 2009
 
Trion Redwood City
Sr. Evnironment Modeler
 
Trion Redwood City
Sr. Environment Artist
 
Sucker Punch Productions
3D Environment Artist
 
Sucker Punch Productions
Network Programmer
 
Sucker Punch Productions
Character Artist
 
Sucker Punch Productions
Texture Artist
 
Monolith Productions
Sr. Software Engineer, Engine - Monolith Productions - #113767
 
Sony Online Entertainment
Brand Manager
spacer
Latest Features
spacer View All spacer
 
November 22, 2009
 
arrow Upping The Craft: Susan O'Connor On Games Writing [6]
 
arrow Small Developers: Minimizing Risks in Large Productions - Part II [7]
 
arrow iPhone Piracy: The Inside Story [48]
 
arrow And Yet It Grows: Analyzing the Size and Growth of the European Game Market [5]
 
arrow NPD: Behind the Numbers, October 2009 [13]
 
arrow Reflecting On Uncharted 2: How They Did It [5]
 
arrow Sponsored Feature: Rasterization on Larrabee -- Adaptive Rasterization Helps Boost Efficiency
 
arrow Postmortem: Wadjet Eye's The Blackwell Convergence [2]
spacer
Latest Blogs
spacer View All     Post     RSS spacer
 
November 22, 2009
 
Time Fcuk [1]
 
Accepting the Inherent Value of Games
 
Planckogenesis, Part II: Song Structure & Gravy Train [1]
spacer
About
spacer News Director:
Leigh Alexander
Features Director:
Christian Nutt
Editor At Large:
Chris Remo
Advertising:
John 'Malik' Watson
Recruitment/Education:
Gina Gross
 
Features
  Beyond AIML: Chatbots 102
by Bruce Wilcox
4 comments
Share RSS
 
 
August 14, 2008 Article Start Previous Page 4 of 6 Next
 

Beyond AIML to CHAT-L

So what would I do? I wouldn't go down the regular expression path, because I, too, want to easily read my patterns. And regular expressions are overkill, it seems to me. Nor would I require the pattern span the entire input. I am much more keyword-oriented. And I want a much clearer organization and independence of topic knowledge. So I store data as topics. The description below is an overview, not intended to be a specification, covering every capability of the system. But it should give you a flavor of something different from -- but similar in ability to -- AIML.

The underlying CHAT-L engine takes your input and provides both the raw input and remapped canonical forms of words (e.g., plural nouns map back to their singular form and verb forms move to a canonical form), joining idiomatic phrases and proper names into single underscored words.

Advertisement

The engine also decides if the input was a statement or a question. AIML doesn't directly address sentence type. AIML requires you strip out punctuation, not allowing any in patterns, and makes you handle detecting statements from questions by your word order. But subject inversion (Do I) is an unreliable question marker given things like tag questions (I do, don't I?).

CHAT-L then hunts to find a way to react. It starts in the current topic, trying to find a matching question or statement reactor. If that fails, it will use keywords in the input to find some other topic with a matching question or statement responder. If that fails (and ignoring a few other complications) it will make some generic grunt from the grunt topic in response to your input, and then proceed with a gambit line from the current topic. E.g.,

synonym: ~movie (film video)

topic: ~ALIENSMOVIE ( Aliens Sigourney)
g: The Aliens movies starred Sigourney Weaver.
?: (~plot) Alien creatures hatch inside humans.
?: ( actress heroine star) Sigourney Weaver starred as Ripley.
?: ( director directed ) Ridley Scott directed.
?: ( you THEN love AND movie) Yes, I love this movie
s: ( Aliens AND ~movie) I have seen the movie Aliens.

A synonym line defines a name (~movie) and a list of words associated with it, including the name itself sans the ~. So in the above example, ~movie maps to movie and film and video.

A topic declaration defines a synonym set (~ALIENSMOVIE) and associated words (Aliens Sigourney), but these words are also keywords that enable access into this topic. The topic declaration also stores a collection of kinds of activities for the topic. The "s:" lines react to statements. The "?:" lines react to questions. The "g:" lines are gambits that can be issued spontaneously.

Patterns are described in the parens of an s or ? line. By default, a collection of words means find ANY of those words in the input (an implicit synonym set). You can use capitalized keywords to force different relationships. AND means both have to occur in any order. THEN means they have to occur in the order given. NEXT means they have to occur in consecutive order.

E.g., the last question line matches Do you love this movie as well as Do you absolutely love this movie (notice the THEN instead of a NEXT). It also matches You love this movie, don't you? and This is a movie you love, isn't it? Of course it also inappropriately matches Do you love Sigourney Weaver in this movie? but you can't have everything. It's not worth trying to block that.

If I trusted the user to enter grammatical input, I could depend on the parser to tell me the real subject, verb, and object of the sentence, and I could put those as test conditions. But user chat is sloppy and often terse.

You can nest parentheticals, to create subpattern levels. The pattern ( (aliens ridley) AND ~movie) would allow a choice of aliens or ridley and some word from the ~movie collection. In this case, the subpattern acts as a local synonym set, but it could also use relationships other than OR.

The following questions:

What color is your hair? Are you a redhead? Are you blond or brunette?

can be handled by this reactor:

?: ( (%type=whatquestion AND I AND hair AND color)
( I AND be AND (blonde brunette redhead )) ) I've been blonde, but my natural color is brunette.

That is, alternate forms are shown on each line, using the same final answer.

Notice that the testing for I is canonical, and covers I and me and myself and mine and my. Similarly be tests all of the conjugations of to be in any tense. Because I am using a dictionary and supplying alternate forms, if you use a base form in the pattern (or in a synonym set), it will match any form of corresponding word. If you use a non-base form, it only matches that form.

So directed will only match the word directed, whereas had I said direct it would have matched direct or directed or directing. If you single-quote a word, it means just literally use that word, so if you only want a base form you could write 'direct to get that.

 
Article Start Previous Page 4 of 6 Next
 
Comments

Mike Rozak
profile image
What you really need to do is include probabilities in AIML. For example, instead of the synonym "George Bush" -> "George W Bush", include a probability of the synoym being correct, such as 90%. Likewise, "George" -> "George W Bush" might have a 1% chance. You might also have "George" -> "George Washington" with a 2% chance.

Also associate a probability with the context. If a player asks, "Does George like flying in Airforce one?", this will be parsed to "Does George W Bushlike flying in Airforce one?" (1%) as well as "Does George Washington like flying in Airfoce one?" (2%). However, some context logic will know that George W Bush is associated with airforce one, and have a higher probability for the context (90% context probability for a modern president, with 1% probability for anyone else).

Then, a combination of sentence-parse probabilities and context probabilities (1% x 90% = 0.9% vs. 2% x 1% = 0.02%) can disambiguate the meaning of a statement. This is a common speech recognition trick. (So you might want to learn about speech recognition, Viterbi searches, and Hidden Markov Models.)

I've already implemented this and am using it in my game, http://www.CircumReality.com .

You might find its use of text-to-speech interesting too. You'll find that your AIML tags for responses are completely inadequate, and need to include facial emotions, spoken emotions, and nuanced prosody.

You'll also find that hand-coding millions of responses isn't worth the work. Most of what players want to ask is more procedural, such as "Where is the nearest merchant/guard/toilet?" and "Did you see where Frank went before the murder occured?"

Kyle laozhao
profile image
well the dialog very interesting
sehr interessant!

you will find more in the http://sglab.cn/blog

Meng Mao
profile image
@Mike Rozak

Yeah, but your game has dialog like this:
http://www.circumreality.com/ScreenPreRelease4b.jpg

Mike Rozak
profile image
@Meng Mao

If you send me E-mail, I'll go into detail... but basically, without a mostly menu-driven dialogue system, players don't know what to say and/or get into ye-olde "guess the verb" problems that Zork and other IF often has.


none
 
Comment:
 


Submit Comment