Gamasutra is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Gamasutra: The Art & Business of Making Gamesspacer
Beyond AIML: Chatbots 102
View All     RSS
February 23, 2020
arrowPress Releases
February 23, 2020
Games Press
View All     RSS

If you enjoy reading this site, you might also want to check out these UBM Tech sites:


Beyond AIML: Chatbots 102

August 14, 2008 Article Start Previous Page 4 of 6 Next

Beyond AIML to CHAT-L

So what would I do? I wouldn't go down the regular expression path, because I, too, want to easily read my patterns. And regular expressions are overkill, it seems to me. Nor would I require the pattern span the entire input. I am much more keyword-oriented. And I want a much clearer organization and independence of topic knowledge. So I store data as topics. The description below is an overview, not intended to be a specification, covering every capability of the system. But it should give you a flavor of something different from -- but similar in ability to -- AIML.

The underlying CHAT-L engine takes your input and provides both the raw input and remapped canonical forms of words (e.g., plural nouns map back to their singular form and verb forms move to a canonical form), joining idiomatic phrases and proper names into single underscored words.

The engine also decides if the input was a statement or a question. AIML doesn't directly address sentence type. AIML requires you strip out punctuation, not allowing any in patterns, and makes you handle detecting statements from questions by your word order. But subject inversion (Do I) is an unreliable question marker given things like tag questions (I do, don't I?).

CHAT-L then hunts to find a way to react. It starts in the current topic, trying to find a matching question or statement reactor. If that fails, it will use keywords in the input to find some other topic with a matching question or statement responder. If that fails (and ignoring a few other complications) it will make some generic grunt from the grunt topic in response to your input, and then proceed with a gambit line from the current topic. E.g.,

synonym: ~movie (film video)

topic: ~ALIENSMOVIE ( Aliens Sigourney)
g: The Aliens movies starred Sigourney Weaver.
?: (~plot) Alien creatures hatch inside humans.
?: ( actress heroine star) Sigourney Weaver starred as Ripley.
?: ( director directed ) Ridley Scott directed.
?: ( you THEN love AND movie) Yes, I love this movie
s: ( Aliens AND ~movie) I have seen the movie Aliens.

A synonym line defines a name (~movie) and a list of words associated with it, including the name itself sans the ~. So in the above example, ~movie maps to movie and film and video.

A topic declaration defines a synonym set (~ALIENSMOVIE) and associated words (Aliens Sigourney), but these words are also keywords that enable access into this topic. The topic declaration also stores a collection of kinds of activities for the topic. The "s:" lines react to statements. The "?:" lines react to questions. The "g:" lines are gambits that can be issued spontaneously.

Patterns are described in the parens of an s or ? line. By default, a collection of words means find ANY of those words in the input (an implicit synonym set). You can use capitalized keywords to force different relationships. AND means both have to occur in any order. THEN means they have to occur in the order given. NEXT means they have to occur in consecutive order.

E.g., the last question line matches Do you love this movie as well as Do you absolutely love this movie (notice the THEN instead of a NEXT). It also matches You love this movie, don't you? and This is a movie you love, isn't it? Of course it also inappropriately matches Do you love Sigourney Weaver in this movie? but you can't have everything. It's not worth trying to block that.

If I trusted the user to enter grammatical input, I could depend on the parser to tell me the real subject, verb, and object of the sentence, and I could put those as test conditions. But user chat is sloppy and often terse.

You can nest parentheticals, to create subpattern levels. The pattern ( (aliens ridley) AND ~movie) would allow a choice of aliens or ridley and some word from the ~movie collection. In this case, the subpattern acts as a local synonym set, but it could also use relationships other than OR.

The following questions:

What color is your hair? Are you a redhead? Are you blond or brunette?

can be handled by this reactor:

?: ( (%type=whatquestion AND I AND hair AND color)
( I AND be AND (blonde brunette redhead )) ) I've been blonde, but my natural color is brunette.

That is, alternate forms are shown on each line, using the same final answer.

Notice that the testing for I is canonical, and covers I and me and myself and mine and my. Similarly be tests all of the conjugations of to be in any tense. Because I am using a dictionary and supplying alternate forms, if you use a base form in the pattern (or in a synonym set), it will match any form of corresponding word. If you use a non-base form, it only matches that form.

So directed will only match the word directed, whereas had I said direct it would have matched direct or directed or directing. If you single-quote a word, it means just literally use that word, so if you only want a base form you could write 'direct to get that.

Article Start Previous Page 4 of 6 Next

Related Jobs

Purdue University
Purdue University — West Lafayette, Indiana, United States

Assistant Professor in Game Development and Design
Vicarious Visions / Activision
Vicarious Visions / Activision — Albany, New York, United States

Senior Software Engineer
Sucker Punch Productions
Sucker Punch Productions — Bellevue, Washington, United States

Gameplay Programmer
Airship Syndicate
Airship Syndicate — Austin, Texas, United States

Mid to Senior Programmer

Loading Comments

loader image