Beyond AIML to
CHAT-L
So what would I do? I wouldn't go down the regular
expression path, because I, too, want to easily read my patterns. And regular
expressions are overkill, it seems to me. Nor would I require the pattern span
the entire input. I am much more keyword-oriented. And I want a much clearer
organization and independence of topic knowledge. So I store data as topics.
The description below is an overview, not intended to be a specification,
covering every capability of the system. But it should give you a flavor of
something different from -- but similar in ability to -- AIML.
The underlying
CHAT-L engine takes your input and provides both the raw input and remapped
canonical forms of words (e.g., plural nouns map back to their singular form
and verb forms move to a canonical form), joining idiomatic phrases and proper
names into single underscored words.
The engine also decides if the input was a
statement or a question. AIML doesn't directly address sentence type. AIML
requires you strip out punctuation, not allowing any in patterns, and makes you
handle detecting statements from questions by your word order. But subject
inversion (Do I) is an unreliable question marker given things like tag
questions (I do, don't I?).
CHAT-L then hunts to
find a way to react. It starts in the
current topic, trying to find a matching question or statement reactor. If that
fails, it will use keywords in the input to find some other topic with a
matching question or statement responder. If that fails (and ignoring a few
other complications) it will make some generic grunt from the grunt topic in
response to your input, and then proceed with a gambit line from the current
topic. E.g.,
synonym: ~movie (film video)
topic: ~ALIENSMOVIE ( Aliens
Sigourney)
g:
The Aliens movies starred Sigourney Weaver.
?:
(~plot) Alien creatures hatch inside humans.
?:
( actress heroine star) Sigourney Weaver starred as Ripley.
?:
( director directed ) Ridley Scott directed.
?:
( you THEN love AND movie) Yes, I love this movie
s:
( Aliens AND ~movie) I have seen the movie Aliens.
A synonym line defines a name (~movie) and a
list of words associated with it, including the name itself sans the ~. So in
the above example, ~movie maps to movie and film and video.
A topic declaration defines a synonym set (~ALIENSMOVIE)
and associated words (Aliens Sigourney),
but these words are also keywords that enable access into this topic. The topic declaration also stores a
collection of kinds of activities for the topic. The "s:" lines react
to statements. The "?:" lines react to questions. The "g:"
lines are gambits that can be issued spontaneously.
Patterns are described in the parens of an s or ?
line. By default, a collection of words means find ANY of those words in the
input (an implicit synonym set). You can use capitalized keywords to force
different relationships. AND means both have to occur in any order. THEN means
they have to occur in the order given. NEXT means they have to occur in
consecutive order.
E.g., the last question
line matches Do you love this movie as well as Do you absolutely love
this movie (notice the THEN instead of a NEXT). It also matches You love
this movie, don't you? and This is a movie you love, isn't it? Of
course it also inappropriately matches Do you love Sigourney Weaver in this
movie? but you can't have everything. It's not worth trying to block that.
If I trusted the user to enter grammatical input, I could depend on the parser
to tell me the real subject, verb, and object of the sentence, and I could put
those as test conditions. But user chat is sloppy and often terse.
You can nest parentheticals, to create subpattern levels. The
pattern ( (aliens ridley) AND
~movie) would allow a choice of
aliens or ridley and some word from the ~movie collection. In this case, the
subpattern acts as a local synonym set, but it could also use relationships
other than OR.
The following questions:
What color is your hair? Are you
a redhead? Are you blond or brunette?
can be handled by this reactor:
?: ( (%type=whatquestion
AND I AND hair AND color)
( I AND be AND (blonde brunette redhead
)) ) I've been blonde, but my natural color is brunette.
That is, alternate forms are shown on each line, using the
same final answer.
Notice that the testing for I is canonical, and covers I
and me and myself and mine and my. Similarly be tests all of the
conjugations of to be in any tense. Because I am using a dictionary and
supplying alternate forms, if you use a base form in the pattern (or in a
synonym set), it will match any form of corresponding word. If you use a
non-base form, it only matches that form.
So directed will only match
the word directed, whereas had I said
direct it would have matched direct
or directed or directing. If you
single-quote a word, it means just literally use that word, so if you only want
a base form you could write 'direct to get that.
|
Also associate a probability with the context. If a player asks, "Does George like flying in Airforce one?", this will be parsed to "Does George W Bushlike flying in Airforce one?" (1%) as well as "Does George Washington like flying in Airfoce one?" (2%). However, some context logic will know that George W Bush is associated with airforce one, and have a higher probability for the context (90% context probability for a modern president, with 1% probability for anyone else).
Then, a combination of sentence-parse probabilities and context probabilities (1% x 90% = 0.9% vs. 2% x 1% = 0.02%) can disambiguate the meaning of a statement. This is a common speech recognition trick. (So you might want to learn about speech recognition, Viterbi searches, and Hidden Markov Models.)
I've already implemented this and am using it in my game, http://www.CircumReality.com .
You might find its use of text-to-speech interesting too. You'll find that your AIML tags for responses are completely inadequate, and need to include facial emotions, spoken emotions, and nuanced prosody.
You'll also find that hand-coding millions of responses isn't worth the work. Most of what players want to ask is more procedural, such as "Where is the nearest merchant/guard/toilet?" and "Did you see where Frank went before the murder occured?"
sehr interessant!
you will find more in the http://sglab.cn/blog
Yeah, but your game has dialog like this:
http://www.circumreality.com/ScreenPreRelease4b.jpg
If you send me E-mail, I'll go into detail... but basically, without a mostly menu-driven dialogue system, players don't know what to say and/or get into ye-olde "guess the verb" problems that Zork and other IF often has.