|
Beyond AIML: Chatbots 102
[Industry veteran Wilcox is creating NPC text chatbots for Avatar Reality's Blue Mars, and this technical article discusses his adventures in AI markup language to create effective human-text interaction.]
A beginning chatbots course (Chatbots 101) would teach AIML.
AIML is the AI markup language based on XML, used for making chatbots.
It has
its roots in Dr. Wallace's A.L.I.C.E. chatbot, which has led to PandoraBots and
a lot of spin-off bots based on standardized AIML. As a standard, AIML made it
possible for many people to work on chatbots more easily.
Avatar Reality (www.avatar-reality.com),
a virtual world company built from the ashes of the Square USA Honolulu office,
wants to use chatbots to represent a user while that user is absent from the Blue
Mars world - a CryEngine 2-using online environment set on a terraformed Mars. My job is to provide them with the appropriate chatbot
technology.
AIML is one such technology, but for my purposes it is
simply a woefully inadequate tool and once again I find myself building a new
scripting language (see Reflections on Building Three Scripting Languages, a
prior Gamasutra article). Hence Chatbots 102.
Historical
Perspective
The conversations-with-a-computer genre began in the 1960's
with Eliza, the computer parody of a
Rogerian psychiatrist. Using a mere 53 rules, and by substituting part of the
user's input back into an output question, the program gave an illusion of
interacting as a human.
It had to always ask questions - because that's the
best way to hide the fact that it knew absolutely nothing. Asking questions
takes advantage of the fact that humans are good at substituting their own rich
interpretations into simple words in questions that have any vague connection
at all to the topic at hand.
Eliza evolved into
most of the chatbots of today. They take user input and, using tens of
thousands of rules, generate output. They differ from Eliza in that they usually have some built-in world knowledge, so
that if you ask them if they like chocolate you might get a straight "yes"
answer instead of "Why do you think of chocolate right now?"
In a different vein, meaning-based human text interaction
with computers began with the adventure games of the 1970s (Zork is a great example). Humans were
given an extremely limited grammar and vocabulary and interacted by telling the
system what action to take using what object and got back a canned text
description of what happened. It was very popular because the human had a set
of choices that could be made, had interesting material to read, pretended to
be in a magical setting, and got to solve puzzles.
The modern video game has evolved to where the user controls
an avatar by mouse or joystick and gets visual feedback. They have a limited
real world with physics, though the computer does not generally talk or reason
about it.
But chatbots and meaning-based interaction don't currently go hand in
hand in most video games anymore. However, there is lots of room for developing
better NPC characters.
Meanwhile, a chatbot like A.L.I.C.E. is an expert system. An
expert system can perform interesting to significant tasks when it has a
limited domain and 30,000 to 50,000 rules for a brain. Many more recent
chatbots claim magical AI abilities, but you should take those claims with a
ton of salt. Descriptions of chatbot capabilities and development systems are
usually gross exaggerations.
So where are there pitfalls in building a chatbot? Getting
reliable human interaction out of a computer is a tar pit waiting to trap any
programmer. The quantity of human knowledge is vast.
The Cyc project has spent
over 100 man-years working on just organizing knowledge, and they have a long
way to go. Chatbots do not have to master all that. They attempt to simulate
interesting and vaguely intelligent conversation by other means. Current ones
can be entertaining, but you can quickly make them look stupid. Future ones,
using additional technology, can look slightly less stupid.
So where are there opportunities in building a chatbot? The
dominant chatbots were started almost a decade ago, so their design goals and
decisions were made some time ago. Technology has evolved.
First, hardware has
changed. Machines are faster and storage is massive, and for all practical
purposes, free. Second, in the past decade natural language processing has
created parsers that often can correctly parse a sentence (though this is less
valuable in chat, since it is often not grammatical).
Third, the internet
provides ubiquitous connections among people, information, and software.
Fourth, extensive amounts of human knowledge have been gathered and made
accessible via the internet (including Google and Wikipedia). Fifth, there have
been a lot of projects in ontologies, knowledge representation, etc. These are
all things that current chatbots do not use to best advantage.
I am currently working on incorporating all these into the
AR chatbot and maybe I can write about that towards the end of this year. But
this article isn't about those things. It is merely about a better scripting
language than AIML.
|
Comments
Also associate a probability with the context. If a player asks, "Does George like flying in Airforce one?", this will be parsed to "Does George W Bushlike flying in Airforce one?" (1%) as well as "Does George Washington like flying in Airfoce one?" (2%). However, some context logic will know that George W Bush is associated with airforce one, and have a higher probability for the context (90% context probability for a modern president, with 1% probability for anyone else).
Then, a combination of sentence-parse probabilities and context probabilities (1% x 90% = 0.9% vs. 2% x 1% = 0.02%) can disambiguate the meaning of a statement. This is a common speech recognition trick. (So you might want to learn about speech recognition, Viterbi searches, and Hidden Markov Models.)
I've already implemented this and am using it in my game, http://www.CircumReality.com .
You might find its use of text-to-speech interesting too. You'll find that your AIML tags for responses are completely inadequate, and need to include facial emotions, spoken emotions, and nuanced prosody.
You'll also find that hand-coding millions of responses isn't worth the work. Most of what players want to ask is more procedural, such as "Where is the nearest merchant/guard/toilet?" and "Did you see where Frank went before the murder occured?"
sehr interessant!
you will find more in the http://sglab.cn/blog
Yeah, but your game has dialog like this:
http://www.circumreality.com/ScreenPreRelease4b.jpg
If you send me E-mail, I'll go into detail... but basically, without a mostly menu-driven dialogue system, players don't know what to say and/or get into ye-olde "guess the verb" problems that Zork and other IF often has.
Login to Comment