Beyond Fašade: Pattern Matching for Natural Language Applications
March 15, 2011 Page 5 of 5
Something that ChatScript can do and Façade had no need for is reflection -- the system can use its abilities on itself. It can track its decisions and can pattern match on what it says.
Suzette uses reflection to set up information to decide if the user responds directly to a question she poses. If she says "How many people..." then if the user input has a number or a word representing a number like few, she can recognize that the user answered the question and just continue in the topic. But if the user responds with something off the wall, she may use keywords in his sentence to try to switch to a new topic.
One could implement this by putting appropriate continuations after every output involving questions. That would be tedious. Instead it's all managed by the control script. It only requires a small amount of script.
The control script for ChatScript is itself just a topic, and topics can invoke other topics. So you can define whatever control structure you want and even switch control structures on the fly. A chatbot can simulate becoming drunk, being drunk, and recovering, by adjusting its control data.
The control script can also do multiple passes of processing with different collections of rules, generating multiple outputs and/or storing away facts about the user and itself.
The fixed control strategy of ChatScript is:
a. Execute an author-specified topic before processing any of the user's input.
b. Execute an author-specified topic for each user sentence in input.
c. Execute an author-specified topic after processing all of the user's input.
The pre-script allows one to reset variables and prepare for new input.
During the main script analysis of each sentence, Suzette runs topics that: rewrite idiomatic sentences, resolve pronouns, map sentences into discourse acts as data for later pattern matching, learn facts about the user, search for the closest matching response, generate quips and verbal stalls, etc.
The post-script allows the system to analyze all of what the user said and what the chatbot responded with. Suzette makes heavy use of post-script. She decides if she changed topics and if so, inserts a transitional sentence. She looks to see if she ended up asking the user a question and prepares data to help her recognize if the user answers it appropriately or not. And she considers the user's input again briefly, to decide if she should inject an emotional reaction in addition to the actual response. So if you say What is 2 + 2, dummy? She can reply
Who are you calling a dummy? It's 4.
In ChatScript you choose how you want pronouns handled. One can duplicate the AIML-style of resolution, which means you author pronoun values at the time you author the output. Or, like Suzette, you can use automate pronoun resolution with post-script to analyze the output and compute pronoun values.
The Loebner Competition is the annual Turing Test for chatbots. Human judges chat blindly with a human confederate (who is trying to be human) and a chatbot (who is trying to fool the judge into voting it as the human). In 2010, bot entrants had to take a human knowledge quiz to qualify for the main competition. Here were the 4 finalists:
#1 Suzette - Bruce Wilcox - 11 pts. 1st time entry
#2 A.L.I.C.E. - Richard Wallace 7.5 pts. (2000, 2001, 2004 Loebner winner)
#3 Cleverbot - Rollo Carpenter 7 pts. (2005, 2006 Loebner winner)
#4 Ultra Hal - Robert Medekjsza 6.5 pts. (2007 Loebner winner)
Why did Suzette, a newcomer started two years ago, easily out-qualify multi-year winners?
The nature of the test (not the exact questions) was published in advance.
1. Questions relating to time, e.g., What time is it? Is it morning, noon, or night?
2. General questions of things, e.g., What would I use a hammer for? Of what use is a taxi?
3. Questions relating to relationships, e.g., Which is larger, a grape or a grapefruit?
John is older than Mary, and Mary is older than Sarah. Which of them is the oldest?
4. Questions demonstrating "memory", e.g.,
Given: I have a friend named Harry who likes to play tennis. Then:
What is the name of the friend I just told you about?
Do you know what game Harry likes to play?
Just general questions of things means tens of thousands of facts about nouns and maybe dozens of patterns all of which really mean what does one use an xxx for?
So what? It's just data. A.L.I.C.E. has hand-entered rules with no good ability to store or retrieve data. Cleverbot has a large database mined from human chat but which is unsuitable for this test. Therein lies a ChatScript advantage. It is easy to enter and retrieve data. I created topics for each of the areas of the test, including a table on objects and their functions:
table: :tool (^what ^class ^used_to_verb ^used_to_object ^used_to_adverb ^use )
-- table processing code not shown ---
[table coffee_table folding_table end_table] furniture rest objects * "eat meals on"
[ladder step_ladder] amplifer reach * higher "reach high areas"
The table handled:
Actual Test: What would I do with a knife?
Answer: A knife is used to cut food.
Suzette's topic for memorizing facts the user said fielded:
Actual Test: My friend Bob likes to play tennis. What game does Bob like to play?
Answer: The game is tennis.
And Suzette has a broad math topic (55 rules) which allows her to calculate, count up and down, and handle some simple algebra and story math problems. She correctly fielded:
Actual Test: What number comes after twelve?
But a human knowledge test is not chat. Success in the qualifiers might not translate into success against human judges in the actual contest. In 2009 the judges had a mere 10 minutes to spend sorting out a pair of human and computer. In 2010 they had 25 minutes. Despite that, Suzette fooled one of the four judges into thinking she was the human and won the competition.
A large amount of data is critical to handling natural language. But data isn't everything. Cleverbot came in third in the Loebner Competition despite 45 million lines of automatically acquired chat, because it lacked a good mechanism for appropriate retrieval. Old-style AIML-based A.L.I.C.E. took second place with only 120K rules. ChatScript-based Suzette, the winner, has somewhat more data than A.L.I.C.E. but has a far more compact and accurate retrieval.
ChatScript supports a simple visual syntax and powerful pattern-matching features. Matching concepts instead of single words and being able to specify words that must not be in the input allow you to approximate patterns of meaning. Using ranged wildcards limits false positives. Subdividing rules into topics accessible via keywords makes large collections of rules efficient to search and makes it easy to author orthogonal content. And that is only a better start.
www.personalityforge.com (a good alternative to AIML but privately hosted)
Page 5 of 5