Complaint #1: The biggest flaw of AIML is that it is simply too wordy and requires huge numbers of effectively redundant categories.
Since the pattern matching of AIML is so primitive and generic, it takes a lot of category information to perform a single general task. If you want the system to respond to a keyword, either alone or as a prefix, infix, or suffix in a sentence, it takes you four categories to do so, one for each condition (with three of them remapping using SRAI to the fourth ).
<template> Tell me more about your family. </template>
<pattern>* MOTHER *</pattern>
Had AIML used regular expressions for patterns, this could have been reduced to a single category statement. This leads to a critical point. Conciseness is good and having to have multiple flavors of the same rule is bad. The more you write, the harder it is to keep it organized, debug it, etc.
This was a frequent problem for large-scale expert systems (or any software for that matter). Of course regular expressions are devilishly hard to read and being able to easily understand your rules is also important. But I find even using XML like this is hard to read. It lacks conciseness. The intrusion of the xml keyword structure makes it slow to skim read what you do have. XML is not readable; it is barely legible.
AIML's wildcard * matches one or more words, but there is no wildcard matching zero or more. Since the match must swallow the entire set of all words, this forces you to use multiple patterns to cover starting, in the middle, and ending a sentence. Being able to use a zero-or-more wildcard would be much better.
AIML is a self-contained system which manipulates words without any knowledge of them. If you had a dictionary to back you up (e.g., the WordNet downloadable dictionary of around 125,000 words), you could use parts of speech wildcards for better patterns. A.L.I.C.E, for example, dedicates around 1000 patterns to handle various adverbs at the beginning of the sentence, remapping by stripping off the adverb. If one could use a keyword in the pattern like %adverb, they could all have been reduced to a single pattern that would have covered far more cases than are actually covered at present.
Thus the A.L.I.C.E. 40,000 rule basic bot is really less than 5,000 when you remove all the waste in their pattern definitions. And 5,000 is generally not enough to get interesting behavior in an expert system in a limited domain, much less a broad one.
Complaint #2 Similarly, the pattern uses an exact word. It would be nice if you could match a list of synonyms in a single specification. E.g., I (know believe think) you. And, to be able to declare a list of synonyms so you can reuse it. E.g. Synonym ~believe = know believe think; Then you could write the pattern I ~believe you. Of the 40,000+ patterns in the publicly available standard ALICE, 9,000 are in the Reductions file, which remaps input. These include.
<category><pattern>I AM JACK</pattern><template><srai>call me jack</srai></template></category>
<category><pattern>I AM JAKE</pattern><template><srai>call me jake</srai></template></category>
<category><pattern>I AM JAMES</pattern><template><srai>call me james</srai></template></category>
<category><pattern>I AM JANE</pattern><template><srai>call me jane</srai></template></category>
... and so on for lots of other names.
I would much rather write a synonym set for a list of names and write this as a single pattern.
Complaint #3 Continuing the issue of exact word matching... Because you can't use wildcard sets of words, you cannot easily handle generic set-based sentences. Google, for example, has no problem taking in what is two hundred plus twelve thousand and 2 and then spitting out the correct answer (and a bunch of search alternatives).
<category> <pattern>* PLUS *</pattern> <template><think><set name="x"> <star/></set> <set name="y"><person><star index="2"/></person></set></think> The answer is
document.write("<br/>result = " + result, "<br/>");</script> </template> </category>
But if it turns out that the * values do not contain numbers, you have already matched the pattern and are screwed on output. That is, the AIML reference guide says: As soon as the first complete match is found, the process stops, and the template that belongs to the category whose pattern was matched is processed by the AIML interpreter to construct the output.
It seems to say nothing about what happens if the result of the execution of the template is nada. So seems to me, you really want to have a more reliable match BEFORE you commit to the template. (When I tried this with A.L.I.C.E., it did come out with other output, suggesting it is going beyond AIML itself).
Complaint #4 Continuing the issue of matches generating no output... AIML creates a system with the potential for many rules that overlap and mask each other. E.g.,
<pattern>DO YOU WANT MOVIES </pattern> < template> <srai> Want Movies </srai></template>
<pattern> DO YOU WANT * </pattern> < template> No, I do not want * </template>
Both can match the input Do you want to go to the movies, but the AIML definition makes the first category match and the second never try with this input. If "Do you want movies" matches, it will remap the input into want movies and if THAT fails to match, you lose. No output. It would be better that if the system did not generate any output from a match, the match were considered as not having happened and the system moved on to another attempt.
Complaint #5 Categories using <srai> are hard to connect visually to the categories they are remapping to, making it impossible to see or guess what will happen. This is true normally, and some applications sort the patterns alphabetically into separate files by starting letter (a common and useful thing to do), which totally destroys the ability to see interconnections of patterns.
Complaint #6 It is hard to organize a collection of related chat information. There are only the category and topic mechanisms at work (and the file system), so everything must be shoehorned into them. Topics do not do a good job of encapsulating themselves. To launch a topic you have to manually enter a "set-topic" command in the output of some pattern. Which, by definition, means the category doing that is outside the topic (or it couldn't have matched). That is poor encapsulation and means lots of boring set-topics lying around. I would rather have the underlying engine manage topics automatically and have all topic data within the topic.