| |
|
|
||||
![]() |
||||||
| |
|
|||||
|
Name Generation Techniques In order to see how the PseudoRandomizer class may be employed to give characteristics to a particular game element, we shall now turn our attention to naming each of the stars in the universe. There are a myriad of techniques for doing this, but one of the most popular is known as the Markovian List (see also A.K. Dewdney, The Tinkertoy Computer, W.H. Freeman: 1990). In essence, this involves examining the frequency of repeated elements in a given number of sequences, and using this as a basis for generating new sequences. The sequences in the case of names are built up of letters. Before creating actual names, we need to build up a table in which the frequency of adjacent letters in a list of sample names can be stored. This table must be 26 by 26 in order to store all the letters in the alphabet. We can then examine each word in a given list, and determine the frequency of adjacent letters in each word. The cumulative frequencies are stored in the table. For example, if the word is: ‘Nicole’ Then the adjacent letter couples are: ‘ni’ ‘ic’ ‘co’ ‘ol’ ‘le’ So, for this word, we would add one to the value stored in the column referenced by ‘n’ and ‘i’, and one to the value stored in the column (i,c), and so on for each couple. Applying the same technique to every word in the list results in a table which represents the frequencies of all letter couples. Listing 4 shows the core algorithm for generating such a table. The table used in this listing is actually 28 by 28 in size, so that we can identify those letters which may end or begin a name. It should be noted that this technique is not constrained by language; that is, any language will work equally well, and for those languages that are based on the Roman Alphabet (accents excluded), the algorithm can be applied as is, with no further modifications.Once the table has been built, it can then be used to build new words at random. The first step is to determine the length of the desired word. Next, we must ascertain the first letter. This is done by adding up all the frequencies of those letters which can appear at the beginning of a word (that is, the frequencies of the letter couples in the first row of the table). Then, we choose a number at random between 1 and the sum of all the frequencies. The final step in choosing the letter is to start at the first column, and check to see if the chosen value falls between 1 and the stored value. If not, each column is examined in turn, by adding the value stored there and checking to see if this exceeds the random value. Once we find a column in which this is the case, we have found our starting letter. The above algorithm yields a letter that can be preceded by a space character. The exact same technique can be applied on the following letter, but this time we use the row referenced by the letter that was chosen to begin the word. Then, for each letter that we require in the word, the same steps can be taken in turn until we have filled each available position. The code in Listing 5 shows the algorithm for doing this. So, in order to generate a name for each of our stars, all we need to do is seed the PseudoRandomizer based on the unique position of the star (based on the grid reference), and let it produce a stream of random numbers which will provide the basis for choosing each letter in the word. Since we are guaranteed the same sequence of numbers for every seed value, all we need store is the frequency table. In addition, since the frequency table is derived from a list of real world words, each name will also "sound" natural. Using this technique applied to the text of this article, I generated the frequency table in Figure 1. Applying the name generation algorithm produced the sequence of twenty names in Figure 2. At this point it is worth calculating where the storage "break even" point is.
The table will occupy (28x28)x2 bytes of memory, if we assume each value to be a 16-bit floating point number. This means that (code excluded) we are using 1,568 bytes of storage space. This could equally well hold the actual names of the stars, in which case we could store 261 names instead (assuming each is 6 characters long). So, if we need only to store 261 names for our universe, then it is more efficient simply to store them directly. Using our universe model, this would mean that the universe dimensions would be 161 by 161 units (since only 1 percent can be occupied, and we only have space for 261 stars), or about 0.2 percent of the algorithmic capacity, based on a maximum universe dimension of 65,535 x 65,535.
|
|
|