|
|
By
Dave Pomerantz
Gamasutra
June 19, 1997
Originally
published in Game Developer Magazine, December, 1994.
|
|
|
Features

A
Few Good Colors
Most
digitized images use 24 bits per pixel. That's 16 million colors. The
problem is that most color monitors show only 256 colors at a time. I
discovered this several years ago as I struggled to get a video digitizer
ready for a trade show. Almost too close to the deadline, I found out
that it's a long step between capturing a 24-bit image and making it pretty
on an 8-bit screen.
Somehow, I had to pick the 256 colors that best represented all the subtle
shades of the original image. Then I had to map those thousands of shades
into my chosen 256 colors. It's a tough problem, common to most image-handling
software, and I think you'll find the solution a fascinating brain-twister
in solid geometry. I have to confess that I didn't invent the color-selection
algorithm. I got it from Paul Heckbert's excellent theoretical article,
"Color Image Quantization for the Frame Buffer Display" (Computer Graphics.
16[3]:297- 307).
This article will show you how his algorithm works and how you can apply
it to any limited-palette monitor, such as EGA and VGA displays. My program
runs on the Mac II in 68000 assembler.
First, let's take a close look at a color image. Fresh off the digitizer,
a typical image uses 24-bit pixels having 1 byte per color component,
as shown in Figure 1. Each pixel is one of 16 million variations of red,
green, and blue (256-by-256-by-256). The pixels are laid out in a rectangle,
640-by-480. This gives us a grand total of 307,200 pixels per image, with
3 bytes per pixel or just under a megabyte of data.
For simplicity, let's assume the display buffer for the color monitor
is also 640-by-480. That way, the only difference between the image and
display buffers is the size of the pixel.
The display buffer uses 8-bit pixels, which are not actually colors but
indexes into a small palette of colors. On an 8-bit display card, like
a VGA card for the PC or an Apple color card for the Macintosh, we choose
a small palette of 256 colors from some larger range determined by the
hardware. Then we load these colors into the color look-up table on the
display card. (On the Macintosh, you don't directly load the video card.
Instead, you go through the Palette Manager. This little piece of indirection
leaves some of the more subtle problems, such as hardware independence
and multiple palettes, in Apple's hands.) The look-up table translates
an eight-bit palette index into the corresponding video display color.
Translating
Now we've set up the hardware to correctly translate an image composed
of 8-bit palette indexes. Our digitized image, however, is built from
24-bit pixels. We need to translate these pixels into palette indexes
and put them in the video display buffer.
Let's say our image pixel is described by its three color components,
RiGiBi. We want to find the palette index k such that the palette color
RkGkBk is the best match for the image pixel. We want to minimize the
difference equation:
d = abs (Ri - Rk) + abs(Gi - Gk) + abs(Bi-Bk)
Unfortunately, we need to calculate d for each of 256 palette entries
and each of 307,200 pixels. That's about 78 million calculations just
to translate the image into palette indexes. Later, I'll show you how
we can apply a strategy that gets around all these calculations.
So here's the problem. We need to choose a small palette of colors that
best represents the full range of colors in the image. We also need to
translate the image from 24-bit colors to 8-bit palette indexes.
The first step in choosing the palette is to find out which colors are
used in the image. Every pixel could be the same shade of blue or a different
color. To find our palette, we'll use a histogram to count how many times
each color occurs.
Color Histogram
If we collected a histogram for every possible color, we'd need 16 million
one-word entries; that is, a 32MB table. To conserve memory, we'll limit
the range of our histogram to 215 entries by using only the upper five
bits of each color component, as shown in Figure 2.
Naturally, this will cost us in color resolution. By ignoring the low
three bits, we lumped together the nearest eight shades of each color
component. Each histogram entry represents a group of colors consisting
of the 512 neighboring shades of red, green, and blue (8-by- 8-by-8).
Does this hurt the appearance of our displayed image? Sure it does. But
as we move from 16 million to 256 colors, we'll be losing quite a bit
of color resolution anyway.
To calculate the histogram, first we set all entries to zero. Then we
make one pass through the original image, converting every 24-bit pixel
to a 15-bit index and incrementing the count at that index.
Listing the Colors
Most images don't contain every color. In my experience, a typical image
captured with a good camera uses fewer than 5,000 of the 32,000 colors
available in the histogram. These are the colors we're interested in,
and we don't need to waste time on the others. So we'll create an array
called colorList, containing only those 15-bit color indexes that appeared
in the image. Finally, we're ready to choose the colors.
Let's get into that solid geometry I promised you earlier. Imagine that
all possible colors are represented with a three-dimensional matrix, where
the three axes are the color components red, green, and blue. Each component
has five bits of resolution, as in our histogram, or 32 shades.
Our digitalized image will typically use fewer than 5,000 of the 32,000
colors. In fact, the color histogram is this cube, with the nonzero entries
representing the colors in our image. What we'd like to do now is carve
the cube into 256 equal regions. But what do we mean by equal?
If we divide the color cube into regions of equal volume, we'll get a
uniform distribution of all the colors. This is called a uniform palette.
It's a useful first approximation, but it doesn't take into account the
actual distribution of colors in our image. In other words, it treats
a blue seaside vista the same as a crimson summer sunset, as shown in
Figure 3.
Instead, we'd like to divide the cube into regions of equal color count,
where the color count is the sum of the histogram counts enclosed by each
region. For example, the sunset image has heavily shaded mixtures of red
and green. That means the colors occur most frequently toward the far-upper
right corner, as already shown. Since we'll want our palette to have more
colors in that area, we should carve more regions from that area. The
regions should be smaller and more numerous where the histogram color
counts are highest. Now, how do we write an algorithm that will create
those regions?
Data Structure
Let's start by representing a three-dimensional rectangle within the color
cube as a data structure called colorRect, shown in Listing 1. Remember
that colorList contains all the colors that appeared in the image. first
and last define a range within colorList, the image colors that fell within
the region. In Figure 4, first is 271, and last is 276.
histSum is the sum of the color counts for the colors in the region. In
Figure 4, histSum is 820.
mins and maxs are the geometric coordinates of the region. The mins represent
the near upper-left corner. The maxs represent the far lower-right corner.
A typical region within the cube might look something like Figure 4.
The colorList for this region has only six colors, shown with their associated
histogram counts. Notice how the list is sorted by red, then green, then
blue. That's because the widestColor is red.
The widestColor is the color component with the longest leg of the region.
In Figure 4, red is the widestColor, with four shades.
red, green, and blue are the palette colors. When we've created every
region, we'll come back and figure out what color we want in the palette
for each region. Every image color that maps into this region will be
displayed with this palette color.
The Algorithm
To initialize the algorithm, we'll create a single colorRect that encloses
the entire RGB cube. We'll set the mins to 0 and the maxs to 31, enclosing
the cube. Then, we'll set first to 0 and last to the end of colorList
to include all the colors that appeared in the image.
Now we're ready to divide the cube, in this fashion:
* Scan the colorList from first to last, finding the minimums and maximums
of each color component and adding the color counts to histSum. After
doing this, we can determine the widestColor by comparing the mins and
maxs and selecting the component with the biggest difference.
* Sort the colorList colors within the region in order of the widestColor.
This prepares us for the next step.
* Find the median. Scan through the colorList again, from first to last,
adding the color counts to a running sum. When the running sum equals
half the histSum, we've found the median point. We'll adjust this median
point so it always breaks between two shades of the widestColor. In the
preceding example, the best median is between (11, 8,18) and (11, 9,17),
two colors that have the same shade of red. But since we need to break
between shades of red, we'll set the median between (10, 8,17) and (11,8,18).
* Split the cube into two regions. Create two colorRects from the current
colorRect, dividing the colorList range at the median. We've created two
regions from one by splitting the region on its longest axis at the point
where the color counts in the two halves were roughly equal.
To build 256 regions, we just continue this process until we've used 256
colorRects. The time-consuming part is the number of times we sort the
colorList. We sort it each time we create a new region, each time sorting
successively smaller portions of the list. I used the quick-sort method
from Carl Reynolds' "Sorts of Sorts" (COMPUTER LANGUAGE, Mar. 1988, p.
56).
What? No Recursion?
Any eager student of computer science would latch onto this as one of
the rare opportunities to use recursion. That's how I first tried to solve
this problem, using the approach shown in Listing 2.
As I discovered, recursion doesn't work. It creates what (I later remembered)
is called a depth-first spanning tree, subdividing only the lesser regions
in each recursive call. It never gets around to subdividing the greater
regions.
Instead, at each iteration, we want to choose the largest of all the remaining
regions for the next division. The method I wound up using chooses the
region with the highest color count. To find that region, I built an index
list for the colorRect structures and wrote routines to insert and remove
those structures in order of their histSum. Now our algorithm looks like
Listing 3.
Assigning Colors
Once you've created the regions, it's easy to assign them color values.
Just add up all the colors that occurred in the region, multiply each
by its histogram count, and divide by the total count for the region,
as shown in Listing 4.
So far, we've been determining which colors appear in the palette. Now
we need a look-up table that translates an arbitrary 24-bit color to the
index of the closest palette color. We saw earlier that it can take 78
million iterations to translate a 640-by-480 image. And I said we'd find
a way around that.
Somehow, we need to map the colors in our RGB cube to the palette indexes.
But that's exactly what the colorRect structure does with mins and maxs
that define the geometric bounds of each palette color. All we need to
do is fill another RGB cube with palette indexes, and we've got our lookup
table color. Given an arbitrary 24-bit color, we first convert it to a
15-bit color, then use that 15-bit color as an index into the cube and
look up the corresponding palette index. Any color that falls within that
region gets the same palette index.
Listing 5 shows the code to initialize the palette look-up table for one
region. This algorithm is strictly linear, which means that the inner
loop executes only once for each look-up table entry.
That's it. We've chosen 256 good colors from our original image and built
a map to translate that image into the smaller palette.
But Is It Fast?
We have bridged the gap between 24-bit scanners and 8-bit monitors by
modeling the data geometrically and applying classic divide-and-conquer
techniques. More to the point, perhaps, we let people like Paul Heckert
do this for us and concentrated on a good engineering implementation.
I can tell you the results in hard numbers. This algorithm executes in
six seconds on a MacIIx, converting a 640-by-480 image from 24-bit pixels
to 8- bit indexes and yielding a high-quality displayed image. A well-written
but brute-force solution takes about 70 seconds and yields a lesser-quality
image. It took me about three weeks to write and debug the code resulting
in 30 pages (1,500 lines) of assembler and a 9K object file.
Thanks to Dave Pratt and Vivian Russell of Digital Vision Inc. for giving
me permission to write this article. All the software described here was
written under contract to Digital Vision as part of its ComuterEye video
digitizer.
Dave Pomerantz is president of Micro Alliance Inc., a software consulting
firm in Wakefield, Mass.
|