By Dave
Pomerantz
Published in Game Developer Magazine, December,
1994
|
Most digitized images use 24 bits per
pixel. That's 16 million colors. The problem is that most color monitors
show only 256 colors at a time. I discovered this several years ago as I
struggled to get a video digitizer ready for a trade show. Almost too close
to the deadline, I found out that it's a long step between capturing a 24-bit
image and making it pretty on an 8-bit screen.
Somehow, I had to pick the 256 colors that best represented all the subtle
shades of the original image. Then I had to map those thousands of shades
into my chosen 256 colors. It's a tough problem, common to most image-handling
software, and I think you'll find the solution a fascinating brain-twister
in solid geometry. I have to confess that I didn't invent the color-selection
algorithm. I got it from Paul Heckbert's excellent theoretical article, "Color
Image Quantization for the Frame Buffer Display" (Computer Graphics. 16[3]:297-
307).
This article will show you how his algorithm works and how you can apply
it to any limited-palette monitor, such as EGA and VGA displays. My program
runs on the Mac II in 68000 assembler.
First, let's take a close look at a color image. Fresh off the digitizer,
a typical image uses 24-bit pixels having 1 byte per color component, as
shown in Figure 1. Each pixel is one of 16 million variations of red, green,
and blue (256-by-256-by-256). The pixels are laid out in a rectangle, 640-by-480.
This gives us a grand total of 307,200 pixels per image, with 3 bytes per
pixel or just under a megabyte of data.
For simplicity, let's assume the display buffer for the color monitor is
also 640-by-480. That way, the only difference between the image and display
buffers is the size of the pixel.
The display buffer uses 8-bit pixels, which are not actually colors but indexes
into a small palette of colors. On an 8-bit display card, like a VGA card
for the PC or an Apple color card for the Macintosh, we choose a small palette
of 256 colors from some larger range determined by the hardware. Then we
load these colors into the color look-up table on the display card. (On the
Macintosh, you don't directly load the video card. Instead, you go through
the Palette Manager. This little piece of indirection leaves some of the
more subtle problems, such as hardware independence and multiple palettes,
in Apple's hands.) The look-up table translates an eight-bit palette index
into the corresponding video display color.
Translating
Now we've set up the hardware to correctly translate an image composed of
8-bit palette indexes. Our digitized image, however, is built from 24-bit
pixels. We need to translate these pixels into palette indexes and put them
in the video display buffer.
Let's say our image pixel is described by its three color components, RiGiBi.
We want to find the palette index k such that the palette color RkGkBk is
the best match for the image pixel. We want to minimize the difference equation:
d = abs (Ri - Rk) + abs(Gi - Gk) + abs(Bi-Bk)
Unfortunately, we need to calculate d for each of 256 palette entries and
each of 307,200 pixels. That's about 78 million calculations just to translate
the image into palette indexes. Later, I'll show you how we can apply a strategy
that gets around all these calculations.
So here's the problem. We need to choose a small palette of colors that best
represents the full range of colors in the image. We also need to translate
the image from 24-bit colors to 8-bit palette indexes.
The first step in choosing the palette is to find out which colors are used
in the image. Every pixel could be the same shade of blue or a different
color. To find our palette, we'll use a histogram to count how many times
each color occurs.
Color Histogram
If we collected a histogram for every possible color, we'd need 16 million
one-word entries; that is, a 32MB table. To conserve memory, we'll limit
the range of our histogram to 215 entries by using only the upper five bits
of each color component, as shown in Figure 2.
Naturally, this will cost us in color resolution. By ignoring the low three
bits, we lumped together the nearest eight shades of each color component.
Each histogram entry represents a group of colors consisting of the 512
neighboring shades of red, green, and blue (8-by- 8-by-8). Does this hurt
the appearance of our displayed image? Sure it does. But as we move from
16 million to 256 colors, we'll be losing quite a bit of color resolution
anyway.
To calculate the histogram, first we set all entries to zero. Then we make
one pass through the original image, converting every 24-bit pixel to a 15-bit
index and incrementing the count at that index.
Listing the Colors
Most images don't contain every color. In my experience, a typical image
captured with a good camera uses fewer than 5,000 of the 32,000 colors available
in the histogram. These are the colors we're interested in, and we don't
need to waste time on the others. So we'll create an array called colorList,
containing only those 15-bit color indexes that appeared in the image. Finally,
we're ready to choose the colors.
Let's get into that solid geometry I promised you earlier. Imagine that all
possible colors are represented with a three-dimensional matrix, where the
three axes are the color components red, green, and blue. Each component
has five bits of resolution, as in our histogram, or 32 shades.
Our digitalized image will typically use fewer than 5,000 of the 32,000 colors.
In fact, the color histogram is this cube, with the nonzero entries representing
the colors in our image. What we'd like to do now is carve the cube into
256 equal regions. But what do we mean by equal?
If we divide the color cube into regions of equal volume, we'll get a uniform
distribution of all the colors. This is called a uniform palette. It's a
useful first approximation, but it doesn't take into account the actual
distribution of colors in our image. In other words, it treats a blue seaside
vista the same as a crimson summer sunset, as shown in Figure 3.
Instead, we'd like to divide the cube into regions of equal color count,
where the color count is the sum of the histogram counts enclosed by each
region. For example, the sunset image has heavily shaded mixtures of red
and green. That means the colors occur most frequently toward the far-upper
right corner, as already shown. Since we'll want our palette to have more
colors in that area, we should carve more regions from that area. The regions
should be smaller and more numerous where the histogram color counts are
highest. Now, how do we write an algorithm that will create those regions?
Data Structure
Let's start by representing a three-dimensional rectangle within the color
cube as a data structure called colorRect, shown in Listing 1. Remember that
colorList contains all the colors that appeared in the image. first and last
define a range within colorList, the image colors that fell within the region.
In Figure 4, first is 271, and last is 276.
histSum is the sum of the color counts for the colors in the region. In Figure
4, histSum is 820.
mins and maxs are the geometric coordinates of the region. The mins represent
the near upper-left corner. The maxs represent the far lower-right corner.
A typical region within the cube might look something like Figure 4.
The colorList for this region has only six colors, shown with their associated
histogram counts. Notice how the list is sorted by red, then green, then
blue. That's because the widestColor is red.
The widestColor is the color component with the longest leg of the region.
In Figure 4, red is the widestColor, with four shades.
red, green, and blue are the palette colors. When we've created every region,
we'll come back and figure out what color we want in the palette for each
region. Every image color that maps into this region will be displayed with
this palette color.
The Algorithm
To initialize the algorithm, we'll create a single colorRect that encloses
the entire RGB cube. We'll set the mins to 0 and the maxs to 31, enclosing
the cube. Then, we'll set first to 0 and last to the end of colorList to
include all the colors that appeared in the image.
Now we're ready to divide the cube, in this fashion:
* Scan the colorList from first to last, finding the minimums and maximums
of each color component and adding the color counts to histSum. After doing
this, we can determine the widestColor by comparing the mins and maxs and
selecting the component with the biggest difference.
* Sort the colorList colors within the region in order of the widestColor.
This prepares us for the next step.
* Find the median. Scan through the colorList again, from first to last,
adding the color counts to a running sum. When the running sum equals half
the histSum, we've found the median point. We'll adjust this median point
so it always breaks between two shades of the widestColor. In the preceding
example, the best median is between (11, 8,18) and (11, 9,17), two colors
that have the same shade of red. But since we need to break between shades
of red, we'll set the median between (10, 8,17) and (11,8,18).
* Split the cube into two regions. Create two colorRects from the current
colorRect, dividing the colorList range at the median. We've created two
regions from one by splitting the region on its longest axis at the point
where the color counts in the two halves were roughly equal.
To build 256 regions, we just continue this process until we've used 256
colorRects. The time-consuming part is the number of times we sort the colorList.
We sort it each time we create a new region, each time sorting successively
smaller portions of the list. I used the quick-sort method from Carl Reynolds'
"Sorts of Sorts" (COMPUTER LANGUAGE, Mar. 1988, p. 56).
What? No Recursion?
Any eager student of computer science would latch onto this as one of the
rare opportunities to use recursion. That's how I first tried to solve this
problem, using the approach shown in Listing 2.
As I discovered, recursion doesn't work. It creates what (I later remembered)
is called a depth-first spanning tree, subdividing only the lesser regions
in each recursive call. It never gets around to subdividing the greater regions.
Instead, at each iteration, we want to choose the largest of all the remaining
regions for the next division. The method I wound up using chooses the region
with the highest color count. To find that region, I built an index list
for the colorRect structures and wrote routines to insert and remove those
structures in order of their histSum. Now our algorithm looks like Listing
3.
Assigning Colors
Once you've created the regions, it's easy to assign them color values. Just
add up all the colors that occurred in the region, multiply each by its histogram
count, and divide by the total count for the region, as shown in Listing
4.
So far, we've been determining which colors appear in the palette. Now we
need a look-up table that translates an arbitrary 24-bit color to the index
of the closest palette color. We saw earlier that it can take 78 million
iterations to translate a 640-by-480 image. And I said we'd find a way around
that.
Somehow, we need to map the colors in our RGB cube to the palette indexes.
But that's exactly what the colorRect structure does with mins and maxs that
define the geometric bounds of each palette color. All we need to do is fill
another RGB cube with palette indexes, and we've got our lookup table color.
Given an arbitrary 24-bit color, we first convert it to a 15-bit color, then
use that 15-bit color as an index into the cube and look up the corresponding
palette index. Any color that falls within that region gets the same palette
index.
Listing 5 shows the code to initialize the palette look-up table for one
region. This algorithm is strictly linear, which means that the inner loop
executes only once for each look-up table entry.
That's it. We've chosen 256 good colors from our original image and built
a map to translate that image into the smaller palette.
But Is It Fast?
We have bridged the gap between 24-bit scanners and 8-bit monitors by modeling
the data geometrically and applying classic divide-and-conquer techniques.
More to the point, perhaps, we let people like Paul Heckert do this for us
and concentrated on a good engineering implementation.
I can tell you the results in hard numbers. This algorithm executes in six
seconds on a MacIIx, converting a 640-by-480 image from 24-bit pixels to
8- bit indexes and yielding a high-quality displayed image. A well-written
but brute-force solution takes about 70 seconds and yields a lesser-quality
image. It took me about three weeks to write and debug the code resulting
in 30 pages (1,500 lines) of assembler and a 9K object file.
Thanks to Dave Pratt and Vivian Russell of Digital Vision Inc. for giving
me permission to write this article. All the software described here was
written under contract to Digital Vision as part of its ComuterEye video
digitizer.
Dave Pomerantz is president of Micro Alliance Inc., a software consulting
firm in Wakefield, Mass.
|