Making Images Palette-able in Python

Awhile ago, I came across a really neat tool that allows a user to pass in an image and generate a representative color palette, derived from the colors of the image, essentially going from this...

In [2]:
display.Image('Images/old_masters.jpg')
Out[2]:

... to this

In [3]:
display.Image('Images/old_masters_colors.png')
Out[3]:

And so after cramming a slew of .PNG files I had laying around, I got curious how it actually worked. Which, in turn, led to my first bit of rabbit-holing on working with image data in Python.

I never got around to learning how to recreate the site's algorithm one-to-one, but I did pick up a bunch of practical skills and raise some interesting questions I thought merited sharing :)

Thinking of Images as Vectors

To get things kicked off, I'm going to import Pillow (PIL), the batteries-included, bread-and-butter Image Processing library in Python

In [4]:
from PIL import Image, ImageDraw

and use it to load up an image from a cache of movie posters I downloaded for another side-project (that never went anywhere, haha)

In [5]:
img = Image.open('posters/Blade Runner 2.png')
img
Out[5]:

Here, img represents an object of type JpegImageFile, which allows us to do handy things like crop or do some light editing

In [6]:
type(img)
Out[6]:
PIL.JpegImagePlugin.JpegImageFile

But more importantly for the purposes of this post, it also makes it neatly consumable by Python's darling computation workhorse, numpy. Now, going to stuff the image into an array, taking us from pixels to a bunch of numbers that represent the pixels.

In [7]:
import numpy as np
vec = np.array(img)

If we take a peek at vec, we get a big, incomprehensible printout of a bunch of numbers

In [8]:
vec
Out[8]:
array([[[ 7, 19, 35],
        [ 8, 20, 36],
        [ 8, 20, 36],
        ...,
        [ 7, 14, 30],
        [ 7, 14, 30],
        [ 7, 14, 30]],

       [[ 8, 20, 36],
        [ 8, 20, 36],
        [ 9, 21, 37],
        ...,
        [ 8, 15, 31],
        [ 7, 14, 30],
        [ 7, 14, 30]],

       [[ 8, 20, 36],
        [ 8, 20, 36],
        [ 9, 21, 37],
        ...,
        [ 8, 15, 31],
        [ 7, 14, 30],
        [ 7, 14, 30]],

       ...,

       [[ 0,  0,  4],
        [ 1,  2,  6],
        [ 3,  4,  8],
        ...,
        [ 3,  4,  8],
        [ 0,  0,  4],
        [ 0,  0,  4]],

       [[ 2,  3,  7],
        [ 0,  0,  5],
        [ 5,  6, 11],
        ...,
        [ 0,  1,  6],
        [ 1,  2,  7],
        [ 0,  1,  5]],

       [[ 2,  3,  7],
        [ 0,  0,  5],
        [ 0,  0,  5],
        ...,
        [ 0,  0,  5],
        [ 3,  4,  9],
        [ 0,  1,  5]]], dtype=uint8)

But a closer look at the shape of this object helps us interpret what we're looking at

In [9]:
vec.shape
Out[9]:
(210, 150, 3)

It's no accident that looking that the size of our original image, it has a width of 150 pixels and a height of 210.

In [10]:
img.size
Out[10]:
(150, 210)

But what of the 3 at the end of (210, 150, 3)?

Well, these represent the distinct Red, G reen, and Blue values that define the color of each pixel. If this concept is foreign to you, poke around this site for a minute or two, as it's pretty much the crux of the rest of the post.

All told, each of the height x width different pixels having their own RGB values mean that our simple, compact image is actually represented by a lot of numbers

In [11]:
vec.size
Out[11]:
94500

So what should we do with these numbers?

TL;DR: K-Means Clustering

Clustering is one of the core areas of unsupervised learning, and essentially answers "I have a bunch of data, can you segment it into groups for me?"

Perhaps the easiest of these algorithms to understand is K-Means, which can be summarized as

  1. Pick N random spots on the grid
  2. For each point of data you've got, figure out which target is closest
  3. Is there an about-even split of points-to-closet-targets?

    • Yes? Done deal
    • No? Move the targets a bit and check again

For a less hand-wavy explanation (featuring great graphics), I like this video

In [12]:
# Youtube
HTML('<iframe width="560" height="315" '
     'src="https://www.youtube.com/embed/IuRb3y8qKX4" '
     'frameborder="0" gesture="media" allow="encrypted-media" '
     'allowfullscreen></iframe>')
Out[12]:

But not to dwell too long on the topic, let's cobble together a quick example to demonstrate this visually.

I'm going to lean on the most vanilla dataset in all of Data Science, which is basically petal and sepal measurements of a bunch of flowers.

In [13]:
import seaborn as sns

iris = sns.load_dataset('iris')
iris.head()
Out[13]:
sepal_length sepal_width petal_length petal_width species
0 5.1 3.5 1.4 0.2 setosa
1 4.9 3.0 1.4 0.2 setosa
2 4.7 3.2 1.3 0.2 setosa
3 4.6 3.1 1.5 0.2 setosa
4 5.0 3.6 1.4 0.2 setosa

For the sake of visualization, we're going to throw away all but two columns to make a cheap scatter plot

In [14]:
trimmedData = iris.loc[: , ['sepal_length', 'sepal_width']]

plt.scatter(x=trimmedData['sepal_length'], y=trimmedData['sepal_width']);

Then, we're going to leverage the K-Means implementation in sklearn to try and separate these points into 3 different groups

In [15]:
from sklearn.cluster import KMeans

model = KMeans(n_clusters=3)
model.fit(trimmedData)
trimmedData['label'] = model.labels_

This runs almost instantly for such a small dataset, and now we can plot the same points, but this time assigning a color based on which group they wound up in.

In [16]:
plt.scatter(x=trimmedData['sepal_length'],
            y=trimmedData['sepal_width'],
            c=trimmedData['label']);

To hammer the point home, running K-Means over this data to get three Targets yields three groups:

  1. The Purple points are centered around, on average, (5.0, 4.0)
  2. The Teal points are centered around, on average, (5.6, 2.7)
  3. The Yellow points are centered around, on average, (7.3, 3.3)

Manipulating

Our example above clustered points arranged in two-dimensional, X/Y space. However, this extends painlessly into 3-dimensions, where our (R, G, B) color definitions live. First, though, we need to take our original 210 x 150 image and basically unravel it into one long chain of RGB values.

numpy makes this a breeze with reshape

In [17]:
reshaped = vec.reshape(-1, 3)
reshaped.shape
Out[17]:
(31500, 3)

The -1 in the function call might seem confusing at a first glance, but basically we knew we wanted to package everything into chunks of 3, per RGB. The -1 is an indicator that numpy should just figure out how to make that happen. Thus, it takes all 94,500 points of data (as above), and realizes that you can cleanly group them into 3's if you make one long list of 31.5k elements.

At this point, we're wandering into the neighborhood of "our data is getting hard to interpret"

In [18]:
plt.imshow(reshaped);

but trust me, this is one long line of every pixel of our original image.

All Together

For convenience, I've packaged the rest of my spaghetti code into functions any interested reader can check out here.

In [19]:
from imagetools import (path_to_img_array,
                        pick_colors, 
                        show_key_colors)

But basically these:

  • Load an image up to a vector, from a given path
In [20]:
path = 'posters/Blade Runner 2.png'
vec = path_to_img_array(path)
vec.shape
Out[20]:
(210, 150, 3)
  • Unrolls our image and runs K-Means clustering to find "Target Points" all of the pixel values are grouped around (here, we choose 3)
In [21]:
colors = pick_colors(vec, 3)
colors
Out[21]:
array([[ 14.84415584,  48.46459201,  67.83851997],
       [  3.55864673,   7.16823366,  16.27111591],
       [124.19018405,  31.15132924,  55.46216769]])
  • Finally, one last function to take these Targets and plot out some simple boxes to show the colors it found
In [22]:
show_key_colors(colors)
(14, 48, 67)
(3, 7, 16)
(124, 31, 55)
Out[22]:

Looks about right, yeah?

In [23]:
Image.open('Posters/Blade Runner 2.png')
Out[23]:

Considering the Number of Means

As I played around with this, I quickly discovered that generating a meaningful color palette from an image was very sensitive to how many Targets you ask sklearn to sniff out.

Lets take a look at a more interesting poster (from a movie I never heard of...)

In [24]:
path = 'posters/3 Idiotas.png'
img = Image.open(path)
img
Out[24]:

Running K-Means for a mere two colors gives the following

In [25]:
img = path_to_img_array(path)
plt.imshow(show_key_colors(pick_colors(img, 2)));
(35, 100, 136)
(203, 158, 132)

Interestingly, we captured the blue of the image (the majority color). On the other hand, we've determined that the "average color" of lime green, fuchsia, honey yellow, hot pink, orange, and lavender is... a gross beige, lol

But take a look at what happens when we allow for more and more means:

At 3, we separate light and dark tones and the blue pops

In [26]:
img = path_to_img_array(path)
plt.imshow(show_key_colors(pick_colors(img, 3)));
(25, 127, 187)
(214, 176, 140)
(81, 61, 59)

At 4, we extract brown tones from light/dark

In [27]:
img = path_to_img_array(path)
plt.imshow(show_key_colors(pick_colors(img, 4)));
(222, 208, 200)
(26, 126, 185)
(49, 47, 42)
(185, 125, 85)

At 5, we split brown to get purple and yellow

In [28]:
img = path_to_img_array(path)
plt.imshow(show_key_colors(pick_colors(img, 5)));
(143, 101, 109)
(19, 127, 189)
(222, 206, 200)
(42, 39, 35)
(230, 175, 47)

At 6, we split purple into a salmon and an olive green

In [29]:
img = path_to_img_array(path)
plt.imshow(show_key_colors(pick_colors(img, 6)));
(42, 32, 32)
(89, 120, 115)
(180, 90, 105)
(16, 128, 192)
(232, 181, 39)
(222, 207, 200)

At 7, we split our blue into two

In [30]:
img = path_to_img_array(path)
plt.imshow(show_key_colors(pick_colors(img, 7)));
(183, 91, 107)
(9, 124, 189)
(39, 29, 28)
(232, 212, 201)
(232, 181, 38)
(84, 106, 93)
(114, 159, 193)

Finally, 8 gives us about as diverse a palette as we'd like from this image

In [31]:
img = path_to_img_array(path)
plt.imshow(show_key_colors(pick_colors(img, 8)));
(108, 160, 200)
(233, 180, 29)
(39, 30, 29)
(82, 112, 101)
(236, 231, 226)
(9, 124, 189)
(173, 74, 94)
(213, 166, 145)

So obviously allowing for more Targets could help us pick out more unique colors. Neat.

Considering Image Size

Until now, we've been looking at meager 150 x 210 images. What happens when we examine larger images?

To play with this idea, I took a still from The Grand Budapest Hotel, a delightfully-colorful movie that I love.

At a glance, it seems like a no-brainer what 5 colors we'd come up with

In [32]:
wes = Image.open('Images/wes.png')
wes
Out[32]: