Awhile ago, I came across a really neat tool that allows a user to pass in an image and generate a representative color palette, derived from the colors of the image, essentially going from this...
... to this
And so after cramming a slew of
.PNG files I had laying around, I got curious how it actually worked. Which, in turn, led to my first bit of rabbit-holing on working with image data in Python.
I never got around to learning how to recreate the site's algorithm one-to-one, but I did pick up a bunch of practical skills and raise some interesting questions I thought merited sharing :)
To get things kicked off, I'm going to import Pillow (PIL), the batteries-included, bread-and-butter Image Processing library in Python
from PIL import Image, ImageDraw
and use it to load up an image from a cache of movie posters I downloaded for another side-project (that never went anywhere, haha)
img = Image.open('posters/Blade Runner 2.png') img
img represents an object of type
JpegImageFile, which allows us to do handy things like crop or do some light editing
But more importantly for the purposes of this post, it also makes it neatly consumable by Python's darling computation workhorse,
numpy. Now, going to stuff the image into an
array, taking us from pixels to a bunch of numbers that represent the pixels.
import numpy as np vec = np.array(img)
If we take a peek at
vec, we get a big, incomprehensible printout of a bunch of numbers
array([[[ 7, 19, 35], [ 8, 20, 36], [ 8, 20, 36], ..., [ 7, 14, 30], [ 7, 14, 30], [ 7, 14, 30]], [[ 8, 20, 36], [ 8, 20, 36], [ 9, 21, 37], ..., [ 8, 15, 31], [ 7, 14, 30], [ 7, 14, 30]], [[ 8, 20, 36], [ 8, 20, 36], [ 9, 21, 37], ..., [ 8, 15, 31], [ 7, 14, 30], [ 7, 14, 30]], ..., [[ 0, 0, 4], [ 1, 2, 6], [ 3, 4, 8], ..., [ 3, 4, 8], [ 0, 0, 4], [ 0, 0, 4]], [[ 2, 3, 7], [ 0, 0, 5], [ 5, 6, 11], ..., [ 0, 1, 6], [ 1, 2, 7], [ 0, 1, 5]], [[ 2, 3, 7], [ 0, 0, 5], [ 0, 0, 5], ..., [ 0, 0, 5], [ 3, 4, 9], [ 0, 1, 5]]], dtype=uint8)
But a closer look at the shape of this object helps us interpret what we're looking at
(210, 150, 3)
It's no accident that looking that the size of our original image, it has a width of 150 pixels and a height of 210.
But what of the
3 at the end of
(210, 150, 3)?
Well, these represent the distinct Red, G reen, and Blue values that define the color of each pixel. If this concept is foreign to you, poke around this site for a minute or two, as it's pretty much the crux of the rest of the post.
All told, each of the
height x width different pixels having their own
RGB values mean that our simple, compact image is actually represented by a lot of numbers
So what should we do with these numbers?
Clustering is one of the core areas of unsupervised learning, and essentially answers "I have a bunch of data, can you segment it into groups for me?"
Perhaps the easiest of these algorithms to understand is K-Means, which can be summarized as
- Pick N random spots on the grid
- For each point of data you've got, figure out which target is closest
Is there an about-even split of points-to-closet-targets?
- Yes? Done deal
- No? Move the targets a bit and check again
For a less hand-wavy explanation (featuring great graphics), I like this video
# Youtube HTML('<iframe width="560" height="315" ' 'src="https://www.youtube.com/embed/IuRb3y8qKX4" ' 'frameborder="0" gesture="media" allow="encrypted-media" ' 'allowfullscreen></iframe>')
But not to dwell too long on the topic, let's cobble together a quick example to demonstrate this visually.
I'm going to lean on the most vanilla dataset in all of Data Science, which is basically petal and sepal measurements of a bunch of flowers.
import seaborn as sns iris = sns.load_dataset('iris') iris.head()
For the sake of visualization, we're going to throw away all but two columns to make a cheap scatter plot
trimmedData = iris.loc[: , ['sepal_length', 'sepal_width']] plt.scatter(x=trimmedData['sepal_length'], y=trimmedData['sepal_width']);
Then, we're going to leverage the K-Means implementation in
sklearn to try and separate these points into 3 different groups
from sklearn.cluster import KMeans model = KMeans(n_clusters=3) model.fit(trimmedData) trimmedData['label'] = model.labels_
This runs almost instantly for such a small dataset, and now we can plot the same points, but this time assigning a color based on which group they wound up in.
plt.scatter(x=trimmedData['sepal_length'], y=trimmedData['sepal_width'], c=trimmedData['label']);
To hammer the point home, running K-Means over this data to get three Targets yields three groups:
- The Purple points are centered around, on average,
- The Teal points are centered around, on average,
- The Yellow points are centered around, on average,
Our example above clustered points arranged in two-dimensional, X/Y space. However, this extends painlessly into 3-dimensions, where our
(R, G, B) color definitions live. First, though, we need to take our original
210 x 150 image and basically unravel it into one long chain of RGB values.
numpy makes this a breeze with
reshaped = vec.reshape(-1, 3) reshaped.shape
-1 in the function call might seem confusing at a first glance, but basically we knew we wanted to package everything into chunks of
3, per RGB. The
-1 is an indicator that
numpy should just figure out how to make that happen. Thus, it takes all 94,500 points of data (as above), and realizes that you can cleanly group them into 3's if you make one long list of 31.5k elements.
At this point, we're wandering into the neighborhood of "our data is getting hard to interpret"
but trust me, this is one long line of every pixel of our original image.
For convenience, I've packaged the rest of my spaghetti code into functions any interested reader can check out here.
from imagetools import (path_to_img_array, pick_colors, show_key_colors)
But basically these:
- Load an image up to a vector, from a given path
path = 'posters/Blade Runner 2.png' vec = path_to_img_array(path) vec.shape
(210, 150, 3)
- Unrolls our image and runs K-Means clustering to find "Target Points" all of the pixel values are grouped around (here, we choose 3)
colors = pick_colors(vec, 3) colors
array([[ 14.84415584, 48.46459201, 67.83851997], [ 3.55864673, 7.16823366, 16.27111591], [124.19018405, 31.15132924, 55.46216769]])
- Finally, one last function to take these Targets and plot out some simple boxes to show the colors it found
(14, 48, 67) (3, 7, 16) (124, 31, 55)
Looks about right, yeah?
Image.open('Posters/Blade Runner 2.png')
As I played around with this, I quickly discovered that generating a meaningful color palette from an image was very sensitive to how many Targets you ask
sklearn to sniff out.
Lets take a look at a more interesting poster (from a movie I never heard of...)
path = 'posters/3 Idiotas.png' img = Image.open(path) img
Running K-Means for a mere two colors gives the following
img = path_to_img_array(path) plt.imshow(show_key_colors(pick_colors(img, 2)));
(35, 100, 136) (203, 158, 132)
Interestingly, we captured the blue of the image (the majority color). On the other hand, we've determined that the "average color" of lime green, fuchsia, honey yellow, hot pink, orange, and lavender is... a gross beige, lol
But take a look at what happens when we allow for more and more means:
At 3, we separate light and dark tones and the blue pops
img = path_to_img_array(path) plt.imshow(show_key_colors(pick_colors(img, 3)));
(25, 127, 187) (214, 176, 140) (81, 61, 59)
At 4, we extract brown tones from light/dark
img = path_to_img_array(path) plt.imshow(show_key_colors(pick_colors(img, 4)));
(222, 208, 200) (26, 126, 185) (49, 47, 42) (185, 125, 85)
At 5, we split brown to get purple and yellow
img = path_to_img_array(path) plt.imshow(show_key_colors(pick_colors(img, 5)));
(143, 101, 109) (19, 127, 189) (222, 206, 200) (42, 39, 35) (230, 175, 47)
At 6, we split purple into a salmon and an olive green
img = path_to_img_array(path) plt.imshow(show_key_colors(pick_colors(img, 6)));
(42, 32, 32) (89, 120, 115) (180, 90, 105) (16, 128, 192) (232, 181, 39) (222, 207, 200)
At 7, we split our blue into two
img = path_to_img_array(path) plt.imshow(show_key_colors(pick_colors(img, 7)));
(183, 91, 107) (9, 124, 189) (39, 29, 28) (232, 212, 201) (232, 181, 38) (84, 106, 93) (114, 159, 193)
Finally, 8 gives us about as diverse a palette as we'd like from this image
img = path_to_img_array(path) plt.imshow(show_key_colors(pick_colors(img, 8)));
(108, 160, 200) (233, 180, 29) (39, 30, 29) (82, 112, 101) (236, 231, 226) (9, 124, 189) (173, 74, 94) (213, 166, 145)
So obviously allowing for more Targets could help us pick out more unique colors. Neat.
Until now, we've been looking at meager
150 x 210 images. What happens when we examine larger images?
To play with this idea, I took a still from The Grand Budapest Hotel, a delightfully-colorful movie that I love.
At a glance, it seems like a no-brainer what 5 colors we'd come up with
wes = Image.open('Images/wes.png') wes