Convert images to arrays

Last updated on Jul 8, 2022 4 min read

Convert any image to data array

Oftentimes, you will have an image that you want to extract data from. It will most likely be something annoyingly minor, but will somehow turn into the most important piece to the puzzle that you NEED RIGHT NOW! Okay… maybe that’s just me. Sometimes it’s nice to be able to extract data from an image you found in a journal article to replicate the results.

Here, I will go through the steps I took to convert an image into RGB format (using Python - naturally) and convert it into an array of data. From there it gets interesting. A lot of images I come across have been interpolated from relatively sparse data. You don’t need all that data! It’s a waste of space and you’re better off trimming the bits you don’t actually need - especially if you would like to do some spatial post-processing.

Here are the main steps:

Import the image using the scipy.ndimage module.
Setup a colour map using matplotlib.colors
Query the colour map with many numbers along the interval you’ve defined and store that in a KDTree.
Query the KDTree using the $m \times n \times 3$ ndimage RGB array. The result is an $m \times n$ data array.
Reduce the data by taking the first spatial derivative using numpy.diff and removing points where $| \nabla | > 0$ .
Use sklearn.DBScan from the scikit-learn module to cluster the data and remove even more unnecessary data.
Have a beer.

At the end I have a Python class that contains all these steps rolled into one nifty code chunk for you to use on your own projects, if you find this useful.

Ready? Here we go…

Load the image

from scipy import ndimage
import numpy as np
import matplotlib.pyplot as plt

im = ndimage.read("filename.jpg") # flatten=True for b&w
plt.imshow(im)
plt.show()

This image is from supplementary information in Pilia et al. (2015), Evidence of micro-continent entrainment during crustal accretion, Scientific Reports, 5(8218), pp. 1–6

That looks nice, im is a $m \times n \times 3$ array - i.e. 3 sets of $m$ rows, $n$ columns, that correspond to the number of pixels, for each RGB channel. Now let’s create the colour map and chuck everything into a KDTree for nearest-neighbour lookup. Seriously, KDTrees are awesome. srsly.

from scipy.spatial import cKDTree
from matplotlib import colors as mcolors

def scalarmap(cmap, vmin, vmax):
    """ Create a matplotlib scalarmap object from colour stretch """
    cNorm = mcolors.Normalize(vmin=vmin, vmax=vmax)
    return plt.cm.ScalarMappable(norm=cNorm, cmap=cmap)
    
scalarMap = scalarmap(plt.get_cmap("jet_r"), 0.0, 0.7)
value_array = np.linspace(0,0.7,10000)
color_array = np.asarray(scalarMap.to_rgba(value_array))
color_array *= 255 # rescale

tree = cKDTree(color_array[:,:-1]) # Last column is alpha
d, index = tree.query(im)
data = value_array[index]

We scaled the colour map between 0.0 and 0.7, which corresponds to the extent in the image. Optionally, we could mask pixels that are some distance, $d$ , from the colour map. E.g. data = np.ma.array(value_array[index], mask=(d > 70)). This would mask out the black text and white background in the image.

I don’t know a better way to map the RGB colour arrays to data than to query entries in a KDTree. If you do, please mention in the comments below.

Data reduction

Now that we have converted an image into an array we can reduce the data so that only the important bits remain. Firstly, we take the first spatial derivative and find where $\nabla = 0$ .

dx = np.abs(np.diff(data[:,:-1], axis=0))
dy = np.abs(np.diff(data[:-1,:], axis=1))
dxdy = (dx + dy)/2.0

mask = dxy < 1e-12

Here, mask is a $m \times n$ boolean array that contains True values where we have good data. But we can reduce it even further using DBScan. This finds spatial regions of high density and expands clusters from them. I think this is basically a higher order use of KDTrees, but let’s humour sklearn toolkit anyway.

from scikit.cluster import DBSCAN

Xcoords = np.arange(data.shape[1]-1)
Ycoords = np.arange(data.shape[0]-1)
xx, yy = np.meshgrid(Xcoords, Ycoords)

db = DBSCAN(eps=0.05, min_samples=1).fit(xx[mask], yy[mask])
labels = db.labels_
num_clusters = len(set(db.labels)) - (1 if -1 in db.labels else 0)

def get_centroid(points):
    """ find centroid of a cluster in xy space """
    n = points.shape[0]
    sum_lon = np.sum(points[:, 0])
    sum_lat = np.sum(points[:, 1])
    return (sum_lon/n, sum_lat/n)
    
tree = cKDTree(np.vstack([xx[mask], yy[mask]]).T)
x = np.zeros(num_clusters, dtype=int)
y = np.zeros(num_clusters, dtype=int)

for i in xrange(num_clusters):
    xc = xtrim[mask][labels == i]
    yc = ytrim[mask][labels == i]
    
    centroid = get_centroid(np.vstack([xc,yc]).T)
    
    d, index = tree.query(centroid)
    x[i], y[i] = tree.data[index]
    
mask2 = np.zeros_like(xx, dtype=bool)
mask2[y,x] = True

Above, we found the centroids for each cluster to reduce the substantially reduce the dataset from 16667 points down to 1221. Each colour represents a unique cluster in the image below.

DBScan on remaining points to identify clusters

mask2 has significantly improved on mask by finding clusters in the data. Now that it’s done, let’s see if we can reconstruct the image we first started with. This is a good check to make sure we haven’t removed too much of the original data.

Left is an interpolation from 1221 points, right is the original data array

Overall the two look pretty similar! There are some bits that could be refined by adjusting some of the parameters, e.g. eps in DBScan and $\nabla \approx 0$ . Also, adjusting the distance threshold in the original RGB $\to$ data conversion would remove artifacts caused by the coastline (i.e. bits that don’t find on the colour map).

python

Convert images to arrays

Load the image

Data reduction

Dr. Ben Mather

Research Fellow in Geophysics