Magnificent Mosaics

Due: Apr 14, 23:59 PM

Learning Objectives

  • Practice using binary search trees as dictionaries for large datasets
  • Visualize timing differences in competing approaches
  • Implement and use a nearest neighbor search (NNS) algorithm
  • Solve open-ended code problems with limited provided structure

Getting set up

While most of the mini-project is autograded (and should be submitted like any regular assignment), Prairielearn may not have the physical memory or speed to build a mosaic of your choosing. While it may be possible to do this entire assignment in the provided workspace, you are encouraged to run your code locally as well for improved speed of calculations.

Assignment Description

A PhotoMosaic is a picture created by taking some source picture, dividing it up into rectangular sections, and replacing each section with a small thumbnail image whose color closely approximates the color of the section it replaces. Viewing the PhotoMosaic at low magnification, the individual pixels appear as the source image, while a closer examination reveals that the image is made up of many smaller tile images.

In this mini-project, you will code up necessary functions to create a mosaic in two different ways (one solution has already been provided) and observe how the accuracy and speed of three different approaches.

Python Image Processing

Before we go into the details about your assignment, lets review the format and implementation details of reading in Python Images! Using PNGs (uncompressed image files), we can represent any image as a three-dimensional matrix. For an input image of (width x height) total pixels, this matrix would have the dimensions (width x height x 3) as every pixel in our matrix has three values associated with it. If it helps conceptually, you can also thing of this as 2D array which stores a tuple of size 3 (as we will often use tuples as inputs in this assignment).

Figure 1: For the following (5 x 5) pixel image, the output numpy array would be: [[[ 45 218 0], [223 147 243],[116 57 223],[187 9 9], [238 208 236]];[[216 190 15], [193 64 80], [184 35 215], [ 95 152 180], [128 36 41]];[[101 128 53], [224 122 191], [237 212 74], [ 35 98 227], [214 66 167]];[[188 3 211], [217 142 33], [210 229 167], [208 57 22], [ 3 213 235]];[[ 11 172 37], [225 191 57], [184 130 34], [136 33 51], [ 26 220 67]]]. Commas and semicolons have been added to emphasize the breakdown of rows and columns.

The following functions have been provided for you to assist with reading in and processing images. There are additional helper functions, but you will likely not need to call them (such as squareAndResizeImage() which is called automatically when you loadImage() with a resize value).

# A string containing the path to a PNG image (fileName)
# An integer containing the square dimension the image should be reduced to (resize)
# A numpy array containing a (row, column, color vector) 3D array
np.array(width x height x color) loadImage(string fileName, int resize = None)

# A string containing the path to a file location
# A list of strings containing the paths to all valid image files at that location
list[string] listTileImagesInPath(string path)

# A string containing the path to a file location
# An optional integer argument that will resize the input image into a square of the appropriate size 
# A numpy array storing a 3D (row, col, color) representation of an image
np.array(width x height x color) getTileImage(string fileName, size=None)

More generally, you may not need to use any of these functions as the bottom part of the Python notebook has an already fully implemented script for building (and saving) your mosaics. Your part of this assignment is to implement some of the key functions called in the provided code.

Naive Mosaic Matching (25 points)

As seen in class, we can produce an 100% accurate mosaic at the expense of a very slow runtime by exhaustively comparing every possible tile image. In order to enable this functionality in the provided scripts, you must implement the following functions.


# Two size-3 tuples containing RGB values being compared (c1, c2)
# A float containing the Euclidian distance between these colors
# HINT: math.sqrt() exists and can be used here.
# Dont delete the import math line if you do!
float exactColorDist(tuple(R, G, B) c1, tuple(R, G, B) c2):

As seen in class, we can treat three-dimensional colors (RGB-values) as a point in 3D space with values ranging from (0, 0, 0) to (255, 255, 255). You should use the standard Euclidean Distance formula in order to implement this function. Most numbers will not result in ‘clean’ numbers but the autograder has an allowable error tolerance of 0.0001 (if you choose to truncate your calculation for whatever reason).

Input: (255, 0, 0), (255, 255, 0), Output: 255.0
Input: (121, 246, 111), (149, 247, 5), Output: 109.64032105024137


# A list of size-3 tuples containing average colors (inlist)
# A size-3 tuple containing the query color of interest (query)
# A size-3 tuple which is the closest match to the input query
# among every item in inlist
tuple(R, G, B) getClosestColor(list[tuple(R, G, B)] inlist, tuple(R, G, B) query):

Write a function that takes in a list of different colors and a single query color (all defined as a size-3 tuple) and returns the closest matching color. This can be done naively (as efficiency is not being tested) but must be done exactly. You are strongly encouraged to use exactColorDist().


# A 3D numpy array indexed by [row][col][RGB-color]) (numArray)
# Four optional integer arguments:
# rstart gives the starting index (in rows) that must be averaged
# cstart gives the starting index (in cols) that must be averaged
# rlen & clen give the length of values that must be averaged (starting from start indices)
# With all four optional arguments, this forms a rectangle.
# Remember that 0,0 is the top left of an image
# A size-3 tuple containing the average color of the image (or subimage defined by four args)
def getAverageColor(numArray, rstart=0, cstart=0, rlen=None, clen=None):

The default behavior of this function is fairly straightforward. When given a (width x height x color) numpy array, compute the average color of the entire array by taking the average R value, G value, and B value.

NOTE: Remember that RGB values must be integers. In this instance, you should convert whatever float output the average gives to an integer, which will result in truncation math. See the second example below, where the average values are all 127.5, but the appropriate color is 127.

Input: The 5x5x3 matrix given in Figure 1, Output: (153, 125, 114)

Input:                                     Output: (127, 127, 127)
[[[0 0 0][0 0 0]]
[[255 255 255][255 255 255]]]

The complexity of this function comes in when the optional arguments are submitted (you can assume all four optional arguments will always be given if any of them are). When you are given these arguments instead of averaging the entire image (read: the entire numpy array), you should instead average only the rectangular values between points (rstart, cstart) and (rstart+rlen-1, cstart+clen-1). Do you understand why the ‘-1’ is included?

Lets look at the examples below. In the first example, we are starting at position (row=0, col=0) and the total length we are comparing is a 1x1 square. So our average value is just whatever pixel is at (0, 0).

In the second example, we start at (row=1, col=0) and my column length is 2. So I get the average of the first two values in row 1, starting at column 0 [Exact indices: (1, 0), (1, 1)]. They are both (255, 255, 255) so my average is also (255, 255, 255).

In my last example, we start at (row=0, col=0) but our row length is 2. So we get the average of the first value in row 0 and the first value in row 1 [Exact indices: (0, 0), (1, 0)]. This yields an average of (127, 127, 127)

Input: The 5x5x3 matrix given in Figure 1, Output: (45, 218, 0)
rstart=0, cstart=0, rlen=1, clen=1

Input:                                     Output: (255, 255, 255)
[[[0 0 0][0 0 0]]
[[255 255 255][255 255 255]]]
rstart=1, cstart=0, rlen=1, clen=2

Input:                                     Output: (127, 127, 127)
[[[0 0 0][0 0 0]]
[[255 255 255][255 255 255]]]
rstart=0, cstart=0, rlen=2, clen=1

Warning: Successfuly implementing this function (with optional arguments) is key for the mosaic. Otherwise you will not be correctly averaging the color of each square you plan to replace with a new tile image (that most closely matches that square’s average color).

Luminance Binary Search Tree (30 points)

We would like to speed up our nearest neighbor search but without coding up the complexity of a k-d tree. As discussed in class, one method of doing this is to convert our multi-dimensional object into a single dimension. Given our objects are RGB color values, there is a meaningful dimension reduction in the form of luminance – a measurement of relative black-white.

The core of this ‘transformation’ is already implemented for you in getLum(), where a simple equation takes in an RGB and returns a single float. Your job is to use this transformation to build a binary search tree capable of finding the nearest neighbor.


# A bstNode containing the root of the tree (or sub-tree) being inserted into. (root)
# A size-3 tuple containing the average color of the image being inserted (key)
# A string containing the path and filename to the image being inserted (val)
# A bstNode containing the root of the tree after the new node has been added
# HINT: Use 'getLum' to compare values as 1D luminance NOT as 3D colors.
bstNode lum_tree_insert(bstNode root, tuple(R, G, B) key, string value):

The luminance tree is a binary search tree where my keys are size-3 tuples containing RGB values (specifically the average value of an image) and my value is a string containing the filename of that image. From an implementation point of view, there is no difference between inserting into a luminance tree and any of the BSTs we have seen in class other than the fact that you must compare 1D luminance values but store size-3 color tuples.

Several small datasets have been provided alongside a build_lum_tree() function for testing the correctness of your insert function. For example:

If I call lum_tree_insert in the following order: [(0, 255, 0), (0, 255, 255), (255, 0, 0), (255, 255, 0), (255, 0, 255), (0, 0, 255)]

The final output is:
        |--(255, 255, 0)
    |--(0, 255, 255)
|--(0, 255, 0)
        |--(255, 0, 255)
    |--(255, 0, 0)
        |--(0, 0, 255)

NOTE: The order of inputs matter and the provided examples in the code may not exactly match with your final output. This is ok – your local machine is not going to store the files in the exact same way and you are likely seeing an issue with the input order moreso than your code. Use the autograder to test for identical trees (as the order will be fixed and run on the same architecture).


# A bstNode containing the root of the tree (or sub-tree) being inserted into. (root)
# A size-3 tuple containing the average color being queried (key)
# A bstNode containing the closest match (in luminance) between query and any node in the tree
# HINT: Use 'getLum' to compare values as 1D luminance NOT as 3D colors.
bstNode lum_tree_find(bstNode root, tuple(R, G, B) key):

The nearest neighbor find is very similar to the binary search tree find() function we covered in class. The only difference is that the actual nearest neighbor may be any node along the path we took to the leaf. For example given the tree below, the node (255, 0, 0) does not exactly match our query (90, 75, 50) but it is a closer match than the leaf (255, 0, 255) and the root (0, 255, 0).

Given the tree w/ query: (90, 75, 50)
        |--(255, 255, 0)
    |--(0, 255, 255)
|--(0, 255, 0)
        |--(255, 0, 255)
    |--(255, 0, 0)
        |--(0, 0, 255)

The bstNode with key=(255, 0, 0) and value='data/rainbow/color_0.png'

Creating your own mosaic (20 points)

If you have made it here, congratulations! You now have a Python script capable of reading in large collections of images and producing a mosaic in three different ways.

Your next task is to be creative and have fun! Find a target image that you want to convert into a mosaic and find a set of small images you want to use as your replacement tiles. If you cannot find the small tile image, you can also use the instragram imageset (igram). You should not use any of the other test image sets given, as they are monochromatic or random images designed to test your code in human-readable ways.

EDIT: Since apparently PL wont allow you to download the full igram set as a directory, you can find the full dataset as a zip file HERE.

Submit your data visualization (and code)

Unlike the previous MPs, the provided code should work out of the box for any input you choose and you will likely not have to write additional code. That said, you will still need to submit your full processing pipeline (including a copy of your own implementation of part 1) as well as your final visualization separately.

Your final visualization should be a PNG of an image of your choosing (and ideally using a tileset of your own design). You should run your image with maximumTilesX set to at least 200 and a tileHeight set to at least 16 but can otherwise change your parameters freely. (This includes which implementation you use to produce your final image)

To ensure that your exploration of this provided code is educational, you are also required to measure the time it takes each of the three implementations (naive, luminance, KD-tree) to produce your mosaic for at least five unique input parameter sets. As the naive approach will take a very long time, your final submission does not need to come from one of these five runs! If you want to make a big final image with a lot of options for tiles, you are strongly encouraged to only use luminance and the KD-tree implementation.