# Overview

Find the best way to partition the given figure in two parts: a monochromatic rectangle and a monochromatic background. The objective is to minimise the sum of squared errors.

# Implementation

We will implement a command line tool pngsegment that you can use to do image segmentation in PNG files.

## Usage

The program can be used as follows:

    ./pngsegment INPUT OUTPUT1 OUTPUT2


where INPUT is the input file and OUTPUT1 and OUTPUT2 will show the segmentation: OUTPUT1 will draw the approximation of the image as a monochromatic rectangle and a monochromatic background, while OUTPUT2 will show the location of the rectangle overlaid with the original figure.

For example, if you have a PNG file called example.png in the current directory, you can do image segmentation as follows:

    ./pngsegment example.png result1.png result2.png


## Template

You can find an almost-working implementation in our shared code repository, in the subdirectories is*. There is one directory per task; the templates are identical.

The only part that is missing is the implementation of the subroutine segment() in file is*/is.cc. This is the function that actually does segmentation. See is*/is.h for the description of the interface.

Once you have implemented the function correctly, you should be able to run make to compile your code, make test to check that it works correctly (at least for some test cases) and make benchmark to see how well it performs.

# Detailed specification

You need to implement the following function:

    Result segment(int ny, int nx, const float* data)


Here data is a colour image with ny*nx pixels, and each pixel consists of three colour components, red, green, and blue. In total, there are ny*nx*3 floating point numbers in the array data.

The colour components are numbered 0 <= c < 3, x coordinates are numbered 0 <= x < nx, y coordinates are numbered 0 <= y < ny, and the value of this colour component is stored in data[c + 3 * x + 3 * nx * y].

## Correct output

The function has to return an instance of the following structure that indicates the optimal segmentation:

    struct Result {
int y0;
int x0;
int y1;
int x1;
float outer[3];
float inner[3];
};


The first four fields indicate the location of the rectangle. The upper left corner of the rectangle is at coordinates (x0, y0), and the lower right corner is at coordinates (x1-1, y1-1). That is, the width of the rectangle is x1-x0 pixels and the height is y1-y0 pixels. The coordinates have to satisfy 0 <= y0 < y1 <= ny and 0 <= x0 < x1 <= nx.

The last two fields indicate the colour of the background and the rectangle. Field outer contains the three colour components of the background and field inner contains the three colour components of the rectangle.

## Objective function

For each pixel (x,y) and colour component c, we define the error error(y,x,c) as follows:

• Let colour(y,x,c) = data[c + 3 * x + 3 * nx * y].
• If (x,y) is located outside the rectangle: error(y,x,c) = outer[c] - colour(y,x,c).
• If (x,y) is located inside the rectangle: error(y,x,c) = inner[c] - colour(y,x,c).

The total cost of the segmentation is the sum of squared errors, that is, the sum of error(y,x,c) * error(y,x,c) over all 0 <= c < 3 and 0 <= x < nx and 0 <= y < ny.

Your task is to find a segmentation that minimises the total cost.

## Numerical stability

The input and output are given as single-precision floating point numbers (type float). However, please use double-precision floating point numbers (type double) for all intermediate results, and only round the final result to single precision (unless you really know what you are doing).

Subdirectory: is1, file to edit: is1/is.cc.

Implement an algorithm for image segmentation. You can use the naive algorithm that tries out all possible locations 0 <= y0 < y1 <= ny and 0 <= x0 < x1 <= nx for the rectangle and finds the best one.

However, your implementation has to be reasonably efficient. In particular, make sure that for each possible location of the rectangle you only need to perform O(1) operations to evaluate how good this position is. That is, the total complexity for an n × n figure has to be O(n4). See the hints for suggestions.

You are supposed to use the following technique to parallelise your code:

• Multiple threads with OpenMP: Different threads try out different candidate locations.
• Vector instructions: Store the colour image so that each pixel is a 4-element vector, with 3 colour components and a zero for padding. This way you can do all calculations for all colour components in parallel.

Examples of a sufficient performance with classroom computers:

• nx = ny = 400: roughly 10 seconds

Subdirectory: is2, file to edit: is2/is.cc.

Design a more efficient algorithm that (at least in typical cases) does not need to try out all possible locations of the rectangle.

# Hints

It is important that you can very efficiently calculate the cost of placing a rectangle at certain position. A little bit of math is needed to do this.

Let $P$ be the set of all $(x,y)$ coordinates. For each pixel $p \in P$, let $v_p$ be its colour in the input. That is, $v_p$ is a vector of size 3, where $v_{p,0}$ is the red component, $v_{p,1}$ is the green component, and $v_{p,2}$ is the blue component.

Let us partition $P$ in two disjoint sets, $X \cup Y = P$ and $X \cap Y = \emptyset$. In our application, $X$ is the “rectangle” and $Y$ is the “background”, but this is not yet important here.

We choose colour $a$ for set $X$ and colour $b$ for set $Y$. Again, these are vectors of size 3.

For each pixel $p \in X$ and colour component $c$ the error is $a_c - v_{p,c}$, and for each pixel $p \in Y$ and colour component $c$ the error is $b_c - v_{p,c}$. The total cost over all pixels and all colour components is therefore $$f(X,Y,a,b) = \sum_c \sum_{p \in X} \left(a_c - v_{p,c}\right)^2 + \sum_c \sum_{p \in Y} \left(b_c - v_{p,c}\right)^2,$$ and our task is to find $(X,Y,a,b)$ that minimises the cost $f(X,Y,a,b)$. Equally well we can maximise the utility $$g(X,Y,a,b) = - f(X,Y,a,b) + \sum_c \sum_{p \in P} v_{p,c}^2$$

Let us introduce the shorthand notation $$v_{X,c} = \sum_{p \in X} v_{p,c}, \quad v_{Y,c} = \sum_{p \in Y} v_{p,c}.$$ We can rewrite the cost as follows: $$\begin{split} f(X,Y,a,b) &= \sum_c \left( \sum_{p \in X} a_c^2 - \sum_{p \in X} 2a_c v_{p,c} + \sum_{p \in X} v_{p,c}^2 + \sum_{p \in Y} b_c^2 - \sum_{p \in Y} 2b_c v_{p,c} + \sum_{p \in Y} v_{p,c}^2 \right) \\ &= \sum_c \left( |X| \cdot a_c^2 + |Y| \cdot b_c^2 - 2a_c v_{X,c} - 2b_c v_{Y,c} + \sum_{p \in P} v_{p,c}^2 \right). \end{split}$$ Therefore the utility that we want to maximise is simply $$g(X,Y,a,b) = \sum_c \left( 2a_c v_{X,c} + 2b_c v_{Y,c} - |X| \cdot a_c^2 - |Y| \cdot b_c^2 \right).$$

To maximise $g(X,Y,a,b)$, we try all possible partitions $(X,Y)$, and for each partition we find the best colours $(a^*,b^*)$ that maximise $g(X,Y,a^*,b^*)$. By differentiation, it is easy to see that the optimal colours are the arithmetic means $$a^*_c = \frac{v_{X,c}}{|X|}, \quad b^*_c = \frac{v_{Y,c}}{|Y|}.$$ We use the shorthand $h(X,Y) = g(X,Y,a^*,b^*)$ for the utility of partition $(X,Y)$ for the best choice of colours. Note that this function takes the following very simple form: $$h(X,Y) = \sum_c \left( \frac{v_{X,c}^2}{|X|} + \frac{v_{Y,c}^2}{|Y|} \right).$$

Therefore to find an optimal segmentation, it is sufficient to check all possible rectangles $X$ and calculate $h(X,Y)$. To do that, we only need to know $v_{X,c}$ and $v_{Y,c}$. Naturally, $v_{Y,c} = v_{P,c} - v_{X,c}$. Hence to calculate $h(X,Y)$ efficiently, we only need to know how to calculate $v_{X,c}$ efficiently, for any rectangle $X$.

Note that $v_{X,c}$ is simply the sum of colour component $c$ for all pixels in rectangle $X$. With a little bit of preprocessing, we can calculate this sum in constant time.

Let us write $s(x_0,y_0,x_1,y_1,c) = v_{R,c}$, where $R = [x_0,x_1-1] \times [y_0,y_1-1]$ is the rectangle with the corners $(x_0,y_0)$ and $(x_1,y_1)$. Then by the inclusion–exclusion principle, $$\begin{split} s(x_0,y_0,x_1,y_1,c) {}={} &s(0,0,x_1,y_1,c) \\ {}-{} &s(0,0,x_0,y_1,c) \\ {}-{} &s(0,0,x_1,y_0,c) \\ {}+{} &s(0,0,x_0,y_0,c). \end{split}$$ So we only need to precompute $s(0,0,x,y,c)$ for all $x$, $y$, and $c$. With this precomputed information, we can then calculate $h(X,Y)$ for any rectangle $X$ in constant time. Vector types will help us to do all work conveniently for all colour components in parallel.

For further performance improvements, it may be helpful to organise the loops so that the outer loop tries out different sizes of the rectangle, and the inner loop tries all possible positions of the rectangle. Then $1/|X|$ and $1/|Y|$ remain constant in the inner loop.

If you are comfortable with studying the assembly code produced by the compiler, these observations related to branch predication may also be helpful.