2D Convolution

Video Demo
Explanation

Convolution is an operation by which two functions/signals can be combined into a third one. The convolution operation for the 1-dimensional case is desribed here.

The convolution operation can also be defined for higher dimensions. In the case of 2-dimensional signals we can picture the process of overlaying one picture (the kernel) on top of the other (the input image) and weighting (ie. multiplying) each pixel in the lower image with the corresponding pixel value of the image above, the summing all the weighted values. This is done for each possible position and the resulting sums are collected in the output image. The size of the output image corresponds to the number of possible positions the two images can be overlayed on each other. Because the two images must always fully overlap, a larger kernel can be positioned in fewer combinations than a larger kernel. Therefor the width of the output image must be input_image_width - kernel_width + 1. The input image size can be increased by introducing additonal padding.

Use the brush to draw something into the Input Image below. Set the padding to increate the images size. Then use the brush to draw a pattern into the filters Kernel image. Click on a pixel in the Padded Image to see which pixels in the Filered Image get affected. Select a pixel in the Filetered Image to se which pixels from the Padded Image got used to calculate its value.

For the Filter you can choose between a Convolution or Correlation operation. The difference is if the Kernel gets flipped when overlayed ontop of the Padded Image. For symmetric Kernels it makes no difference. For non-symmetric Kernels notice the overlayed Kernels orientation when focusing Pixels in the Padded Image and in the Filtered Image. Compare this with the 1-dimensional convolution

You can also choose from various pooling operations. The pooling determines how the values overlayed by the kernel at each position shall be combined into a single output value. For classic convolution and correlation the kernel values and image values are multiplied elementwise and then summed. But instead of summing the values one could also select only the minimum, only the maximum or only the mean. When applying an alternative operation instead of summing the filter is not linear anymore. One additional non-linear poling strategy is to not multiply the kernel values with the image values but instead check them for pairwise equality. Then the filters output value is only 1 at positions where whe image matches the filter exactly and 0 everywhere else.

The Element-wise Operation of the filter allows to apply an additional function to each output value. Selecting anything but Identity causes the filter to be not linear anymore. But functions like ceil, floor or relu allow for more sophisticated effects.

If the kernel is set to be normalized all output values will be devided by the sum of all the Kernels pixels values. Notice that if not normalized the range of values in the output image may vary from the Input Image. The values in the output Image can either by rescaled so that the largest value is 1.0 or just clipped at 1.0.

Use any of the provided examples as inspiration. Conway's Game of Life can be implemented in terms of a convolution combined with a non-linear element-wise operation. The convolution is used to count neighbouring living cells and to combine the resulting count with the state of the center cell. Then the non-linear function is used to apply the game rules. When only one convolution is used the non-linear functions needs to be a bit complex to encode the game rule of allowing a living cell have either two OR three living neighbours. This might be simplified by increasing the number of dimensions, i.e. the number of filters &emdash; one filter for a living center and one for a dead center.