2D Convolution

Explanation

Convolution is an operation by which two functions/signals can be combined into a third one. The convolution operation for the 1-dimensional case is desribed here.

The convolution operation can also be defined for higher dimensions. In the case of 2-dimensional signals we can picture the process of overlaying one picture (the kernel) on top of the other (the input image) and weighting (ie. multiplying) each pixel in the lower image with the corresponding pixel value of the image above, the summing all the weighted values. This is done for each possible position and the resulting sums are collected in the output image. The size of the output image corresponds to the number of possible positions the two images can be overlayed on each other. Because the two images must always fully overlap, a larger kernel can be positioned in fewer combinations than a larger kernel. Therefor the width of the output image must be input_image_width - kernel_width + 1. The input image size can be increased by introducing additonal padding.

Use the brush to draw something into the Input Image below. Set the padding to increate the images size. Then use the brush to draw a pattern into the filters Kernel image. Click on a pixel in the Padded Image to see which pixels in the Filered Image get affected. Select a pixel in the Filetered Image to se which pixels from the Padded Image got used to calculate its value.

For the Filter you can choose between a Convolution or Correlation operation. The difference is if the Kernel gets flipped when overlayed ontop of the Padded Image. For symmetric Kernels it makes no difference. For non-symmetric Kernels notice the overlayed Kernels orientation when focusing Pixels in the Padded Image and in the Filtered Image. Compare this with the 1-dimensional convolution

The Element-wise Operation of the filter allows to apply an additional function to each pixel value of the output. When selecting the Identity function (ie. apply no function at all) the output is only a weighted sum of the Input Image. Non-linear function like ceil, floor or relu allow for more sophisticated effects.

If the kernel is set to be normalized all output values will be devided by the sum of all the Kernels pixels values. Notice that if not normalized the range of values in the output image may vary from the Input Image. The values in the output Image can either by rescaled so that the largest value is 1.0 or just clipped at 1.0.

Use any of the provided examples as inspiration.