by Ziran Zhou
In this project, I followed the instructions and the background introduction of the spec and explored how to use the recorded three exposures of the photos, and split them into their own channels of Red, Green, and Blue, which was not able to be physically recorded together at the time. But with modern day imaging algorithms and approaches, I was able to use the three channels to recreate the photos.
I read in the single image file that contains the three color channels stacked vertically, in the order of Blue, Green, Red (BGR), then with the starter code provided, I was abel to split the image into three equal parts, of three color channels, achieved by calculating the height of the image and dividing it to 3 equal parts, and assigning each part to the respective color channel.
After splitting the channels, I align the Green (G) and Red (R) channels to the Blue (B) channel as base, and the alignment assumes that there is a simple x,y translation model, where the channels are shifted horizontally and vertically to match each other so that the pixels of the original photo match each other in terms of location, with the assumption that the translation is sufficient to align properlly a visually correct RGB image.
I used an exhaustive approach to align the channels to create the visually correct RGB image,
including shifting the image across a defined window of pixels of the range [-15, 15]
pixels in both the directions of x and y axis, and I calculated the Euclidean Distance (SSD when
squared) following the instructions, between the base Blue channel and the shifted channels of
either Red or Green to determine the best alignment.
There were some issues arising in the implementation, mainly for the emir.tif, mainly
because that image has very varrying brightness values, so using only pixel values is not
sufficient, so just in case of other inputs with similar traits, aligning based only on the pixel
values does not always work properly because of the variations in teh brightness and color of the
pixels and to fix the problem. See Bells & Whistles section for more.
I used edge detection via teh Prewitt filter to
emphasize the structural elements of the
image, mainly the edges, which are consistent throughout the different color channels. The borders
of the image contain a lot of noise, so this presents anothe issue, so to properly align these
edges, I also ignored the border artifacts by a certain percentage (around 15%) so I have proper
alignment performance with less noise or fringe effects at the borders.
Following the spec of the project, I utilized the Pyramid technique to optimize for the large images, since the exhaustive searches become computationally expensive, as we start with a coarse version of the image, which is downscaled using the package methods, and alignment is then performed at each level of the pyramid so the displacement or shifting is refined as the resolution increases. The downscaling is performed recursively, as the image progressively being downscaled, the alignment is performed at each subordinate level, and the displacement at lower resolution is multiplied by 2 so we can apply to the higher resolution for refinement. Simply put, the offset found at the recursive lower level is kept and used for the next pyramid search level.
After splitting the image into the 3 color channels (B, G, R), but i use edge detection on each channel using sobel filter. I also considered prewitt filter but sobel is said to have more response and sensitivity to fine-grained details and and place a higher weight on central pixels of the gradient calculation, and more useful in high-contrast areas than prewitt. This filtering asists in identifying significant features of the image, and the maximum edge values across the color channels are used to make a unified masking which is used to determin the region that I want to keep based on the edges itensities and a threshold is used to decide that 95% will be kept since they are of moment, and with this mask I extract the bounding box defined by outermost significant edges and it determines the cropping boundaries, and I can eventually dynamically adjusting them based on the image's content rather than the predefined margins.
However, the dynamic fringe detection and removal is not enough. After that, the shifting on the R
red channel is slightly better, from [132 -629] to [128 -388], but still
slightly off, so I had to reapply a prewitt filter for the shifted channel before passing into the
pyramid function, which successfully aligned the image emir.tif.
[See the images output
below titled emir_output_unmodified.jpg, emir_output_only_dynamic.jpg, and
emir_output_dynamic_prewitt.jpg]
monastery_output.jpg
Green Channel Shift: [-3, 2]
Red Channel Shift: [3, 2]
church_output.jpg
Green Channel Shift: [25, 4]
Red Channel Shift: [58, -4]
three_generations_output.jpg
Green Channel Shift: [54, 12]
Red Channel Shift: [111, 9]
melons_output.jpg
Green Channel Shift: [80, 10]
Red Channel Shift: [177, 13]
onion_church_output.jpg
Green Channel Shift: [52, 25]
Red Channel Shift: [107, 36]
train_output.jpg
Green Channel Shift: [42, 2]
Red Channel Shift: [85, 29]
tobolsk_output.jpg
Green Channel Shift: [3, 3]
Red Channel Shift: [6, 3]
icon_output.jpg
Green Channel Shift: [42, 17]
Red Channel Shift: [90, 23]
cathedral_output.jpg
Green Channel Shift: [5, 2]
Red Channel Shift: [12, 3]
self_portrait_output.jpg
Green Channel Shift: [78, 29]
Red Channel Shift: [176, 37]
harvesters_output.jpg
Green Channel Shift: [60, 17]
Red Channel Shift: [124, 14]
sculpture_output.jpg
Green Channel Shift: [33, -11]
Red Channel Shift: [140, -26]
lady_output.jpg
Green Channel Shift: [56, 9]
Red Channel Shift: [120, 13]
emir_output_unmodified.jpg - Before applying Any Bells & Whistles techniques
Green Channel Shift: [49 24]
Red Channel Shift: [132 -629]
emir_output_only_dynamic.jpg - After applying Bells & Whistles Dynamic Edge Detection & Removal technique
Green Channel Shift: [49 24]
Red Channel Shift: [128 -388]
emir_output_dynamic_prewitt.jpg - After applying Bells & Whistles Dyamic Edge Detection & Removal and Prewitt Filter techniques
Green Channel Shift: [49 24]
Red Channel Shift: [107 40]
With the help of starter code, and the methods like Euclidean Distance and Pyramid method, the images are able to be restored to the best quality possible from black and white channel-only images, and the techniques also substantially reduced computation time for high-resolution images. Also, the edge detection further improve the alignment quality especially for images with high brightness differences.