Overview

In this project, I followed the instructions and the background introduction of the spec and explored how to use the recorded three exposures of the photos, and split them into their own channels of Red, Green, and Blue, which was not able to be physically recorded together at the time. But with modern day imaging algorithms and approaches, I was able to use the three channels to recreate the photos.

Implementation

Splitting the image into 3 different Channels

I read in the single image file that contains the three color channels stacked vertically, in the order of Blue, Green, Red (BGR), then with the starter code provided, I was abel to split the image into three equal parts, of three color channels, achieved by calculating the height of the image and dividing it to 3 equal parts, and assigning each part to the respective color channel.

Aligning channels using Simple Translation Model

After splitting the channels, I align the Green (G) and Red (R) channels to the Blue (B) channel as base, and the alignment assumes that there is a simple x,y translation model, where the channels are shifted horizontally and vertically to match each other so that the pixels of the original photo match each other in terms of location, with the assumption that the translation is sufficient to align properlly a visually correct RGB image.

Exhaustive Search for Alignment

I used an exhaustive approach to align the channels to create the visually correct RGB image, including shifting the image across a defined window of pixels of the range [-15, 15] pixels in both the directions of x and y axis, and I calculated the Euclidean Distance (SSD when squared) following the instructions, between the base Blue channel and the shifted channels of either Red or Green to determine the best alignment.

Edge Detection and Border Artifacts Handling

There were some issues arising in the implementation, mainly for the emir.tif, mainly because that image has very varrying brightness values, so using only pixel values is not sufficient, so just in case of other inputs with similar traits, aligning based only on the pixel values does not always work properly because of the variations in teh brightness and color of the pixels and to fix the problem. See Bells & Whistles section for more. I used edge detection via teh Prewitt filter to emphasize the structural elements of the image, mainly the edges, which are consistent throughout the different color channels. The borders of the image contain a lot of noise, so this presents anothe issue, so to properly align these edges, I also ignored the border artifacts by a certain percentage (around 15%) so I have proper alignment performance with less noise or fringe effects at the borders.

Pyramid Technique Optimization for Large Image

Following the spec of the project, I utilized the Pyramid technique to optimize for the large images, since the exhaustive searches become computationally expensive, as we start with a coarse version of the image, which is downscaled using the package methods, and alignment is then performed at each level of the pyramid so the displacement or shifting is refined as the resolution increases. The downscaling is performed recursively, as the image progressively being downscaled, the alignment is performed at each subordinate level, and the displacement at lower resolution is multiplied by 2 so we can apply to the higher resolution for refinement. Simply put, the offset found at the recursive lower level is kept and used for the next pyramid search level.

Bells & Whistles

Dynamic and Fixed Border (Edge) Detection and Removal

After splitting the image into the 3 color channels (B, G, R), but i use edge detection on each channel using sobel filter. I also considered prewitt filter but sobel is said to have more response and sensitivity to fine-grained details and and place a higher weight on central pixels of the gradient calculation, and more useful in high-contrast areas than prewitt. This filtering asists in identifying significant features of the image, and the maximum edge values across the color channels are used to make a unified masking which is used to determin the region that I want to keep based on the edges itensities and a threshold is used to decide that 95% will be kept since they are of moment, and with this mask I extract the bounding box defined by outermost significant edges and it determines the cropping boundaries, and I can eventually dynamically adjusting them based on the image's content rather than the predefined margins.

However, the dynamic fringe detection and removal is not enough. After that, the shifting on the R red channel is slightly better, from [132 -629] to [128 -388], but still slightly off, so I had to reapply a prewitt filter for the shifted channel before passing into the pyramid function, which successfully aligned the image emir.tif.

[See the images output below titled emir_output_unmodified.jpg, emir_output_only_dynamic.jpg, and emir_output_dynamic_prewitt.jpg]

Output

Summation

With the help of starter code, and the methods like Euclidean Distance and Pyramid method, the images are able to be restored to the best quality possible from black and white channel-only images, and the techniques also substantially reduced computation time for high-resolution images. Also, the edge detection further improve the alignment quality especially for images with high brightness differences.