by Ziran Zhou
In this project, I followed the instructions and the background introduction of the spec and explored how to use homographies and perspective warping to obtain perspective-changed image manipulation results.
I shoot several photos using the camera on a camera stand being rotated around, then I digitizie the photos and reduce the file size and image dimension for faster processing. The image set has 7 photos in total, including im-1.jpg, etc. Each successive pair of images have around 40%-70% of similarity and I kept certain distinguishable features of the contents being taken so it is easier to select mathcing corner and edge points as correspondences. Admittedly, the photos have different lightings due to camera setting and environmental influences which may slightly influence the warping and stitching results in later parts.
In this part, I used the sel_pts
and a series of other functions from Project 3 and I
was able to select, store, and reload correspondence points between every two images in a pair.
After selecting the correspondence points, for instance:
im1-im2-correspondence.jpg
im2-im3-correspondence.jpg
I then work on image alignment via the point correspondences, and I developed the function
computeH(impts1, impts2)
for recovering homography matrix \(H\) between the two images
we selected correspondences for, and it will create a 3x3 transformation matrix mapping
correspondence
points between two image.
We can also visualize it with the mathematical relationship:
\[p'=Hp\], where \(p'\) and \(p\) are (correspondence) points we selected for the two images, and
each point is represented in homogeneous coordinates.
We have the transformation as following: \[ \begin{bmatrix} wx' \\ wy' \\ w \end{bmatrix} = \begin{bmatrix} a & b & c \\ d & e & f \\ g & h & 1 \end{bmatrix} \begin{bmatrix} x \\ y \\ 1 \end{bmatrix} \]
Then for each correspondence pair, we record points of coordinates \((x_1, y_1)\), \((x_2, y_2)\)
from
impts1
and impts2
and we can expand the matrix multiplication write th
esystem as \(Ah=0\) where \(h\) is flaten \(h\) with the following 1st and 2nd rows:
\[
\begin{bmatrix}
x_1 & y_1 & 1 & 0 & 0 & 0 & -x_1x_2 & -y_1x_2 & -x_2 \\
0 & 0 & 0 & x_1 & y_1 & 1 & -x_1y_2 & -y_1y_2 & -y_2
\end{bmatrix}
\begin{bmatrix}
a & b & c & d & e & f & g & h & 1
\end{bmatrix}^\top
=
\begin{bmatrix}
x' \\
y'
\end{bmatrix}
\]
And we solve this matrix multiplication to get the homography-transformed coordinates to now how
each point in the image will be rearranged to create a chaneg in perspective of the image.
We then solve the homography by Singular Value Decomposition (SVD) where matrix \(A\) is decomposed and the homograhy matrix \(H\) is then derived from the eigenvectors to the smallest singular value of \(A\) which effectively minimizes error and obtain the desired result.
In this part, I morphed two images into one another using a sequence of transformations based on specified control points and traingulation.
I utilized th ehomography matrix H to transform the coordinates of source image to target image's perspectvive and in the appropriate direction, i.e. warping image 1 to image 4 then H is utilized, warping image 5 to image 4 then H's inverse instead.
I also avoided loops for actual warping computation by using np.meshgrid
and other
vectorized operations, and I also predicted the bounding box by transforming coordinates of original
images and calculate the extends of the resulting image.
Note that the resulting image is not placed
in that extended background, but the information needed for blending is returned and will be handled
and associated with the extended bounding box in the mosaic
function instead.
I avoided
aliasing by using scipy.interpolate.griddata
for resampling.
Note that I excluded the 4th warped image from the display since it is chosen as the center so it is not warped with a homography matrix but with an identity matrix, i.e., that it is not warped to any different perspective.
im-1.jpg
im1-warped-to-im2.jpg
im-2.jpg
im-3.jpg
im3-warped-to-im2.jpg
I also used the homography functionality to achieve Rectification of images to desired
perspective.
The usage of homography in previous part was to find the projective transformation relationship
between 2 images with correspondence points detailing the levels of perspective adjustment.
However,
in this rectifying task, I do not use homograhy between 2 images, but one image and another with
correspondence points of a square or a rectangle, and choose ONLY 4 correspondence points (forming a
Square or a Rectangle to ensure rectified warping effects) on the
image whose perspective I wish to rectify, and I ensured the order of points are the same no
matter
the order they are selected. Afterwards, the image is being warped using the
warpImage
function to the square or rectangle perspective, which works especally well for images with
rectangular or square features before warping allowing for easy correspondence points selection.
All the original images below have edges in a perspective that is non-parallel, and after rectification, most of the edges are warped to parallel perspective and look more justified than before.
I have the following example of a image of a box and the resulting rectified image.
box.jpg
box-rect.jpg
And another example of a image of a street in London and the resulting rectified image.
londonst.jpg
londonst-rect.jpg
Here is another example of rectification of the buildings at Times Square in NYC.
nyc.jpg
nyc-rect.jpg
In this part I used the mosaic
function that takes in the results from
warpImage
(for a total of 7 with the 4th one being the center image not warped
where each other image and their homographies are aligned to)
where I am able to access the coordinates that matter for the creation of the
dimension-extended
eventual mosaic canvas.
ims-mosaic-overlaid(123-2-centered).jpg
We can see that the warped images connect almost perfectly, exhibiting the stitched result resembling of a panoramic image. There are some slight misalignment of the images and some differences in the pixel intensities which as described in the previous part of shooting the photos that they could be caused by the camera setting and the environmental influences, and the alignment may also be affected by the number of correspondence points selected and how each points are selected accurately enough which may not always be the case since it is manually selected.
I could also use Laplacian blending to remove the sharp edges contrast in the eventual stitched image.
Another result is the pictures taken at Salt Lake City Intl. Airport for a total of 4 photos
with the 2nd one being the center. The photos having some edges not being perfectly aligned
which is expected since I was not able to stabilize the phone for shooting them with perspective
rotated while keeping the center of projection identical throughout. But the main contents of
the image are maintained and are perfectly stitched together with the homography transformation
and the warpImage
function implemented.
ims-mosaic-overlaid-slc(1234-2-centered).jpg
Here are the individual photos in their original and warped forms:
slc-1.jpg
slc-im1-warped-to-im2.jpg
im1-im2-slc-correspondence.jpg
slc-2.jpg
slc-3.jpg
slc-im3-warped-to-im2.jpg
im2-im3-slc-correspondence.jpg
slc-4.jpg
slc-im4-warped-to-im2.jpg
im3-im4-slc-correspondence.jpg
One other result is the pictures taken at a classroom in Wheeler Hall.
ims-mosaic-overlaid-bb(123-2-centered).jpg
Here are the individual photos in their original and warped forms:
bb-1.jpg
bb-im1-warped-to-im2.jpg
im1-im2-bb-correspondence.jpg
bb-2.jpg
bb-3.jpg
bb-im3-warped-to-im2.jpg
im2-im3-bb-correspondence.jpg
I detected all the Harris Corners on an image using the provided harris.py
(in
lib.py
file) which dots the entire image with points, which I will later filter out
most to select the most suitable corner points, as shown below:
im1-Harris Corner.jpg
im2-Harris Corner.jpg
I then implemented the Adaptive Non-Maximal Suppression (ANMS) on the Harris Corner points to select a subset of the corner points from an image. ANMS is designed to retain strong corner points while ensuring that they are spatially well-distributed across the image, and this is crucial for tasks like image matching and stitching which I implemented in later parts.
The Harris Corners detection algorithm provided me a series of corner points and their corresponding corner strengths. I then ordered the corner points in descending order by their corner strengths and ensuring the strongest corners are considered first during suppression process.
And for each corner point, I initialize the radius to infinity as it represents the minimum distance to a stronger corner point that satisfy the conditions.
Then I also considered all previously processed corner points since they have higher corner
strengths
from being sorted first. And I calculated the Euclidean distance between the current corner point
and
each stronger corner point using the provided dist2
function. Then I find the minimum
distance to a stronger corner and setting the radius to that minimal distance.
\[ r_i = \min_j\{\sqrt{(x_i - x_j)^2 + (y_i - y_j)^2}\} \text{ for all } j \text{ where } c_j > c_i \]
I then calculated a minimum alloable suppression radius for the current corner point which is
proportional to its corner strength relative tothe max corner strength, by using the
c_robust
parameter which controlls the sensitivity of suppression radius to corner
strength.
Then if the minimum distance (radius) is greater than the minimal radius, I update teh suppression
radius for corner point i, otherwise still set to infinity, and this ensures that only corner points
sufficiently distant from stronger corners and are beyond minimum allowable radius are considered
for selection.
After that, i selected the top 500 corner points with the largest suppression radii and are evenly distributed across the image. See results below:
im1-ANMS-Processed Corners.jpg
im2-ANMS-Processed Corners.jpg
I then implemented the Feature Descriptor Extraction for each feature point found in the previous part to facilitae accurate image matching. I 1st convert the input image to grayscale if it is not already to simplify feature descriptor extraction by reducing data dimensionality and use corner points identified from previous ANMS result as centers for feature descriptor extraction.
For each point, I extacted a large square patch around the feature point, chosen to have the size of 40x40 pixels to ensure sufficient contextual info around it is pertained and is essential for creating stable descriptor. I also checked for whether or not the entire patch lies within the image boundaries to avoid errors during extraction.
Then apply Gaussian blur to the patch with the standard deviation (sigma) of the Gaussian kernel as the parameter controlling blurring extent and helps reduce impact of noise and minor variations in image making teh descriptor more robust.
Then i downsampled teh patches to have 0 mean and unit (1) variance (bias/gain normalization) to make descriptor invariant to changes in lighting and contrast. I then flatten the normalized 8x8 patch into a 64 dimensional vector to represent feature of teh patch and a descriptor for corresponding feature point.
Then I implemented Feature Descriptors matching by identifying pairs of feature descriptors from 2 different images likely corresponding to the same physical point in teh scene.
For each descriptor in the 1st image, I find the 2 nearest neighbor descriptors in the 2nd image
based on the dist2
(Euclidean distance) function, and then I determined whether the
match is valid by calculating if the ratio of the distance to the neighrest neighbor to the
distance to the 2nd-nearest neighbor is belowed teh specified threshold, as proposed by David Lowe.
By using this criterion, I filter out ambiguous matches where closest descriptor not significantly closer than the second-closest. I tested out several times and observed the empirical evidence and differences between images matched, and set the threshold value accordingly.
I then also plot the matched points b/w 2 images on a single image to visualize the correspondences, as shown below:
im1-im2-matches.jpg
Here I also implemented the same for the other image sets, here are some example shown:
im1_slc-Harris Corner.jpg
im2_slc-Harris Corner.jpg
im1_slc-ANMS-Processed Corners.jpg
im2_slc-ANMS-Processed Corners.jpg
im1_slc-im2_slc-matches.jpg
im1_bb-Harris Corner.jpg
im2_bb-Harris Corner.jpg
im1_bb-ANMS-Processed Corners.jpg
im2_bb-ANMS-Processed Corners.jpg
im1_bb-im2_bb-matches.jpg
It is obvious to see that the common parts of the 2 images have correspondence points capturing the corner details and characteristics well, while there do exist some points that exist in one image and not the other, which may cause some imperfections, but later it is shown that they have miniscule effects.
I then implemented the Random Sample Consensus (RANSAC) functionality for computation of homographies.
I start by repeatedly selecting a random subset of matched point pairs (4 points) and compute a homography H based on these sample points. I Then, for each sample, I use the homography function from part A to calculate teh homography using the direct linear transformation. Then, I apply the computed H homography to all the matched points and count how many of the transformation by H able to create residuals below the threshold, and repeatedly I am able to find teh H that gives the maximum number of inliers (i.e., points randomly selected but happen to fit well) and this will be the best estimate under the specified number of iterations of random sampling.
Below I show the optimal points selected by RANSAC:
RANSAC-4-points-im1.jpg
RANSAC-4-points-im2.jpg
RANSAC-8-points-im1_slc.jpg
RANSAC-8-points-im2_slc.jpg
RANSAC-12-points-im1_bb.jpg
RANSAC-12-points-im2_bb.jpg
And after calculating the optimal homography matrix, I can warp the images and create the mosaics like part A before, as shown below. These mosaics look almost the same as selecting the correspondence points manually, indicating success in the autostitching algorithms implemented.
im1_RANSAC_warped.jpg
im2_RANSAC_warped.jpg
im3_RANSAC_warped.jpg
ims-mosaic-overlaid-RANSAC-1.jpg
Similarly, I can also implement this on teh other image sets (selecting 4 points in RANSAC).
im1_slc_RANSAC_warped.jpg
im2_slc_RANSAC_warped.jpg
im3_slc_RANSAC_warped.jpg
im4_slc_RANSAC_warped.jpg
ims-mosaic-overlaid-RANSAC-slc.jpg
im1_bb_RANSAC_warped.jpg
im2_bb_RANSAC_warped.jpg
im3_bb_RANSAC_warped.jpg
ims-mosaic-overlaid-RANSAC-bb.jpg
im1_bart_RANSAC_warped.jpg
im2_bart_RANSAC_warped.jpg
im3_bart_RANSAC_warped.jpg
ims-mosaic-overlaid-RANSAC-bart.jpg
This mosaic looks slightly unrefined due to the fact the the images I took did not have the same center of projection due to difficulties arising in taking the photographs, but they are still almost perfectly aligned in the mosaic.