Project 1: Image Alignment and Colorization

Methods

The colorization process involves two steps: Image Alignment and Color Restoration. In the image alignment phase, we attempt to align the three channels as closely as possible by treating this as an optimization problem. The optimization is done using two key techniques: Sum of Squared Differences (SSD) and Normalized Cross-Correlation (NCC).

SSD works by minimizing the error magnitude between two images, while NCC aims to find vectors that have the smallest angles between them.

Optimization Functions: `argmax` and `min`

In the alignment task, I optimized for the best displacement that aligns the color channels by either minimizing the SSD using min or maximizing the NCC using argmax. These functions help find the displacement that yields the best alignment score. For small images, I can perform an exhaustive search over a large window, but for larger images, the shifts will most likely not lie in this range and there are way more pixels to compute due to higher number of pixels in high resolution photos, so I employed a pyramid search to recursively downscale the image and optimize over smaller regions.

Sum of Squared Differences (SSD)

The SSD is calculated by minimizing the squared pixel intensity differences between two images. This can be represented mathematically as:

Here, d is the displacement vector, and I₁ and I₂ are the intensities of the two images at pixel positions (x, y).

Normalized Cross-Correlation (NCC)

The NCC is used to find the displacement that results in the best correlation between two images. It is represented by:

NCC(d) = ∑_x,y (I₁(x + d_x, y + d_y) - μ₁) (I₂(x, y) - μ₂) / √(∑_x,y (I₁(x + d_x, y + d_y) - μ₁)² ∑_x,y (I₂(x, y) - μ₂)²)

Here, μ₁ and μ₂ are the mean intensity values of the two images.

Details and Techniques

In my implementation, I used the blue channel as the base image and aligned the red and green channels to it. I experimented with both SSD and NCC and found that NCC provided better alignment results. For smaller images, I used a brute-force search over a displacement window, while for larger images, I implemented a pyramid search to reduce computation time. I also found that using the sobel edge detector filter really improved the result for emir image specifically so I added that into my implementation presumambly this is because distributional brightness and pixel intensity across pixel channels being too different. As such I simply used this across all photos to get consistently good results.

Automatic Cropping

To clean up the images, I implemented an automatic cropping method that removes unnecessary white or black borders by detecting the edge between the border and the image content. The cropping is based on intensity thresholding and calculating the bounding box of the actual image content.

Automatic Contrast Adjustment

Additionally, I implemented an automatic contrast adjustment method to improve the visual quality of the image. This rescaled the image intensity such that the darkest pixel is zero and the brightest pixel is one. This simple adjustment helps make the images more visually striking without any manual intervention.

Advanced Contrast and Cropping Techniques

In addition to simple intensity rescaling, I incorporated various advanced techniques to further enhance the visual quality of the images:

Sobel Filter with Automated Cropping: This technique uses the Sobel filter to detect edges in the image and then automatically crops unnecessary borders based on an edge mask I believe.
Histogram Equalization: This redistributes pixel intensities to improve contrast. Dark areas become lighter, and light areas become more prominent - basically what Professor Efros mentioned in lecture today with the cumulative histogram rescaling!!
Adaptive CLAHE (Contrast Limited Histogram Equalization): Apparently this does the same histogram equalization but does so on a local level as well. See below for some pretty cool results.
Automatic White Balancing: Tries to formulate a solution where white areas appear truly white by adjusting for the color temperature of the light source.
Dynamic Thresholding for Cropping: This method automates the dynamic or learnable thresholds for cropping based on intensity distributions instead of an manual threshold.
Linear Autocontrast: This technique linearly scales the pixel values to fill the available intensity range much like the histogram methodology.