Project 1: Image Alignment and Colorization

Overview

This project implements methods to algorithmically colorize photos from Sergei Mikhailovich Prokudin-Gorskii's (1863-1944) three-channel collection.

Methods

The colorization process involves two steps: Image Alignment and Color Restoration. In the image alignment phase, we attempt to align the three channels as closely as possible by treating this as an optimization problem. The optimization is done using two key techniques: Sum of Squared Differences (SSD) and Normalized Cross-Correlation (NCC).

SSD works by minimizing the error magnitude between two images, while NCC aims to find vectors that have the smallest angles between them.

Optimization Functions: argmax and min

In the alignment task, I optimized for the best displacement that aligns the color channels by either minimizing the SSD using min or maximizing the NCC using argmax. These functions help find the displacement that yields the best alignment score. For small images, I can perform an exhaustive search over a large window, but for larger images, the shifts will most likely not lie in this range and there are way more pixels to compute due to higher number of pixels in high resolution photos, so I employed a pyramid search to recursively downscale the image and optimize over smaller regions.

Sum of Squared Differences (SSD)

The SSD is calculated by minimizing the squared pixel intensity differences between two images. This can be represented mathematically as:

Here, d is the displacement vector, and I1 and I2 are the intensities of the two images at pixel positions (x, y).

Normalized Cross-Correlation (NCC)

The NCC is used to find the displacement that results in the best correlation between two images. It is represented by:

NCC(d) = ∑x,y (I1(x + dx, y + dy) - μ1) (I2(x, y) - μ2) / √(∑x,y (I1(x + dx, y + dy) - μ1)² ∑x,y (I2(x, y) - μ2)²)

Here, μ1 and μ2 are the mean intensity values of the two images.

Details and Techniques

In my implementation, I used the blue channel as the base image and aligned the red and green channels to it. I experimented with both SSD and NCC and found that NCC provided better alignment results. For smaller images, I used a brute-force search over a displacement window, while for larger images, I implemented a pyramid search to reduce computation time. I also found that using the sobel edge detector filter really improved the result for emir image specifically so I added that into my implementation presumambly this is because distributional brightness and pixel intensity across pixel channels being too different. As such I simply used this across all photos to get consistently good results.

Automatic Cropping

To clean up the images, I implemented an automatic cropping method that removes unnecessary white or black borders by detecting the edge between the border and the image content. The cropping is based on intensity thresholding and calculating the bounding box of the actual image content.

Automatic Contrast Adjustment

Additionally, I implemented an automatic contrast adjustment method to improve the visual quality of the image. This rescaled the image intensity such that the darkest pixel is zero and the brightest pixel is one. This simple adjustment helps make the images more visually striking without any manual intervention.

Advanced Contrast and Cropping Techniques

In addition to simple intensity rescaling, I incorporated various advanced techniques to further enhance the visual quality of the images:

Results: Pyramid Displacement Images

Below are the results from pyramid displacement images, filtered by those that contain green and red displacement vectors in their filenames.

Church

Church Without Cropping
Green Displacement: (25, -1), Red Displacement: (58, -4)
Church Linear Autocontrast
Green Displacement: (25, -1), Red Displacement: (58, -4)
Church Histogram Equalization Autocontrast
Green Displacement: (25, -1), Red Displacement: (58, -4)
Church CLAHE Autocontrast
Green Displacement: (25, 5), Red Displacement: (58, -4)
Church Dynamic Threshold Cropping
Green Displacement: (25, 5), Red Displacement: (58, -4)

Melons

Melons Without Cropping
Green Displacement: (87, 7), Red Displacement: (177, 12)
Melons Linear Autocontrast
Green Displacement: (87, 7), Red Displacement: (177, 12)
Melons Histogram Equalization Autocontrast
Green Displacement: (87, 7), Red Displacement: (177, 12)
Melons CLAHE Autocontrast
Green Displacement: (87, 7), Red Displacement: (177, 12)
Melons Dynamic Threshold Cropping
Green Displacement: (87, 7), Red Displacement: (177, 12)

Emir

Emir Without Cropping
Green Displacement: (50, 23), Red Displacement: (107, 40)
Emir Linear Autocontrast
Green Displacement: (50, 23), Red Displacement: (107, 40)
Emir Histogram Equalization Autocontrast
Green Displacement: (50, 23), Red Displacement: (107, 40)
Emir CLAHE Autocontrast
Green Displacement: (50, 23), Red Displacement: (107, 40)
Emir Dynamic Threshold Cropping
Green Displacement: (50, 23), Red Displacement: (107, 40)

Harvesters

Harvesters Without Cropping
Green Displacement: (60, 14), Red Displacement: (123, 11)
Harvesters Linear Autocontrast
Green Displacement: (60, 17), Red Displacement: (123, 11)
Harvesters Histogram Equalization Autocontrast
Green Displacement: (60, 13), Red Displacement: (123, 10)
Harvesters CLAHE Autocontrast
Green Displacement: (60, 13), Red Displacement: (118, 10)
Harvesters Dynamic Threshold Cropping
Green Displacement: (60, 14), Red Displacement: (123, 11)

Icon

Icon Without Cropping
Green Displacement: (41, 16), Red Displacement: (90, 23)
Icon Linear Autocontrast
Green Displacement: (41, 16), Red Displacement: (90, 23)
Icon Histogram Equalization Autocontrast
Green Displacement: (38, 16), Red Displacement: (90, 23)
Icon CLAHE Autocontrast
Green Displacement: (39, 16), Red Displacement: (90, 23)
Icon Dynamic Threshold Cropping
Green Displacement: (41, 16), Red Displacement: (90, 23)

Lady

Lady Without Cropping
Green Displacement: (59, -9), Red Displacement: (122, -15)
Lady Linear Autocontrast
Green Displacement: (59, -9), Red Displacement: (122, -15)
Lady Histogram Equalization Autocontrast
Green Displacement: (57, -1), Red Displacement: (122, -15)
Lady CLAHE Autocontrast
Green Displacement: (59, -6), Red Displacement: (87, -21)
Lady Dynamic Threshold Cropping
Green Displacement: (59, -9), Red Displacement: (122, -15)

Self Portrait

Self Portrait Without Cropping
Green Displacement: (78, 29), Red Displacement: (175, 37)
Self Portrait Linear Autocontrast
Green Displacement: (78, 29), Red Displacement: (175, 37)
Self Portrait Histogram Equalization Autocontrast
Green Displacement: (78, 28), Red Displacement: (175, 37)
Self Portrait CLAHE Autocontrast
Green Displacement: (77, 27), Red Displacement: (175, 37)
Self Portrait Dynamic Threshold Cropping
Green Displacement: (78, 29), Red Displacement: (175, 37)

Sculpture

Sculpture Without Cropping
Green Displacement: (33, -11), Red Displacement: (140, -27)
Sculpture Linear Autocontrast
Green Displacement: (33, -11), Red Displacement: (140, -27)
Sculpture Histogram Equalization Autocontrast
Green Displacement: (33, -11), Red Displacement: (140, -27)
Sculpture CLAHE Autocontrast
Green Displacement: (33, -11), Red Displacement: (140, -27)
Sculpture Dynamic Threshold Cropping
Green Displacement: (33, -11), Red Displacement: (140, -27)

Three Generations

Three Generations Without Cropping
Green Displacement: (56, 12), Red Displacement: (111, 8)
Three Generations Linear Autocontrast
Green Displacement: (56, 12), Red Displacement: (111, 8)
Three Generations Histogram Equalization Autocontrast
Green Displacement: (56, 12), Red Displacement: (107, 7)
Three Generations CLAHE Autocontrast
Green Displacement: (56, 11), Red Displacement: (115, 11)
Three Generations Dynamic Threshold Cropping
Green Displacement: (56, 12), Red Displacement: (111, 8)

Train

Train Without Cropping
Green Displacement: (41, 1), Red Displacement: (85, 29)
Train Linear Autocontrast
Green Displacement: (41, 1), Red Displacement: (85, 29)
Train Histogram Equalization Autocontrast
Green Displacement: (41, 1), Red Displacement: (85, 29)
Train CLAHE Autocontrast
Green Displacement: (41, 1), Red Displacement: (85, 29)
Train Dynamic Threshold Cropping
Green Displacement: (41, 1), Red Displacement: (85, 29)

Onion Church

Train Without Cropping
Green Displacement: (52, 24), Red Displacement: (107, 35)
Train Linear Autocontrast
Green Displacement: (52, 24), Red Displacement: (107, 35)
Train Histogram Equalization Autocontrast
Green Displacement: (52, 24), Red Displacement: (107, 35)
Train CLAHE Autocontrast
Green Displacement: (52, 24), Red Displacement: (107, 35)
Train Dynamic Threshold Cropping
Green Displacement: (52, 24), Red Displacement: (107, 35)

Results: Single-Scale Images

Below are the results from single-scale images where I search a space of [-15, 15] window for both x and y image coordinates. Let's see the results below. Nothing fancy just the initial "naive" algorithm that works for smaller low-resolution images. If the ones below don't render for some reason just please look at github media single-naive for the photos. The displacement vectors are corectly referenced.

Monastery

Baseline Image - Green: -3,2 Red: 3,2
Green Displacement: (-3, 2), Red Displacement: (3, 2)

Cathedral

Baseline Image - Green: 5,2 Red: 12,3
Green Displacement: (5, 2), Red Displacement: (12, 3)

Tobolsk

Baseline Image - Green: 3,2 Red: 6,3
Green Displacement: (3, 2), Red Displacement: (6, 3)

Conclusion

The project successfully automates the process of aligning and colorizing the digitized images from Prokudin-Gorskii's collection. The results demonstrate that NCC generally produces better alignment compared to SSD, and the pyramid search significantly reduces computation time for larger images.

Back to Main Portfolio