-
Notifications
You must be signed in to change notification settings - Fork 0
Image Processing
The first stage of the image processing pipeline is to align the x and y axis of the image frame with the rows and columns of the captured pattern. This requires estimation of the degree of rotation offset, then an inverse rotation transform back on the image. Derotation is done before any other steps because image rotation requires resampling and interpolation, which has the effect of blurring the image and is useful as initial noise reduction but opposes the eventual output of sharp binary segmentation. A side note is that in implementation, rescaling of the raw input image can be done in the same transform step to avoid additional interpolation overhead and degradation.
The method for rotation estimation takes advantage of the uniform checkerboard-like grid underlying the binary pattern. Ignoring perspective distortions (see Perspective and Homography), it can be assumed that the distribution of sharp gradients throughout the image fall under one of four directions normal to the rows and columns of the binary pattern. Given the relationship between the four directions is expected to be multiples of 90 degrees apart, the most straight forward approach is to take the average of the angles of gradients modulo 90 degrees. Since we are dealing with circular quantities however, the only well defined method of averaging angles is in Cartesian coordinates instead of polar. Instead of modulo 90 as with scalar quantities, the equivalent is achieved in Cartesian space via double angle trigonometric identities. For example, by quadrupling the angle of every gradient, all 90 degree offsets are canceled out in the process since every four quadrant is one full rotation. In summary, the rotation estimation method is simply stated as follows: Compute image gradients, quadruple each angle, take the Cartesian mean of angles, then the resultant rotation estimate is a quarter of the mean angle. Keep in mind that rotation estimation based on aligning rows and columns can only snap to the nearest 90 degrees so absolute cardinal directions remain ambiguous and are left to be uncovered from the bit pattern itself once decoded.
Work in progress