Course Home
Administration
Syllabus
References
Lecture Notes
Readings
Assignments
Projects
Resources


Comments? Idea?























































































































































































































































































































































Shape inference for sheet of paper with text via characteristic strips

Final project by

Arie Kozak

ariekmail@gmail.com


Introduction

In this project I will involve myself with attempt to perform shape inference, mainly using characteristic strips algorithm. In general the algorithm is not easily applicable, therefore this will be done with certain set of assumptions (stated below) under controlled environment. The sheet of paper (excluding text) constitutes a Lambertian surface (in approximation at least) with constant albido, therefore it provides good case for application of the algorithm. Given a photograph of a sheet with text, the program will output its plot in 3 dimensional space. The photograph is taken with single light source from above, while the sheet is curved - generally non-planar – as planar shape of the sheet produces trivial solution. The surface is assumed to be constant in one direction.

Approach and Method

The process is the following:

1. Locate the sheet in the image. The sheet is marked with red contour manually. Given this, the image can be segmented: all pixels inside the enclosing are separated from those outside.
Source image:


Output:


2. Locate the text. This is done by basic thresholding technique. Additional thresholding is done after preprocessing the image using high pass filter to reduce effect of changing illumination. Thresholds are found by generating a histogram and searching for local minima points. The final "text segment" is intersection of both "text segments".
Soruce image:


Output:


3. Remove the text from the image. The text interferes with characterestic strips, and hence should be removed. It is done by using interpolation (built in MatLab). The image is also smoothed using Gaussian filters. Before interpolation - to reduce "lateral inhibition" effect of the camera, and after, to smooth the text removal.
Source image:


Output:


4. Find starting points for characteristic strips. According to reflectance map, maximal intensity means p = q = 0. Use parabolic approximation according to B.K.P. Horn's chapter 11 by solving equations:
Exx = a^2 + b^2, Eyy = c^2 + b^2, Exy = (a + c) * b
See full report for more details.
Maximal points are found in areas, aka "clusters". Such clusters define common height for all pixels and each cluster is identified as local maxima or local minima. Note, that there are constraints on the possibilities: two clusters of the same type cannot appear "in a row".
Source image:


Output:


As seen in the example, there are two possible solutions: max/min/max, and min/max/min. Only one solution is calcluated, which one can be specified in the program.
5. Perform characteristic strips. The initial points assumed to have zero height. Each cluster is calculated separately.
Source image:


Output:


6. Merge clusters. Find closest clusters and calculate relative height between them. The closest points are found using Voronoi diagram (built in MatLab). The expected height is calculated for points from first cluster which are close to the second cluster using interpolation. Given their height and expected height, the height shift is calculated by minimizing the error for each pixel. (See full report for more details).
7. Rebuild the surface of the sheet. It was assumed that H is constant in one direction. Therefore all partial derivatives by x are zero, meaning p = 0 in some coordinate system so that all points (p, q) located on axis y of that system. To find the axis y is the same as to find a common line for all points (p, q). Which is done by using "least squares line" calculation learned in class.
After the new coordinate system is known, projection of all points on YZ plane is calculated.
Source image:


Project on YZ:


Next, polyline approximation is used which we learned in class. Given number of desired points = number of clusters + 2, the desired error can be approximated using binary search.
Using spline approximation on edge points of the polyline. Finally, the surface can be found:

Results

I ran the program on various images and got overall satisfactory results. Soruce image:


Result:


Soruce image:


Result:


Soruce image:


Result:


Conclusions

The program has quite a bit of different parameters that might need adjustment for certain type of images: like max% intensity threshold for finding max pixels, and maximal distance to relate to the same cluster. Characteristic strips requires sufficient amount of local maxima areas in the image to produce enough data (with current reflectance map at least). Generally the result is more than one solution, and does not include full 3d information of the surface spatial structure (height are only relative, and multiplication by constant is also solution).
In spite of this, for carefully taken photos the results were pretty good. Some problems / points for improvement:
- Detect sheet of paper automatically.
- Relaxing some assumptions, for example:
- Surface being constant in one direction can be relaxed, while partial derivative being independent of perpendicular direction. Although, calculating YZ projection might be more difficult / require more data.
- Illumination from arbitrary direction. Parabolic approximation need to be changed / replaced in this case.
- Search for clusters can/should be improved (as it doesn't work well for some cases).
- Polyline approximation doesn't work well for some cases, should be replaced with better algorithm.

Additional Information

The program is implemented in MatLab, version 2010a. Copy img1.jpg - the input image, and img1f.jpg - the same image with sheet marked, to the source code folder and run main.m

References

• Introduction to Computational and Biological Vision – Prof. Ohad Ben-Shahar. Including B.K.P. Horn's book chapter 11 from mandatory reading meterial.

  • Voronoi
  • Spline