August 31, Tuesday
14:00 – 15:30
Historical Document Image Analyzing of Arabic manuscripts and Script Recognition
Graduate seminar
Lecturer : Raid Saabni
Affiliation : CS, BGU
Location : 202/37
Host : Graduate Seminar
Document image analysis (DIA) refers to the process of converting a raster image of a document page (a matrix of pixels) to
a symbolic form consisting of textual (characters, digits, punctuation, words) and graphical (lines, geometric shapes, etc.) objects.
Document descriptions in terms of these higher-level objects are significantly more compact than their image counterparts.
More importantly, the rich semantic content of such descriptions makes it possible to manipulate these documents to serve a variety
of uses such as searching them for specific patterns or classifying and combining them according to some criteria.
Most DIA systems consist of, Binarization, Page analysis and segmentation, Preprocessing, Feature extraction, Classification, and Post processing.
This talk will include a summery of some results we had, including page layout segmentation, key word searching & spotting and script recognition.