Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
foundation:gsoc_geganage [2010/07/15 16:57]
lgtkaushalya
foundation:gsoc_geganage [2010/12/18 17:35] (current)
Line 94: Line 94:
 ^  Goal  ^  Measure  ^  Due date  ^  Status  ^ ^  Goal  ^  Measure  ^  Due date  ^  Status  ^
 | Integrating Tesseract with the SahanaOCR   | Accurately recognize the letters by the system using Tesseract| 06/20/2010| Completed | | Integrating Tesseract with the SahanaOCR   | Accurately recognize the letters by the system using Tesseract| 06/20/2010| Completed |
-| Training Tesseract for handwritten characters  | Accurately recognize the handwritten letters by the system using Tesseract| 06/30/2010|Completed(But there is no significant improvement in the accuracy of Tesseract, So have to use another data set ty ir again)|+| Training Tesseract for handwritten characters  | Accurately recognize the handwritten letters by the system using Tesseract| 06/30/2010|Completed(But there is no significant improvement in the accuracy of Tesseract, So have to use another data set and try it again)|
 | Solving the identified issue when processing the rotated images | Correctly process the rotated images and correctly recognize data | 07/10/2010|Completed| | Solving the identified issue when processing the rotated images | Correctly process the rotated images and correctly recognize data | 07/10/2010|Completed|
-| Modify the FormProcessor and ImageProcessor to identify forms | Correctly identify each form and process them according to its structure | 06/29/2010 | Not Yet Complete| +
-| Implement the data handling section with the other modules | Correctly send the recognized data from the OCR module to related module | 07/10/2010 |Not Yet Complete |+
 ^  Midterm Evaluation from 12th July to 16th July      ^^^^ ^  Midterm Evaluation from 12th July to 16th July      ^^^^
-Rewrite the OCR module using WxWidgets The SahanaOCR module works platform independently | 08/07/2010 |Not Yet Complete | +Developing a UI for the system | Fully handle the process by UI | 07/28/2010|Complete| 
- Final Evaluation from 16th August to 20th August      ^^^^+| Mo Training Tesseract for handwritten characters  | Accurately recognize the handwritten letters by the system using Tesseract 08/05/2010 | Complete(But there is no significant improvement in the accuracy of Tesseract)| 
 +| Integrating the scanner manager with the sahanaCOR UI | Correctly scanned the images and feed the images to the SahanaOCR | 08/16/2010 |Complete | 
 + 
 +==== Project implementation ==== 
 +Initially I have gone through sample implementations using WxWidgets to develop platform independent programs. 
 +   
 +{{:foundation:wxwidget_logo.jpg|{{:foundation:wxwidget_logo.jpg|}} 
 +   
 +But after discussed with the mentor and prioritized the suggestions and he mentioned that the Tesseract integration and the training Tesseract for the hand written characters is the most important task. So I have started integrating Tesseract as the Optical Character Recognition Engine.  
 + 
 +== Integrating Tesserat with SahanaOCR == 
 +  
 +What is Tesseract? 
 + 
 +The Tesseract OCR engine was one of the top 3 engines in the 1995 UNLV Accuracy test. Between 1995 and 2006 it had little work done on it, but it is probably one of the most accurate open source OCR engines available.    
 +So I have gone through the Tesseract documentations as well as took helps from the forum to get a great idea on Tesseract architecture. Then I have combined OpenCv library files with the Tesseract code for image handling and used the built library files of the Tesseract to communicate with its functions. Then I have used the Tesseract API (TessbaseApi ) to call for the Tesseract functions.  
 +I have successfully replaced the Tesseract with the current Fann neural network functions and did the major changes in the FormProcessor class to send the segmented image data to Tesseract. The accuracy of the results were highly improved with the Tesseract integration but no up to 100%. So I had to train the Tesseract for handwritten data to improve the accuracy. 
 + 
 +== Training Tesseract for handwritten letters ==  
 +So at start I have used a hand written data set which was available at the web to make the tessdata folder of the Tesseract trained data. Gihan (mentor) helped me to create the images from the dataset and I have created the necessary tessdata files from those images.  
 +  
 +{{:foundation:hand_written_letterq.jpg|{{:foundation:hand_written_letterq.jpg|}} 
 + 
 +A sample image I have used to train the Tesseract for handwritten “q” letter 
 + 
 + 
 +But the accuracy was not improved at that stage and most of the time Tesseract returned a segmentation fault error at the images. So then I have tried for a data set which I was written by myself.  
 + 
 +{{:foundation:handwritten2.jpg?792×208|}} 
 +  
 +A portion of a sample image I have written to train the Tesseract for handwritten letter 
 + 
 + 
 +== Recognize only letters or digits == 
 + 
 +Then to improve the accuracy I had included a new feature to the system using Tesseract. That was reading the data field type from the xform and handle the recognizing type according to that. So when it reads the data field type as  
 + 
 + 
 +<dataType> String </dataType>  
 + 
 + 
 +I have set the possible letters which could include in the data files in to letters as follows.  
 +api.SetVariable("tessedit_char_whitelist", " ABCDEFGHIJKLMNOPQRSTUVWXYZ ") 
 +So after that it only matches the letters with the images. Then id the data field is a number.   
 + 
 + 
 +<dataType>number</dataType> 
 + 
 + 
 +I have set the output to numbers. So this eliminated lots of ambiguous results which made mix the letters and numbers and improved the accuracy of the system.   
 + 
 + 
 +== Modify the rotation compensate functions == 
 + 
 + 
 +While I have working with the system I have recognized that the SahanaOCR was unable to process the rotated images at about 5 degrees. It was only able to change image upside down and process it. But for the images which were at -5 to 5 degrees and 175 to 185 degrees rotated the system does not validate the forms with the xforms.  So I had to modify the algorithm which is used to rotate the images. Then It was able to correctly rotate the images. The following two images show the correct rotation of the images by the system.   
 + 
 + 
 +            
 +{{:foundation:original_rotated.jpg?496x697|}}                    {{:foundation:horizontally_proceesed_image.jpg?496x697|}} 
 +            
 +Original image which is rotated in to 175 deg and its corresponding properly rotated image by the system 
 + 
 + 
 + 
 +In this the data filed coordinates got small deviation with the rotated images. So there were some errors with the segmented letter boxes. So we planned to handle it by applying more improved algorithm to it. I’ll list it at the todo section.  
 + 
 + 
 +== Designing the UI == 
 + 
 + 
 +Then I have worked with improving the UI features. So after discussing with the mentor I have started working with combine main functionalities as  
 + 
 +  * loading images by file system 
 +  * loading xforms to the system 
 +  * process the form and get results 
 + 
 +Then after that I have design the Log Form features to show the processing criteria of the forms. The Log Form contained following features.  
 + 
 +  * Showing the segmented letters boxes at a picturebox 
 +  * Show the recognized letter corresponding to the segmented images 
 +  * Show the full results of the form while processing 
 + 
 +Following screen shot shows the design of the Log Form. 
 +  
 +{{:foundation:logform_modified.jpg|{{:foundation:logform_modified.jpg|}} 
 + 
 +Screen shot of the Log form of the UI while running a process 
 + 
 + 
 +Then I have started working with integrating the Scanner Manager option to the UI. That was loading images directly from the scanners. Using that we can automate the process of the form loading to the system.  
 + 
 +Now the images were correctly uploaded to the system using the Scanner Manager.  
 + 
 + 
 +After all I have identified some more functionality to add to the system so it could be more usable for the users.   
 + 
 + 
 +== To do ==  
 + 
 +These are the features I have identified to improve the system further in the future. 
 + 
 + 
 +  * To improve the accuracy of the outputs we had to correctly create a training dataset for the handwritten characters using Tesseract. 
 + 
 +  * To handle the rotated images we had to change the current algorithm. The current algorithm first recognize the form the 5 black boxes at the edges of the image. Then it extract the form section which is bounded by those edges and then extract the data fields , input areas and the letter boxes according to coordinates from those edges. But if there is a small deviation in a position if any data filed all the other areas inside that does not correctly get segment. So we have to change it to first extract a little bit larger area than the data filed and then recognize the edges of the data filed using image processing and then process the fields within that. It may remove these issues with the rotated images. 
 + 
 +  * Completing the NetMngr and complete the system to upload the recognized data to its corresponding module.  
 + 
 +== User Guide == 
 + 
 +This is the link for the user guide for the features that are provided by the existing SahanaOCR system.  
 + 
 +http://wiki.sahanafoundation.org/doku.php/wiki:user:lgtkaushalya  
 + 
 +This is the link for the video demo of the current SahanaOCR application 
 + 
 +http://www.youtube.com/watch?v=Zl3KR8QEHyI  
 + 
 +Here is the link for the progress report of the SahanaOCR project during Gsoc 2010 
 + 
 +http://www.mediafire.com/?am1aerng63ni450 
 + 
 +== Conclusion == 
 + 
 +SahanaOCR is a system which came out form innovative ideas and it contains some new concept in a practical scenario. With working with the system during the project period I gained a lot of knowledge and that was a fascination era of my life. For that Gihan , Jo , Chammindra , Michel,  Hayesha and Suryagith helped me a lot and others from the community helped too. I’m willing to work with the project further more finish it as a complete project in the near future.    
  
  

QR Code
QR Code foundation:gsoc_geganage (generated for current page)