Digit Recognition
in Mobile Devices
Student: Jakub Kříž
Supervisor: doc. RNDr. Vlastislav Dohnal, Ph.D.

Main points
•Thesis Objectives
•Goggles
•Android application – Environment
•OCR – Preprocessing, Feature Extraction, Classification
•OpenCV – Android, Manager, Examples, OpenCV Demo2, VLFeat
•Tesseract – tess-two, NDK, Cygwin, Character recognition, android-ocr
•Training Tesseract – Tesseract-OCR, Tesseract3, 3rdParty, fonts, images
•Image Preprocessing – manual, automatic, Leptonica

Thesis Objectives
•Android Application – camera, box
•OCR – digit recognition
•High Success Rate
•Electric, Gas, Water Meters
•Extension – parts, Tomáš Lexmaul´s diploma thesis
•
•
•
•
C:\development\images\DSC_0037.jpg C:\development\images\DSC_0051.jpg
C:\development\images\DSC_0053.jpg

Goggles
•Existing Solution – box, camera
•Internet
•Additional Functions (Search, Translation, Sudoku, QR, sightseeings)
C:\development\presentation\Screenshot_2014-11-08-22-40-40.png
C:\development\presentation\Screenshot_2014-11-08-22-41-26.png
C:\development\presentation\unnamed.png

Android Application
•developer.android.com/training
•SDK (Eclipse ADT Bundle)
•SDK Manager
•Eclipse – Import… Existing Projects Into Workspace

OCR
•blog.damiles.com/2008/11/basic-ocr-in-opencv
•Preprocessing – input image, filtering, size normalizing, colour converting, bounding boxes, …
•Feature extraction – image conversion, vector of features to classify
•Classification – feature vector, train system / classification method as knn
•my topic – 1 row, digits only, gaps, artefacts
C:\development\images\vzorek2.jpg C:\development\images\vyrez1.jpg

 OpenCV
•open source computer vision and machine learning software library
•2500+ optimized algorithms
•detect and recognize faces, identify objects, classify human actions in videos, …
•Android v2.4.9 – library as project, added as referenced project
•OpenCV Manager – needs to be installed along the app., does not require NDK
•examples - C++, Python, face recognition
•OpenCV Demo2 – Barry Thomas, source codes, Canny Edges, Track Features, …
•VLFeat – above OpenCV, visual features (HOG), statistical methods (SVM)

Tesseract
•Open Source OCR engine, Apache 2.0 license
•Use - directly / API, languages, built-in GUI not provided
•tesseract imagename outputbase [-l lang] [-psm pagesegmode] [configfile...]
•tess-two (tesseract-android-tools + Leptonica )
•Android NDK 32, Cygwin
•Character Recognition
•android-ocr – box, settings


Training Tesseract
•Tesseract-OCR download
•Tesseract3 - box files ([0,0] bottom-left), train, font_properties, combine
•         - jTessBoxEditor, Serak
•                          -  fonts vs. images (real data)
• Real data problem
<fontname> <italic> <bold> <fixed> <serif> <fraktur>
timesitalic        1      0 0            1             0


Image Preprocessing
•opened problem
•manual – Image Editor - grayscale, negative, manual threshold (remove artefacts)
•
•
•
•
•
•
•
•
•
•
•
•automatic – Leptonica - Grayscale, Inversion, Binarization, Thresholding

Conclusion
•Thesis Objectives
•Goggles
•Android application
•OCR
•OpenCV
•Tesseract
•Training Tesseract
•Image Preprocessing
•
•

Thank you for your attention.