Digit Recognition in Mobile Devices Student: Jakub Kříž Supervisor: doc. RNDr. Vlastislav Dohnal, Ph.D. Main points •Thesis Objectives •Goggles •Android application – Environment •OCR – Preprocessing, Feature Extraction, Classification •OpenCV – Android, Manager, Examples, OpenCV Demo2, VLFeat •Tesseract – tess-two, NDK, Cygwin, Character recognition, android-ocr •Training Tesseract – Tesseract-OCR, Tesseract3, 3rdParty, fonts, images •Image Preprocessing – manual, automatic, Leptonica Thesis Objectives •Android Application – camera, box •OCR – digit recognition •High Success Rate •Electric, Gas, Water Meters •Extension – parts, Tomáš Lexmaul´s diploma thesis • • • • C:\development\images\DSC_0037.jpg C:\development\images\DSC_0051.jpg C:\development\images\DSC_0053.jpg Goggles •Existing Solution – box, camera •Internet •Additional Functions (Search, Translation, Sudoku, QR, sightseeings) C:\development\presentation\Screenshot_2014-11-08-22-40-40.png C:\development\presentation\Screenshot_2014-11-08-22-41-26.png C:\development\presentation\unnamed.png Android Application •developer.android.com/training •SDK (Eclipse ADT Bundle) •SDK Manager •Eclipse – Import… Existing Projects Into Workspace OCR •blog.damiles.com/2008/11/basic-ocr-in-opencv •Preprocessing – input image, filtering, size normalizing, colour converting, bounding boxes, … •Feature extraction – image conversion, vector of features to classify •Classification – feature vector, train system / classification method as knn •my topic – 1 row, digits only, gaps, artefacts C:\development\images\vzorek2.jpg C:\development\images\vyrez1.jpg OpenCV •open source computer vision and machine learning software library •2500+ optimized algorithms •detect and recognize faces, identify objects, classify human actions in videos, … •Android v2.4.9 – library as project, added as referenced project •OpenCV Manager – needs to be installed along the app., does not require NDK •examples - C++, Python, face recognition •OpenCV Demo2 – Barry Thomas, source codes, Canny Edges, Track Features, … •VLFeat – above OpenCV, visual features (HOG), statistical methods (SVM) Tesseract •Open Source OCR engine, Apache 2.0 license •Use - directly / API, languages, built-in GUI not provided •tesseract imagename outputbase [-l lang] [-psm pagesegmode] [configfile...] •tess-two (tesseract-android-tools + Leptonica ) •Android NDK 32, Cygwin •Character Recognition •android-ocr – box, settings Training Tesseract •Tesseract-OCR download •Tesseract3 - box files ([0,0] bottom-left), train, font_properties, combine • - jTessBoxEditor, Serak • - fonts vs. images (real data) • Real data problem timesitalic 1 0 0 1 0 Image Preprocessing •opened problem •manual – Image Editor - grayscale, negative, manual threshold (remove artefacts) • • • • • • • • • • • •automatic – Leptonica - Grayscale, Inversion, Binarization, Thresholding Conclusion •Thesis Objectives •Goggles •Android application •OCR •OpenCV •Tesseract •Training Tesseract •Image Preprocessing • • Thank you for your attention.