Hinweis zum Urheberrecht
Report (Bericht) zugänglich unter
URN: urn:nbn:de:bvb:29-opus-28598
URL: http://www.opus.ub.uni-erlangen.de/opus/volltexte/2011/2859/
Definition and Evaluation of the NEOCR Dataset for Natural-Image Text Recognition
Nagy, Robert ;
Dicker, Anders ;
Meyer-Wegener, Klaus





| SWD-Schlagwörter: |
| Optische Zeichenerkennung , Datensatz , Photographie , Evaluation |
| Freie Schlagwörter (Deutsch): |
| natürliche Fotoaufnahmen, Szenentext |
| Freie Schlagwörter (Englisch): |
| ocr, scene text recognition, dataset, evaluation, latin characters |
| CCS - Klassifikation: |
| I.2.10 Vis , I.7.5 Docu , I.4.8 Scen |
| Fakultät: |
| Technische Fakultät |
| DDC-Sachgruppe: |
| Informatik |
| Dokumentart: |
| Report (Bericht) |
| Schriftenreihe: |
| Technical reports / Department Informatik, ISSN 2191-5008 |
| Bandnummer: |
| CS-2011,7 |
| Sprache: |
| Englisch |
| Erstellungsjahr: |
| 2011 |
| Publikationsdatum: |
| 28.09.2011 |
| Kurzfassung in Englisch: |
| Recently growing attention has been paid to recognizing text in natural images. Natural image text OCR is far more complex than OCR in scanned documents. Text in real world environments appears in arbitrary colors, font sizes and typefaces, often affected by perspective distortion, lighting effects, textures or occlusion. Currently there is no dataset publicly available that covers all aspects of natural image OCR. A comprehensive well-annotated configurable dataset for optical character recognition in natural images is defined and created for the evaluation and comparison of approaches tackling with natural-image text OCR. Furthermore, current open source and commercial OCR tools have been analyzed in various test scenarios using the proposed NEOCR dataset. Based on the results further steps to be addressed by the OCR community are concluded towards all-embracing natural-image text recognition. |