A Robust Script Identification System for Historical Indian Document Images

S. Kavitha; P. Shivakumara; G. Hemantha Kumar; C. L. Tan

Authors

S. Kavitha Department of Studies in Computer Science, University of Mysore-Karnataka
P. Shivakumara Faculty of Computer Science and Information Technology, University of Malaya
G. Hemantha Kumar Department of Studies in Computer Science, University of Mysore-Karnataka
C. L. Tan School of Computing, National University of Singapore

Keywords:

Indus document, Dominant points, Proximity matrix, Variance, Indian scripts identification

Abstract

Automatic script identification in archives of documents is essential for searching a specific document in order to choose an appropriate Optical Character Recognizer (OCR) for recognition. Besides, identification of one of the oldest historical documents such as Indus scripts is challenging and interesting because of inter script similarities. In this work, we propose a new robust script identification system for Indian scripts that includes Indus documents and other scripts, namely, English, Kannada, Tamil, Telugu, Hindi and Gujarati which helps in selecting an appropriate OCR for recognition. The proposed system explores the spatial relationship between dominant points,namely, intersection points, end points and junction points of the connected components in the documents to extract the structure of the components. The degree of similarity between the scripts is studied by computing the variances of the proximity matrices of dominant points of the respective scripts. The method is evaluated on 700 scanned document images. Experimentalresults show that the proposed system outperforms the existing methods in terms of classification rate.

A Robust Script Identification System for Historical Indian Document Images

Authors

Keywords:

Abstract

Downloads

Published

Issue

Section

Most read articles by the same author(s)

Editorial Information

Scope

Submission Guidelines

Indexing

Article Publication Charge

Journal Template

Special Issue

In Press Publication

Awards

Information

Conference

Articles

Top Cited Articles

Most View Articles

Publishing Timeline