Scene Text Recognition (STR) focuses on extracting textual information embedded within natural scene images, playing a vital role in applications like document analysis, and automated translation systems. Despite its importance, STR remains a challenging task due to the diversity of text styles, orientations, and backgrounds in real-world images.
My work in STR aims to address several key challenges: improving the recognition of irregular and non-horizontal text arrangements, enhancing the handling of noisy, low-resolution images, and developing methods that generalize across languages. Current models often struggle with linguistic variations, diacritic characters, and diverse font styles, making them less applicable to non-English languages or real-world conditions.
To tackle these issues, my research emphasizes the creation of robust datasets and advanced models capable of handling multi-language text, diverse fonts, and environmental conditions. This includes exploring novel synthetic data generation techniques and innovative recognition architectures tailored for scene text's unique challenges.