Imran Abdullah

A multidisciplinary team of American researchers is working on the development of Arabic and Farsi visual recognition technology, facilitating the conversion of books and manuscripts to digitally adaptable texts to overcome the lack of practical and effective visual knowledge tools currently used in Arabic.

The Andrew W. Mellon Foundation, which works in charitable and educational activities in the United States, has awarded a two-year grant worth $ 800,000 to the University of Maryland-College Park to develop technology that facilitates digital access to a wide range of literature and texts written in Persian and Arabic in pre-modern times.

The open Islamic texts initiative will support the recognition and visual scanning of Arabic alphabet, which means that images and scripts containing texts in Arabic and Persian can be converted to digital texts that can be electronically processed.

The initiative works to develop easy-to-use, open-source programs capable of creating these digital texts directly from Persian and Arabic books.

Matthew Thomas Miller, associate professor at the Rushan Institute of Persian Studies at the University of Maryland, leads a multidisciplinary team of researchers, including computer science, information, programming, Islamic history and humanities.

The Islamic Open Text Initiative will support the recognition and visual scanning of the Arabic alphabet (Al Jazeera)

Unread text
"We realized that there were attempts in different areas to create tools to digitize Persian and Arabic documents," Miller said. "But there was not much communication across the different fields, and these new tools did not get into the hands of users."

Until now, the development of "digitization" software has focused primarily on Latin-language languages, and in many cases requires specialized knowledge to operate it.

The digitization tools of existing Arabic texts lack accuracy and are often prohibitively expensive for academic, research and public users.

The project team emphasizes the amazing amount of Arabic and Persian texts produced in the pre-modern era, making it impossible to read them all, even throughout the scientific age.

Miller said the digitally unreadable texts were "a potential treasure" and should be made digital to facilitate their discovery, study and content collection by researchers.