The use of advanced computational methods for the analysis of large corpora of electronic texts is becoming increasingly popular in humanities and social science research. Unfortunately, Tibetan Studies has lacked such a repository of electronic, searchable texts. The automated recognition of printed texts, known as Optical Character Recognition (OCR), offers a solution to this problem; however, until recently, robust OCR systems for the Tibetan language have not been available. In this paper, we introduce one new system, called Namsel, which uses Optical Character Recognition (OCR) to support the production, review, and distribution of searchable Tibetan texts at a large scale. Namsel tackles a number of challenges unique to the recognitio...
This article presents a pipeline that converts collections of Tibetan documents in plain text or XML...
In recent years, rapid progress has been made in computer processing of oriental languages, and the ...
Tibetan websites are important for studying the Tibetan language and Tibetan culture. The domain nam...
The use of advanced computational methods for the analysis of large corpora of electronic texts is b...
Even if the technological and digital world is expanding more quickly, there are still many things t...
This paper describes a recognition system for online handwritten Tibetan characters using advanced t...
This contribution aims at presenting a condensed version of an online catalogue of 15th and 16th cen...
Abstract-- Optical character recognition, usually abbreviated to OCR, is the mechanical or electroni...
Reading is a complex and difficult skill. The main difficulty beginning readers face is learning whi...
This paper presents a mapping of the 15th and 16th centuries Buddhist printed works from South-Weste...
In Africa around 2,500 languages are spoken. Some of these languages have their own indigenous scrip...
Automatic machine-printed OpticalCharacters or texts Recognizers (OCR) arehighly desirable for a mul...
India is a multi-lingual country. A significantly large number of scripts are used to represent thes...
The library research award for my project, “Redactions from India in Modern Tibet,” allowed me to ac...
The construction of a character dataset is an important part of the research on document analysis an...
This article presents a pipeline that converts collections of Tibetan documents in plain text or XML...
In recent years, rapid progress has been made in computer processing of oriental languages, and the ...
Tibetan websites are important for studying the Tibetan language and Tibetan culture. The domain nam...
The use of advanced computational methods for the analysis of large corpora of electronic texts is b...
Even if the technological and digital world is expanding more quickly, there are still many things t...
This paper describes a recognition system for online handwritten Tibetan characters using advanced t...
This contribution aims at presenting a condensed version of an online catalogue of 15th and 16th cen...
Abstract-- Optical character recognition, usually abbreviated to OCR, is the mechanical or electroni...
Reading is a complex and difficult skill. The main difficulty beginning readers face is learning whi...
This paper presents a mapping of the 15th and 16th centuries Buddhist printed works from South-Weste...
In Africa around 2,500 languages are spoken. Some of these languages have their own indigenous scrip...
Automatic machine-printed OpticalCharacters or texts Recognizers (OCR) arehighly desirable for a mul...
India is a multi-lingual country. A significantly large number of scripts are used to represent thes...
The library research award for my project, “Redactions from India in Modern Tibet,” allowed me to ac...
The construction of a character dataset is an important part of the research on document analysis an...
This article presents a pipeline that converts collections of Tibetan documents in plain text or XML...
In recent years, rapid progress has been made in computer processing of oriental languages, and the ...
Tibetan websites are important for studying the Tibetan language and Tibetan culture. The domain nam...