This text is a practical guide for linguists, and programmers, who work with data in multilingual computational environments. We introduce the basic concepts needed to understand how writing systems and character encodings function, and how they work together at the intersection between the Unicode Standard and the International Phonetic Alphabet. Although these standards are often met with frustration by users, they nevertheless provide language researchers and programmers with a consistent computational architecture needed to process, publish and analyze lexical data from the world’s languages. Thus we bring to light common, but not always transparent, pitfalls which researchers face when working with Unicode and IPA. Having identified an...
This thesis describes our improvement of word sense translation for under-resourced languages utiliz...
The article presents and discusses a few African Latin orthographies. The scope of the work is set ...
This chapter first briefly reviews the history of character encoding. Following from this is a discu...
This text is a practical guide for linguists, and programmers, who work with data in multilingual co...
This text is a practical guide for linguists/ and programmers/ who work with data in multilingual co...
This text is a practical guide for linguists, and programmers, who work with data in multilingual co...
Across the world's languages and cultures, most writing systems predate the use of computers. In the...
We agree with Frost that the variety of orthographies in the world's languages complicates the task ...
The term orthography is derived from the Greek word ‘orthos’ which means ‘correct’, and ‘graphein’,...
Orthography issues are complex. Although literature about writing systems has flourished in recent y...
A universal character encoding is required to produce software that can be localized for any languag...
Writing technology is a central issue for Human Language Technology (HLT) both in terms of theory an...
This OER delves into the fascinating world of orthography and writing systems, exploring the fundame...
For those working with minority languages, one of the first needs is the ability to work with the or...
There are various competing and sometimes incompatible requirements for an orthography: phonological...
This thesis describes our improvement of word sense translation for under-resourced languages utiliz...
The article presents and discusses a few African Latin orthographies. The scope of the work is set ...
This chapter first briefly reviews the history of character encoding. Following from this is a discu...
This text is a practical guide for linguists, and programmers, who work with data in multilingual co...
This text is a practical guide for linguists/ and programmers/ who work with data in multilingual co...
This text is a practical guide for linguists, and programmers, who work with data in multilingual co...
Across the world's languages and cultures, most writing systems predate the use of computers. In the...
We agree with Frost that the variety of orthographies in the world's languages complicates the task ...
The term orthography is derived from the Greek word ‘orthos’ which means ‘correct’, and ‘graphein’,...
Orthography issues are complex. Although literature about writing systems has flourished in recent y...
A universal character encoding is required to produce software that can be localized for any languag...
Writing technology is a central issue for Human Language Technology (HLT) both in terms of theory an...
This OER delves into the fascinating world of orthography and writing systems, exploring the fundame...
For those working with minority languages, one of the first needs is the ability to work with the or...
There are various competing and sometimes incompatible requirements for an orthography: phonological...
This thesis describes our improvement of word sense translation for under-resourced languages utiliz...
The article presents and discusses a few African Latin orthographies. The scope of the work is set ...
This chapter first briefly reviews the history of character encoding. Following from this is a discu...