This text is a practical guide for linguists/ and programmers/ who work with data in multilingual computational environments. We introduce the basic concepts needed to understand how writing systems and character encodings function/ and how they work together at the intersection between the Unicode Standard and the International Phonetic Alphabet. Although these standards are often met with frustration by users/ they nevertheless provide language researchers and programmers with a consistent computational architecture needed to process/ publish and analyze lexical data from the world\u27s languages. Thus we bring to light common/ but not always transparent/ pitfalls which researchers face when working with Unicode and IPA. Having identified...
There are a multitude of programming languages in use today; dozens of very popular languages with w...
The article presents and discusses a few African Latin orthographies. The scope of the work is set ...
A linguist uses various kinds of linguistic data – both text corpora or text collections and dictio...
This text is a practical guide for linguists, and programmers, who work with data in multilingual co...
This text is a practical guide for linguists, and programmers, who work with data in multilingual co...
This text is a practical guide for linguists, and programmers, who work with data in multilingual co...
Across the world's languages and cultures, most writing systems predate the use of computers. In the...
We agree with Frost that the variety of orthographies in the world's languages complicates the task ...
The term orthography is derived from the Greek word ‘orthos’ which means ‘correct’, and ‘graphein’,...
Orthography issues are complex. Although literature about writing systems has flourished in recent y...
A universal character encoding is required to produce software that can be localized for any languag...
This OER delves into the fascinating world of orthography and writing systems, exploring the fundame...
There are various competing and sometimes incompatible requirements for an orthography: phonological...
Writing technology is a central issue for Human Language Technology (HLT) both in terms of theory an...
For those working with minority languages, one of the first needs is the ability to work with the or...
There are a multitude of programming languages in use today; dozens of very popular languages with w...
The article presents and discusses a few African Latin orthographies. The scope of the work is set ...
A linguist uses various kinds of linguistic data – both text corpora or text collections and dictio...
This text is a practical guide for linguists, and programmers, who work with data in multilingual co...
This text is a practical guide for linguists, and programmers, who work with data in multilingual co...
This text is a practical guide for linguists, and programmers, who work with data in multilingual co...
Across the world's languages and cultures, most writing systems predate the use of computers. In the...
We agree with Frost that the variety of orthographies in the world's languages complicates the task ...
The term orthography is derived from the Greek word ‘orthos’ which means ‘correct’, and ‘graphein’,...
Orthography issues are complex. Although literature about writing systems has flourished in recent y...
A universal character encoding is required to produce software that can be localized for any languag...
This OER delves into the fascinating world of orthography and writing systems, exploring the fundame...
There are various competing and sometimes incompatible requirements for an orthography: phonological...
Writing technology is a central issue for Human Language Technology (HLT) both in terms of theory an...
For those working with minority languages, one of the first needs is the ability to work with the or...
There are a multitude of programming languages in use today; dozens of very popular languages with w...
The article presents and discusses a few African Latin orthographies. The scope of the work is set ...
A linguist uses various kinds of linguistic data – both text corpora or text collections and dictio...