Character encoding in corpus construction.

McEnery, A. M.
Xiao, R. Z.

Open PDF

Open link

Publication date

January 2005

Publisher

AHDS

Language

English

Abstract

This chapter first briefly reviews the history of character encoding. Following from this is a discussion of standard and non-standard native encoding systems, and an evaluation of the efforts to unify these character codes. Then we move on to discuss Unicode as well as various Unicode Transformation Formats (UTFs). As a conclusion, we recommend that Unicode (UTF-8, to be precise) be used in corpus construction

Extracted data

We use cookies to provide a better user experience.

Data Protection

Character encoding in corpus construction.

Abstract

Extracted data

Character encoding in corpus construction.

Abstract

Extracted data

Related items

Related items