Up   Previous   Next  

Characters, Character Encodings, and Unicode

A character is a symbolic representation of a letter, a number, a punctuation mark, or any other mark used in text; it is the concept of, for example, "lowercase a" or "number 3."

In computer memory, text is stored as character codes , where each code is a numeric value that defines a particular character. A character encoding is the organization of the set of numeric codes that represent all the meaningful characters of a script system in memory. There are two fundamental classes of Mac OS character encodings: 1 byte and 2 byte.

A writing system is a set of characters and the basic rules for using them to create a visual depiction of language. Examples of writing systems are Roman, Japanese, Arabic, and Hebrew. Unicode is an international standard that combines the characters for all commonly used writing systems into a single, coded character set, based upon a 16-bit character encoding standard. With a universal character encoding such as Unicode, the character sets of separate writing systems do not overlap. Furthermore, Unicode resolves the issue of conflicting character encodings within a single writing system.


Copyright © 2001 Apple Computer, Inc. (Last Updated January 11, 2001)

Up   Previous   Next