CharacterEncoding class defines the interface of the byte and
character encodings for predicates and conversions.
deferred String name;
deferred char decode byte b;
b, i.e. the Unicode character
corresponding to the byte b in the receiving encoding.
deferred byte encode char c;
c. If the byte
equivalent of the character c does not exist in the receiving
encoding, an encoding-condition is signaled, and the byte encoded is
the byteValue of the object returned, or 127 if nil is returned.
deferred boolean isAlpha byte b;
TRUE the character denoted by the byte b in the receiving
encoding is a letter.
deferred boolean isDigit byte b;
TRUE the character denoted by the byte b in the receiving
encoding is a digit.
deferred boolean isLower byte b;
TRUE the character denoted by the byte b in the receiving
encoding is a lowercase letter.
deferred boolean isPunct byte b;
TRUE the character denoted by the byte b in the receiving
encoding is a punctuation character.
deferred boolean isSpace byte b;
TRUE the character denoted by the byte b in the receiving
encoding is a space character.
deferred boolean isUpper byte b;
TRUE the character denoted by the byte b in the receiving
encoding is a uppercase letter.
deferred byte toLower byte b;
b, according to the
receiving encoding. If the character is not in uppercase, it is
returned unharmed.
deferred byte toUpper byte b;
b, according to the
receiving encoding. If the character is not in lowercase, it is
returned unharmed.
deferred int digitValue byte b;
b in the
receiving encoding.
deferred int alphaValue byte b;
b relative to the start of its
letter range. Thus, 'a' returns 0, 'f' returns 5, etc.
CharEncoding class maintains information on on
a particular mapping for encoding a subset of Unicode characters to
8-bit bytes. An example of such mappings is iso-8859-1,
which is the well known western european byte encoding, of which
USASCII is a subset.
static MutableDictionary encodings;
ByteArray
loadBytes int num
from String name
extension String ext;
num bytes from the file with the name and the extension
ext (sans dot). The full path of the file is obtained from the
main Bundle.
instance (id) named String name;
CharEncoding known as the name. This always
succeeds, as a CharEncoding reads the resources it needs on demand.
public String name;
CharArray decoding;
IntDictionary encoding;
ByteArray to_lower;
ByteArray to_upper;
ByteArray to_title;
ByteArray is_digit;
ByteArray is_letter;
ByteArray is_lower;
ByteArray is_punct;
ByteArray is_space;
ByteArray is_upper;
id init String n;
char decode byte b;
b, i.e. the Unicode character
corresponding to the byte b in the receiving encoding.
CharArray decoding;
decoding map, reading it iff necessary.
byte encode char c;
c. If the byte
equivalent of the character c does not exist in the receiving
encoding, an encoding-condition is signaled, and the byte encoded is
the byteValue of the object returned, or 127 if nil is returned.
IntDictionary encoding;
encoding map, creating it from the decoding map if
necessary.
protected ByteArray loadConversion String conversion;
conversion of the
receiving encoding.
protected ByteArray loadPredicateSet String predicate;
predicate of the
receiving encoding.
boolean isAlpha byte b;
TRUE the character denoted by the byte b in the receiving
encoding is a letter.
boolean isDigit byte b;
TRUE the character denoted by the byte b in the receiving
encoding is a digit.
boolean isLower byte b;
TRUE the character denoted by the byte b in the receiving
encoding is a lowercase letter.
boolean isPunct byte b;
TRUE the character denoted by the byte b in the receiving
encoding is a punctuation character.
boolean isSpace byte b;
TRUE the character denoted by the byte b in the receiving
encoding is a space character.
boolean isUpper byte b;
TRUE the character denoted by the byte b in the receiving
encoding is a uppercase letter.
byte toLower byte b;
b, according to the
receiving encoding. If the character is not in uppercase, it is
returned unharmed.
byte toUpper byte b;
b, according to the
receiving encoding. If the character is not in lowercase, it is
returned unharmed.
int digitValue byte b;
b in the
receiving encoding.
int alphaValue byte b;
b relative to the start of its
letter range. Thus, 'a' returns 0, 'f' returns 5, etc.
CharEncoding used during program
initialization.
static USASCIIEncoding shared;
USASCIIEncoding object.
instance (id) shared;
String name;
char decode byte b;
byte encode char c;
boolean isAlpha byte b;
boolean isDigit byte b;
boolean isLower byte b;
boolean isPunct byte b;
boolean isSpace byte b;
boolean isUpper byte b;
byte toLower byte b;
byte toUpper byte b;
int digitValue byte b;
int alphaValue byte b;