SAP Basis Notes: What does Unicode conformance require?

What does Unicode conformance require?

Chapter 3 discusses this in detail. Here's a very informal version:

Unicode characters don't fit in 8 bits; deal with it.
2 Byte order is only an issue in I/O.
If you don't know, assume big-endian.
Loose surrogates have no meaning.
Neither do U+FFFE and U+FFFF.
Leave the unassigned codepoints alone.
It's OK to be ignorant about a character, but not plain wrong.
Subsets are strictly up to you.
Canonical equivalence matters.
Don't garble what you don't understand.
Process UTF-* by the book.
Ignore illegal encodings.
Right-to-left scripts have to go by bidi rules

No comments:

Subscribe to: Post Comments (Atom)