How does Unicode play in internationalization?

Unicode is the new foundation for this process of internationalization. Older codepages were difficult to use, and have inconsistent definitions for characters. Internationalizing your code while using the same code base is complex, since you would have to support different character sets--with different architectures--for different markets.

But modern business requirements are even stronger; programs have to handle characters from a wide variety of languages at the same time: the EU alone requires several different older character sets to cover all its languages. Mixing older character sets together is a nightmare, since all data has to be tagged, and mixing data from different sources is nearly impossible to do reliably.

With Unicode, a single internationalization process can produce code that handles the requirements of all the world markets at the same time. Since Unicode has a single definition for each character, you don't get data corruption problems that plague mixed codeset programs. Since it handles the characters for all the world markets in a uniform way, it avoids the complexities of different character code architectures. All of the modern operating systems, from PCs to mainframes, support Unicode now or are actively developing support for it. The same is true of databases, as well

No comments:

topics