|
| |
|
| |
|
Call Sisulizer
☎ (213) 984-4945
Outside the USA please dial +1 (213) 984-4945 |
| |
|
|
| |
|
Working with Code Pages in Windows
Applications designed on Win32, depending on the development
language or IDE, use either UNICODE or ANSI. ANSI applications,
like programs designed with Delphi and C++ Builder, use code page
encoded strings and a non-Unicode version of WIN32 API. Why do
you have to know this information?
There are several different code pages. English and most other
Western languages use code page 1252. Japanese uses 932, Russian
uses 1251, and so forth. To render an ANSI string correctly, the
application must have the right system code page. This page is
the default code page of the system. In Windows ME, 98, and 95
you cannot change the system code page.
However, in Windows 2003, XP, 2000, and NT, you can change the
system code page. This document describes how to check if the
system code page is right, and, if it isn't, how to change it.
How do I know if the system code page
is not right?
Usually, you will see the problem when trying to run your application.
If strings in the menu items and components do not show correctly
and display gibberish, the system code page is not right. This
effect is called Mojibake.
The
following screenshot shows a Japanese application that is run
on a computer where the Japanese system code page is active.
Everything is fine here. However, if the same application is
running on a computer that has the Western code page, the application
appears like this:
As you can see, hopefully without the help of your Japanese colleagues,
the strings are mojibake ("character changing" = gibberish).
Background Info
|
Use virtual machines for different code pages
The following information describes how you can learn how to change the
system code page. This is simple; however, Windows asks you to reboot your
machine every time. This is time-consuming and might interrupt your work.
A solution to this is to use a virtual machine, such as Virtual PC or VMWare.
Set up multiple virtual PCs with different code pages for the languages
you want to support and test without rebooting. While the market even
offers free editions all you need is some free space on your hard drive to follow this
practical hint. |
How do I change the system code page?
You can change the system code page using the Control Panel.
The following instructions are for English Windows XP. The procedure
is similar for Windows 2000.
Start the Control Panel, and open Regional and Language
Options. On the Regional Options tab, specify information
in the Standards and formats box and in the Location
box to match the target language and country.
On the Advanced tab, specify settings in the Language
for non-Unicode programs boxes to match the target language.
Click OK. You might need to insert the operating system
CD to install the necessary files. If you already have the files
installed, the system prompts you to use the existing files. Click
Yes. Next, the system prompts you to restart the computer.
Click Yes. You must reboot. Otherwise, the new system code
page won't be effective.
Asian languages
If you are using Asian languages, such as Chinese, Korean, Japanese
or Thai, select the options in the Supplemental language support
area on the Languages tab.
There are two written Chinese formats: Simplified and Traditional.
Be careful to choose the right one: Use Chinese (PRC) and China
for Simplified Chinese, and Chinese (Taiwan) and Taiwan for Traditional
Chinese.
Background Info
|
What is a code page and why
is it needed?
Code pages are necessary because ANSI files only have 8 bits to display a character (char).
This means there are only 256 possible characters--not nearly enough for all languages
of the world.
The American charset needs only 128 different chars = 7-bit. Because 7-bit
was a bit inefficient for computers, this led to the need for another
bit; thus, currently, another 128 possibilities are available to display chars.
On MS-DOS systems, some of these bits have been used for drawing boxes and lines. With
Windows, these boxes and lines have been removed from the charsets and
more foreign chars have been added. For the most Western languages
like English, French, German, and others, these additional chars work efficiently. For example, the German charset
needs only seven extra chars to the US charset - leaving enough space
for special chars from Spain, Norway, and so forth.
However, for certain charsets, such as Cyrillic charsets, the space was not big enough. Codepages
fill that gap. A code page in Windows is nothing more than
information, so that the upper 128 chars use some other characters. For example, instead
of the German umlaut Ü, a Cyrillic Ш appears. both of these items have
the ANSI value 205. Thus, if the Windows codepage 1252 is selected, a Ü
appears, while with the Russian Windows codepage 1251 Ш (sha) is displayed.
If code pages are used, the system cannot possibly show Ü and Ш on the same
display. This is only possible if UNICODE is used. For example, this page uses
UNICODE (UTF-8) to display both chars.
While this solves the problem for most of the languages, the code page technique
does not help languages with more than 128 special characters, such as
Japanese, Korean and Chinese. For these languages, DBCS is available.
While the lower 128 characters are still the same as in US code pages, the upper
128 are specially encoded. In this system, one character of the upper 128 chars starts
a multi-byte sequence. This means that one character is stored in one or many
chars. For example, in Japanese shift-jis, one character can use up to five bytes.
Thus, if a person writes a text file on her or his computer and does not
use UNICODE to save it, the current code page is used. If this file
is given to someone with some other current codepage, the file is not displayed
correctly. So, if you are in Western Europe or the USA, and you get a text file
from someone in Greece, Turkey, China, or Japan, the chances are high that
the file is useless to you. Kaboom can fix these problems. Simply convert
the file into UNICODE and print, edit, or use the file in any way--without losing
information. If you edit the file and you want to return it with your changes,
simply convert the file back into the code page that the receiver needs. Kaboom makes the entire process easy and quick.
|
|
|
|
Platforms |
HTML
JSP/ASP/PHP
JavaScript
Visual C++
Java
Windows Binaries
DLL/EXE/OCX
Windows Vista
32+64 Bit PE files
Reports
XML
XLIFF/TMX
INI/Text
Symbian
Pocket PC
.NET Compact Framework
J2ME
PO/POT/MO
|
|
|
|