Software Localization Localization Tool Delphi .NET

Three simple steps to localize

.NET, Delphi, C/C++, Visual Basic, Java, XML, Databases, HTML, HTMLHelp, Mobile, ...

ResX
XML Software Localization Tool
DotNet Delphi

java software localization java software localization database localization

Support

database localization
Support Picture

Forum

database localization

Localization Tool Localization Tool

More About Support

Software
VB.NET
C# Database Localization Tool
 
Software Localization Delphi Localization Localization Tool VB, VB.NET, C# Localization

Selected Customers

C++ Localization
Database Localization ResX Localization XML Localization
Mobile Localization Java Localization
 

Call Sisulizer

☎ (213) 984-4945

Outside the USA please dial
+1 (213) 984-4945

 
C# localization tag cloud C++ localization tag cloud

Tag Cloud

.NET localization tag cloud
Delphi localization tag cloud
Software localization tag cloud tag cloud
 

Working with Code Pages in Windows

Applications designed on Win32, depending on the development language or IDE, use either UNICODE or ANSI. ANSI applications, like programs designed with Delphi and C++ Builder, use code page encoded strings and a non-Unicode version of WIN32 API. Why do you have to know this information?

There are several different code pages. English and most other Western languages use code page 1252. Japanese uses 932, Russian uses 1251, and so forth. To render an ANSI string correctly, the application must have the right system code page. This page is the default code page of the system. In Windows ME, 98, and 95 you cannot change the system code page.

However, in Windows 2003, XP, 2000, and NT, you can change the system code page. This document describes how to check if the system code page is right, and, if it isn't, how to change it.

How do I know if the system code page is not right?

Usually, you will see the problem when trying to run your application. If strings in the menu items and components do not show correctly and display gibberish, the system code page is not right. This effect is called Mojibake.

The following screenshot shows a Japanese application that is run on a computer where the Japanese system code page is active.

 

Dialog with Japanese chars

Everything is fine here. However, if the same application is running on a computer that has the Western code page, the application appears like this:

 

Dialog with scrambled chars

As you can see, hopefully without the help of your Japanese colleagues, the strings are mojibake ("character changing" = gibberish).

Background Info


Use virtual machines for different code pages

The following information describes how you can learn how to change the system code page. This is simple; however, Windows asks you to reboot your machine every time. This is time-consuming and might interrupt your work.

A solution to this is to use a virtual machine, such as Virtual PC or VMWare. Set up multiple virtual PCs with different code pages for the languages you want to support and test without rebooting. While the market even offers free editions all you need is some free space on your hard drive to follow this practical hint.


How do I change the system code page?

You can change the system code page using the Control Panel. The following instructions are for English Windows XP. The procedure is similar for Windows 2000.

Start the Control Panel, and open Regional and Language Options. On the Regional Options tab, specify information in the Standards and formats box and in the Location box to match the target language and country.

On the Advanced tab, specify settings in the Language for non-Unicode programs boxes to match the target language.

Click OK. You might need to insert the operating system CD to install the necessary files. If you already have the files installed, the system prompts you to use the existing files. Click Yes. Next, the system prompts you to restart the computer. Click Yes. You must reboot. Otherwise, the new system code page won't be effective.

Asian languages

If you are using Asian languages, such as Chinese, Korean, Japanese or Thai, select the options in the Supplemental language support area on the Languages tab.

There are two written Chinese formats: Simplified and Traditional. Be careful to choose the right one: Use Chinese (PRC) and China for Simplified Chinese, and Chinese (Taiwan) and Taiwan for Traditional Chinese.

Background Info


What is a code page and why is it needed?

Code pages are necessary because ANSI files only have 8 bits to display a character (char). This means there are only 256 possible characters--not nearly enough for all languages of the world.

The American charset needs only 128 different chars = 7-bit. Because 7-bit was a bit inefficient for computers, this led to the need for another bit; thus, currently, another 128 possibilities are available to display chars.

On MS-DOS systems, some of these bits have been used for drawing boxes and lines. With Windows, these boxes and lines have been removed from the charsets and more foreign chars have been added. For the most Western languages like English, French, German, and others, these additional chars work efficiently. For example, the German charset needs only seven extra chars to the US charset - leaving enough space for special chars from Spain, Norway, and so forth.

However, for certain charsets, such as Cyrillic charsets, the space was not big enough. Codepages fill that gap. A code page in Windows is nothing more than information, so that the upper 128 chars use some other characters. For example, instead of the German umlaut Ü, a Cyrillic Ш appears. both of these items have the ANSI value 205. Thus, if the Windows codepage 1252 is selected, a Ü appears, while with the Russian Windows codepage 1251 Ш (sha) is displayed.

If code pages are used, the system cannot possibly show Ü and Ш on the same display. This is only possible if UNICODE is used. For example, this page uses UNICODE (UTF-8) to display both chars.

While this solves the problem for most of the languages, the code page technique does not help languages with more than 128 special characters, such as Japanese, Korean and Chinese. For these languages, DBCS is available. While the lower 128 characters are still the same as in US code pages, the upper 128 are specially encoded. In this system, one character of the upper 128 chars starts a multi-byte sequence. This means that one character is stored in one or many chars. For example, in Japanese shift-jis, one character can use up to five bytes.

Thus, if a person writes a text file on her or his computer and does not use UNICODE to save it, the current code page is used. If this file is given to someone with some other current codepage, the file is not displayed correctly. So, if you are in Western Europe or the USA, and you get a text file from someone in Greece, Turkey, China, or Japan, the chances are high that the file is useless to you. Kaboom can fix these problems. Simply convert the file into UNICODE and print, edit, or use the file in any way--without losing information. If you edit the file and you want to return it with your changes, simply convert the file back into the code page that the receiver needs. Kaboom makes the entire process easy and quick.

HTMLHelp localization tool
Software Localization Delphi Localization .NET Localization C# Localization

Buy Now

VB.NET Localization
C++ Localization Delphi Localization XML Localization
PO Localization

Platforms



HTML
JSP/ASP/PHP
JavaScript

Visual C++


Java

Windows Binaries
DLL/EXE/OCX
Windows Vista
32+64 Bit PE files


Reports

XML
XLIFF/TMX
INI/Text

Symbian
Pocket PC
.NET Compact Framework
J2ME

PO/POT/MO
 



Home - Support - Download - Buy - About us - Privacy statement - Impressum - Sitemap - Search - External: Blog - .de - .fi

Copyright 2006-08 Sisulizer Ltd & Co KG, except Online Help content by Sisulizer Ltd | Three simple steps to localize
The software localization tool specialists