Importing Text and CSV Files

Importing data from a text file can be very difficult task. The reason is that data can be in any format. There are dozens of different formats used in text files. This is why Sisulizer uses regular expression when importing data from text files. Expression are used to define the data format. The same expression rules are used when localizing text files. When you select a file to be imported Sisulizer tries to detect the format. It can do that of the format is quite simple. If you format is not simple you have to manually set the rules in the Import Wizard.

CSV files (Comma Separate Value files) are text files where one record is one line and each fields are separated by a delimiter characters that is most often comma (,). Unfortunately the data format is not the same on every files. Some files uses tab or colon as separator. Some contains a language pair, some other contains several languages. There are at lest three different CSV files. They are use .csv file extension and all contain text data. File format are:

File type Description Import scanner to be used
Microsoft Old CSV file This is the old Microsoft glossary file where one line contains the original (English) and translated value and items are most commonly separated by comma (,). Each line contains many other fields but the English and target strings. Comma Separated Value file
Microsoft New CSV file This is Microsoft new glossary file format where each line contains the string in several languages and each columns are separated by tab. File does not contain any other data but translations. You can donwload this file from Microsoft web pages. Microsoft Glossary file
Generic CSV file Any other "CSV" file. Most commonly exported from Excel. Each line can contain any number of fields and both translations and other data. Sometimes very difficult to import but can be done when using regular expressions. Regular expression defined text file

You can use the same regular expression rules as with other text file when importing CSV files but first two of the above file formats are quite complex and it is difficult to set the regular expressions correctly. This is why Sisulizer contains a dedicated import scanner for them. For other CSV files and all TXT files use Regular expression defined text file scanner.