The Flash MX supports Unicode text encoding.
In the sections below you can read about how you can use this feature when working with CabronConnector.
If you want to read some documentation first, try
this page about Flash MX and Unicode, and
this one about the UTF-8 encoding.
Also the
unicode.org site can help you in many cases.
You probably are familiar with the following names:
"ISO-8859-1",
"ISO-8859-2",
"ISO-8859-5",
"ISO-8859-7". You can see these in the
View > Encoding menu in your webbrowser.
These are used in the case when you know for sure, that a file (an html or a text file) will contain characters only from a single character set (for example you will use characters only form Central-European, Greek or Cyrillic set).
In this case you can represent each character on a single byte (0x00-0xFF), and these characters are mapped to the corresponding unicode character using some conversion tables. The rendering of the characters is done using the unicode value. You can see these character mapping tables
here.
Examples:
Char value |
Char |
Character Map |
Unicode value |
Unicode font |
Unicode image |
0x61 |
a |
ISO-8859-1 |
0x0061 |
a |
|
0x00F5 |
õ |
ISO-8859-2 |
0x0151 |
ő |
|
0x00FB |
û |
ISO-8859-2 |
0x0171 |
ű |
|
0x00E1 |
á |
ISO-8859-7 |
0x03B1 |
α |
|
0x00D4 |
Ô |
ISO-8859-5 |
0x0434 |
д |
|
UTF-8 is a multibyte encoding for Unicode, it's supported in many softwares, and it's used in Flash Mx too.
Why do we need an encoding for Unicode? It's easy to see that the 0x00-0xFF characters can be stored on a single byte, the 0x0100-0xFFFF on two bytes, and so on. The problem is that if you just simply put the characters together in this way (some characters on one byte, others on two), no one would know where are the character boundaries.
You could set that you will use four bytes for each unicode character, but in this case you'll waste a lot of space with zero bytes.
The solution is to use an encoding that will let you to detect the character boundaries, and will use only the space that is really needed. And that encoding seems to be the UTF-8.
Here is a table that presents the encoding method (you can find a detailed description
here)
Character Range |
Bit Encoding |
U+0000 - U+007F |
0xxxxxxx |
U+0080 - U+07FF |
110xxxxx 10xxxxxx |
U+0800 - U+FFFF |
1110xxxx 10xxxxxx 10xxxxxx |
U+10000 - U+10FFFF |
11110xxx 10xxxxxx 10xxxxxx 10xxxxxx |
What does this mean exactly in the case of Flash MX?
Let's take an example: You have a Flash movie with a TextField. You want to load an external file which contains a single variable/value pair, and you want to display the value in the TextField.
The code will be the following:
var loader = new LoadVars();
loader.onLoad = function(res){
if (res){
_root.textFieldInstance.text = this.var1;
}
else{
trace("error loading external file");
}
}
loader.load("datafile.txt",loader);
The
datafile.txt will look like this:
&var1=value1&
This probably will work. But what to do if you want to load a text containing some more interesting characters. Let's take the
õ (0xF5 - LATIN SMALL LETTER O WITH TILDE) character. The file will look like this:
&var1=õextra&
The textfield probably will display a character (or a square if you don't have the corresponding character on your system), and the text "ra".
Let's find out why. If you take the hex value of the first character (0xF5) and convert it to binary, it will give [11110101].
If you inspect the UTF-8 table above, you will see, that if the bit pattern starts with "11110", it means that the character consists of four consecutive bytes. That's why the "õext" sequence is taken as a single character.
All we need to do is to transform the õ (0xF5) character into it's UTF-8 representation. This can be done either on paper, or using some functions. You can use the
utf8_encode of PHP. Using this function the file will look like this:
&var1=õextra&
If you test now your movie, it should work
Of course, it's known that the data loaded into Flash should be urlencoded. This can be done using the
rawurlencode of PHP. The data file will contain:
&var1=%C3%B5extra&
To safely transfer data from external sources into Flash, you should utf8-encode, than url-encode the string values. In PHP you can use:
urlencode(utf8_encode($stringValue));
// or
rawurlencode(utf8_encode($stringValue));
Of course this is only the half of the communication: from external sources to Flash. Please read the column in the right to find out how can the Cabron Connector help you to easily transfer multilanguage content in both directions.