Re: The keyboard in SDL
=?UTF-8?Q? Kuon_-_Nicolas_Goy_-_=E6=99=82=E6=9C=9F=E7=B2=BE?= =?UTF-8?Q?=E9=9C=8A_=28Goyman.com_SA=29 ?= <kuon <at> goyman.com>
2007-02-02 09:10:57 GMT
On Feb 1, 2007, at 10:05 PM, Christer Sandberg wrote:
>
You should not confuse Unicode and UTF-8.
Unicode is NOT an encoding, it's a standard CHARACTER SET.
The encoding are, UTF-8, UTF-16 (little and big endian) and UTF-32
(same thing as 16 for endian).
In SDL, the unicode value is UCS4. (I think so, not 100% sure about
sdl, it's quite strange because UCS4 is 32 bit, and sdl return a
16bit value, but I think it's UCS4 but they just drop the value if
it's bigger than 65535. Sam, Ryan?)
Let me explain a bit:
UTF-32 is the UCS4 value encoded on 32 bit, simply a number, stored
in little or big endian.
For example (I took a rare kanji because it has a huge value:) )
0x2F9F4 is the UCS4 value, and can be encoding as is on 32 bit.
Now, how can I encode this on 16 or 8 bit? It's impossible!
The answer is, surrogates. This barbaric words means a "prefix" to be
used to inform the parser that our char is encoded on two unit. (a
unit is 16 or 8 bit depending of the encoding, can be up to 4 units
(Continue reading)