[ACCEPTED]-Platform's default charset on different platforms?-platform

Accepted answer
Score: 32

That's a user specific setting. On many 11 modern Linux systems, it's UTF-8. On Macs, it’s 10 MacRoman. In the US on Windows, it's often 9 CP1250, in Europe it's CP1252. In China, you 8 often find simplified chinese (Big5 or a 7 GB*).

But that’s the system default, which 6 each user can change at any time. Which 5 is probably the solution: Set the encoding 4 when you start your app using the system 3 property file.encoding

See this answer how to do that. I suggest to put 2 this into a small script which starts your 1 app, so the user default isn't tainted.

Score: 8

For Windows and Linux installations in the 26 "western world" I know what that 25 means.

Probably not as well as you think.

But 24 thinking about Russian or Asian platforms 23 I am totally unsure what their platform's 22 default charset is

Usually it's whatever 21 encoding is historically used in their country.

(just 20 UTF-16?).

Most definitely not. Computer usage 19 spread widely before the Unicode standard 18 existed, and each language area developed 17 one or more encodings that could support 16 its language. Those who needed less than 15 128 characters outside ASCII typically developed 14 an "extended ASCII", many of which 13 were eventually standardized as ISO-8859, while 12 others developed two-byte encodings, often 11 several competing ones. For example, in 10 Japan, emails typically use JIS, but webpages 9 use Shift-JIS, and some applications use EUC-JP. Any of 8 these might be encountered as the platform 7 default encoding in Java.

It's all a huge 6 mess, which is exactly why Unicode was developed. But 5 the mess has not yet disappeared and we 4 still have to deal with it and should not 3 make any assumptions about what encoding 2 a given bunch of bytes to be interpreted 1 as text are in. There Ain't No Such Thing as Plain Text.

More Related questions