[ACCEPTED]-Java, UTF-8, and Windows console-console

Accepted answer
Score: 10

Try chcp 65001 && start.bat

The chcp command changes the code page, and 5 65001 is the Win32 code page identifier 4 for UTF-8 under Windows 7 and up. A code 3 page, or character encoding, specifies how 2 to convert a Unicode code point to a sequence 1 of bytes or back again.

Score: 7

Java on windows does NOT support unicode 4 ouput by default. I have written a workaround 3 method by calling Native API with JNA library.The 2 method will call WriteConsoleW for unicode 1 output on the console.

import com.sun.jna.Native;
import com.sun.jna.Pointer;
import com.sun.jna.ptr.IntByReference;
import com.sun.jna.win32.StdCallLibrary;

/** For unicode output on windows platform
 * @author Sandy_Yin
 * 
 */
public class Console {
    private static Kernel32 INSTANCE = null;

    public interface Kernel32 extends StdCallLibrary {
        public Pointer GetStdHandle(int nStdHandle);

        public boolean WriteConsoleW(Pointer hConsoleOutput, char[] lpBuffer,
                int nNumberOfCharsToWrite,
                IntByReference lpNumberOfCharsWritten, Pointer lpReserved);
    }

    static {
        String os = System.getProperty("os.name").toLowerCase();
        if (os.startsWith("win")) {
            INSTANCE = (Kernel32) Native
                    .loadLibrary("kernel32", Kernel32.class);
        }
    }

    public static void println(String message) {
        boolean successful = false;
        if (INSTANCE != null) {
            Pointer handle = INSTANCE.GetStdHandle(-11);
            char[] buffer = message.toCharArray();
            IntByReference lpNumberOfCharsWritten = new IntByReference();
            successful = INSTANCE.WriteConsoleW(handle, buffer, buffer.length,
                    lpNumberOfCharsWritten, null);
            if(successful){
                System.out.println();
            }
        }
        if (!successful) {
            System.out.println(message);
        }
    }
}
Score: 0

We had some similar problems in Linux. Our 13 code was in ISO-8859-1 (mostly cp-1252 compatible) but 12 the console was UTF-8, making the code to 11 not compile. Simply changing the console 10 to ISO-8859-1 would make the build script, in 9 UTF-8, to break. We found a couple of choices:
1- define 8 some standard encoding and sticky to it. That 7 was our choice. We choose to keep all in 6 ISO-8859-1, modifying the build scripts.
2- Setting 5 the encoding before starting any task, even 4 inside the build scripts. Some code like 3 the erickson said. In Linux was like :

lang=pt_BR.ISO-8859-1 /usr/local/xxxx

My 2 eclipse is still like this. Both do work 1 well.

More Related questions