We know there is an application called AppLocale, which can change the code page of non-Unicode applications, to solve text display problems.
But there is a program whose right display code page is UTF-8, which means its text should be shown as UTF-8, but instead Windows displays it as the native code page and makes the text unreadable. It seems funny, because there are almost all countries and regions, but without UTF-8. I think it is a bug, because the programmers may use English and ignore testing non-English text display issues. I don't think the producer will fix it and I wanna fix it myself.
Is it possible to set non-Unicode output as UTF-8 by using software like AppLocale? Default non-Unicode output is native code page? How can I set the native code page to UTF-8?
13 Answers
Previously it was not possible because
Microsoft claimed a UTF-8 locale might break some functions (a possible example is
_mbsrev) as they were written to assume multibyte encodings used no more than 2 bytes per character, thus until now code pages with more bytes such as GB 18030 (cp54936) and UTF-8 could not be set as the locale.
However there's a "Beta: Use Unicode UTF-8 for worldwide language support" checkbox since Windows 10 insider build 17035 for setting the locale code page to UTF-8
See also
- Changing ansi and OEM code page in Windows
- Windows 10 Insider Preview Build 17035 Supports UTF-8 as ANSI
That said, the support is still buggy at this point
- Freeze issue in Windows 10 1803 when use UTF-8 as default code page
- when unicode beta support in windows 10 is turned on, add-ons fail to install
- UTF-8 support for single byte character sets is beta in Windows and likely breaks a lot of applications not expecting this
- Build fail with internal error in MSVC
Update:
Microsoft has also added the ability for programs to use the UTF-8 locale without even setting the UTF-8 beta flag above. You can use the /execution-charset:utf-8 or /utf-8 options when compiling with MSVC or set the ActiveCodePage property in appxmanifest
You can also use UTF-8 locale in older Windows versions by linking with the appropriate C runtime
Starting in Windows 10 build 17134 (April 2018 Update), the Universal C Runtime supports using a UTF-8 code page. This means that
charstrings passed to C runtime functions will expect strings in the UTF-8 encoding. To enable UTF-8 mode, use "UTF-8" as the code page when usingsetlocale. For example,setlocale(LC_ALL, ".utf8")will use the current default Windows ANSI code page (ACP) for the locale and UTF-8 for the code page....
To use this feature on an OS prior to Windows 10, such as Windows 7, you must use app-local deployment or link statically using version 17134 of the Windows SDK or later. For Windows 10 operating systems prior to 17134, only static linking is supported.
From what I read about Microsoft AppLocale tool on Wikipedia, the tool can NOT change your code page to UTF-8. It only works with Non-Unicode applications, but UTF-8 is part of Unicode standard.
Under the hood, Unicode processing of non-ASCII characters greatly differs from non-Unicode one, so while it is possible to change between non-Unicode code pages (this is what AppLocale does) it is NOT possible to change between Unicode and non-Unicode without modification of the application made by its producer.
2Just to mention it here: In Windows 10 17133 there is now a beta option to use UTF-8 for worldwide support. But it does not help with non-Unicode programs for me as of now, but it is placed on the pop-up where I can change the locale for non-Unicode programs.
So, maybe they are working on something to end the necessity of having to change the locale for non-Unicode programs.
3