英文:
Problem using wscanf in c for wide characters
问题
当我尝试输入3个字符时,其中一个是 å。以下是代码部分:
#ifdef _MSC_VER
#include <io.h>     // _setmode
#include <fcntl.h>  // _O_U16TEXT
#endif
#include <stdio.h>
#include <locale.h>
#include <wchar.h>
#define SIZE 4
void set_locale_mode() {
   #ifdef _MSC_VER
   // Unicode UTF-16, little endian byte order (BMP of ISO 10646)
   const char *CP_UTF_16LE = ".1200";
   setlocale(LC_ALL, CP_UTF_16LE);
   _setmode(_fileno(stdout), _O_U16TEXT);
   #else
      setlocale(LC_ALL, "");
   #endif
}
int main(void) {
   set_locale_mode(); 
   wchar_t myString[SIZE];
   wchar_t testChar=0x00E5;
   wprintf(L"Your test character is %lc\n", testChar);
   printf("Now, enter 3 characters: ");
   wscanf(L"%ls", myString);
   wprintf(L"Your input is %ls\n", myString);
   return 0;
}
然而,当我输入例如 blå 时,我得到这个输出:
Your test character is å
Now, enter 3 characters: blå
Your input is bl�
为什么在使用 wscanf 后 å 没有正确打印出来?
我在 Windows 上。
为什么在使用 wscanf 后,å 没有正确打印出来?
英文:
I am trying to enter 3 characters, one of which is å. Here is the code:
#ifdef _MSC_VER
#include <io.h>     // _setmode
#include <fcntl.h>  // _O_U16TEXT
#endif
#include <stdio.h>
#include <locale.h>
#include <wchar.h>
#define SIZE 4
void set_locale_mode() {
   #ifdef _MSC_VER
   // Unicode UTF-16, little endian byte order (BMP of ISO 10646)
   const char *CP_UTF_16LE = ".1200";
   setlocale(LC_ALL, CP_UTF_16LE);
   _setmode(_fileno(stdout), _O_U16TEXT);
   #else
      setlocale(LC_ALL, "");
   #endif
}
int main(void) {
   set_locale_mode(); 
   wchar_t myString[SIZE];
   wchar_t testChar=0x00E5;
   wprintf(L"Your test character is %lc\n", testChar);
   printf("Now, enter 3 characters: ");
   wscanf(L"%ls", myString);
   wprintf(L"Your input is %ls\n", myString);
   return 0;
}
However when I enter for example blå I get this output:
You test character is å
Now, enter 3 characters: blå 
Your input is bl┼
Why does not å get printed correctly after I use wscanf?
I am on windows.
答案1
得分: 3
由于您希望输入也使用 _O_U16TEXT,请将其添加到 set_locale_mode 函数中。
我还建议不要混合使用“宽”输出和窄输出,例如 wprintf 和 printf。
#if defined(_MSC_VER) || defined(__MINGW32__) || defined(__MINGW64__)
// 或许已经有一个宏来定义这个,但我不知道:
#define ON_WINDOWS
#endif
#ifdef ON_WINDOWS
#define _CRT_SECURE_NO_WARNINGS
#include <io.h>     // _setmode
#include <fcntl.h>  // _O_U16TEXT
// 以防万一mingw没有定义它:
#ifndef _O_U16TEXT
#define _O_U16TEXT (0x20000)
#endif
#endif
#include <stdio.h>
#include <locale.h>
#include <wchar.h>
#define SIZE 4
void set_locale_mode() {
#ifdef ON_WINDOWS
    // Unicode UTF-16,小端字节序(ISO 10646的BMP)
    const char* CP_UTF_16LE = ".1200";
    setlocale(LC_ALL, CP_UTF_16LE);
    _setmode(_fileno(stdin), _O_U16TEXT);                // <- 添加了这一行
    _setmode(_fileno(stdout), _O_U16TEXT);
#else
    setlocale(LC_ALL, "");
#endif
}
int main(void) {
    set_locale_mode();
    wchar_t myString[SIZE];
    wchar_t testChar = u'\u00E5'; // 显示意图比0x00E5更清晰
    wprintf(L"您的测试字符是 %lc\n", testChar);
    wprintf(L"现在,请输入3个字符:");              // <- 在这里使用wprintf
    wscanf(L"%3ls", myString); // <- 由于SIZE为4,限制为3个字符
    wprintf(L"您的输入是 %ls\n", myString);
}
英文:
Since you want input to be using _O_U16TEXT too, add that to the set_locale_mode function.
I also suggest not mixing "wide" output with narrow, like wprintf and printf.
#if defined(_MSC_VER) || defined(__MINGW32__) || defined(__MINGW64__)
// perhaps there is already a macro for this, but I don't know of one:
#define ON_WINDOWS
#endif
#ifdef ON_WINDOWS
#define _CRT_SECURE_NO_WARNINGS
#include <io.h>     // _setmode
#include <fcntl.h>  // _O_U16TEXT
// just in case mingw doesn't define it after all:
#ifndef _O_U16TEXT
#define _O_U16TEXT (0x20000)
#endif
#endif
#include <stdio.h>
#include <locale.h>
#include <wchar.h>
#define SIZE 4
void set_locale_mode() {
#ifdef ON_WINDOWS
    // Unicode UTF-16, little endian byte order (BMP of ISO 10646)
    const char* CP_UTF_16LE = ".1200";
    setlocale(LC_ALL, CP_UTF_16LE);
    _setmode(_fileno(stdin), _O_U16TEXT);                // <- Added
    _setmode(_fileno(stdout), _O_U16TEXT);
#else
    setlocale(LC_ALL, "");
#endif
}
int main(void) {
    set_locale_mode();
    wchar_t myString[SIZE];
    wchar_t testChar = u'\u00E5'; // shows intent clearer than 0x00E5
    wprintf(L"Your test character is %lc\n", testChar);
    wprintf(L"Now, enter 3 characters: ");              // <- Use wprintf here
    wscanf(L"%3ls", myString); // <- limit to 3 since SIZE is 4
    wprintf(L"Your input is %ls\n", myString);
}
答案2
得分: 2
代码遇到了未定义行为(UB),因为它尝试使用不同的_方向_进行写入。
每个流都有一个_方向_。在将流关联到外部文件之后,但在对其执行任何操作之前,该流是未定向的。一旦对未定向流应用了宽字符输入/输出函数,该流就会成为宽定向流。同样,一旦对未定向流应用了字节输入/输出函数,该流就会成为字节定向流。只有对
freopen函数或fwide函数的调用才能否则更改流的方向。(对freopen的成功调用会删除任何方向。)
不得将字节输入/输出函数应用于宽定向流,也不得将宽字符输入/输出函数应用于字节定向流。...
C23dr § 7.23.2 4-5
使用与@Ted Lyngmo建议的相同方向,或者移除方向。
wprintf(L"Your test character is %lc\n", testChar);
freopen(NULL, "w", stdout);
printf("Now, enter 3 characters: ");
freopen(NULL, "w", stdout);
wprintf(L"...");
英文:
Code runs into undefined behavior (UB) as it attempts to write using a different orientation.
> Each stream has an orientation. After a stream is associated with an external file, but before any operations are performed on it, the stream is unoriented. Once a wide character input/output function has been applied to an unoriented stream, the stream becomes a wide-oriented stream. Similarly, once a byte input/output function has been applied to an unoriented stream, the stream becomes a byte-oriented stream. Only a call to the freopen function or the fwide function can otherwise alter the orientation of a stream.  (A successful call to freopen removes any orientation.)
> Byte input/output functions shall not be applied to a wide-oriented stream and wide character input/output functions shall not be applied to a byte-oriented stream. ...
> C23dr § 7.23.2 4-5
Use the same orientation as well suggested by @Ted Lyngmo or remove the orientation.
wprintf(L"Your test character is %lc\n", testChar);
freopen(NULL, "w", stdout);
printf("Now, enter 3 characters: ");
freopen(NULL, "w", stdout);
wprintf(L"...");
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。


评论