使用wscanf在C中处理宽字符时出现问题。

huangapple go评论91阅读模式
英文:

Problem using wscanf in c for wide characters

问题

当我尝试输入3个字符时,其中一个是 å。以下是代码部分:

  1. #ifdef _MSC_VER
  2. #include <io.h> // _setmode
  3. #include <fcntl.h> // _O_U16TEXT
  4. #endif
  5. #include <stdio.h>
  6. #include <locale.h>
  7. #include <wchar.h>
  8. #define SIZE 4
  9. void set_locale_mode() {
  10. #ifdef _MSC_VER
  11. // Unicode UTF-16, little endian byte order (BMP of ISO 10646)
  12. const char *CP_UTF_16LE = ".1200";
  13. setlocale(LC_ALL, CP_UTF_16LE);
  14. _setmode(_fileno(stdout), _O_U16TEXT);
  15. #else
  16. setlocale(LC_ALL, "");
  17. #endif
  18. }
  19. int main(void) {
  20. set_locale_mode();
  21. wchar_t myString[SIZE];
  22. wchar_t testChar=0x00E5;
  23. wprintf(L"Your test character is %lc\n", testChar);
  24. printf("Now, enter 3 characters: ");
  25. wscanf(L"%ls", myString);
  26. wprintf(L"Your input is %ls\n", myString);
  27. return 0;
  28. }

然而,当我输入例如 blå 时,我得到这个输出:

  1. Your test character is å
  2. Now, enter 3 characters: blå
  3. Your input is bl
  4. 为什么在使用 wscanf &#229; 没有正确打印出来?
  5. 我在 Windows 上。

为什么在使用 wscanf 后,å 没有正确打印出来?

英文:

I am trying to enter 3 characters, one of which is å. Here is the code:

  1. #ifdef _MSC_VER
  2. #include &lt;io.h&gt; // _setmode
  3. #include &lt;fcntl.h&gt; // _O_U16TEXT
  4. #endif
  5. #include &lt;stdio.h&gt;
  6. #include &lt;locale.h&gt;
  7. #include &lt;wchar.h&gt;
  8. #define SIZE 4
  9. void set_locale_mode() {
  10. #ifdef _MSC_VER
  11. // Unicode UTF-16, little endian byte order (BMP of ISO 10646)
  12. const char *CP_UTF_16LE = &quot;.1200&quot;;
  13. setlocale(LC_ALL, CP_UTF_16LE);
  14. _setmode(_fileno(stdout), _O_U16TEXT);
  15. #else
  16. setlocale(LC_ALL, &quot;&quot;);
  17. #endif
  18. }
  19. int main(void) {
  20. set_locale_mode();
  21. wchar_t myString[SIZE];
  22. wchar_t testChar=0x00E5;
  23. wprintf(L&quot;Your test character is %lc\n&quot;, testChar);
  24. printf(&quot;Now, enter 3 characters: &quot;);
  25. wscanf(L&quot;%ls&quot;, myString);
  26. wprintf(L&quot;Your input is %ls\n&quot;, myString);
  27. return 0;
  28. }

However when I enter for example blå I get this output:

  1. You test character is &#229;
  2. Now, enter 3 characters: bl&#229;
  3. Your input is bl

Why does not å get printed correctly after I use wscanf?

I am on windows.

答案1

得分: 3

由于您希望输入也使用 _O_U16TEXT,请将其添加到 set_locale_mode 函数中。

我还建议不要混合使用“宽”输出和窄输出,例如 wprintfprintf

  1. #if defined(_MSC_VER) || defined(__MINGW32__) || defined(__MINGW64__)
  2. // 或许已经有一个宏来定义这个,但我不知道:
  3. #define ON_WINDOWS
  4. #endif
  5. #ifdef ON_WINDOWS
  6. #define _CRT_SECURE_NO_WARNINGS
  7. #include <io.h> // _setmode
  8. #include <fcntl.h> // _O_U16TEXT
  9. // 以防万一mingw没有定义它:
  10. #ifndef _O_U16TEXT
  11. #define _O_U16TEXT (0x20000)
  12. #endif
  13. #endif
  14. #include <stdio.h>
  15. #include <locale.h>
  16. #include <wchar.h>
  17. #define SIZE 4
  18. void set_locale_mode() {
  19. #ifdef ON_WINDOWS
  20. // Unicode UTF-16,小端字节序(ISO 10646的BMP)
  21. const char* CP_UTF_16LE = ".1200";
  22. setlocale(LC_ALL, CP_UTF_16LE);
  23. _setmode(_fileno(stdin), _O_U16TEXT); // <- 添加了这一行
  24. _setmode(_fileno(stdout), _O_U16TEXT);
  25. #else
  26. setlocale(LC_ALL, "");
  27. #endif
  28. }
  29. int main(void) {
  30. set_locale_mode();
  31. wchar_t myString[SIZE];
  32. wchar_t testChar = u'\u00E5'; // 显示意图比0x00E5更清晰
  33. wprintf(L"您的测试字符是 %lc\n", testChar);
  34. wprintf(L"现在,请输入3个字符:"); // <- 在这里使用wprintf
  35. wscanf(L"%3ls", myString); // <- 由于SIZE为4,限制为3个字符
  36. wprintf(L"您的输入是 %ls\n", myString);
  37. }
英文:

Since you want input to be using _O_U16TEXT too, add that to the set_locale_mode function.

I also suggest not mixing "wide" output with narrow, like wprintf and printf.

  1. #if defined(_MSC_VER) || defined(__MINGW32__) || defined(__MINGW64__)
  2. // perhaps there is already a macro for this, but I don&#39;t know of one:
  3. #define ON_WINDOWS
  4. #endif
  5. #ifdef ON_WINDOWS
  6. #define _CRT_SECURE_NO_WARNINGS
  7. #include &lt;io.h&gt; // _setmode
  8. #include &lt;fcntl.h&gt; // _O_U16TEXT
  9. // just in case mingw doesn&#39;t define it after all:
  10. #ifndef _O_U16TEXT
  11. #define _O_U16TEXT (0x20000)
  12. #endif
  13. #endif
  14. #include &lt;stdio.h&gt;
  15. #include &lt;locale.h&gt;
  16. #include &lt;wchar.h&gt;
  17. #define SIZE 4
  18. void set_locale_mode() {
  19. #ifdef ON_WINDOWS
  20. // Unicode UTF-16, little endian byte order (BMP of ISO 10646)
  21. const char* CP_UTF_16LE = &quot;.1200&quot;;
  22. setlocale(LC_ALL, CP_UTF_16LE);
  23. _setmode(_fileno(stdin), _O_U16TEXT); // &lt;- Added
  24. _setmode(_fileno(stdout), _O_U16TEXT);
  25. #else
  26. setlocale(LC_ALL, &quot;&quot;);
  27. #endif
  28. }
  29. int main(void) {
  30. set_locale_mode();
  31. wchar_t myString[SIZE];
  32. wchar_t testChar = u&#39;\u00E5&#39;; // shows intent clearer than 0x00E5
  33. wprintf(L&quot;Your test character is %lc\n&quot;, testChar);
  34. wprintf(L&quot;Now, enter 3 characters: &quot;); // &lt;- Use wprintf here
  35. wscanf(L&quot;%3ls&quot;, myString); // &lt;- limit to 3 since SIZE is 4
  36. wprintf(L&quot;Your input is %ls\n&quot;, myString);
  37. }

答案2

得分: 2

代码遇到了未定义行为(UB),因为它尝试使用不同的_方向_进行写入。

每个流都有一个_方向_。在将流关联到外部文件之后,但在对其执行任何操作之前,该流是未定向的。一旦对未定向流应用了宽字符输入/输出函数,该流就会成为宽定向流。同样,一旦对未定向流应用了字节输入/输出函数,该流就会成为字节定向流。只有对freopen函数或fwide函数的调用才能否则更改流的方向。(对freopen的成功调用会删除任何方向。)
不得将字节输入/输出函数应用于宽定向流,也不得将宽字符输入/输出函数应用于字节定向流。...
C23dr § 7.23.2 4-5

使用与@Ted Lyngmo建议的相同方向,或者移除方向。

  1. wprintf(L"Your test character is %lc\n", testChar);
  2. freopen(NULL, "w", stdout);
  3. printf("Now, enter 3 characters: ");
  4. freopen(NULL, "w", stdout);
  5. wprintf(L"...");
英文:

Code runs into undefined behavior (UB) as it attempts to write using a different orientation.

> Each stream has an orientation. After a stream is associated with an external file, but before any operations are performed on it, the stream is unoriented. Once a wide character input/output function has been applied to an unoriented stream, the stream becomes a wide-oriented stream. Similarly, once a byte input/output function has been applied to an unoriented stream, the stream becomes a byte-oriented stream. Only a call to the freopen function or the fwide function can otherwise alter the orientation of a stream. (A successful call to freopen removes any orientation.)
> Byte input/output functions shall not be applied to a wide-oriented stream and wide character input/output functions shall not be applied to a byte-oriented stream. ...
> C23dr § 7.23.2 4-5

Use the same orientation as well suggested by @Ted Lyngmo or remove the orientation.

  1. wprintf(L&quot;Your test character is %lc\n&quot;, testChar);
  2. freopen(NULL, &quot;w&quot;, stdout);
  3. printf(&quot;Now, enter 3 characters: &quot;);
  4. freopen(NULL, &quot;w&quot;, stdout);
  5. wprintf(L&quot;...&quot;);

huangapple
  • 本文由 发表于 2023年6月29日 00:29:17
  • 转载请务必保留本文链接:https://go.coder-hub.com/76575090.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定