Save Hex值或UTF-16 [LE]在Delphi 7中的文件内 [在使用这些代码时发生错误]

huangapple go评论72阅读模式
英文:

Save Hex value or UTF-16 [LE] inside a file in Delphi 7 [Error in using these codes]

问题

Here's the translation of the code-related part:

  1. 当我使用这行代码:W := ''ᗿABC''; 时,我的IDE显示 W := ''?ABC'';。我可以解决这个问题吗?因为[?]不是我的目标。

  2. 运行代码时,我遇到了图片中的错误:

Save Hex值或UTF-16 [LE]在Delphi 7中的文件内 [在使用这些代码时发生错误]

英文:

Continuing from my previous question: https://stackoverflow.com/questions/76150675/

When I use the following code from this answer:

uses
  Classes;
    
const
  BOM: WideChar = $FEFF;
var
  W: WideString;
  FS: TFileStream;
begin
  W := 'ᗿABC';
  FS := TFileStream.Create('text.txt', fmCreate);
  try
    FS.WriteBuffer(BOM, Sizeof(BOM));
    FS.WriteBuffer(PWideChar(W)^, Length(W) * Sizeof(WideChar));
  finally
    FS.Free;
  end;
end;

I have these problems:

  1. when I used this: W := 'ᗿABC'; my IDE showed W := '?ABC'; Can I fix this problem? Because [?] is not my goal

  2. When running the code, I get the error in the image:

Save Hex值或UTF-16 [LE]在Delphi 7中的文件内 [在使用这些代码时发生错误]

答案1

得分: 1

  1. Delphi 7的代码编辑器对Unicode字符支持不太好,甚至可能完全不支持。你真的应该停止使用一个25年历史的编译器,升级一下。

    无论如何,可以尝试这样做:

    W := #$15FF'ABC';
    

    或者,你可能需要使用类似这样的方式:

    W := WideChar($15FF) + WideString('ABC');
    
  2. 尝试将整数值强制转换为WideChar

    const
      BOM: WideChar = WideChar($FEFF);
    

    否则,使用Word而不是WideChar

    const
      BOM: Word = $FEFF;
    
英文:
  1. The code editor in Delphi 7 does not support Unicode characters very well, if at all. You really should stop using a 25-year-old compiler and upgrade.

    In any case, try this instead:

    W := #$15FF'ABC';
    

    Or, you may have to resort to something more like this:

    W := WideChar($15FF) + WideString('ABC');
    
  2. Try type-casting the integer value to WideChar:

    const
      BOM: WideChar = WideChar($FEFF);
    

    Otherwise, use Word instead of WideChar:

    const
      BOM: Word = $FEFF;
    

答案2

得分: 1

使用 Delphi 7 处理 WideString 中的 Unicode 有点 能够工作,但并不一致:

var
  RegExprWLineSeparators: WideString;
begin
  // 在以下行中,文本字面量最终会产生ASCII问号作为第5和第6个字符。
  RegExprWLineSeparators:= #$d#$a#$b#$c+ WideChar($2028)+ WideChar($2029)+ #$85;

  // 但逐个分配字符将使两者都正确 - 所以首先执行上面的字符
  // (或提供任何内容,因为你想重新分配它),然后按字符进行分配。
  RegExprWLineSeparators[5]:= WideChar($2028);
  RegExprWLineSeparators[6]:= WideChar($2029);

一些字符甚至不能以这种方式分配(无论是通过文本字面量还是通过序数字面量),因此您可以使用不同的方法进行测试:

var
  sText: WideString;
begin
  sText:= <something>;

  // 检查第一个字符是否是UTF-16 BE或LE BOM
  case Word(sText[1]) of
    $FEFF,
    $FFFE: Delete(sText, 1, 1);  // 移除这样的字符
  end;

规则的要点是:

  • 使用 Word 并将其强制转换为 WideChar,当使用文本字面量时
  • 在比较/检查时,使用 Word 而不是 WideChar
  • 非字符(如 U+FFFE 和 U+FFFF)通常无法分配

作为初学者,应避免在 Delphi 7 中使用 Unicode - 当您对 Unicode、UTF-16 和 Pascal 有信心时再这样做。我从 Windows 2000 开始使用 Delphi 5,后来继续使用 Delphi 7,在不同场合(正则表达式等)遇到过这样的经验。

作为 Delphi 版本的替代方案,您可以尝试免费的 Lazarus IDE 用于 FPC - 它将 UTF-8 作为 Unicode 的方法,并且应更好地处理/支持代码中的文本字面量。该 IDE 甚至看起来像强大的 Delphi 7。

英文:

Using Delphi 7 kind of works with Unicode in WideStrings, but it is not consistent:

var
  RegExprWLineSeparators: WideString;
begin
  // In the following line the literal ends up producing ASCII question marks for the 5th and 6th character.
  RegExprWLineSeparators:= #$d#$a#$b#$c+ WideChar($2028)+ WideChar($2029)+ #$85;

  // But assigning characters individually will make both correct - so first do the one above
  // (or provide anything, because you want to reassign it anyway) and later make it per character.
  RegExprWLineSeparators[5]:= WideChar($2028);
  RegExprWLineSeparators[6]:= WideChar($2029);

A few characters aren't even assignable this way (neither via text literal, nor via ordinal literal), so you use a different approach to test against these:

var
  sText: Widestring;
begin
  sText:= &lt;something&gt;;

  // Checking if the first character is a UTF-16 BE or LE BOM
  case Word(sText[1]) of
    $FEFF,
    $FFFE: Delete( sText, 1, 1 );  // Remove such a character
  end;

Thumb of rules are:

  • use Words and cast them to WideChar when using text literals
  • use Word over WideChar when comparing/checking
  • noncharacters (like U+FFFE and U+FFFF) are usually unassignable

Using Delphi 7 for Unicode as a beginner should be avoided - do this when you're confident with Unicode, UTF-16 and Pascal in general. I started this since Windows 2000 on Delphi 5 and later continued with Delphi 7, having experiences like these in different occasions (regular expressions, amongst others).

As an alternative to a dated Delphi version you could try the free Lazarus IDE for FPC - it uses UTF-8 as an approach to Unicode and should treat/support text literals in code much better. The IDE even looks like the robust Delphi 7 one.

huangapple
  • 本文由 发表于 2023年5月7日 13:24:36
  • 转载请务必保留本文链接:https://go.coder-hub.com/76192316.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定