Create Java file with special hidden (non-printable and invisible?) characters of IBM I series (AS400) problem

huangapple go评论76阅读模式
英文:

Create Java file with special hidden (non-printable and invisible?) characters of IBM I series (AS400) problem

问题

我试图创建一个“XLS”文件,但它将作为txt文件可用。这意味着当你在Windows上右键单击并选择“打开方式”并选择记事本时,它将显示为制表符分隔的文件。

这是我试图创建的示例文件(由于太大,我必须删除一些条目)。当你在Notepad++中将其转换为UTF-8时,你会看到“隐藏字符”。

当你下载这个文件并在Notepad++中打开它时,请选择编码为UTF-8。你将能够看到这些隐藏字符:
Create Java file with special hidden (non-printable and invisible?) characters of IBM I series (AS400) problem

这是如果你在Excel中打开它时的样子:
Create Java file with special hidden (non-printable and invisible?) characters of IBM I series (AS400) problem
正如你所看到的,内容的开头会显示xA0作为隐藏字符。
现在我知道AS400使用EBCDIC字符代码。

这是他们在COBOL程序中放入的代码:

HSPACE PIC X VALUE X'41'

在Java中,这些隐藏字符的等效物是什么?

我已经创建了下面的测试程序:

 List<Object[]> data = new ArrayList<>();
 data.add(new Object[]{" AS1", "185914", " NETHERLANDS", "NL", "A0", "2023714", " 2023714", "27-AUG-2022", "03-FEB-2023", " ", "4", "00000000", " IF-ADAMAS", " ", " PTF166091NL00", " P166091NL00", " ", " ", " ", " ", " IF ADAMAS B V"});
 data.add(new Object[]{" AS1", " 200893", " GERMANY", "DE", " ", " 13801864.3", " 2915188", "05-NOV-2022", "22-FEB-2023", " R80049", "10", "00000434", " MICRONIT M", " ", " PTF124241DEEP", " P118354DEEP", " ", " ", " ", " ", " MICRONIT MICROFLUIDICS B.V."});

 FileWriter writer = new FileWriter("output.XLS", StandardCharsets.UTF_8);

 writer.write("\"Client\"\t\"Case Number\"\t\"Country\"\t\"WIPO\"\t\"Subcase\"\t\"Application Number\"\t\"Patent Number\"\t\"Due Date\"\t\"Paid Date\"\t\"Invoice Number\"\t\"Annuity Number\"\t\"Invoice Amount\"\t\"Client/Division\"\t\"Client Ref(Inv)\"\t\"Client Ref#1(Ctry)\"\t\"Client Ref#2(Ctry)\"\t\"Attorney(Inv)\"\t\"Attorney(Ctry)\"\t\"Remarks\"\t\"Local Title\"\t\"Title Holder\"\n");

 for (Object[] row : data) {
     for (int i = 0; i < row.length; i++) {
         writer.write("\"" + row[i].toString() + "\"");
         if (i < row.length - 1) {
             writer.write("\t");
         }
     }
     writer.write("\n");
 }

 writer.close();
 System.out.println("完成");

然而,当我在Notepad++中以UTF-8编码打开文件时,我什么都看不到。

虽然你会看到文本文件中有空白,但是如果你在Excel中打开我生成的文件:
Create Java file with special hidden (non-printable and invisible?) characters of IBM I series (AS400) problem
你会看到我在代码中放入的奇怪字符!
我如何在Java中创建一个文本文件,以输出类似IBM I系列(AS400)COBOL程序的“XLS / TXT”文件?有人可以帮助我吗?

英文:

I'm trying to create an "XLS" file, but it will be available as txt file. (It means when you right click and select open with in windows and choose notepad it will show as tab separated file)

This is the sample file that I'm trying to create (I have to remove some entry because it is big). When you turn it into UTF-8 in notepad++, you will see "hidden characters"
https://docs.google.com/spreadsheets/d/1q_AkGaQK8Glc6OzmVl4gRmItO4Ojnq7G/edit?usp=sharing&amp;ouid=113904619378239546124&amp;rtpof=true&amp;sd=true

When you download this file and open it in notepad++. Choose the encoding as UTF-8. You will able to see those hidden characters:
Create Java file with special hidden (non-printable and invisible?) characters of IBM I series (AS400) problem

This is what it looks like if you open it with excel:
Create Java file with special hidden (non-printable and invisible?) characters of IBM I series (AS400) problem
As you can see at the beginning of the content you will see those xA0 as the hidden characters.
Now I know that AS400 use EBCDIC characters code.

This is the code that they put in the cobol program:

HSPACE PIC X VALUE X&#39;41&#39;

What are the equivalent of that hidden characters in Java?

I have create a test program below:

 List&lt;Object[]&gt; data = new ArrayList&lt;&gt;();
	        data.add(new Object[]{&quot;\u0020 AS1&quot;, &quot;185914&quot;, &quot;\u0020 NETHERLANDS&quot;, &quot;NL&quot;, &quot;A0&quot;, &quot;\u00202023714&quot;, &quot;\u00A02023714&quot;, &quot;27-AUG-2022&quot;, &quot;03-FEB-2023&quot;, &quot;\u00A0&quot;, &quot;\u00A04&quot;, &quot;00000000&quot;, &quot;\u00A0IF-ADAMAS&quot;, &quot;\u00A0&quot;, &quot;\u00A0PTF166091NL00&quot;, &quot;\u00A0P166091NL00&quot;, &quot;\u00A0&quot;, &quot;\u00A0&quot;, &quot;\u00A0&quot;, &quot;\u00A0&quot;, &quot;\u00A0IF ADAMAS B V&quot;});
	        data.add(new Object[]{&quot;\u0020 AS1&quot;, &quot;\u0020200893&quot;, &quot;\u0020 GERMANY&quot;, &quot;DE&quot;, &quot;\u00A0&quot;, &quot;\u00A013801864.3&quot;, &quot;\u00A02915188&quot;, &quot;05-NOV-2022&quot;, &quot;22-FEB-2023&quot;, &quot;\u00A0R80049&quot;, &quot;\u00A010&quot;, &quot;00000434&quot;, &quot;\u00A0MICRONIT M&quot;, &quot;\u00A0&quot;, &quot;\u00A0PTF124241DEEP&quot;, &quot;\u00A0P118354DEEP&quot;, &quot;\u00A0&quot;, &quot;\u00A0&quot;, &quot;\u00A0&quot;, &quot;\u00A0&quot;, &quot;\u00A0MICRONIT MICROFLUIDICS B.V.&quot;});

	        FileWriter writer = new FileWriter(&quot;output.XLS&quot;, StandardCharsets.UTF_8);
	        
	        writer.write(&quot;\&quot;Client\&quot;\t\&quot;Case Number\&quot;\t\&quot;Country\&quot;\t\&quot;WIPO\&quot;\t\&quot;Subcase\&quot;\t\&quot;Application Number\&quot;\t\&quot;Patent Number\&quot;\t\&quot;Due Date\&quot;\t\&quot;Paid Date\&quot;\t\&quot;Invoice Number\&quot;\t\&quot;Annuity Number\&quot;\t\&quot;Invoice Amount\&quot;\t\&quot;Client/Division\&quot;\t\&quot;Client Ref(Inv)\&quot;\t\&quot;Client Ref#1(Ctry)\&quot;\t\&quot;Client Ref#2(Ctry)\&quot;\t\&quot;Attorney(Inv)\&quot;\t\&quot;Attorney(Ctry)\&quot;\t\&quot;Remarks\&quot;\t\&quot;Local Title\&quot;\t\&quot;Title Holder\&quot;\n&quot;);

	        for (Object[] row : data) {
	            for (int i = 0; i &lt; row.length; i++) {
	                writer.write(&quot;\&quot;&quot; + row[i].toString() + &quot;\&quot;&quot;);
	                if (i &lt; row.length - 1) {
	                    writer.write(&quot;\t&quot;);
	                }
	            }
	            writer.write(&quot;\n&quot;);
	        }

	        writer.close();
	        System.out.println(&quot;Done&quot;);

However, the file when I open in notepad++ with encoding UTF-8, I see nothing
Create Java file with special hidden (non-printable and invisible?) characters of IBM I series (AS400) problem

Although you will see there is white space like in that text file. However, if you open this file in my generated file in excel:
Create Java file with special hidden (non-printable and invisible?) characters of IBM I series (AS400) problem
You can see it has weird characters that I put in my code!
How can I create a text file in Java that output a "XLS / TXT" file like the IBM I series(AS400) cobol program? Can someone help me with this, please?

答案1

得分: 3

The file produced by AS400 is encoded (probably) with the Windows-1252 charset, Notepad++ names it ANSI. When you display it as UTF-8, you see XA0 because the way it is encoded is illegal in UTF-8.

So to produce a similar file, you have to write it with the charset 1252 too and use \u00A0 in your Java strings, so that when written, Java NIO translates it from \u00A0 to \xA0.

FileWriter writer = new FileWriter("output.XLS", Charset.forName("windows-1252"));
writer.write("\u00A0");
英文:

The file produce by AS400 is encoded (probably) with windows 1252 charset, notepad++ names it ansi. When you display it as utf8 you see XA0 because the way it is encode is illegal in utf-8.

So to produce a similar file you have to write it with charset 1252 too and use \u00A0 in your java strings, so that when writen java nio translates it from \u00a0 to \xa0

FileWriter writer = new FileWriter(&quot;output.XLS&quot;, Charset.forName(&quot;windows-1252&quot;));
writer.write(&quot;\u00a0&quot;);

huangapple
  • 本文由 发表于 2023年3月31日 03:04:31
  • 转载请务必保留本文链接:https://go.coder-hub.com/75892060.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定