characters showing in CSV file generated using opencsv.CSVWriter

huangapple go评论72阅读模式
英文:

^M characters showing in CSV file generated using opencsv.CSVWriter

问题

Csv writing Code

private void writeReversalPendingCsv(List<String[]> elements) throws IOException {
    BufferedWriter writer = null;
    CSVWriter csvWriter = null;
    String fileName = null;
    // ...
    writer = new BufferedWriter(new FileWriter(filePath));
    Character sep = new Character('|');
    csvWriter = new CSVWriter(writer,
            sep,
            new Character('
private void writeReversalPendingCsv(List<String[]> elements) throws IOException {
    BufferedWriter writer = null;
    CSVWriter csvWriter = null;
    String fileName = null;
    // ...
    writer = new BufferedWriter(new FileWriter(filePath));
    Character sep = new Character('|');
    csvWriter = new CSVWriter(writer,
            sep,
            new Character('\0'),
            CSVWriter.DEFAULT_ESCAPE_CHARACTER,
            CSVWriter.DEFAULT_LINE_END);
    for (String[] row : elements) {
        csvWriter.writeNext(row);
    }
}
'
),
CSVWriter.DEFAULT_ESCAPE_CHARACTER, CSVWriter.DEFAULT_LINE_END); for (String[] row : elements) { csvWriter.writeNext(row); } }

CSV reading

Scanner scanner = new Scanner(file.getInputStream());
List<String[]> reversalPending = new ArrayList<>();
scanner.useDelimiter("\n");
int totalRows = 0;
while (scanner.hasNext()) {
    // ...
    String line = scanner.next();
    String[] arr = line.split("\\|");
    // ... processing
    if (processing fails) {
        reversalPending.add(arr);
        writeReversalPendingCsv(reversalPending);
    }
}

So, my overall logic is -

  • read csv file
  • process each row
  • dump unprocessed rows into csv file.

If I use the csv file from the output of csv writing and feed it into the same flow, the processing works but I get an extra ^M -

2|value|hello^M^M
2|value2|hello2

Should I prevent this? How, if so?

英文:

Csv writing Code

private void writeReversalPendingCsv(List&lt;String[]&gt; elements) throws IOException {
	BufferedWriter writer = null;
	CSVWriter csvWriter = null;
	String fileName = null;
    ..
    writer = new BufferedWriter(new FileWriter(filePath));
			Character sep = new Character(&#39;|&#39;);
			csvWriter = new CSVWriter(writer,
					sep,
					new Character(&#39;
private void writeReversalPendingCsv(List&lt;String[]&gt; elements) throws IOException {
BufferedWriter writer = null;
CSVWriter csvWriter = null;
String fileName = null;
..
writer = new BufferedWriter(new FileWriter(filePath));
Character sep = new Character(&#39;|&#39;);
csvWriter = new CSVWriter(writer,
sep,
new Character(&#39;\0&#39;),
CSVWriter.DEFAULT_ESCAPE_CHARACTER,
CSVWriter.DEFAULT_LINE_END);
for (String[] row : elements) {
csvWriter.writeNext(row);
}
&#39;), CSVWriter.DEFAULT_ESCAPE_CHARACTER, CSVWriter.DEFAULT_LINE_END); for (String[] row : elements) { csvWriter.writeNext(row); }

Csv in vim mode -

2|value|hello^M
2|value2|hello2

CSV reading

Before this writing part, I also read a csv of the same format.

Scanner scanner = new Scanner(file.getInputStream());
			List&lt;String[]&gt; reversalPending = new ArrayList&lt;&gt;();
			scanner.useDelimiter(&quot;\\n&quot;);
			int totalRows = 0;
			while (scanner.hasNext()) {
                   ..
                   String line = scanner.next();
				   String[] arr = line.split(&quot;\\|&quot;);
                   .. processing 
                   if(processing fails) {
                       reversalPending.add(arr);
                       writeReversalPendingCsv(reversalPending);
                   }
            }

I process each row and depending on some condition, take these rows and write them into a csv file.

So, my overall logic is -

  • read csv file
  • process each row
  • dump unprocessed rows into csv file.

If I use the csv file from output of csv writing and feed in the same flow, the processing works but I get an extra ^M -

2|value|hello^M^M
2|value2|hello2

Should I prevent this? How, if so?

答案1

得分: 0

在vim中,^M 用于显示\r回车符号(请参考这里以查看其他“看起来奇怪的字符”)。

\r\n 是Windows用作“行尾符”的表示方式,而Unix仅使用\n

这里发生的情况是,您的输入文件使用\r\n作为行尾符,Scanner没有将\r删除,因为您只指定了\n作为分隔符。

\r\n设置为分隔符可以解决这个问题,或者更好地使用\r?\n,因为useDelimiter()的输入是正则表达式,\r后面的?表示\r是可选的,这样可以正确处理从Windows和类Unix系统写入的文件。

scanner.useDelimiter("\r?\n");
英文:

in vim ^M is used to display \r carriage return character (see this for other "weird looking characters").

\r\n is used by windows as "line ending", while unix uses just \n.

what is happening here is that your input file uses \r\n as line ending, the \r is not getting "eaten" by Scanner and ends up in the strings because you specify only \n as delimiter.

setting \r\n as delimiter fixes the issue or better \r?\n since input of useDelimiter() is a regex, ? after \r means that \r is optional and in this way it will work properly with both files written from windows and files written from unix-like systems.

scanner.useDelimiter(&quot;\r?\n&quot;);

huangapple
  • 本文由 发表于 2020年8月13日 19:11:29
  • 转载请务必保留本文链接:https://go.coder-hub.com/63393897.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定