如何使用Java从JSON数据生成CSV文件?

huangapple go评论73阅读模式
英文:

How to generated the csv file from json data with Java?

问题

以下是翻译好的内容:

我尝试从 JSON 类型的数据生成 CSV 文件。这是我的 JSON 测试数据。

{
  "realtime_start": "2020-09-25",
  "realtime_end": "2020-09-25",
  "units": "Percent",
  "series": [
    {
      "name": "James",
      "age": 29,
      "house": "CA"
    },
    {
      "name": "Jina",
      "age": 39,
      "house": "MA",
      "notes": "Million tonne punch"
    }
  ]
}

问题是 JSON 数组类型中的 "series" 并不是每个节点都包含 "notes" 键。我编写了以下 Java 代码来将此 JSON 数据转换为带有标题列的 CSV 文件。

JSONObject json = getJsonFileFromURL(...);

JSONArray docsArray = json.getJSONArray("series");
docsArray.put(json.get("realtime_start"));
docsArray.put(json.get("realtime_end"));
docsArray.put(json.get("units"));

JsonNode jsonTree = new ObjectMapper().readTree(docsArray.toString());

CsvSchema.Builder csvSchemaBuilder = CsvSchema.builder();
for (JsonNode node : jsonTree) {
    node.fieldNames().forEachRemaining(fieldName -> {
        csvSchemaBuilder.addColumn(fieldName);
    });
}

CsvSchema csvSchema = csvSchemaBuilder.build().withHeader();
CsvMapper csvMapper = new CsvMapper();
csvMapper.writerFor(JsonNode.class).with(csvSchema).writeValue(new File("test.csv"), jsonTree);

但生成的结果如下所示:

realtime_start,realtime_end,units,names,age,house,realtime_start,realtime_end,units,names,age,house,notes,realtime_start,...

生成的标题列并不包含不同的值。标题列被重复添加。如何才能生成类似下面的不同标题?

realtime_start,realtime_end,units,names,age,house,notes

有什么想法吗?

更新部分

我尝试从 FRED(FEDERAL RESERVE BANK OF ST. LOUIS)中提取数据。FRED 提供了类似以下的简单便捷的 Python API。

from fredapi import Fred 
import pandas as pd

fred = Fred(api_key='abcdefghijklmnopqrstuvwxyz0123456789')
data_unemploy = fred.search('Unemployment Rate in California')
data_unemploy.to_csv("test_unemploy.csv")

但是 Java 的 API 已经过时,所以我必须开发一个简单的 Java API,将 JSON 值转换为 CSV 文件。我通过谷歌找到了以下 Java 代码。

JSONObject json = getJsonFileFromURL("https://api.stlouisfed.org/fred/series/search?search_text=Unemployment+Rate+in+California&api_key=abcdefghijklmnopqrstuvwxyz0123456789&file_type=json");

JSONArray docsArray = json.getJSONArray("series");
docsArray.put(json.get("realtime_start"));
docsArray.put(json.get("realtime_end"));

JsonNode jsonTree = new ObjectMapper().readTree(docsArray.toString());
JsonNode firstObject = jsonTree.elements().next();  // 我在这一行遇到了困难
firstObject.fieldNames().forEachRemaining(fieldName -> {csvSchemaBuilder.addColumn(fieldName);} );
CsvSchema csvSchema = csvSchemaBuilder.build().withHeader();

CsvMapper csvMapper = new CsvMapper();
csvMapper.writerFor(JsonNode.class).with(csvSchema).writeValue(new File("test.csv"), jsonTree);

为了从 JSON 数据中提取列,JsonNode firstObject = jsonTree.elements().next(); 返回了第一个 JSON 节点。但这行代码并不返回 notes 列,因为第一行并不包含 notes 键的值。

因此,我将这行代码改成以下几行:

for (JsonNode node : jsonTree) {
    node.fieldNames().forEachRemaining(fieldName -> {
        csvSchemaBuilder.addColumn(fieldName);
    });
}

但这些行产生了我不期望的重复列,如下所示:

realtime_start,realtime_end,units,names,age,house,realtime_start,realtime_end,units,names,age,house,notes,realtime_start,...

我在这部分完全被困住了。

英文:

I try to generate csv file from json type data. These are my json test data.

{
  "realtime_start":"2020-09-25",
  "realtime_end":"2020-09-25",,
  "units": "Percent",
  "seriess": [
    {
      "name": "James",
      "age": 29,
      "house": "CA"
    },
    {
      "name": "Jina",
      "age": 39,
      "house": "MA",
      "notes": "Million tonne punch"
    },
}

The problem is json array type "seriess" does not contain "notes" node in all every nodes.
I made the below java codes to change this json data to csv file with header columns

JSONObject json = getJsonFileFromURL(...)

JSONArray docsArray = json.getJSONArray("seriess");
docsArray.put(json.get("realtime_start"));
docsArray.put(json.get("realtime_end"));
docsArray.put(json.get("units"));

JsonNode jsonTree = new ObjectMapper().readTree(docsArray.toString());
		
Builder csvSchemaBuilder = CsvSchema.builder();
for(JsonNode node : jsonTree) {
	node.fieldNames().forEachRemaining(fieldName -> {csvSchemaBuilder.addColumn(fieldName);} );
}

CsvSchema csvSchema = csvSchemaBuilder.build().withHeader();
CsvMapper csvMapper = new CsvMapper();
csvMapper.writerFor(JsonNode.class).with(csvSchema).writeValue(new File("test.csv"), jsonTree);

But the incorrect results are shown like below,

realtime_start,realtime_end,units,names,age,house,realtime_start,realtime_end,units,names,age,house,notes, realtime_start,.....

The generated header columns does not contain distinct values. The header columns are added in duplicate. How can I generate the distinct header like below

realtime_start,realtime_end,units,names,age,house, notes

Any idea?

Update Part

I try to extract data from the FRED (FEDERAL RESERVE BANK OF ST. LOUIS). FRED provide simple and convenient Python api like below,

from fredapi import Fred 
import pandas as pd

fred = Fred(api_key='abcdefghijklmnopqrstuvwxyz0123456789')
data_unemploy = fred.search('Unemployment Rate in California')
data_unemploy.to_csv("test_unemploy.csv")

But the java apis are deprecated, so I have to develop simple Java api which convert json values to csv file. I found the below Java codes with googling

JSONObject json = getJsonFileFromURL("https://api.stlouisfed.org/fred/series/search?search_text=Unemployment+Rate+in+California&api_key=abcdefghijklmnopqrstuvwxyz0123456789&file_type=json");
		
JSONArray docsArray = json.getJSONArray("seriess");
docsArray.put(json.get("realtime_start"));
docsArray.put(json.get("realtime_end"));

JsonNode jsonTree = new ObjectMapper().readTree(docsArray.toString());
JsonNode firstObject = jsonTree.elements().next();  // I am struggling with this line 
firstObject.fieldNames().forEachRemaining(fieldName -> {csvSchemaBuilder.addColumn(fieldName);} );
CsvSchema csvSchema = csvSchemaBuilder.build().withHeader();
		
CsvMapper csvMapper = new CsvMapper();
csvMapper.writerFor(JsonNode.class).with(csvSchema).writeValue(new File("test.csv"), jsonTree);

To extract columns from json data JsonNode firstObject = jsonTree.elements().next(); return the first json node. But this line does not return notes column. because the first line does not contain the notes key value.

So I change this code line to following lines

for(JsonNode node : jsonTree) {
	node.fieldNames().forEachRemaining(fieldName -> {
		csvSchemaBuilder.addColumn(fieldName);
	} );
}

But these lines throws the results which I do not expect. The repeated duplicated columns like below

realtime_start,realtime_end,units,names,age,house,realtime_start,realtime_end,units,names,age,house,notes, realtime_start,.....

I am totally stuck with this part.

答案1

得分: 1

以下是翻译好的内容:

你可以使用 Apache Commons IO 库来实现这个功能。

pom.xml

<dependency>
    <groupId>commons-io</groupId>
    <artifactId>commons-io</artifactId>
    <version>2.6</version>
</dependency>

ConvertJsonToCSVTest.java

import java.io.File;
import org.apache.commons.io.FileUtils;
import org.json.*;
public class ConvertJsonToCSVTest {
   public static void main(String[] args) throws JSONException {
      String jsonArrayString = "{\"fileName\": [{\"first name\": \"Adam\",\"last name\": \"Smith\",\"location\": \"London\"}]}";
      JSONObject output;
      try {
         output = new JSONObject(jsonArrayString);
         JSONArray docs = output.getJSONArray("fileName");
         File file = new File("EmpDetails.csv");
         String csv = CDL.toString(docs);
         FileUtils.writeStringToFile(file, csv);
         System.out.println("Data has been Sucessfully Writeen to "+ file);
         System.out.println(csv);
      }
      catch(Exception e) {
         e.printStackTrace();
      }
   }
}

输出

数据已成功写入 EmpDetails.csv
last name,first name,location
Smith,Adam,London
英文:

You can do it with a library Apache Commons IO

pom.xml

&lt;dependency&gt;
    &lt;groupId&gt;commons-io&lt;/groupId&gt;
    &lt;artifactId&gt;commons-io&lt;/artifactId&gt;
    &lt;version&gt;2.6&lt;/version&gt;
&lt;/dependency&gt;

ConvertJsonToCSVTest.java

import java.io.File;
import org.apache.commons.io.FileUtils;
import org.json.*;
public class ConvertJsonToCSVTest {
   public static void main(String[] args) throws JSONException {
      String jsonArrayString = &quot;{\&quot;fileName\&quot;: [{\&quot;first name\&quot;: \&quot;Adam\&quot;,\&quot;last name\&quot;: \&quot;Smith\&quot;,\&quot;location\&quot;: \&quot;London\&quot;}]}&quot;;
      JSONObject output;
      try {
         output = new JSONObject(jsonArrayString);
         JSONArray docs = output.getJSONArray(&quot;fileName&quot;);
         File file = new File(&quot;EmpDetails.csv&quot;);
         String csv = CDL.toString(docs);
         FileUtils.writeStringToFile(file, csv);
         System.out.println(&quot;Data has been Sucessfully Writeen to &quot;+ file);
         System.out.println(csv);
      }
      catch(Exception e) {
         e.printStackTrace();
      }
   }
}

Output

Data has been Sucessfully Writeen to EmpDetails.csv
last name,first name,location
Smith,Adam,London

答案2

得分: 1

最简单的方法可能是编写一个类似下面的 bin 类(二进制类):

public class CsvVo {

    private String realtime_start;
    private String realtime_end;
    private String units;
    private String name;
    private String age;
    private String house;
    private String notes;

    public void setRealtime_start(String realtime_start) {
        this.realtime_start = realtime_start;
    }

    // 其他 getter 和 setter 方法
}

然后你可以编写:

public class ConvertJsonToCSVTest {
    public static void main(String[] args) throws JSONException {
        String jsonArrayString = "..."; // JSON 字符串省略
        JSONObject inJson;
        List<CsvVo> list = new ArrayList<>();
        inJson = new JSONObject(jsonArrayString);
        JSONArray inJsonSeries = inJson.getJSONArray("seriess");
        for (int i = 0, size = inJsonSeries.length(); i < size; i++) {
            CsvVo line = new CsvVo();
            line.setRealtime_start(inJson.get("realtime_start").toString());
            line.setRealtime_end(inJson.get("realtime_end").toString());
            line.setUnits(inJson.get("units").toString());
            JSONObject o = (JSONObject) inJsonSeries.get(i);
            line.setName(o.get("name").toString());
            line.setAge(o.get("age").toString());
            line.setHouse(o.get("house").toString());
            try {
                line.setNotes(o.get("notes").toString());
            } catch (JSONException e) {
                line.setNotes("");
            }
            list.add(line);
        }
        String[] cols = {"realtime_start", "realtime_end", "units", "name", "age", "house", "notes"};
        CsvUtils.csvWriterUtil(CsvVo.class, list, "in/EmpDetails.csv", cols);
    }
}

csvWriterUtil 方法如下:

public static <T> void csvWriterUtil(Class<T> beanClass, List<T> data, String outputFile, String[] columnMapping) {
    try {
        Writer writer = new BufferedWriter(new FileWriter(outputFile));
        ColumnPositionMappingStrategy<T> strategy = new ColumnPositionMappingStrategy<>();
        strategy.setType(beanClass);
        strategy.setColumnMapping(columnMapping);
        StatefulBeanToCsv<T> statefulBeanToCsv = new StatefulBeanToCsvBuilder<T>(writer)
                .withMappingStrategy(strategy)
                .build();
        writer.write(String.join(",", columnMapping) + "\n");
        statefulBeanToCsv.write(data);
        writer.close();
    } catch (IOException e) {
        e.printStackTrace();
    } catch (CsvRequiredFieldEmptyException e) {
        e.printStackTrace();
    } catch (CsvDataTypeMismatchException e) {
        e.printStackTrace();
    }
}

完整示例可在 此处 的 Git 仓库中找到。

英文:

Most probably it is easiest to write a bin type class like below :

public class CsvVo {
private String realtime_start;
private String realtime_end;
private String units;
private String name;
private String age;
private String house;
private String notes;
public void setRealtime_start(String realtime_start) {
this.realtime_start = realtime_start;
}
//Other getters and Setters

Then you can Write :

public class ConvertJsonToCSVTest {
public static void main(String[] args) throws JSONException {
String jsonArrayString = &quot;{\n&quot; +
&quot;\t\&quot;realtime_start\&quot;: \&quot;2020-09-25\&quot;,\n&quot; +
&quot;\t\&quot;realtime_end\&quot;: \&quot;2020-09-25\&quot;,\n&quot; +
&quot;\t\&quot;units\&quot;: \&quot;Percent\&quot;,\n&quot; +
&quot;\t\&quot;seriess\&quot;: [{\n&quot; +
&quot;\t\t\t\&quot;name\&quot;: \&quot;James\&quot;,\n&quot; +
&quot;\t\t\t\&quot;age\&quot;: 29,\n&quot; +
&quot;\t\t\t\&quot;house\&quot;: \&quot;CA\&quot;\n&quot; +
&quot;\t\t},\n&quot; +
&quot;\t\t{\n&quot; +
&quot;\t\t\t\&quot;name\&quot;: \&quot;Jina\&quot;,\n&quot; +
&quot;\t\t\t\&quot;age\&quot;: 39,\n&quot; +
&quot;\t\t\t\&quot;house\&quot;: \&quot;MA\&quot;,\n&quot; +
&quot;\t\t\t\&quot;notes\&quot;: \&quot;Million tonne punch\&quot;\n&quot; +
&quot;\t\t}\n&quot; +
&quot;\t]\n&quot; +
&quot;}&quot;;
JSONObject inJson;
List&lt;CsvVo&gt; list = new ArrayList&lt;&gt;();
inJson = new JSONObject(jsonArrayString);
JSONArray inJsonSeries = inJson.getJSONArray(&quot;seriess&quot;);
for (int i = 0, size = inJsonSeries.length(); i &lt; size; i++){
CsvVo line = new CsvVo();
line.setRealtime_start(inJson.get(&quot;realtime_start&quot;).toString());
line.setRealtime_end(inJson.get(&quot;realtime_end&quot;).toString());
line.setUnits(inJson.get(&quot;units&quot;).toString());
JSONObject o = (JSONObject)inJsonSeries.get(i);
line.setName(o.get(&quot;name&quot;).toString());
line.setAge(o.get(&quot;age&quot;).toString());
line.setHouse(o.get(&quot;house&quot;).toString());
try {
line.setNotes(o.get(&quot;notes&quot;).toString());
}catch (JSONException e){
line.setNotes(&quot;&quot;);
}
list.add(line);
}
String[] cols = {&quot;realtime_start&quot;, &quot;realtime_end&quot;, &quot;units&quot;, &quot;name&quot;, &quot;age&quot;, &quot;house&quot;, &quot;notes&quot;};
CsvUtils.csvWriterUtil(CsvVo.class, list, &quot;in/EmpDetails.csv&quot;, cols);
}
}

csvWriterUtil is like below :

    public static &lt;T&gt; void csvWriterUtil(Class&lt;T&gt; beanClass, List&lt;T&gt; data, String outputFile, String[] columnMapping){
try{
Writer writer = new BufferedWriter(new FileWriter(outputFile));
ColumnPositionMappingStrategy&lt;T&gt; strategy = new ColumnPositionMappingStrategy&lt;&gt;();
strategy.setType(beanClass);
strategy.setColumnMapping(columnMapping);
StatefulBeanToCsv&lt;T&gt; statefulBeanToCsv =new StatefulBeanToCsvBuilder&lt;T&gt;(writer)
.withMappingStrategy(strategy)
.build();
writer.write(String.join(&quot;,&quot;,columnMapping)+&quot;\n&quot;);
statefulBeanToCsv.write(data);
writer.close();
} catch (IOException e) {
e.printStackTrace();
} catch (CsvRequiredFieldEmptyException e) {
e.printStackTrace();
} catch (CsvDataTypeMismatchException e) {
e.printStackTrace();
}
}

Full example is available in <a href="https://github.com/soumya-kole/JavaUtils/blob/master/CsvUtil/src/main/java/com/soumya/github/app/ConvertJsonToCSV.java">GitRepo</a>

huangapple
  • 本文由 发表于 2020年9月26日 14:35:44
  • 转载请务必保留本文链接:https://go.coder-hub.com/64074637.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定