如何将一个未标准化的CSV文件转换为一个复杂的JSON或Java对象

huangapple go评论74阅读模式
英文:

How to change an unnormalized csv file to a complex json or java object

问题

我有以下未规范化的 CSV 文件:

user_id,nickname,joinDate,product_id,price
1,kmh,2023-07-24,P131,3000
1,kmh,2023-07-24,P132,4000
1,kmh,2023-07-24,P133,7000
1,kmh,2023-07-24,P134,9000
2,john,2023-07-24,P135,2500
2,john,2023-07-24,P136,6000
3,alice,2023-07-25,P137,4500
3,alice,2023-07-25,P138,8000

我将把它转换为以下的 JSON 格式(或 Java 对象):

[
    {
        "user_id": 1,
        "nickname": "kmh",
        "joinDate": "2023-07-24",
        "orders": [
            {
                "product_id": "P131",
                "price": 3000
            },
            {
                "product_id": "P132",
                "price": 4000
            },
            {
                "product_id": "P133",
                "price": 7000
            },
            {
                "product_id": "P134",
                "price": 9000
            }
        ]
    },
    {
        "user_id": 2,
        "nickname": "john",
        "joinDate": "2023-07-24",
        "orders": [
            {
                "product_id": "P135",
                "price": 2500
            },
            {
                "product_id": "P136",
                "price": 6000
            }
        ]
    },
    {
        "user_id": 3,
        "nickname": "alice",
        "joinDate": "2023-07-25",
        "orders": [
            {
                "product_id": "P137",
                "price": 4500
            },
            {
                "product_id": "P138",
                "price": 8000
            }
        ]
    }
]

我已经搜索了相当长的时间,但没有找到可以实现这个功能的库或工具。

我有很多不同类型的 CSV 文件,我需要工具或库来转换所有这些。是否有任何可以实现这一目标的库或工具?

英文:

I have the following unnormalized csv file

user_id,nickname,joinDate,product_id,price
1,kmh,2023-07-24,P131,3000
1,kmh,2023-07-24,P132,4000
1,kmh,2023-07-24,P133,7000
1,kmh,2023-07-24,P134,9000
2,john,2023-07-24,P135,2500
2,john,2023-07-24,P136,6000
3,alice,2023-07-25,P137,4500
3,alice,2023-07-25,P138,8000

I'm going to change this to the following json format (or java object).

[
    {
        "user_id": 1,
        "nickname": "kmh",
        "joinDate": "2023-07-24",
        "orders": [
            {
                "product_id": "P131",
                "price": 3000
            },
            {
                "product_id": "P132",
                "price": 4000
            },
            {
                "product_id": "P133",
                "price": 7000
            },
            {
                "product_id": "P134",
                "price": 9000
            }
        ]
    },
    {
        "user_id": 2,
        "nickname": "john",
        "joinDate": "2023-07-24",
        "orders": [
            {
                "product_id": "P135",
                "price": 2500
            },
            {
                "product_id": "P136",
                "price": 6000
            }
        ]
    },
    {
        "user_id": 3,
        "nickname": "alice",
        "joinDate": "2023-07-25",
        "orders": [
            {
                "product_id": "P137",
                "price": 4500
            },
            {
                "product_id": "P138",
                "price": 8000
            }
        ]
    }
]

I've been searching for quite a long time and haven't found a library or tool that enables this .

I have so many different types of csv that I need tools or libraries to change all of these. Are there any libraries or tools that make this possible?

答案1

得分: 0

以下是您要的翻译部分:

你只需要一种方法将CSV解析为Java对象。您可以手动执行此操作,或使用现有库。
例如,您可以使用Jackson与CSV数据格式如下:
class MyRecord {
@JsonProperty("user_id")
private int userId;
private String nickname;
private LocalDate joinDate;
@JsonProperty("product_id")
private String productId;
// 获取器和设置器
// 有意义的toString方法
}
public class Main {
public static void main(String[] args) throws IOException {
String csv = "user_id,nickname,joinDate,product_id,price\n" +
"1,kmh,2023-07-24,P131,3000\n" +
"1,kmh,2023-07-24,P132,4000\n" +
"1,kmh,2023-07-24,P133,7000\n" +
"1,kmh,2023-07-24,P134,9000\n" +
"2,john,2023-07-24,P135,2500\n" +
"2,john,2023-07-24,P136,6000\n" +
"3,alice,2023-07-25,P137,4500\n" +
"3,alice,2023-07-25,P138,8000";
CsvSchema schema = CsvSchema.emptySchema().withHeader(); // 使用CSV标题读取模式
ObjectMapper mapper = new CsvMapper().registerModule(new JavaTimeModule()); // 用于反序列化Java 8 LocalDate
MappingIterator<MyRecord> resultIterator = mapper.readerFor(MyRecord.class).with(schema).readValues(csv);
while (resultIterator.hasNext()) {
System.out.println(resultIterator.next());
}
resultIterator.close();
}
} 
打印输出如下:
MyRecord[userId=1, nickname='kmh', joinDate=2023-07-24, productId='P131', price=3000]
MyRecord[userId=1, nickname='kmh', joinDate=2023-07-24, productId='P132', price=4000]
MyRecord[userId=1, nickname='kmh', joinDate=2023-07-24, productId='P133', price=7000]
MyRecord[userId=1, nickname='kmh', joinDate=2023-07-24, productId='P134', price=9000]
MyRecord[userId=2, nickname='john', joinDate=2023-07-24, productId='P135', price=2500]
MyRecord[userId=2, nickname='john', joinDate=2023-07-24, productId='P136', price=6000]
MyRecord[userId=3, nickname='alice', joinDate=2023-07-25, productId='P137', price=4500]
MyRecord[userId=3, nickname='alice', joinDate=2023-07-25, productId='P138', price=8000]
英文:

All you need is a way to parse the CSV to a Java object. You can do this manually, or by using an existing library.

For example, you can use Jackson with the CSV data format like this:

class MyRecord {
@JsonProperty(&quot;user_id&quot;)
private int userId;
private String nickname;
private LocalDate joinDate;
@JsonProperty(&quot;product_id&quot;)
private String productId;
// getters and setters
// a meaningful toString method
}
public class Main {
public static void main(String[] args) throws IOException {
String csv = &quot;user_id,nickname,joinDate,product_id,price\n&quot; +
&quot;1,kmh,2023-07-24,P131,3000\n&quot; +
&quot;1,kmh,2023-07-24,P132,4000\n&quot; +
&quot;1,kmh,2023-07-24,P133,7000\n&quot; +
&quot;1,kmh,2023-07-24,P134,9000\n&quot; +
&quot;2,john,2023-07-24,P135,2500\n&quot; +
&quot;2,john,2023-07-24,P136,6000\n&quot; +
&quot;3,alice,2023-07-25,P137,4500\n&quot; +
&quot;3,alice,2023-07-25,P138,8000&quot;;
CsvSchema schema = CsvSchema.emptySchema().withHeader(); // uses CSV header to read the schema
ObjectMapper mapper = new CsvMapper().registerModule(new JavaTimeModule()); // to deserialise Java 8 LocalDate
MappingIterator&lt;MyRecord&gt; resultIterator = mapper.readerFor(MyRecord.class).with(schema).readValues(csv);
while (resultIterator.hasNext()) {
System.out.println(resultIterator.next());
}
resultIterator.close();
}
} 

Which prints:

MyRecord[userId=1, nickname=&#39;kmh&#39;, joinDate=2023-07-24, productId=&#39;P131&#39;, price=3000]
MyRecord[userId=1, nickname=&#39;kmh&#39;, joinDate=2023-07-24, productId=&#39;P132&#39;, price=4000]
MyRecord[userId=1, nickname=&#39;kmh&#39;, joinDate=2023-07-24, productId=&#39;P133&#39;, price=7000]
MyRecord[userId=1, nickname=&#39;kmh&#39;, joinDate=2023-07-24, productId=&#39;P134&#39;, price=9000]
MyRecord[userId=2, nickname=&#39;john&#39;, joinDate=2023-07-24, productId=&#39;P135&#39;, price=2500]
MyRecord[userId=2, nickname=&#39;john&#39;, joinDate=2023-07-24, productId=&#39;P136&#39;, price=6000]
MyRecord[userId=3, nickname=&#39;alice&#39;, joinDate=2023-07-25, productId=&#39;P137&#39;, price=4500]
MyRecord[userId=3, nickname=&#39;alice&#39;, joinDate=2023-07-25, productId=&#39;P138&#39;, price=8000]

答案2

得分: 0

对于您的情况,您可以直接将CSV反序列化为JsonNode,而无需创建POJO类。然后,使用JSON库"Josson"来通过group()函数转换JSON。

String csv = "user_id,nickname,joinDate,product_id,price\n" +
        "1,kmh,2023-07-24,P131,3000\n" +
        "1,kmh,2023-07-24,P132,4000\n" +
        "1,kmh,2023-07-24,P133,7000\n" +
        "1,kmh,2023-07-24,P134,9000\n" +
        "2,john,2023-07-24,P135,2500\n" +
        "2,john,2023-07-24,P136,6000\n" +
        "3,alice,2023-07-25,P137,4500\n" +
        "3,alice,2023-07-25,P138,8000";
ArrayNode arrayNode = Josson.createArrayNode();
CsvSchema schema = CsvSchema.emptySchema().withHeader();
try (MappingIterator<JsonNode> it = new CsvMapper().readerFor(JsonNode.class).with(schema).readValues(csv)) {
    arrayNode.addAll(it.readAll());
}
Josson josson = Josson.create(arrayNode);
JsonNode grouped = josson.getNode(
        "group(map(user_id, nickname, joinDate), map(product_id, price))" +
        ".map(**:key, orders:elements)");
System.out.println(grouped.toPrettyString());

函数 group()

  1. 按{user_id, nickname, joinDate}的"key"分组
  2. 使用{product_id, price}的"elements"

函数 map()

  1. 提取对象"key"中的值
  2. 添加从"elements"复制的字段

输出如下:

[{
  "user_id": "1",
  "nickname": "kmh",
  "joinDate": "2023-07-24",
  "orders": [{
    "product_id": "P131",
    "price": "3000"
  }, {
    "product_id": "P132",
    "price": "4000"
  }, {
    "product_id": "P133",
    "price": "7000"
  }, {
    "product_id": "P134",
    "price": "9000"
  }]
}, {
  "user_id": "2",
  "nickname": "john",
  "joinDate": "2023-07-24",
  "orders": [{
    "product_id": "P135",
    "price": "2500"
  }, {
    "product_id": "P136",
    "price": "6000"
  }]
}, {
  "user_id": "3",
  "nickname": "alice",
  "joinDate": "2023-07-25",
  "orders": [{
    "product_id": "P137",
    "price": "4500"
  }, {
    "product_id": "P138",
    "price": "8000"
  }]
}]
英文:

For your case, you can deserialize the csv into JsonNode directly without creating POJO class. And then use JSON library Josson to transform the JSON by function group().

String csv = &quot;user_id,nickname,joinDate,product_id,price\n&quot; +
&quot;1,kmh,2023-07-24,P131,3000\n&quot; +
&quot;1,kmh,2023-07-24,P132,4000\n&quot; +
&quot;1,kmh,2023-07-24,P133,7000\n&quot; +
&quot;1,kmh,2023-07-24,P134,9000\n&quot; +
&quot;2,john,2023-07-24,P135,2500\n&quot; +
&quot;2,john,2023-07-24,P136,6000\n&quot; +
&quot;3,alice,2023-07-25,P137,4500\n&quot; +
&quot;3,alice,2023-07-25,P138,8000&quot;;
ArrayNode arrayNode = Josson.createArrayNode();
CsvSchema schema = CsvSchema.emptySchema().withHeader();
try (MappingIterator&lt;JsonNode&gt; it = new CsvMapper().readerFor(JsonNode.class).with(schema).readValues(csv)) {
arrayNode.addAll(it.readAll());
}
Josson josson = Josson.create(arrayNode);
JsonNode grouped = josson.getNode(
&quot;group(map(user_id, nickname, joinDate), map(product_id, price))&quot; +
&quot;.map(**:key, orders:elements)&quot;);
System.out.println(grouped.toPrettyString());

Function group()

  1. Group by "key" of {user_id, nickname, joinDate}
  2. With "elements" of {product_id, price}

Functoin map()

  1. Extract values inside object "key"
  2. Add field "orders" copy from "elements"

Output

[ {
&quot;user_id&quot; : &quot;1&quot;,
&quot;nickname&quot; : &quot;kmh&quot;,
&quot;joinDate&quot; : &quot;2023-07-24&quot;,
&quot;orders&quot; : [ {
&quot;product_id&quot; : &quot;P131&quot;,
&quot;price&quot; : &quot;3000&quot;
}, {
&quot;product_id&quot; : &quot;P132&quot;,
&quot;price&quot; : &quot;4000&quot;
}, {
&quot;product_id&quot; : &quot;P133&quot;,
&quot;price&quot; : &quot;7000&quot;
}, {
&quot;product_id&quot; : &quot;P134&quot;,
&quot;price&quot; : &quot;9000&quot;
} ]
}, {
&quot;user_id&quot; : &quot;2&quot;,
&quot;nickname&quot; : &quot;john&quot;,
&quot;joinDate&quot; : &quot;2023-07-24&quot;,
&quot;orders&quot; : [ {
&quot;product_id&quot; : &quot;P135&quot;,
&quot;price&quot; : &quot;2500&quot;
}, {
&quot;product_id&quot; : &quot;P136&quot;,
&quot;price&quot; : &quot;6000&quot;
} ]
}, {
&quot;user_id&quot; : &quot;3&quot;,
&quot;nickname&quot; : &quot;alice&quot;,
&quot;joinDate&quot; : &quot;2023-07-25&quot;,
&quot;orders&quot; : [ {
&quot;product_id&quot; : &quot;P137&quot;,
&quot;price&quot; : &quot;4500&quot;
}, {
&quot;product_id&quot; : &quot;P138&quot;,
&quot;price&quot; : &quot;8000&quot;
} ]
} ]

huangapple
  • 本文由 发表于 2023年7月24日 19:45:15
  • 转载请务必保留本文链接:https://go.coder-hub.com/76754157.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定