如何使用 map 处理对象以获得 CSV 输出。

huangapple go评论75阅读模式
英文:

How to process object with map for CSV output

问题

我有一组如下的对象,我需要写入CSV:

public class OutputObject {
    private String userId;
    private Map<String, Object> behaviour;
}

上述集合中的对象可能包含一个具有两个、三个或四个值的映射。

[
OutputObject1 [userId=11, behaviours={color=white, size=S, owner=Mr. A}],
OutputObject2 [userId=22, behaviours={color=black, isNew=true}],
OutputObject3 [userId=33, behaviours={color=green, size=L}]
]

所需的CSV输出:

userId, color, size, owner, isNew
11,     white,  S,   Mr. A,
22,     black,   ,        , true
33,     green,  L,        ,

我从以下代码片段开始:

// 已经接收到 Set<OutputObject> outputObjectSet.
JSONArray jsonArrayObject = new JSONArray(outputObjectSet);
String csvValue = CDL.toString(jsonArrayObject);
FileWriter fileWriter = new FileWriter(fileObject, true);
fileWriter.write(csvValue);
fileWriter.close();

但是上述代码创建了一个包含 userIdbehaviours 的两列CSV,打印了所有映射对象的 behaviours。如何实现上述类型的输出。

由于集合中可能包含大量这样的对象,如何以高效的方式完成这项工作?

英文:

I have a set of below Objects, which i need to write to CSV:

public class OutputObject {
    private String userId;
    private Map&lt;String, Object&gt; behaviour;
}

Above set can have a map with two, three or four values.

[
OutputObject1 [userId=11, behaviours={color=white, size=S, owner=Mr. A}], 
OutputObject2 [userId=22, behaviours={color=black, isNew=true}],
OutputObject3 [userId=33, behaviours={color=green, size=L}]
]

Output of CSV required:

userId, color, size, owner, isNew
11,     white,  S,   Mr. A,
22,     black,   ,        , true
33,     green,  L,        ,

I started with below snippet to print out:

     // Set&lt;OutputObject&gt; outputObjectSet already received.
     JSONArray jsonArrayObject = new JSONArray(outputObjectSet);
     String csvValue = CDL.toString(jsonArrayObject);
     FileWriter fileWriter = new FileWriter(fileObject, true);
     fileWriter.write(csvValue);
     fileWriter.close();

But above is creating a two column csv with userId and behaviours printing all map object behaviours. How to achieve above type of output.

As the set may contain huge number of such objects, how can this be done efficiently.

答案1

得分: 1

这应该相当高效地工作,即使对象数量较多,因为Java的HashMapArrays.sort()实现都非常快速。请注意,此实现依赖于Apache的common-text库来对内容进行转义。

private static void outputCSV(List<OutputObject> objects, PrintStream output) {
    AtomicInteger highestBehaviourIndex = new AtomicInteger();

    HashMap<String, Integer> behaviourIndexMap = new HashMap<>();

    // 为每个行为分配一个索引
    for (OutputObject object : objects) {
        object.getBehaviour().forEach((name, value) -> behaviourIndexMap.computeIfAbsent(name, (unused) -> highestBehaviourIndex.getAndIncrement()));
    }

    String[] behaviours = new String[highestBehaviourIndex.get()];

    behaviourIndexMap.forEach((name, index) -> {
        behaviours[index] = name;
    });

    output.println("userId, " + String.join(", ", behaviours));

    // 按ID排序
    objects.sort(Comparator.comparingInt(OutputObject::getUserId));

    for (OutputObject object : objects) {
        // 打印行
        StringJoiner joiner = new StringJoiner(", ");
        
        for (String behaviour : behaviours) {
            joiner.add(StringEscapeUtils.escapeCsv(object.getBehaviour().getOrDefault(behaviour, "").toString()));
        }
        
        output.println(object.getUserId() + ", " + joiner.toString());
    }
}
英文:

This should work pretty efficient, even with a higher number of objects since Java's HashMap and Arrays.sort() implementations are pretty fast. Note that this implementation relies on Apache's common-text-library to escape the contents.

private static void outputCSV(List&lt;OutputObject&gt; objects, PrintStream output) {
    AtomicInteger highestBehaviourIndex = new AtomicInteger();

    HashMap&lt;String, Integer&gt; behaviourIndexMap = new HashMap&lt;&gt;();

    // Give every behaviour an index
    for (OutputObject object : objects) {
        object.getBehaviour().forEach((name, value) -&gt; behaviourIndexMap.computeIfAbsent(name, (unused) -&gt; highestBehaviourIndex.getAndIncrement()));
    }

    String[] behaviours = new String[highestBehaviourIndex.get()];

    behaviourIndexMap.forEach((name, index) -&gt; {
        behaviours[index] = name;
    });

    output.println(&quot;userId, &quot; + String.join(&quot;, &quot;, behaviours));

    // Sort by ID
    objects.sort(Comparator.comparingInt(OutputObject::getUserId));

    for (OutputObject object : objects) {
        // Print line
        StringJoiner joiner = new StringJoiner(&quot;, &quot;);
        
        for (String behaviour : behaviours) {
            joiner.add(StringEscapeUtils.escapeCsv(object.getBehaviour().getOrDefault(behaviour, &quot;&quot;).toString()));
        }
        
        output.println(object.getUserId() + &quot;, &quot; + joiner.toString());
    }

}

答案2

得分: 1

使用JSONArray在这里似乎是多余的,你可以实现一个辅助方法来将 OutputObject 序列化为CSV字符串,考虑到需要保持列的顺序:

public class CSVSerializer {
    public static String transform(OutputObject obj) {
        String[] fields = {"color", "size", "owner", "isNew"};
        return Stream.concat(
                Stream.of(obj.getUserId()), 
                Arrays.stream(fields)
                      .map(f -> obj.getBehaviour().get(f))
                      .map(v -> v == null ? "" : v.toString()) 
               )
               .collect(Collectors.joining(","));
    }
}

String csv = outputObjectSet.stream()
                            .map(CSVSerializer::transform)
                            .collect(Collectors.joining("\n"));
// 打印CSV内容
英文:

Using JSONArray seems to be redundant here, you could implement a helper method to serialize OutputObject into a CSV string, taking into account that the order of the columns needs to be maintained:

public class CSVSerializer {
    public static String transform(OutputObject obj) {
        String[] fields = {&quot;color&quot;, &quot;size&quot;, &quot;owner&quot;, &quot;isNew&quot;};
        return Stream.concat(
                Stream.of(obj.getUserId()), 
                Arrays.stream(fields)
                      .map(f -&gt; obj.getBehaviour().get(f))
                      .map(v -&gt; v == null ? &quot;&quot; : v.toString()) 
               )
               .collect(Collectors.joining(&quot;,&quot;));
    }
}


String csv = outputObjectSet.stream()
                            .map(CSVSerializer::transform)
                            .collect(Collectors.joining(&quot;\n&quot;));
// print csv contents

huangapple
  • 本文由 发表于 2020年10月23日 05:53:23
  • 转载请务必保留本文链接:https://go.coder-hub.com/64491039.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定