英文:
How to convert WrappedArray to String using Spark / JAVA
问题
以下是翻译好的部分:
我有以下数据框:
+--------------------+
| column |
+--------------------+
| [99896, 10, ] |
|[50, 30, 40, ] |
+--------------------+
列的架构如下:
|-- column: array (nullable = true)
|-- element: string (containsNull = true)
当我执行以下代码:
for (Iterator<Row> iter = dataframe.toLocalIterator(); iter.hasNext();){
String item = (iter.next()).get(0).toString();
System.out.println(item);
}
我会得到以下输出:
WrappedArray(99896, 10, )
WrappedArray(50, 30, 40, )
我该如何将这个输出转换为类似以下的字符串:
[99896, 10, 50, 30, 40]
我需要您的帮助。
谢谢
英文:
I have the following dataframe :
+--------------------+
| column |
+--------------------+
| [99896, 10, ] |
|[50, 30, 40, ] |
+--------------------+
Shema of column is :
|-- column: array (nullable = true)
|-- element: string (containsNull = true)
When I execute the following code :
for (Iterator<Row> iter = dataframee.toLocalIterator(); iter.hasNext();){
String item = (iter.next()).get(0).toString();
System.out.println(item);
}
I get the following output :
WrappedArray(99896, 10, )
WrappedArray(50, 30, 40, )
How can I convert this output to String like :
[99896, 10,50,30,40 ]
I need your help .
Thank you
答案1
得分: 2
基本上,你正在循环遍历每一行,获取该行的WrappedArray
,并使用WrappedArray
的toString()
方法。你需要做的是,不要调用toString()
,而是循环遍历该WrappedArray
并打印其中的每个值。
英文:
So basically, what you're doing is looping through each row, getting the WrappedArray
for that row and using WrappedArray
's toString()
method. What you need to do instead of calling toString()
is to loop over that WrappedArray
and print each value in it
答案2
得分: 2
请参考以下翻译:
尝试这个 -
加载提供的测试数据
Dataset<Row> df = spark.sql("select column from values array(99896, 10, null), array(50, 30, 40, null) T(column)");
df.show(false);
df.printSchema();
/**
* +-------------+
* |column |
* +-------------+
* |[99896, 10,] |
* |[50, 30, 40,]|
* +-------------+
*
* root
* |-- column: array (nullable = false)
* | |-- element: integer (containsNull = true)
*/
选项-1
StringBuilder sb = new StringBuilder();
sb.append("[");
for (java.util.Iterator<Row> iter = df.toLocalIterator(); iter.hasNext();){
String item = (iter.next()).getList(0).stream()
.filter(Objects::nonNull)
.map(String::valueOf)
.collect(Collectors.joining(","));
sb.append(item).append(",");
}
int i = sb.lastIndexOf(",");
sb.replace(i, i+1, "]");
System.out.println(sb);
/**
* [99896,10,50,30,40]
*/
选项-2
Dataset<Row> p = df.withColumn("column",
expr("concat('[', concat_ws(',', collect_list(concat_ws(',', column))), ']')"));
for (java.util.Iterator<Row> iter = p.toLocalIterator(); iter.hasNext();){
String item = (iter.next()).get(0).toString();
System.out.println(item);
}
/**
* [99896,10,50,30,40]
*/
英文:
Try this-
Load the test data provided
Dataset<Row> df = spark.sql("select column from values array(99896, 10, null), array(50, 30, 40, null) T(column)");
df.show(false);
df.printSchema();
/**
* +-------------+
* |column |
* +-------------+
* |[99896, 10,] |
* |[50, 30, 40,]|
* +-------------+
*
* root
* |-- column: array (nullable = false)
* | |-- element: integer (containsNull = true)
*/
Option-1
StringBuilder sb = new StringBuilder();
sb.append("[");
for (java.util.Iterator<Row> iter = df.toLocalIterator(); iter.hasNext();){
String item = (iter.next()).getList(0).stream()
.filter(Objects::nonNull)
.map(String::valueOf)
.collect(Collectors.joining(","));
sb.append(item).append(",");
}
int i = sb.lastIndexOf(",");
sb.replace(i, i+1, "]");
System.out.println(sb);
/**
* [99896,10,50,30,40]
*/
option-2
Dataset<Row> p = df.withColumn("column",
expr("concat('[', concat_ws(',', collect_list(concat_ws(',', column))), ']')"));
for (java.util.Iterator<Row> iter = p.toLocalIterator(); iter.hasNext();){
String item = (iter.next()).get(0).toString();
System.out.println(item);
}
/**
* [99896,10,50,30,40]
*/
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论