英文:
Extract Column value into integer
问题
我有一个函数,看起来像这样:
static Column getFormattedData(Column name, Column surname) {
return concat(
lit("NAME_")
lpad(name, greatest(length(name), lit(8)), "0"),
lpad(surname, greatest(length(surname), lit(8)), "0"));
}
在lpad()
步骤中遇到问题,因为greatest()
返回一个Column,而lpad()
期望第二个参数是一个整数(Integer)。
有没有办法将列的值提取为整数形式?
或者,是否有更好的方法来格式化这两列/它们的值?
编辑:
根据您的要求,以下是示例输入:
例如,
name: Joe
和 surname: Thomas
输出:NAME_00000JOE_00THOMAS
(名字和姓氏都填充到8个字符)
例如,
name: Leonardo
和 surname: DaCaprio
输出:NAME_LEONARDO_DaCaprio
(名字和姓氏都>= 8个字母,因此不需要填充)。
英文:
I have a function that looks like this:
static Column getFormattedData(Column name, Column surname) {
return concat(
lit("NAME_")
lpad(name, greatest(length(name), lit(8)), "0"),
lpad(surname, greatest(length(surname), lit(8)), "0"));
}
Have issues at lpad()
step, where greatest()
returns a Column, while lpad()
expects second parameter to be an Integer
.
Is there way to extract the value of column into an integer form?
Alternatively, is there a better way to format the two columns/their values?
Edit:
As requested sample input:
e.g.,
name: Joe
and surname: Thomas
Output: NAME_00000JOE_00THOMAS
(both first and last names are padded to 8 chars)
e.g.,
name: Leonardo
and surname: DaCaprio
Output: NAME_LEONARDO_DaCaprio
(both names and surnames are >= 8 letters, hence no padding needed).
答案1
得分: 1
请查看以下代码:
df.show(false)
+--------+--------+
|name |surname |
+--------+--------+
|Joe |Thomas |
|Leonardo|DaCaprio|
+--------+--------+
scala> :paste
// 进入粘贴模式 (按ctrl-D结束)
df
.select(
concat(
lit("NAME_"),
lpad(upper($"name"), 8, "0"),
lit("_"),
lpad(upper($"surname"), 8, "0")
).as("output")
)
.show(false)
// 退出粘贴模式,现在进行解释。
+----------------------+
|output |
+----------------------+
|NAME_00000JOE_00THOMAS|
|NAME_LEONARDO_DACAPRIO|
+----------------------+
如果您需要进一步的帮助,请告诉我。
英文:
Please check below code
scala> df.show(false)
+--------+--------+
|name |surname |
+--------+--------+
|Joe |Thomas |
|Leonardo|DaCaprio|
+--------+--------+
scala> :paste
// Entering paste mode (ctrl-D to finish)
df
.select(
concat(
lit("NAME_"),
lpad(upper($"name"), 8, "0"),
lit("_"),
lpad(upper($"surname"), 8, "0")
).as("output")
)
.show(false)
// Exiting paste mode, now interpreting.
+----------------------+
|output |
+----------------------+
|NAME_00000JOE_00THOMAS|
|NAME_LEONARDO_DACAPRIO|
+----------------------+
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论