提取列值为整数。

huangapple go评论59阅读模式
英文:

Extract Column value into integer

问题

我有一个函数,看起来像这样:

static Column getFormattedData(Column name, Column surname) {
    return concat(
            lit("NAME_")
            lpad(name, greatest(length(name), lit(8)), "0"),
            lpad(surname, greatest(length(surname), lit(8)), "0"));
}

lpad()步骤中遇到问题,因为greatest()返回一个Column,而lpad()期望第二个参数是一个整数(Integer)。

有没有办法将列的值提取为整数形式?

或者,是否有更好的方法来格式化这两列/它们的值?

编辑:
根据您的要求,以下是示例输入:

例如,
name: Joesurname: Thomas

输出:NAME_00000JOE_00THOMAS

(名字和姓氏都填充到8个字符)

例如,
name: Leonardosurname: DaCaprio

输出:NAME_LEONARDO_DaCaprio

(名字和姓氏都>= 8个字母,因此不需要填充)。

英文:

I have a function that looks like this:

static Column getFormattedData(Column name, Column surname) {
        return concat(
                lit("NAME_")
                lpad(name,  greatest(length(name), lit(8)), "0"),
                lpad(surname, greatest(length(surname), lit(8)), "0"));
    }

Have issues at lpad() step, where greatest() returns a Column, while lpad() expects second parameter to be an Integer.

Is there way to extract the value of column into an integer form?

Alternatively, is there a better way to format the two columns/their values?

Edit:
As requested sample input:

e.g.,
name: Joe and surname: Thomas

Output: NAME_00000JOE_00THOMAS

(both first and last names are padded to 8 chars)

e.g.,
name: Leonardo and surname: DaCaprio

Output: NAME_LEONARDO_DaCaprio

(both names and surnames are >= 8 letters, hence no padding needed).

答案1

得分: 1

请查看以下代码:

df.show(false)
+--------+--------+
|name    |surname |
+--------+--------+
|Joe     |Thomas  |
|Leonardo|DaCaprio|
+--------+--------+

scala> :paste
// 进入粘贴模式 (按ctrl-D结束)

df
.select(
    concat(
        lit("NAME_"),
        lpad(upper($"name"), 8, "0"),
        lit("_"),
        lpad(upper($"surname"), 8, "0")
    ).as("output")
)
.show(false)

// 退出粘贴模式,现在进行解释。

+----------------------+
|output                |
+----------------------+
|NAME_00000JOE_00THOMAS|
|NAME_LEONARDO_DACAPRIO|
+----------------------+

如果您需要进一步的帮助,请告诉我。

英文:

Please check below code

scala> df.show(false)
+--------+--------+
|name    |surname |
+--------+--------+
|Joe     |Thomas  |
|Leonardo|DaCaprio|
+--------+--------+


scala> :paste
// Entering paste mode (ctrl-D to finish)

df
.select(
    concat(
        lit("NAME_"),
        lpad(upper($"name"), 8, "0"),
        lit("_"),
        lpad(upper($"surname"), 8, "0")
    ).as("output")
)
.show(false)

// Exiting paste mode, now interpreting.

+----------------------+
|output                |
+----------------------+
|NAME_00000JOE_00THOMAS|
|NAME_LEONARDO_DACAPRIO|
+----------------------+

huangapple
  • 本文由 发表于 2023年7月17日 20:10:53
  • 转载请务必保留本文链接:https://go.coder-hub.com/76704334.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定