如何在数据框中获取列的索引/位置(Spark SQL Java)

huangapple go评论111阅读模式
英文:

How to get index/position of column in dataframe (Spark sql Java)

问题

使用Spark Java(不是Scala或Python)。

我需要更改我的代码,使我的Spark查询选择所有列而不是特定的一组列(就像使用select *一样)。以前,当我有一组特定的列时,我很容易知道每个列的确切位置/索引,因为它们按照我的选择顺序排列。然而,由于现在我选择了所有列,我不知道确切的顺序。

我需要特定列的位置/索引,以便我可以使用函数.isNullAt(),因为它需要位置/索引而不是字符串列名。

我想知道是否使用dataframe.columns()会给我一个数组,其中我可以使用与需要索引/位置的数据框方法相同的索引/位置。然后,我可以使用我的字符串列名搜索数组,以获取正确的索引?

英文:

I am using Spark Java (not scala, python).

I have to change my code so that my spark query will select all columns rather than a specific set of columns. (Like using select *). Before when I had a specific set of columns, it is easy for me to know the exact position/index of each column because it is in the order of my select. However, since I am now selecting all, I do not know the order exactly.

I need the position/index of particular columns so that I can use the function .isNullAt() because it requires position/index and not the string column name.

I am wondering does using dataframe.columns() give me an array which the exact same index/position I can use for the dataframe methods that require an index/position? And then I can search the array using my string column name to get back the correct index?

答案1

得分: 0

从你的问题中,我猜测你正在尝试获取行中字段的索引,以便检查是否为空。

你可以使用ds.columns(),因为它会给你有序的列,然后使用这里的索引。

然而,我建议使用另一种方法,因为这样可以将逻辑保持在行处理内,更加健壮。你可以使用.fieldIndex(String fieldName)方法:

row.isNullAt(row.fieldIndex("my_column_name"))

查看更多信息:
https://spark.apache.org/docs/2.1.0/api/java/org/apache/spark/sql/Row.html#fieldIndex(java.lang.String)

英文:

From your question I'm guessing you're trying to get the index of a field in a row so you can check nullity.

Indeed you could use ds.columns() as it will give you the ordered columns and then use the index from here.

Nevertheless, I would advice to use another method though as you keep the logic inside row processing and it will be more robust. You can use .fieldIndex(String fieldName)

row.isNullAt(row.fieldIndex("my_column_name"))

See more
https://spark.apache.org/docs/2.1.0/api/java/org/apache/spark/sql/Row.html#fieldIndex(java.lang.String)

huangapple
  • 本文由 发表于 2023年7月11日 04:54:44
  • 转载请务必保留本文链接:https://go.coder-hub.com/76657279.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定