在数组的列中索引元素

huangapple go评论62阅读模式
英文:

Index element in column of arrays

问题

I want to get the second element from each array in a DataFrame column. I can do this manually like this:

df[:, 3][1][2, 1, 1]
df[:, 3][2][2, 1, 1]
df[:, 3][3][2, 1, 1]

But I want to achieve this in a single step, like:

df[:, 3][1:3][2, 1, 1]

However, this returns an array, and I want a vector containing 3 3 3. How can I achieve this?

英文:

I have a dataframe, where one of the columns contains 3d arrays. I want to get a vector containing a single value from each array. For example:

# create some data
ar = cat([1 2; 3 4], [5 6; 7 8], dims=3)
df = DataFrame(var1 = Int64[1,2,3], var2 = Int64[2,3,4], var3 = Array{Int64, 3}[ar,ar,ar])

Now I want to get the the second element from each array. I can do this manually e.g.:

df[:,3][1][2,1,1]
df[:,3][2][2,1,1]
df[:,3][3][2,1,1]

But I want to do it in a single go, I thought the following would work:

df[:,3][1:3][2,1,1]

But that returns an array. The desired output would be a vector containing 3 3 3. Any advice would be much appreciated.

答案1

得分: 2

你可以使用getindex函数和一个推导式:

vector = [getindex(df[:,3][i], 2, 1, 1) for i in 1:size(df, 1)]
print(vector)

这段代码会遍历指定列 (df[:,3]) 中的每一行,使用getindex函数获取所需元素,并将其存储在向量中。结果将包含该列中每个数组的第一个元素。

因此,输出向量应为 [3, 3, 3]

希望有所帮助。

英文:

you can use the getindex function along with a comprehension:

vector = [getindex(df[:,3][i], 2, 1, 1) for i in 1:size(df, 1)]
print(vector)

This code iterates over each row of the specified column (df[:,3]), retrieves the desired element using getindex, and stores it in the vector. The result will contain the first element from each array in the column.

So, the output vector should be [3, 3, 3].

Hope it helps.

答案2

得分: 2

以下是翻译好的部分:

你可以这样做:

julia> [a[2, 1, 1] for a in df[!, :var3]]
3-element Vector{Int64}:
 3
 3
 3

尽管我最初的直觉是:

julia> map(a -> a[2, 1, 1], df.var3)
3-element Vector{Int64}:
 3
 3
 3

请注意,通常最好使用 df[!, :column](或 df.column,它们是等效的)而不是 df[:, :column]df[:, columnnumber]。这是因为 df[:, .. 首先会复制该列,这在这里是不必要的,可能会影响您的代码性能。相反,df[!, .. 引用的是DataFrame中的原始列本身,没有额外不必要的复制。

英文:

You can do:

julia> [a[2, 1, 1] for a in df[!, :var3]]
3-element Vector{Int64}:
 3
 3
 3

though my first instinct would be:

julia> map(a -> a[2, 1, 1], df.var3)
3-element Vector{Int64}:
 3
 3
 3

Note that it's usually preferrable to use df[!, :column] (or df.column, which is equivalent) rather than df[:, :column] or df[:, columnnumber]. This is because df[:, .. makes a copy of that column first, which is unnecessary here and can affect the performance of your code. df[!, .. instead refers to the original column in the DataFrame itself, with no extra unnecessary copies.

huangapple
  • 本文由 发表于 2023年5月17日 12:26:51
  • 转载请务必保留本文链接:https://go.coder-hub.com/76268560.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定