英文:
In Google Sheets, how do I flatten a hierarchical table with dynamic rows?
问题
我有一个这样的数据集:
Ad_id | attribute | value |
---|---|---|
2341 | cost | 2 |
lead | 7 | |
view | 5 | |
reaction | 9 | |
5465 | cost | 12 |
click | 34 |
广告数量是动态的。属性的数量也是动态的,对于每个广告都不同。属性可能在不同广告中重复。我想将其转化为类似这样的形式:
ad_id | cost | lead | view | reaction | click | etc. |
---|---|---|---|---|---|---|
2341 | 2 | 7 | 5 | 9 | NA | |
5465 | 12 | NA | NA | NA | 34 |
我使用了 "=UNIQUE(FILTER(A:A, A:A<>"" ))" 来获取 ad_id 列,以及 "=ARRAYFORMULA(TRANSPOSE(UNIQUE(FILTER(B:B, B:B<>""))))" 来将属性列转换为表头,但是我卡在了如何获取正确的值上。如果 id 之间的行不是空白的话,这会很容易。
我尝试了 "=ArrayFormula(INDEX(C:C, MATCH(1, (A:A=ad_id_to_search)*(B:B=attribute_to_search), 0))" 但这只对每个 id 分支的第一行有效。也就是说,它只会找到成本 (cost) 的值,而且只会在成本是该分支的第一个属性时有效。
英文:
I have a dataset like this:
Ad_id | attribute | value |
---|---|---|
2341 | cost | 2 |
lead | 7 | |
view | 5 | |
reaction | 9 | |
5465 | cost | 12 |
click | 34 |
The number of ads is dynamic. The number of attributes is also dynamic and different for each ad. The attributes may repeat for different ads. I want to transform it to something like this:
ad_id | cost | lead | view | reaction | click | etc. |
---|---|---|---|---|---|---|
2341 | 2 | 7 | 5 | 9 | NA | |
5465 | 12 | NA | NA | NA | 34 |
I used =UNIQUE(FILTER(A:A,A:A<>"")) for the ad_id column and =ARRAYFORMULA(TRANSPOSE(UNIQUE(FILTER(B:B,B:B<>"")))) for turning the attributes column into headers but am stumped for how to pull the correct values. If the rows between the ids weren't blank it would be easy.
I tried =ArrayFormula(INDEX(C:C, MATCH(1, (A:A=ad_id_to_search)*(B:B=attribute_to_search), 0))) but this only works for the first row of each id branch. ie, it will only find the cost value and only if cost is the first attribute in the branch.
答案1
得分: 0
如果您不介意目标表中的属性按字母顺序排列,您可以尝试以下代码:
=map(
query(
{A1:C1;{scan(,A2:A,lambda(acc,arr,if(len(arr),arr,acc))),B2:C}},
"select Col1,max(Col3) where Col2 is not null group by Col1 pivot Col2",1),
lambda(x,if(len(x),x,"NA")))
假设您的源表位于工作表的A1:Cx区域。SCAN函数为那些缺少ad_id的行分配适当的ad_id,然后QUERY函数按属性行旋转数据,最后MAP函数为那些未出现在特定ad_id下的属性分配“NA”。
英文:
If you don't mind the attributes being in alphabetical order in the destination table, you can try:
=map(
query(
{A1:C1;{scan(,A2:A,lambda(acc,arr,if(len(arr),arr,acc))),B2:C}},
"select Col1,max(Col3) where Col2 is not null group by Col1 pivot Col2",1),
lambda(x,if(len(x),x,"NA")))
This is assuming that your source table is in A1:Cx of the sheet. The SCAN assigns the appropriate ad_id for those rows where it is missing, the QUERY then pivots the data by the attribute rows, and finally the MAP assigns an 'NA' for those attributes which were not present for a particular ad_id.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论