英文:
Dynamic Filtering using variable in filter function in Flux
问题
使用分位数函数,我能够在数据流中获取95%的分位数值。
现在,我想筛选那些低于95%分位数的记录。因此,我循环遍历我的记录并筛选那些低于分位数的记录。然而,在这个话题上我遇到了错误 –
请查看下面的代码 –
percentile = totalTimeByDoc
|> filter(fn: (r) => r["documentType"] == "PurchaseOrder")
|> group(columns:["documentType"])
// |> yield()
|> quantile(column: "processTime", q: 0.95, method: "estimate_tdigest", compression: 9999.0)
|> limit(n: 1)
|> rename(columns: {processTime: "pt"})
给我这些数据 –
0 PurchaseOrder 999
现在,我尝试循环遍历我的记录并筛选 -
percentile_filered = totalTimeByDoc
|> filter(fn: (r) => r["documentType"] == "PurchaseOrder")
|> filter(fn: (r) => r.processTime < percentile[0]["pt"])
|> yield()
其中,totalTimeByDoc是这样的 –
|0|PurchaseOrder|testpass22PID230207222747-1|1200|
|1|PurchaseOrder|testpass22PID230207222747-2|807|
|2|PurchaseOrder|testpass22PID230207222934-1|671|
|3|PurchaseOrder|testpass22PID230207222934-2|670|
我从上面的查询中得到以下错误 –
error @116:41-116:51: expected [{A with pt: B}] (array) but found stream[{A with pt: B}]
英文:
Using the quantile function, I was able to get 95 % percentile value in a stream.
Now, i want to filter records which lie below the 95% percentile.
hence, I loop over my recods and filter records which lie below the percentile.
However, at this topic I get error –
Please find code below –
percentile = totalTimeByDoc
|> filter(fn: (r) => r["documentType"] == "PurchaseOrder")
|> group(columns:["documentType"])
// |> yield()
|> quantile(column: "processTime", q: 0.95, method: "estimate_tdigest", compression: 9999.0)
|> limit(n: 1)
|> rename(columns: {processTime: "pt"})
Gives me data – >
0 PurchaseOrder 999
Now, I try to loop over my records and filter -
percentile_filered = totalTimeByDoc
|> filter(fn: (r) => r["documentType"] == "PurchaseOrder")
|> filter(fn: (r) => r.processTime < percentile[0]["pt"])
|> yield()
Where, totalTimeByDoc is like below –
|0|PurchaseOrder|testpass22PID230207222747-1|1200|
|1|PurchaseOrder|testpass22PID230207222747-2|807|
|2|PurchaseOrder|testpass22PID230207222934-1|671|
|3|PurchaseOrder|testpass22PID230207222934-2|670|
I get following error from above query –
error @116:41-116:51: expected [{A with pt: B}] (array) but found stream[{A with pt: B}]
答案1
得分: 0
你只需要从 percentile 流中提取列。请参考提取标量值。在这种情况下,你可以这样做:
percentile = totalTimeByDoc
|> ...
|> rename(columns: {processTime: "pt"})
|> findColumn(fn: (key) => true, column: "pt")
percentile_filtered = totalTimeByDoc
|> filter(fn: (r) => r["documentType"] == "PurchaseOrder")
|> filter(fn: (r) => r.processTime < percentile[0])
|> yield()
英文:
You are only missing column extraction from percentile stream. Have a look at Extract scalar values. In this very case, you could do
percentile = totalTimeByDoc
|> ...
|> rename(columns: {processTime: "pt"})
|> findColumn(fn: (key) => true, column: "pt")
percentile_filtered = totalTimeByDoc
|> filter(fn: (r) => r["documentType"] == "PurchaseOrder")
|> filter(fn: (r) => r.processTime < percentile[0])
|> yield()
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论