英文:
how custom sort of rows in polars
问题
如何按特定顺序排序行
```python
df = pl.DataFrame({"currency": ["EUR", "EUR", "EUR", "USD", "USD", "USD"], "alphabet": ["A", "B", "C", "A", "B", "C"]})
我需要按货币降序和字母自定义排序,期望结果如下:
currency | alphabet |
---|---|
USD | C |
USD | A |
USD | B |
EUR | C |
EUR | A |
EUR | B |
<details>
<summary>英文:</summary>
How to sort row with spesific order
df = pl.DataFrame({"currency": ["EUR","EUR","EUR","USD","USD","USD"], "alphabet": ["A","B","C","A","B","C"]})
i need to descending the currency and custom sort of alphabet
expected to be like this
| currency | alphabet |
| -------- | -------- |
| USD | C |
| USD | A |
| USD | B |
| EUR | C |
| EUR | A |
| EUR | B |
</details>
# 答案1
**得分**: 2
以下是代码部分的翻译:
```python
df = pl.DataFrame({
"currency": ["EUR","EUR","EUR","USD","USD","USD","USD"],
"alphabet": ["A","B","C","A","B","C","A"]
})
with pl.StringCache():
currency = sorted(["EUR", "USD"], reverse=True)
pl.Series(["C", "A", "B", *currency]).cast(pl.Categorical)
df = df.with_columns(
pl.col(pl.Utf8).cast(pl.Categorical),
).sort(
pl.col(pl.Categorical).to_physical()
)
print(df)
┌──────────┬──────────┐
│ currency ┆ alphabet │
│ --- ┆ --- │
│ cat ┆ cat │
╞══════════╪══════════╡
│ USD ┆ C │
│ USD ┆ A │
│ USD ┆ A │
│ USD ┆ B │
│ EUR ┆ C │
│ EUR ┆ A │
│ EUR ┆ B │
└──────────┴──────────┘
希望这些信息对您有所帮助。
英文:
For example you can make your own order of pl.Categorical
data using pl.StringCache
.
df = pl.DataFrame({
"currency": ["EUR","EUR","EUR","USD","USD","USD","USD"],
"alphabet": ["A","B","C","A","B","C","A"]
})
with pl.StringCache():
currency = sorted(["EUR", "USD"], reverse=True)
pl.Series(["C", "A", "B", *currency]).cast(pl.Categorical)
df = df.with_columns(
pl.col(pl.Utf8).cast(pl.Categorical),
).sort(
pl.col(pl.Categorical).to_physical()
)
print(df)
┌──────────┬──────────┐
│ currency ┆ alphabet │
│ --- ┆ --- │
│ cat ┆ cat │
╞══════════╪══════════╡
│ USD ┆ C │
│ USD ┆ A │
│ USD ┆ A │
│ USD ┆ B │
│ EUR ┆ C │
│ EUR ┆ A │
│ EUR ┆ B │
└──────────┴──────────┘
答案2
得分: 1
创建一个 Polars 表达式,将 "alphabet" 的值映射到数字,以保持列值的期望顺序,使用 Expr.map_dict
。使用 DataFrame.sort
方法,首先按 "currency" 值按降序排序行,然后按先前的表达式值按升序排序。
df = pl.DataFrame({
"currency": ["EUR", "EUR", "EUR", "USD", "USD", "USD"],
"alphabet": ["A", "B", "C", "A", "B", "C"]
})
abc_order = {val: idx for idx, val in enumerate(["C", "A", "B"])}
res = df.sort(pl.col("currency"),
pl.col("alphabet").map_dict(abc_order),
descending=[True, False])
输出:
>>> res
shape: (6, 2)
┌──────────┬──────────┐
│ currency ┆ alphabet │
│ --- ┆ --- │
│ str ┆ str │
╞══════════╪══════════╡
│ USD ┆ C │
│ USD ┆ A │
│ USD ┆ B │
│ EUR ┆ C │
│ EUR ┆ A │
│ EUR ┆ B │
└──────────┴──────────┘
英文:
Create a polars expression that maps the "alphabet" values to numbers that respect the desired order of the column values using Expr.map_dict
. Use the DataFrame.sort
method to sort the rows first by "currency" value in descending order, and second by the previous expression value (in ascending order).
df = pl.DataFrame({
"currency": ["EUR","EUR","EUR","USD","USD","USD"],
"alphabet": ["A","B","C","A","B","C"]
})
abc_order = {val: idx for idx, val in enumerate(["C", "A", "B"])}
res = df.sort(pl.col("currency"),
pl.col("alphabet").map_dict(abc_order),
descending=[True, False])
Output:
>>> res
shape: (6, 2)
┌──────────┬──────────┐
│ currency ┆ alphabet │
│ --- ┆ --- │
│ str ┆ str │
╞══════════╪══════════╡
│ USD ┆ C │
│ USD ┆ A │
│ USD ┆ B │
│ EUR ┆ C │
│ EUR ┆ A │
│ EUR ┆ B │
└──────────┴──────────┘
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论