如何在 Polars 中自定义排序行?

huangapple go评论156阅读模式
英文:

how custom sort of rows in polars

问题

如何按特定顺序排序行

```python
df = pl.DataFrame({"currency": ["EUR", "EUR", "EUR", "USD", "USD", "USD"], "alphabet": ["A", "B", "C", "A", "B", "C"]})

我需要按货币降序和字母自定义排序,期望结果如下:

currency alphabet
USD C
USD A
USD B
EUR C
EUR A
EUR B

<details>
<summary>英文:</summary>

How to sort row with spesific order


df = pl.DataFrame({"currency": ["EUR","EUR","EUR","USD","USD","USD"], "alphabet": ["A","B","C","A","B","C"]})


i need to descending the currency and custom sort of alphabet

expected to be like this

| currency | alphabet |
| -------- | -------- |
| USD   | C   |
| USD   | A   |
| USD   | B   |
| EUR   | C   |
| EUR   | A   |
| EUR   | B   |



</details>


# 答案1
**得分**: 2

以下是代码部分的翻译:

```python
df = pl.DataFrame({
    "currency": ["EUR","EUR","EUR","USD","USD","USD","USD"],
    "alphabet": ["A","B","C","A","B","C","A"]
})

with pl.StringCache():
    currency = sorted(["EUR", "USD"], reverse=True)
    pl.Series(["C", "A", "B", *currency]).cast(pl.Categorical)
    
    df = df.with_columns(
        pl.col(pl.Utf8).cast(pl.Categorical),
    ).sort(
        pl.col(pl.Categorical).to_physical()
    )
    
    print(df)
┌──────────┬──────────┐
 currency  alphabet 
 ---       ---      
 cat       cat      
╞══════════╪══════════╡
 USD       C        
 USD       A        
 USD       A        
 USD       B        
 EUR       C        
 EUR       A        
 EUR       B        
└──────────┴──────────┘

希望这些信息对您有所帮助。

英文:

For example you can make your own order of pl.Categorical data using pl.StringCache.

df = pl.DataFrame({
    &quot;currency&quot;: [&quot;EUR&quot;,&quot;EUR&quot;,&quot;EUR&quot;,&quot;USD&quot;,&quot;USD&quot;,&quot;USD&quot;,&quot;USD&quot;],
    &quot;alphabet&quot;: [&quot;A&quot;,&quot;B&quot;,&quot;C&quot;,&quot;A&quot;,&quot;B&quot;,&quot;C&quot;,&quot;A&quot;]
})

with pl.StringCache():
    currency = sorted([&quot;EUR&quot;, &quot;USD&quot;], reverse=True)
    pl.Series([&quot;C&quot;, &quot;A&quot;, &quot;B&quot;, *currency]).cast(pl.Categorical)
    
    df = df.with_columns(
        pl.col(pl.Utf8).cast(pl.Categorical),
    ).sort(
        pl.col(pl.Categorical).to_physical()
    )
    
    print(df)
┌──────────┬──────────┐
│ currency ┆ alphabet │
│ ---      ┆ ---      │
│ cat      ┆ cat      │
╞══════════╪══════════╡
│ USD      ┆ C        │
│ USD      ┆ A        │
│ USD      ┆ A        │
│ USD      ┆ B        │
│ EUR      ┆ C        │
│ EUR      ┆ A        │
│ EUR      ┆ B        │
└──────────┴──────────┘

答案2

得分: 1

创建一个 Polars 表达式,将 "alphabet" 的值映射到数字,以保持列值的期望顺序,使用 Expr.map_dict。使用 DataFrame.sort 方法,首先按 "currency" 值按降序排序行,然后按先前的表达式值按升序排序。

df = pl.DataFrame({
    "currency": ["EUR", "EUR", "EUR", "USD", "USD", "USD"],
    "alphabet": ["A", "B", "C", "A", "B", "C"]
})

abc_order = {val: idx for idx, val in enumerate(["C", "A", "B"])}

res = df.sort(pl.col("currency"),
              pl.col("alphabet").map_dict(abc_order),
              descending=[True, False])

输出:

>>> res

shape: (6, 2)
┌──────────┬──────────┐
 currency  alphabet 
 ---       ---      
 str       str      
╞══════════╪══════════╡
 USD       C        
 USD       A        
 USD       B        
 EUR       C        
 EUR       A        
 EUR       B        
└──────────┴──────────┘
英文:

Create a polars expression that maps the "alphabet" values to numbers that respect the desired order of the column values using Expr.map_dict. Use the DataFrame.sort method to sort the rows first by "currency" value in descending order, and second by the previous expression value (in ascending order).

df = pl.DataFrame({
    &quot;currency&quot;: [&quot;EUR&quot;,&quot;EUR&quot;,&quot;EUR&quot;,&quot;USD&quot;,&quot;USD&quot;,&quot;USD&quot;], 
    &quot;alphabet&quot;: [&quot;A&quot;,&quot;B&quot;,&quot;C&quot;,&quot;A&quot;,&quot;B&quot;,&quot;C&quot;]
})

abc_order = {val: idx for idx, val in enumerate([&quot;C&quot;, &quot;A&quot;, &quot;B&quot;])}

res = df.sort(pl.col(&quot;currency&quot;), 
              pl.col(&quot;alphabet&quot;).map_dict(abc_order),
              descending=[True, False])

Output:

&gt;&gt;&gt; res

shape: (6, 2)
┌──────────┬──────────┐
│ currency ┆ alphabet │
│ ---      ┆ ---      │
│ str      ┆ str      │
╞══════════╪══════════╡
│ USD      ┆ C        │
│ USD      ┆ A        │
│ USD      ┆ B        │
│ EUR      ┆ C        │
│ EUR      ┆ A        │
│ EUR      ┆ B        │
└──────────┴──────────┘

huangapple
  • 本文由 发表于 2023年3月7日 12:02:35
  • 转载请务必保留本文链接:https://go.coder-hub.com/75657940.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定