英文:
Python Polars find the length of a string in a dataframe
问题
我尝试在Polars中计算字符串中字母的数量。我可能可以只使用apply方法并获取len(Name)
。但是,我想知道是否有Polars特定的方法?
import polars as pl
mydf = pl.DataFrame(
{"start_date": ["2020-01-02", "2020-01-03", "2020-01-04"],
"Name": ["John", "Joe", "James"]})
print(mydf)
│start_date ┆ Name │
│ --- ┆ --- │
│ str ┆ str │
╞════════════╪═══════╡
│ 2020-01-02 ┆ John │
│ 2020-01-03 ┆ Joe │
│ 2020-01-04 ┆ James │
最终John将有5个字母,Joe将有3个字母,James将有5个字母。
我认为类似于以下内容可能适用于Pandas的等效代码:
# 假设这是一个Pandas DataFrame
mydf['count'] = mydf['Name'].str.len()
# Polars等效 - 错误
mydf = mydf.with_columns(
pl.col('Name').str.len().alias('count')
)
英文:
I am trying to count the number of letters in a string in Polars.
I could probably just use an apply method and get the len(Name)
.
However, I was wondering if there is a polars specific method?
import polars as pl
mydf = pl.DataFrame(
{"start_date": ["2020-01-02", "2020-01-03", "2020-01-04"],
"Name": ["John", "Joe", "James"]})
print(mydf)
│start_date ┆ Name │
│ --- ┆ --- │
│ str ┆ str │
╞════════════╪═══════╡
│ 2020-01-02 ┆ John │
│ 2020-01-03 ┆ Joe │
│ 2020-01-04 ┆ James │
In the end John would have 5, Joe would be 3 and James would be 5
I thought something like below might work based on the Pandas equivalent
# Assume that its a Pandas Dataframe
mydf['count'] = mydf ['Name'].str.len()
# Polars equivalent - ERRORs
mydf = mydf.with_columns(
pl.col('Name').str.len().alias('count')
)
答案1
得分: 2
你可以使用以下方法:
mydf.with_columns([
pl.col("Name").str.lengths().alias("len")
])
┌────────────┬───────┬─────┐
│ start_date ┆ Name ┆ len │
│ --- ┆ --- ┆ --- │
│ str ┆ str ┆ u32 │
╞════════════╪═══════╪═════╡
│ 2020-01-02 ┆ John ┆ 4 │
│ 2020-01-03 ┆ Joe ┆ 3 │
│ 2020-01-04 ┆ James ┆ 5 │
└────────────┴───────┴─────┘
英文:
You can use
.str.lengths()
that counts number of bytes in the UTF8 string (doc) - faster.str.n_chars()
that counts number of characters (doc)
mydf.with_columns([
pl.col("Name").str.lengths().alias("len")
])
┌────────────┬───────┬─────┐
│ start_date ┆ Name ┆ len │
│ --- ┆ --- ┆ --- │
│ str ┆ str ┆ u32 │
╞════════════╪═══════╪═════╡
│ 2020-01-02 ┆ John ┆ 4 │
│ 2020-01-03 ┆ Joe ┆ 3 │
│ 2020-01-04 ┆ James ┆ 5 │
└────────────┴───────┴─────┘
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论