确定 pandas DataFrame 中一个字符串包含的语言是什么

huangapple go评论91阅读模式
英文:

Determining what language a string contains in a pandas DataFrame

问题

# 代码部分不要翻译,只返回翻译好的部分
My dataframe:

**df**

```python
Text
Best tv in 2020
utilizar un servicio sms gratuito
utiliser un tv pour netflix

My desired output

Text                                    Language
Best tv in 2020                         en
utilizar un servicio sms gratuito       es
utiliser un tv pour netflix             fr

What I am using:

from textblob import TextBlob

b = TextBlob("utilizar un servicio sms gratuito")
print(b.detect_language())

>>es

I am not sure how I could integrate this method to fill my Pandas Dataframe.

I have tried:

df['Language'] = df['Text'].apply(lambda x: TextBlob(x).detect_language())

But I am getting an error:

TypeError: The `text` argument passed to `__init__(text)` must be a string, not <class 'pandas.core.series.Series'>

I understand what it means, that I need to pass a string rather than pandas DataFrame Series, so my question is how would I loop the entire Series to detect language per row in column text?

Thank you for your suggestions.


<details>
<summary>英文:</summary>

I am new to Pandas and Python.

My dataframe:

**df**

Text
Best tv in 2020
utilizar un servicio sms gratuito
utiliser un tv pour netflix


**My desired output**

Text Language
Best tv in 2020 en
utilizar un servicio sms gratuito es
utiliser un tv pour netflix fr


**What I am using:**

from textblob import TextBlob

b = TextBlob("utilizar un servicio sms gratuito")
print(b.detect_language())

>>es


I am not sure how I could integrate this method to fill my Pandas Dataframe.

**I have tried:**

df['Language'] = TextBlob(df['Text']).detect_language()

But I am getting an error:

TypeError: The text argument passed to __init__(text) must be a string, not <class 'pandas.core.series.Series'>


I understand what it means, that I need to pass a string rather than pandas DataFrame Series, so my question is how would I loop the entire Series to detect language per row in column `text`?

Thank you for your suggestions.


</details>


# 答案1
**得分**: 3

使用 [`Series.apply`](http://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.apply.html) 与 lambda 函数:

```python
df['Language'] = df['Text'].apply(lambda x: TextBlob(x).detect_language())

或者 Series.map

df['Language'] = df['Text'].map(lambda x: TextBlob(x).detect_language())

print (df)
                                    Text Language
0                    Best tv in 2020       en
1  utilizar un servicio sms gratuito       es
2        utiliser un tv pour netflix       fr
英文:

Use Series.apply with lambda function:

df[&#39;Language&#39;] = df[&#39;Text&#39;].apply(lambda x: TextBlob(x).detect_language())

Or Series.map:

df[&#39;Language&#39;] = df[&#39;Text&#39;].map(lambda x: TextBlob(x).detect_language())

print (df)
                                Text Language
0                    Best tv in 2020       en
1  utilizar un servicio sms gratuito       es
2        utiliser un tv pour netflix       fr

huangapple
  • 本文由 发表于 2020年1月6日 18:04:10
  • 转载请务必保留本文链接:https://go.coder-hub.com/59610076.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定