SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation:

huangapple go评论110阅读模式
英文:

SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation:

问题

我正在尝试发送请求到一个网站,然后从网站中提取文本。然而,我收到了警告。

> SettingWithCopyWarning:
A value is trying to be set on a copy of a
> slice from a DataFrame
>
> 请查看文档中的注意事项:
> https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy _df['text']=remove_one_words_from_list(website_text,_df['language']).copy()

我已经尝试过.copy(),但问题仍然存在,使用_df.loc时也出现了"too many indexers"错误。重要的是要注意,我在for循环中传递的DataFrame,所以我每次都调用get_the_text2方法并传递一行。

  1. def get_the_text2(_df):
  2. '''
  3. 用不同的方法第二次发送请求以接收文章的文本
  4. 参数
  5. ----------
  6. _df : DataFrame
  7. 返回
  8. -------
  9. 仅包含在URL中的文本
  10. '''
  11. df['text']=''
  12. if str(_df):
  13. website_text=list()
  14. print(_df)
  15. try:
  16. response=requests.get(_df['url'],headers={"User-Agent" : "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36"})
  17. status_code=response.status_code
  18. soup = BeautifulSoup(response.content, 'html.parser')
  19. if len(website_text)<=10:
  20. website_text=list()
  21. if soup.article:
  22. if soup.article.find_all(['p',re.compile("^h\d{1}")]):
  23. for data in soup.article.find_all(['p',re.compile("^h\d{1}")]):
  24. website_text.append(data.get_text(strip=True))
  25. _df['text']=remove_one_words_from_list(website_text,_df['language']).copy()
  26. print('****文章 P & H{1}****',remove_one_words_from_list(website_text,_df['language']))
  27. for _index,item in enumerate(df['status_code']):
  28. if item !=200:
  29. get_the_text2(df.loc[_index])

编辑

只是显示了.loc的错误消息。

我的代码:

_df['text']=remove_one_words_from_list(website_text,_df.loc[:,'language']).copy()

错误消息:

  1. IndexingError Traceback (most recent call last)
  2. Cell In[14], line 102
  3. 100 for _index,item in enumerate(df['status_code']):
  4. 101 if item !=200:
  5. --> 102 get_the_text2(df.loc[_index])
  6. File c:\Users\\anaconda3\envs\GDELT\Lib\site-packages\pandas\core\indexing.py:939, in _LocationIndexer._validate_key_length(self, key)
  7. 937 raise IndexingError(_one_ellipsis_message)
  8. 938 return self._validate_key_length(key)
  9. --> 939 raise IndexingError("Too many indexers")
  10. 940 return key
  11. IndexingError: Too many indexers

编辑2

我发现如果使用.loc['language'],它不会引发错误,尽管"SettingWithCopyWarning"仍然存在。

  1. _df['text']=remove_one_words_from_list(website_text,_df.loc['language']).copy()

根据这篇帖子,我知道为什么会发生这种情况,但不知道如何解决。

英文:

i'm trying to send a request to a website then get the scrape the Text out of the website. however i get warning.

> SettingWithCopyWarning:
A value is trying to be set on a copy of a
> slice from a DataFrame
>
> See the caveats in the documentation:
> https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy _df['text']=remove_one_words_from_list(website_text,_df['language']).copy()

i already tried .copy() and the issue still remains and also with _df.loc i get too many indexers error. It's important to note that the dataframe that i pass is in for loop soi call get_the_text2 method in a for loop then pass a row each time

  1. def get_the_text2(_df):
  2. &#39;&#39;&#39;
  3. sending a request for second time with a different method to recieve the Text of the Articles
  4. Parameters
  5. ----------
  6. _df : DataFrame
  7. Returns
  8. -------
  9. only the text contained in the url
  10. &#39;&#39;&#39;
  11. df[&#39;text&#39;]=&#39;&#39;
  12. # for k,i in enumerate(_df[&#39;url&#39;]):
  13. if str(_df):
  14. website_text=list()
  15. print(_df)
  16. #time.sleep(2)
  17. try:
  18. response=requests.get(_df[&#39;url&#39;],headers={&quot;User-Agent&quot; : &quot;Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36&quot;})
  19. status_code=response.status_code
  20. soup = BeautifulSoup(response.content, &#39;html.parser&#39;)
  21. if len(website_text)&lt;=10:
  22. website_text=list()
  23. if soup.article:
  24. if soup.article.find_all([&#39;p&#39;,re.compile(&quot;^h\d{1}&quot;)]):
  25. for data in soup.article.find_all([&#39;p&#39;,re.compile(&quot;^h\d{1}&quot;)]):
  26. website_text.append(data.get_text(strip=True))
  27. #df.at[k,&#39;text&#39;]=remove_one_words_from_list(website_text,df.at[k,&#39;language&#39;])
  28. _df[&#39;text&#39;]=remove_one_words_from_list(website_text,_df[&#39;language&#39;]).copy()
  29. print(&#39;****ARTICLE P &amp; H{1}****&#39;,remove_one_words_from_list(website_text,_df[&#39;language&#39;]))
  30. for _index,item in enumerate(df[&#39;status_code&#39;]):
  31. if item !=200:
  32. get_the_text2(df.loc[_index])

EDIT:

just to show the error message with .loc

my Code:

_df[&#39;text&#39;]=remove_one_words_from_list(website_text,_df.loc[:,&#39;language&#39;]).copy()

error message:

  1. IndexingError Traceback (most recent call last)
  2. Cell In[14], line 102
  3. 100 for _index,item in enumerate(df[&#39;status_code&#39;]):
  4. 101 if item !=200:
  5. --&gt; 102 get_the_text2(df.loc[_index])
  6. File c:\Users\\anaconda3\envs\GDELT\Lib\site-packages\pandas\core\indexing.py:939, in _LocationIndexer._validate_key_length(self, key)
  7. 937 raise IndexingError(_one_ellipsis_message)
  8. 938 return self._validate_key_length(key)
  9. --&gt; 939 raise IndexingError(&quot;Too many indexers&quot;)
  10. 940 return key
  11. IndexingError: Too many indexers

EDIT 2

found out if i use this .loc[&#39;language&#39;] it won't throw error although the SettingWithCopyWarning is still there.

  1. _df[&#39;text&#39;]=remove_one_words_from_list(website_text,_df.loc[&#39;language&#39;]).copy()

according to this post i know why it's happened but don't know how to fix it.

答案1

得分: 0

I tried to assign the new value to a new Dataframe and this did the job.

_df2=pd.DataFrame(columns=list(df.columns)) # to get the columns from the original Dataframe
_df2['text']=remove_one_words_from_list(website_text,_df.loc['language']).copy()

英文:

i tried to assign the new value to a new Dataframe and this did the job.

  1. _df2=pd.DataFrame(columns=list(df.columns)) # to get the columns from the original Dataframe
  2. _df2[&#39;text&#39;]=remove_one_words_from_list(website_text,_df.loc[&#39;language&#39;]).copy()

huangapple
  • 本文由 发表于 2023年7月6日 18:55:39
  • 转载请务必保留本文链接:https://go.coder-hub.com/76628083.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定