删除包含在某一列中的特定字符串的行。

huangapple go评论61阅读模式
英文:

Drop rows that contain a specific string in a column

问题

# 请尝试以下代码来删除包含四个连续的1的行:
data = df[~df["col1"].astype(str).str.contains("1111")]
英文:

I have a large pandas dataset in the format as below

col1
11111112322
15211114821
25482136522
45225625656
11125648121

I would like to drop all rows that contain 1111 (four consecutive ones) to have below results

25482136522
45225625656
11125648121

I tried this but did not work:

data = df[df["col1"].str.contains("1111")==False]
Traceback (most recent call last):
  File "<pyshell#17>", line 1, in <module>
    data1_1 = section1[section1["col1"].str.contains("111111")==False]
  File "C:\Users\henry\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\generic.py", line 5575, in __getattr__
    return object.__getattribute__(self, name)
  File "C:\Users\henry\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\accessor.py", line 182, in __get__
    accessor_obj = self._accessor(obj)
  File "C:\Users\henry\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\strings\accessor.py", line 177, in __init__
    self._inferred_dtype = self._validate(data)
  File "C:\Users\henry\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\strings\accessor.py", line 231, in _validate
    raise AttributeError("Can only use .str accessor with string values!")
AttributeError: Can only use .str accessor with string values!. Did you mean: 'std'?

答案1

得分: 1

问题是,正如错误代码所述,该列不是字符串列:
> AttributeError: 只能在字符串值上使用 .str 访问器!你是不是想使用 'std'?

因此,要对其执行字符串操作,您首先必须将该列转换为字符串,然后您的代码将起作用:

df[df["col1"].astype(str).str.contains("1111")==False]

输出:

          col1
2  25482136522
3  45225625656
4  11125648121
英文:

The issue, is, as the error code states, that the column is not a column of strings:
>AttributeError: Can only use .str accessor with string values!. Did you mean: 'std'?

So to perform string actions on it, you have to first convert the column to strings, then your code will work:

df[df["col1"].astype(str).str.contains("1111")==False]

Output:

          col1
2  25482136522
3  45225625656
4  11125648121

huangapple
  • 本文由 发表于 2023年8月4日 07:47:42
  • 转载请务必保留本文链接:https://go.coder-hub.com/76832205.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定