2023年3月20日 22:49:50go评论92阅读模式

英文:

Checking if elements in my dataframe columns have the same type

问题

我使用Python和DataFrame df一起工作。在尝试检查所有列的每行是否具有相同类型时，我编写了以下代码：

a=0
first_object = df.loc[df.index[0]]
for column in df:
    for i in range(0,len(df)):
        if type(df[column][i]) != type(first_object[column]):
            a+=1
print(a)

我得到的错误是：

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
~/opt/anaconda3/envs/adsml/lib/python3.9/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   3360             try:
-> 3361                 return self._engine.get_loc(casted_key)
   3362             except KeyError as err:
...
KeyError: 155

我感到困惑，因为type(df[column][i])和type(first_object[column])单独使用时都能正常工作。我尝试了匹配类型和不匹配类型，预期地返回了True和False。所以我不明白为什么我的代码不起作用。

英文:

I'm using Python and work with a dataframe df. When trying to check if for all columns, each row has the same type I wrote the following lines :

a=0
first_object = df.loc[df.index[0]]
for column in df: 
    for i in range(0,len(df)):
        if type(df[column][i]) != type(first_object[column]):
            a+=1
print(a)

The error I got is :


---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
~/opt/anaconda3/envs/adsml/lib/python3.9/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   3360             try:
-&gt; 3361                 return self._engine.get_loc(casted_key)
   3362             except KeyError as err:
~/opt/anaconda3/envs/adsml/lib/python3.9/site-packages/pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
~/opt/anaconda3/envs/adsml/lib/python3.9/site-packages/pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()
pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()
KeyError: 155
The above exception was the direct cause of the following exception:
KeyError                                  Traceback (most recent call last)
/var/folders/xb/74q_24bx0rxgqc6gtd6ksn7c0000gn/T/ipykernel_25626/3160699232.py in &lt;module&gt;
      3 for column in df:
      4     for i in range(0,len(df)):
----&gt; 5         if type(df[column][i]) != type(first_object[column]):
      6             a+=1
~/opt/anaconda3/envs/adsml/lib/python3.9/site-packages/pandas/core/series.py in __getitem__(self, key)
    940 
    941         elif key_is_scalar:
--&gt; 942             return self._get_value(key)
    943 
    944         if is_hashable(key):
~/opt/anaconda3/envs/adsml/lib/python3.9/site-packages/pandas/core/series.py in _get_value(self, label, takeable)
   1049 
   1050         # Similar to Index.get_value, but we do not fall back to positional
-&gt; 1051         loc = self.index.get_loc(label)
   1052         return self.index._get_values_for_loc(self, loc, label)
   1053 
~/opt/anaconda3/envs/adsml/lib/python3.9/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   3361                 return self._engine.get_loc(casted_key)
   3362             except KeyError as err:
-&gt; 3363                 raise KeyError(key) from err
   3364 
   3365         if is_scalar(key) and isna(key) and not self.hasnans:
KeyError: 155

I am confused as both type(df[column][i]) and type(first_object[column]) works separately. I tried it with matching types and non-matching types, and True and False were returned as expected. So I don't understand why my code is not working.

答案1

得分: 1

如果我理解正确，您想统计具有唯一对象类型的列数。

您可以使用：

df.applymap(type).nunique().eq(1).sum()

修正您的代码：

a = 0
first_object = df.iloc[0]
for column in df:
    for i in df.index:
        if type(df.loc[i, column]) != type(first_object[column]):
            a += 1

矢量等效（计算与第一行不同的值）将是：

df2 = df.applymap(type)
out = df2.ne(df2.iloc[0]).sum().sum()

英文:

If I understand correctly, you want to count the number of columns that have a unique type of object.

You can use:

df.applymap(type).nunique().eq(1).sum()

fixing your code:

I wouldn't use a loop in real-life!

a=0
first_object = df.iloc[0]
for column in df: 
    for i in df.index:
        if type(df.loc[i, column]) != type(first_object[column]):
            a+=1

The vectorial equivalent (counting the values that differ from your first row) would be:

df2 = df.applymap(type)
out = df2.ne(df2.iloc[0]).sum().sum()

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

检查我的数据框列中的元素是否具有相同的类型

问题

答案1

fixing your code:

遍历 DataFrame 列表，获取它们的列名并应用一个函数？

如何在保留现有参数的情况下更改 torch.nn.Linear 的输出大小？

如何在Plotly中更改在y轴上绘制的线的颜色？

如何正确地将数据框进行旋转，使第一列的值成为我的新列？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。