英文:
List and Compare data types between two dataframes
问题
我有两个数据框,其中列的计数和标题应该匹配。但列的数据类型可能不同。
例如,我有两个数据框 - df1 和 df2。DF1如下所示:
- Geometry列具有几何数据类型
- A列是整数
- B列是字符串
在DF2中,A列是字符串,但应该是整数(就像在DF1中一样)。
我尝试过根据数据类型获取列的计数,并成功使用 df1.dtypes.value_counts()
。
我还尝试过使用 groupby 根据数据类型列出所有列名,但必须删除df中的几何列,因为它会引发类型错误。我设法在创建一个新的数据框后,删除了几何列后获得了列表。
我现在想比较这两个数据框及其列的数据类型,并列出不匹配的列。我还尝试使用 equals 方法,但结果是 FALSE
。
英文:
I have two data frames where the count and headers of the columns are supposed to match. But the column data type may be different. <br />
I want to be able to list out the columns as per the data type and then compare between the two, again giving me a list of column headers whose data types are not matching.
For example, I have two data frames - df1 and df2. DF1 is like below where
- Geometry has the geometry data type
- A is an integer
- B is string
Geometry | A | B |
---|---|---|
123456 | 1 | x |
78.900 | 2 | b |
And in DF2, A is a string, whereas it should be an integer (like in DF1)
I have tried getting the count of the columns based on data types, and was able to get so using df1.dtypes.value_counts()
<br /><br /> I have also tried groupby to list all the column names based on the data type using <br />g = df1.columns.to_series().groupby(df1.dtypes).groups
<br /> But in order to use the groupby, I have to delete the geometry column from the df as it is throwing a TypeError for this. I managed to get the list after creating a new df where I dropped the geometry column. <br /> I want to now compare the two dataframes and their columns' data types, and list the same. <br />
I also tried using equals like df1.equals(df2)
which provided FALSE
.
答案1
得分: 0
假设两个数据框具有相同数量的列和相同的列名,这将为您提供数据类型不同的列名列表:
dt = (df1.dtypes.sort_index() == df2.dtypes.sort_index())
dt.loc[dt == False].index.to_list()
英文:
Assuming the two data frames have the same number of columns and same column names, this will give you the list of column names for which the dtypes are different:
dt = (df1.dtypes.sort_index() == df2.dtypes.sort_index())
dt.loc[dt == False].index.to_list()
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论