使用字典创建的Pandas DataFrame与使用列表创建的DataFrame相比。

huangapple go评论58阅读模式
英文:

Pandas DataFrame Created from Dictionary vs Created from List

问题

以下是代码的翻译部分:

# 使用列表创建的DataFrame,使其表现得像使用字典创建的DataFrame一样是否有一行或两行代码?

# 从字典创建的DataFrame,这是有效的:
import pandas as pd
data = {'Salary': [30000, 40000, 50000, 85000, 75000],
        'Exp': [1, 3, 5, 10, 25],
        'Gender': ['M', 'F', 'M', 'F', 'M']}
df = pd.DataFrame(data)
print(df), print()

new_df1 = df[df['Salary'] >= 50000]
print(new_df1), print()

new_df2 = df.sort_values(['Exp'], axis=0, ascending=[False])
print(new_df2)

# 这不适用于使用df函数、排序和条件:
data = [['Salary', 'Exp', 'Gender'], [30000, 1, 'M'],
        [40000, 3, 'F'], [50000, 5, 'M'], [85000, 10, 'F'], [75000, 25, 'M']]

df = pd.DataFrame(data)
print(df), print()

new_df1 = df[df['Salary'] >= 50000]  # 不起作用
print(new_df1), print()

new_df2 = df.sort_values(['Exp'], axis=0, ascending=[False])  # 同样不起作用
print(new_df2)

请注意,这是代码的翻译,不包括任何其他内容。

英文:

Is there a line or two of code that would make the DataFrame created from lists behave like the one created from a dictionary?

#DataFrame created from dictionary, this works:
import pandas as pd
data= {'Salary': [30000, 40000, 50000, 85000, 75000],            
        'Exp': [1, 3, 5, 10, 25],          
        'Gender': ['M','F', 'M', 'F', 'M']} 
df = pd.DataFrame(data)
print(df), print()

new_df1 = df[df['Salary'] >= 50000]
print(new_df1), print()

new_df2 = df.sort_values(['Exp'], axis = 0, ascending=[False])
print(new_df2)


#This doesn't work with the df.functions, sort and conditionals    
data = [['Salary', 'Exp', 'Gender'],[30000, 1, 'M'],
        [40000, 3, 'F'], [50000, 5, 'M'], [85000, 10, 'F'], [75000, 25, 'M']]

df = pd.DataFrame(data)
print(df), print()

new_df1 = df[df['Salary'] >= 50000]  #doesn't work
print(new_df1), print()

new_df2 = df.sort_values(['Exp'], axis = 0, ascending=[False])  #ditto
print(new_df2)

答案1

得分: 1

在你的第二段代码中,你没有将第一个子列表用作列名,而是用作数据。请将第一个子列表作为DataFrame构造函数的columns参数传递:

df = pd.DataFrame(data[1:], columns=data[0])

输出:

   Salary  Exp Gender
0   30000    1      M
1   40000    3      F
2   50000    5      M
3   85000   10      F
4   75000   25      M
为什么你的代码失败了

你的代码错误地将第一个子列表映射为数据:

pd.DataFrame(data)

        0    1       2   # 错误的列名
0  Salary  Exp  Gender   # 这不应该是数据行
1   30000    1       M
2   40000    3       F
3   50000    5       M
4   85000   10       F
5   75000   25       M

完整代码:
df = pd.DataFrame(data[1:], columns=data[0])
print(df), print()

new_df1 = df[df['Salary'] >= 50000]  # 无法工作
print(new_df1), print()

new_df2 = df.sort_values(['Exp'], axis=0, ascending=[False])  # 同样无法工作
print(new_df2)

输出:

   Salary  Exp Gender
0   30000    1      M
1   40000    3      F
2   50000    5      M
3   85000   10      F
4   75000   25      M

   Salary  Exp Gender
2   50000    5      M
3   85000   10      F
4   75000   25      M

   Salary  Exp Gender
4   75000   25      M
3   85000   10      F
2   50000    5      M
1   40000    3      F
0   30000    1      M
英文:

In your second code, you're not using the first sublist as column names but rather data.

Pass instead the first sublist as the columns parameter of your DataFrame constructor:

df = pd.DataFrame(data[1:], columns=data[0])

Output:

   Salary  Exp Gender
0   30000    1      M
1   40000    3      F
2   50000    5      M
3   85000   10      F
4   75000   25      M
why your code failed

You code was incorrectly mapping the first sublist as data:

pd.DataFrame(data)

        0    1       2   # incorrect header
0  Salary  Exp  Gender   # this shouldn't be a data row
1   30000    1       M
2   40000    3       F
3   50000    5       M
4   85000   10       F
5   75000   25       M

full code:
df = pd.DataFrame(data[1:], columns=data[0])
print(df), print()

new_df1 = df[df['Salary'] >= 50000]  #doesn't work
print(new_df1), print()

new_df2 = df.sort_values(['Exp'], axis = 0, ascending=[False])  #ditto
print(new_df2)

Output:

   Salary  Exp Gender
0   30000    1      M
1   40000    3      F
2   50000    5      M
3   85000   10      F
4   75000   25      M

   Salary  Exp Gender
2   50000    5      M
3   85000   10      F
4   75000   25      M

   Salary  Exp Gender
4   75000   25      M
3   85000   10      F
2   50000    5      M
1   40000    3      F
0   30000    1      M

答案2

得分: 1

这里需要通过所有值创建DataFrame,不包括第一行,并传递参数columns

# 这不适用于df函数、排序和条件
data = [['Salary', 'Exp', 'Gender'], [30000, 1, 'M'],
        [40000, 3, 'F'], [50000, 5, 'M'], [85000, 10, 'F'], [75000, 25, 'M']]

df = pd.DataFrame(data[1:], columns=data[0])
print(df), print()
   Salary  Exp Gender
0   30000    1      M
1   40000    3      F
2   50000    5      M
3   85000   10      F
4   75000   25      M

new_df1 = df[df['Salary'] >= 50000]  # 运行良好
print(new_df1), print()
   Salary  Exp Gender
2   50000    5      M
3   85000   10      F
4   75000   25      M

new_df2 = df.sort_values(['Exp'], axis=0, ascending=[False])  # 同样适用
print(new_df2)

   Salary  Exp Gender
4   75000   25      M
3   85000   10      F
2   50000    5      M
1   40000    3      F
0   30000    1      M
英文:

Here is necessary create DataFrame by all values without first and pass parameter columns:

#This doesn't work with the df.functions, sort and conditionals    
data = [['Salary', 'Exp', 'Gender'],[30000, 1, 'M'],
        [40000, 3, 'F'], [50000, 5, 'M'], [85000, 10, 'F'], [75000, 25, 'M']]

df = pd.DataFrame(data[1:], columns=data[0])
print(df), print()
   Salary  Exp Gender
0   30000    1      M
1   40000    3      F
2   50000    5      M
3   85000   10      F
4   75000   25      M

new_df1 = df[df['Salary'] >= 50000]  #working well
print(new_df1), print()
   Salary  Exp Gender
2   50000    5      M
3   85000   10      F
4   75000   25      M

new_df2 = df.sort_values(['Exp'], axis = 0, ascending=[False])  #ditto
print(new_df2)

   Salary  Exp Gender
4   75000   25      M
3   85000   10      F
2   50000    5      M
1   40000    3      F
0   30000    1      M

huangapple
  • 本文由 发表于 2023年3月15日 20:07:32
  • 转载请务必保留本文链接:https://go.coder-hub.com/75744477.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定