英文:
Pandas DataFrame Created from Dictionary vs Created from List
问题
以下是代码的翻译部分:
# 使用列表创建的DataFrame,使其表现得像使用字典创建的DataFrame一样是否有一行或两行代码?
# 从字典创建的DataFrame,这是有效的:
import pandas as pd
data = {'Salary': [30000, 40000, 50000, 85000, 75000],
'Exp': [1, 3, 5, 10, 25],
'Gender': ['M', 'F', 'M', 'F', 'M']}
df = pd.DataFrame(data)
print(df), print()
new_df1 = df[df['Salary'] >= 50000]
print(new_df1), print()
new_df2 = df.sort_values(['Exp'], axis=0, ascending=[False])
print(new_df2)
# 这不适用于使用df函数、排序和条件:
data = [['Salary', 'Exp', 'Gender'], [30000, 1, 'M'],
[40000, 3, 'F'], [50000, 5, 'M'], [85000, 10, 'F'], [75000, 25, 'M']]
df = pd.DataFrame(data)
print(df), print()
new_df1 = df[df['Salary'] >= 50000] # 不起作用
print(new_df1), print()
new_df2 = df.sort_values(['Exp'], axis=0, ascending=[False]) # 同样不起作用
print(new_df2)
请注意,这是代码的翻译,不包括任何其他内容。
英文:
Is there a line or two of code that would make the DataFrame created from lists behave like the one created from a dictionary?
#DataFrame created from dictionary, this works:
import pandas as pd
data= {'Salary': [30000, 40000, 50000, 85000, 75000],
'Exp': [1, 3, 5, 10, 25],
'Gender': ['M','F', 'M', 'F', 'M']}
df = pd.DataFrame(data)
print(df), print()
new_df1 = df[df['Salary'] >= 50000]
print(new_df1), print()
new_df2 = df.sort_values(['Exp'], axis = 0, ascending=[False])
print(new_df2)
#This doesn't work with the df.functions, sort and conditionals
data = [['Salary', 'Exp', 'Gender'],[30000, 1, 'M'],
[40000, 3, 'F'], [50000, 5, 'M'], [85000, 10, 'F'], [75000, 25, 'M']]
df = pd.DataFrame(data)
print(df), print()
new_df1 = df[df['Salary'] >= 50000] #doesn't work
print(new_df1), print()
new_df2 = df.sort_values(['Exp'], axis = 0, ascending=[False]) #ditto
print(new_df2)
答案1
得分: 1
在你的第二段代码中,你没有将第一个子列表用作列名,而是用作数据。请将第一个子列表作为DataFrame
构造函数的columns
参数传递:
df = pd.DataFrame(data[1:], columns=data[0])
输出:
Salary Exp Gender
0 30000 1 M
1 40000 3 F
2 50000 5 M
3 85000 10 F
4 75000 25 M
为什么你的代码失败了
你的代码错误地将第一个子列表映射为数据:
pd.DataFrame(data)
0 1 2 # 错误的列名
0 Salary Exp Gender # 这不应该是数据行
1 30000 1 M
2 40000 3 F
3 50000 5 M
4 85000 10 F
5 75000 25 M
完整代码:
df = pd.DataFrame(data[1:], columns=data[0])
print(df), print()
new_df1 = df[df['Salary'] >= 50000] # 无法工作
print(new_df1), print()
new_df2 = df.sort_values(['Exp'], axis=0, ascending=[False]) # 同样无法工作
print(new_df2)
输出:
Salary Exp Gender
0 30000 1 M
1 40000 3 F
2 50000 5 M
3 85000 10 F
4 75000 25 M
Salary Exp Gender
2 50000 5 M
3 85000 10 F
4 75000 25 M
Salary Exp Gender
4 75000 25 M
3 85000 10 F
2 50000 5 M
1 40000 3 F
0 30000 1 M
英文:
In your second code, you're not using the first sublist as column names but rather data.
Pass instead the first sublist as the columns
parameter of your DataFrame
constructor:
df = pd.DataFrame(data[1:], columns=data[0])
Output:
Salary Exp Gender
0 30000 1 M
1 40000 3 F
2 50000 5 M
3 85000 10 F
4 75000 25 M
why your code failed
You code was incorrectly mapping the first sublist as data:
pd.DataFrame(data)
0 1 2 # incorrect header
0 Salary Exp Gender # this shouldn't be a data row
1 30000 1 M
2 40000 3 F
3 50000 5 M
4 85000 10 F
5 75000 25 M
full code:
df = pd.DataFrame(data[1:], columns=data[0])
print(df), print()
new_df1 = df[df['Salary'] >= 50000] #doesn't work
print(new_df1), print()
new_df2 = df.sort_values(['Exp'], axis = 0, ascending=[False]) #ditto
print(new_df2)
Output:
Salary Exp Gender
0 30000 1 M
1 40000 3 F
2 50000 5 M
3 85000 10 F
4 75000 25 M
Salary Exp Gender
2 50000 5 M
3 85000 10 F
4 75000 25 M
Salary Exp Gender
4 75000 25 M
3 85000 10 F
2 50000 5 M
1 40000 3 F
0 30000 1 M
答案2
得分: 1
这里需要通过所有值创建DataFrame,不包括第一行,并传递参数columns
:
# 这不适用于df函数、排序和条件
data = [['Salary', 'Exp', 'Gender'], [30000, 1, 'M'],
[40000, 3, 'F'], [50000, 5, 'M'], [85000, 10, 'F'], [75000, 25, 'M']]
df = pd.DataFrame(data[1:], columns=data[0])
print(df), print()
Salary Exp Gender
0 30000 1 M
1 40000 3 F
2 50000 5 M
3 85000 10 F
4 75000 25 M
new_df1 = df[df['Salary'] >= 50000] # 运行良好
print(new_df1), print()
Salary Exp Gender
2 50000 5 M
3 85000 10 F
4 75000 25 M
new_df2 = df.sort_values(['Exp'], axis=0, ascending=[False]) # 同样适用
print(new_df2)
Salary Exp Gender
4 75000 25 M
3 85000 10 F
2 50000 5 M
1 40000 3 F
0 30000 1 M
英文:
Here is necessary create DataFrame by all values without first and pass parameter columns
:
#This doesn't work with the df.functions, sort and conditionals
data = [['Salary', 'Exp', 'Gender'],[30000, 1, 'M'],
[40000, 3, 'F'], [50000, 5, 'M'], [85000, 10, 'F'], [75000, 25, 'M']]
df = pd.DataFrame(data[1:], columns=data[0])
print(df), print()
Salary Exp Gender
0 30000 1 M
1 40000 3 F
2 50000 5 M
3 85000 10 F
4 75000 25 M
new_df1 = df[df['Salary'] >= 50000] #working well
print(new_df1), print()
Salary Exp Gender
2 50000 5 M
3 85000 10 F
4 75000 25 M
new_df2 = df.sort_values(['Exp'], axis = 0, ascending=[False]) #ditto
print(new_df2)
Salary Exp Gender
4 75000 25 M
3 85000 10 F
2 50000 5 M
1 40000 3 F
0 30000 1 M
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论