英文:
Pandas pd.DataFrame from loop-data
问题
import pandas as pd
cat = ['a', 'b']
TorF = [True, True, True, False, False, True, False, False, True, True]
data = [cat] + [TorF[i:i + len(TorF) // len(cat)] for i in range(0, len(TorF), len(TorF) // len(cat))]
df = pd.DataFrame(data).T
df.columns = [0] + [i for i in range(1, len(cat) + 1)]
df
0 1 2 3 4 5
0 a True True False False True
1 b True False True False True
英文:
I am new to Python. I have some data that I get from a loop. cat
and be between two and n
and TorF
will always be (cat*5
) or (cat*4
). My gold is to create a pd.DataFrame
from two lists, like this
cat = ['a', 'b']
TorF = [True, True, True, False, False, True, False, False, True, True]
I think my current solution is kind of clumpy with the int((len(man_corr_n)/len(cat)))
,
import pandas as pd
data = [[c, *TorF[i:i+int((len(TorF )/len(cat)))]] for i, c in enumerate(cat)]
df = pd.DataFrame(data)
if there a simpler way to do it?
My desired output is
0 1 2 3 4 5
0 a True True False False True
1 b True False True False True
答案1
得分: 2
获取两个形状的比率是一个不错的策略。
然而,我会使用 sliding_window_view
函数:
import pandas as pd
from numpy.lib.stride_tricks import sliding_window_view as swv
cat = ['a', 'b']
man_corr_n = [True, True, True, False, False, True, False, False, True, True]
df = pd.DataFrame(swv(man_corr_n, len(man_corr_n)//len(cat))[:len(cat)],
index=cat).reset_index()
或者:
view = swv(man_corr_n, len(man_corr_n)//len(cat))[:len(cat)]
df = pd.DataFrame(np.hstack([np.array(cat)[:,None], view]))
输出:
0 1 2 3 4 5
0 a True True True False False
1 b True True False False True
编辑:输出不明确
您提供的代码和所显示的期望输出明显不一致。使用明确的输入 (TorF = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
),您的代码会产生以下结果:
0 1 2 3 4 5
0 a 0 1 2 3 4
1 b 1 2 3 4 5
而您的期望输出似乎是:
0 1 2 3 4 5
0 a 0 2 4 6 8
1 b 1 3 5 7 9
在这种情况下,您只需要进行 reshape
操作:
df = pd.DataFrame(np.reshape(np.r_[cat, TorF], (len(cat), -1), order='F'))
# 或者
df = pd.DataFrame(np.hstack([list(map(list, cat)), np.reshape(TorF, (len(cat), -1), order='F')]))
输出:
0 1 2 3 4 5
0 a True True False False True
1 b True False True False True
英文:
Getting the ratio of the two shapes is a good strategy.
I would however use sliding_window_view
:
import pandas as pd
from numpy.lib.stride_tricks import sliding_window_view as swv
cat = ['a', 'b']
man_corr_n = [True, True, True, False, False, True, False, False, True, True]
df = pd.DataFrame(swv(man_corr_n, len(man_corr_n)//len(cat))[:len(cat)],
index=cat).reset_index()
Or:
view = swv(man_corr_n, len(man_corr_n)//len(cat))[:len(cat)]
df = pd.DataFrame(np.hstack([np.array(cat)[:,None], view]))
Output:
0 1 2 3 4 5
0 a True True True False False
1 b True True False False True
edit: ambiguous output
Your provided code and the shown expected output clearly conflict. Using an unambiguous input (TorF = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
), your code gives:
0 1 2 3 4 5
0 a 0 1 2 3 4
1 b 1 2 3 4 5
While your expected output seems to be:
0 1 2 3 4 5
0 a 0 2 4 6 8
1 b 1 3 5 7 9
In such case, you just need to reshape
:
df = pd.DataFrame(np.reshape(np.r_[cat, TorF], (len(cat), -1), order='F'))
# or
df = pd.DataFrame(np.hstack([list(map(list, cat)), np.reshape(TorF, (len(cat), -1), order='F')]))
Output:
0 1 2 3 4 5
0 a True True False False True
1 b True False True False True
答案2
得分: 2
你可以从这两个列表创建一个Numpy数组,然后进行reshape、transpose操作,最后创建一个DataFrame:
import numpy as np
import pandas as pd
cat = ['a', 'b']
TorF = [True, True, True, False, False, True, False, False, True, True]
TorF = np.array(cat + TorF)
TorF2 = TorF.reshape(len(TorF)//2, 2)
df = pd.DataFrame(TorF2.T)
结果如下:
0 1 2 3 4 5
0 a True True False False True
1 b True False True False True
英文:
You could form a Numpy array from the two Lists, reshape, transpose and form the DataFrame:
import numpy as np
import pandas as pd
cat = ['a', 'b']
TorF = [True, True, True, False, False, True, False, False, True, True]
TorF = np.array(cat + TorF)
TorF2 = TorF.reshape(len(TorF)//2, 2)
df = pd.DataFrame(TorF2.T)
giving:
0 1 2 3 4 5
0 a True True False False True
1 b True False True False True
答案3
得分: 1
0 a True True False False True
1 b True False True False True
英文:
pd.DataFrame({'cat':cat*int(len(TorF)/2),'TorF':TorF})\
.assign(col1=lambda dd:dd.index//2)\
.set_index(['col1','cat'])\
.unstack().T\
.reset_index(level=1).to_numpy()
Output:
0 1 2 3 4 5
0 a True True False False True
1 b True False True False True
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论