英文:
Pandas pd.DataFrame from loop-data
问题
import pandas as pd
cat = ['a', 'b']
TorF = [True, True, True, False, False, True, False, False, True, True]
data = [cat] + [TorF[i:i + len(TorF) // len(cat)] for i in range(0, len(TorF), len(TorF) // len(cat))]
df = pd.DataFrame(data).T
df.columns = [0] + [i for i in range(1, len(cat) + 1)]
df
   0     1      2      3      4      5
0  a  True   True  False  False   True
1  b  True  False   True  False   True
英文:
I am new to Python. I have some data that I get from a loop. cat and be between two and n and TorF will always be (cat*5) or (cat*4). My gold is to create a pd.DataFrame from two lists, like this
cat = ['a', 'b'] 
TorF = [True, True, True, False, False, True, False, False, True, True]
I think my current solution is kind of clumpy with the int((len(man_corr_n)/len(cat))),
import pandas as pd 
data = [[c, *TorF[i:i+int((len(TorF )/len(cat)))]] for i, c in enumerate(cat)]
df = pd.DataFrame(data)
if there a simpler way to do it?
My desired output is
   0     1     2      3      4      5
0  a  True  True   False  False   True
1  b  True  False  True   False   True
答案1
得分: 2
获取两个形状的比率是一个不错的策略。
然而,我会使用 sliding_window_view 函数:
import pandas as pd
from numpy.lib.stride_tricks import sliding_window_view as swv
cat = ['a', 'b']
man_corr_n = [True, True, True, False, False, True, False, False, True, True]
df = pd.DataFrame(swv(man_corr_n, len(man_corr_n)//len(cat))[:len(cat)],
                  index=cat).reset_index()
或者:
view = swv(man_corr_n, len(man_corr_n)//len(cat))[:len(cat)]
df = pd.DataFrame(np.hstack([np.array(cat)[:,None], view]))
输出:
   0     1     2      3      4      5
0  a  True  True   True  False  False
1  b  True  True  False  False   True
编辑:输出不明确
您提供的代码和所显示的期望输出明显不一致。使用明确的输入 (TorF = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]),您的代码会产生以下结果:
   0  1  2  3  4  5
0  a  0  1  2  3  4
1  b  1  2  3  4  5
而您的期望输出似乎是:
   0  1  2  3  4  5
0  a  0  2  4  6  8
1  b  1  3  5  7  9
在这种情况下,您只需要进行 reshape 操作:
df = pd.DataFrame(np.reshape(np.r_[cat, TorF], (len(cat), -1), order='F'))
# 或者
df = pd.DataFrame(np.hstack([list(map(list, cat)), np.reshape(TorF, (len(cat), -1), order='F')]))
输出:
   0     1      2      3      4     5
0  a  True   True  False  False  True
1  b  True  False   True  False  True
英文:
Getting the ratio of the two shapes is a good strategy.
I would however use sliding_window_view:
import pandas as pd
from numpy.lib.stride_tricks import sliding_window_view as swv
cat = ['a', 'b']
man_corr_n = [True, True, True, False, False, True, False, False, True, True]
df = pd.DataFrame(swv(man_corr_n, len(man_corr_n)//len(cat))[:len(cat)],
                  index=cat).reset_index()
Or:
view = swv(man_corr_n, len(man_corr_n)//len(cat))[:len(cat)]
df = pd.DataFrame(np.hstack([np.array(cat)[:,None], view]))
Output:
   0     1     2      3      4      5
0  a  True  True   True  False  False
1  b  True  True  False  False   True
edit: ambiguous output
Your provided code and the shown expected output clearly conflict. Using an unambiguous input (TorF = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]), your code gives:
   0  1  2  3  4  5
0  a  0  1  2  3  4
1  b  1  2  3  4  5
While your expected output seems to be:
   0  1  2  3  4  5
0  a  0  2  4  6  8
1  b  1  3  5  7  9
In such case, you just need to reshape:
df = pd.DataFrame(np.reshape(np.r_[cat, TorF], (len(cat), -1), order='F'))
# or
df = pd.DataFrame(np.hstack([list(map(list, cat)), np.reshape(TorF, (len(cat), -1), order='F')]))
Output:
   0     1      2      3      4     5
0  a  True   True  False  False  True
1  b  True  False   True  False  True
答案2
得分: 2
你可以从这两个列表创建一个Numpy数组,然后进行reshape、transpose操作,最后创建一个DataFrame:
import numpy as np
import pandas as pd
cat = ['a', 'b']
TorF = [True, True, True, False, False, True, False, False, True, True]
TorF = np.array(cat + TorF)
TorF2 = TorF.reshape(len(TorF)//2, 2)
df = pd.DataFrame(TorF2.T)
结果如下:
       0     1      2      3      4     5
0  a  True   True  False  False  True
1  b  True  False   True  False  True
英文:
You could form a Numpy array from the two Lists, reshape, transpose and form the DataFrame:
import numpy as np
import pandas as pd
cat = ['a', 'b']
TorF = [True, True, True, False, False, True, False, False, True, True]
TorF = np.array(cat + TorF)
TorF2 = TorF.reshape(len(TorF)//2, 2)
df = pd.DataFrame(TorF2.T)
giving:
   0     1      2      3      4     5
0  a  True   True  False  False  True
1  b  True  False   True  False  True
答案3
得分: 1
0  a  True  True  False  False  True
1  b  True  False  True  False  True
英文:
pd.DataFrame({'cat':cat*int(len(TorF)/2),'TorF':TorF})\
    .assign(col1=lambda dd:dd.index//2)\
    .set_index(['col1','cat'])\
    .unstack().T\
    .reset_index(level=1).to_numpy()
Output:
   0     1      2      3      4     5
0  a  True   True  False  False  True
1  b  True  False   True  False  True
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。


评论