英文:
pandas read_table with stopping strings to delimit different dataframes to assign
问题
我有一个csv文件的形式:
跳过的第1行
跳过的第2行
2.13999987 0.139999986 -0.398405492 1
2.61999989 6.0000062E-2 0.450082362 1
2.74000001 5.99999428E-2 1.04403841 1
2.84000015 4.00000811E-2 6.17375337E-2 1
IGN IGN IGN IGN
21.4200001 0.420000076 1.53572667 1
22.3199997 0.479999542 -0.595370948 1
23.3199997 0.520000458 0.136062101 1
24.3600006 0.519999504 -0.520044923 1
25.3999996 0.520000458 2.45230961 1
26.4399986 0.519999504 -2.08248448 1
27.4799995 0.520000458 -0.263438225 1
IGN IGN IGN IGN
58.6800003 0.520000458 -0.789233088 1
59.7200012 0.520000458 -1.02961564 1
60.7600021 0.51999855 -0.889572859 1
61.7999992 0.520000458 -1.03346229 1
62.8400002 0.520000458 4.94940579E-2 1
我想使用pandas读取它,如下所示:
df_first = pd.read_table('file.txt', names=names, delimiter=' ', skiprows=3, nrows=4)
(其中names
是文件.txt中每列的名称)。
我想将每组行分配给指定的df
,直到遇到字符串IGN IGN IGN IGN
,然后再次将其余的行分配给下一个df
,直到再次遇到IGN IGN IGN IGN
字符串,一直到文件结束。
如何实现这一目标的一个好方法是什么?
英文:
I have a csv file of the form :
LINE 1 to SKIP
LINE 2 to SKIP
2.13999987 0.139999986 -0.398405492 1
2.61999989 6.0000062E-2 0.450082362 1
2.74000001 5.99999428E-2 1.04403841 1
2.84000015 4.00000811E-2 6.17375337E-2 1
IGN IGN IGN IGN
21.4200001 0.420000076 1.53572667 1
22.3199997 0.479999542 -0.595370948 1
23.3199997 0.520000458 0.136062101 1
24.3600006 0.519999504 -0.520044923 1
25.3999996 0.520000458 2.45230961 1
26.4399986 0.519999504 -2.08248448 1
27.4799995 0.520000458 -0.263438225 1
IGN IGN IGN IGN
58.6800003 0.520000458 -0.789233088 1
59.7200012 0.520000458 -1.02961564 1
60.7600021 0.51999855 -0.889572859 1
61.7999992 0.520000458 -1.03346229 1
62.8400002 0.520000458 4.94940579E-2 1
And I would like to read that with pandas like:
df_first = pd.read_table('file.txt', names=names, delimiter=' ', skiprows=3, nrows=4)
(where names
are the name of each column in the file.txt).
I want to assign each series of rows to a df
with a given name specified (perhaps with an array of names), until the IGN IGN IGN IGN
string is met, and then assign the rest of the rows to the following df
again until the the next IGN IGN IGN IGN
string is met, till the end of the file.
What is a good way to do that?
答案1
得分: 1
我几年前遇到了这个问题。我的解决方案:
names = ['1', '2', '3', '4']
df = pd.read_table('file.txt', names=names, delimiter=' ', skiprows=3) # 读取数据
index = list(df.loc[df['1']=='IGN'].index) # 获取"IGN"出现的索引
df_list = [] # 定义数据框列表以存储数据框
start = df.index.min() # 定义起始索引
for end in index: # 循环遍历所有索引
df_list.append(df.loc[start:end-1])
start = end+1
else:
df_list.append(df.loc[start:]) # 获取主数据框的最后一部分
您可以像这样调用单个数据框:
df_list[0]
df_list[1]
...
df_list[n]
问候。
英文:
I was confronted by this problem couple of years ago. My solution:
names =['1','2', '3', '4']
df = pd.read_table('file.txt', names=names, delimiter=' ', skiprows=3) # Read the data
index = list(df.loc[df['1']=='IGN'].index) # Getting the index, where IGN ocures
df_list = [] # Defining the dataframe-List ot store the dataframes
start = df.index.min() # Defining the start index
for end in index: # looping through all indeces
df_list.append(df.loc[start:end-1])
start = end+1
else:
df_list.append(df.loc[start:]) # Getting the last slice of the main dataframe
You can call the single dataframes like this:
df_list[0]
df_list[1]
...
df_list[n]
Greetings
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论