2023年3月3日 19:43:47go评论106阅读模式

英文:

pandas read_table with stopping strings to delimit different dataframes to assign

问题

我有一个csv文件的形式：

跳过的第1行
跳过的第2行
2.13999987 0.139999986 -0.398405492 1
2.61999989 6.0000062E-2 0.450082362 1
2.74000001 5.99999428E-2 1.04403841 1
2.84000015 4.00000811E-2 6.17375337E-2 1
IGN IGN IGN IGN 
21.4200001 0.420000076 1.53572667 1
22.3199997 0.479999542 -0.595370948 1
23.3199997 0.520000458 0.136062101 1
24.3600006 0.519999504 -0.520044923 1
25.3999996 0.520000458 2.45230961 1
26.4399986 0.519999504 -2.08248448 1
27.4799995 0.520000458 -0.263438225 1
IGN IGN IGN IGN 
58.6800003 0.520000458 -0.789233088 1
59.7200012 0.520000458 -1.02961564 1
60.7600021 0.51999855 -0.889572859 1
61.7999992 0.520000458 -1.03346229 1
62.8400002 0.520000458 4.94940579E-2 1

我想使用pandas读取它，如下所示：

df_first = pd.read_table('file.txt', names=names, delimiter=' ', skiprows=3, nrows=4)

（其中names是文件.txt中每列的名称）。
我想将每组行分配给指定的df，直到遇到字符串IGN IGN IGN IGN，然后再次将其余的行分配给下一个df，直到再次遇到IGN IGN IGN IGN字符串，一直到文件结束。

如何实现这一目标的一个好方法是什么？

英文:

I have a csv file of the form :

LINE 1 to SKIP
LINE 2 to SKIP
2.13999987 0.139999986 -0.398405492 1
2.61999989 6.0000062E-2 0.450082362 1
2.74000001 5.99999428E-2 1.04403841 1
2.84000015 4.00000811E-2 6.17375337E-2 1
IGN IGN IGN IGN 
21.4200001 0.420000076 1.53572667 1
22.3199997 0.479999542 -0.595370948 1
23.3199997 0.520000458 0.136062101 1
24.3600006 0.519999504 -0.520044923 1
25.3999996 0.520000458 2.45230961 1
26.4399986 0.519999504 -2.08248448 1
27.4799995 0.520000458 -0.263438225 1
IGN IGN IGN IGN 
58.6800003 0.520000458 -0.789233088 1
59.7200012 0.520000458 -1.02961564 1
60.7600021 0.51999855 -0.889572859 1
61.7999992 0.520000458 -1.03346229 1
62.8400002 0.520000458 4.94940579E-2 1

And I would like to read that with pandas like:

df_first = pd.read_table(&#39;file.txt&#39;, names=names, delimiter=&#39; &#39;, skiprows=3, nrows=4)

(where names are the name of each column in the file.txt).
I want to assign each series of rows to a df with a given name specified (perhaps with an array of names), until the IGN IGN IGN IGN string is met, and then assign the rest of the rows to the following df again until the the next IGN IGN IGN IGN string is met, till the end of the file.

What is a good way to do that?

答案1

得分: 1

我几年前遇到了这个问题。我的解决方案：

names = ['1', '2', '3', '4']
df = pd.read_table('file.txt', names=names, delimiter=' ', skiprows=3) # 读取数据
index = list(df.loc[df['1']=='IGN'].index) # 获取"IGN"出现的索引
df_list = [] # 定义数据框列表以存储数据框
start = df.index.min() # 定义起始索引
for end in index: # 循环遍历所有索引
    df_list.append(df.loc[start:end-1])
    start = end+1
else:
    df_list.append(df.loc[start:]) # 获取主数据框的最后一部分

您可以像这样调用单个数据框：

df_list[0]
df_list[1]
...
df_list[n]

问候。

英文:

I was confronted by this problem couple of years ago. My solution:

names =[&#39;1&#39;,&#39;2&#39;, &#39;3&#39;, &#39;4&#39;]
df = pd.read_table(&#39;file.txt&#39;, names=names, delimiter=&#39; &#39;, skiprows=3) # Read the data
index = list(df.loc[df[&#39;1&#39;]==&#39;IGN&#39;].index) # Getting the index, where IGN ocures
df_list = [] # Defining the dataframe-List ot store the dataframes
start = df.index.min() # Defining the start index
for end in index: # looping through all indeces
    df_list.append(df.loc[start:end-1])
    start = end+1
else:
    df_list.append(df.loc[start:]) # Getting the last slice of the main dataframe

You can call the single dataframes like this:

df_list[0]
df_list[1]
...
df_list[n]

Greetings

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

pandas read_table使用停止字符串来分隔不同的数据框以进行分配。

问题

答案1

扩展表格以适应R或Python中的日期范围？

字典由理解构成的条目顺序

Simplest pub-sub for golang <–> python communication, possibly across machines?

多重索引数据框的嵌套循环替代方案

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。