2020年1月3日 15:43:10go评论110阅读模式

英文:

How to create a sequence of rows of Data Frame based on starting and ending value defined by columns

问题

我有以下的数据框：

example_df = pd.DataFrame({'id': {0: 0, 1: 1, 2: 2, 3: 3, 4: 4},
 'seq_start': {0: 0.0, 1: 2800.0, 2: 6400.0, 3: 8400.0, 4: 9800.0},
 'seq_end': {0: 1400.0, 1: 4700.0, 2: 8400.0, 3: 9800.0, 4: 11400.0}})

我想要获得一个数据框，其中包含从 example_df['seq_start'] 到 example_df['seq_end'] 的值序列，以便稍后在连接中使用新创建的列。

所以期望的输出如下：

out_df = pd.DataFrame({'id': np.concatenate([[0] * 15, [1] * 20, [2] * 21]),
                       'expected_output': np.concatenate([np.arange(0, 1500, 100), 
                                                          np.arange(2800, 4800, 100),
                                                          np.arange(6400, 8500, 100)])})

如何处理这个问题？

英文:

I've got a following Data Frame:

example_df = pd.DataFrame({&#39;id&#39;: {0: 0, 1: 1, 2: 2, 3: 3, 4: 4},
 &#39;seq_start&#39;: {0: 0.0, 1: 2800.0, 2: 6400.0, 3: 8400.0, 4: 9800.0},
 &#39;seq_end&#39;: {0: 1400.0, 1: 4700.0, 2: 8400.0, 3: 9800.0, 4: 11400.0}})

I'd like to obtain a Data Frame that has a sequences of values from example_df['seq_start'] to example_df['seq_end'] so that I could later use newly created column in a join.

So the expected output would look like below:

out_df = pd.DataFrame({&#39;id&#39;: np.concatenate([[0] * 15, [1] * 20, [2] * 21]),
                       &#39;expected_output&#39;: np.concatenate([np.arange(0, 1500, 100), 
                                                          np.arange(2800, 4800, 100),
                                                          np.arange(6400, 8500, 100)])})
    id  expected_output
0    0                0
1    0              100
2    0              200
3    0              300
4    0              400
5    0              500
          ...
12   0             1200
13   0             1300
14   0             1400
15   1             2800
16   1             2900
17   1             3000
          ...
31   1             4400
32   1             4500
33   1             4600
34   1             4700
35   2             6400
36   2             6500
37   2             6600
          ...
54   2             8300
55   2             8400

How can I approach this?

答案1

得分: 2

使用 pandas.DataFrame.explode：

def listify(x, step=100, right_closed=True):
    lower, upper = sorted(x)
    return range(lower, upper+step*right_closed, step)
example_df['expected'] = example_df[['seq_end', 'seq_start']].astype(int).apply(listify, 1)
new_df = example_df[['id','expected']].explode('expected')
print(new_df)

输出：

    id expected
0    0        0
0    0      100
0    0      200
0    0      300
0    0      400
...
4    4    11000
4    4    11100
4    4    11200
4    4    11300
4    4    11400

英文:

Using pandas.DataFrame.explode:

def listify(x, step=100, right_closed=True):
    lower, upper = sorted(x)
    return range(lower, upper+step*right_closed, step)
example_df[&#39;expected&#39;] = example_df[[&#39;seq_end&#39;, &#39;seq_start&#39;]].astype(int).apply(listify, 1)
new_df = example_df[[&#39;id&#39;,&#39;expected&#39;]].explode(&#39;expected&#39;)
print(new_df)

Output:

    id expected
0    0        0
0    0      100
0    0      200
0    0      300
0    0      400
..  ..      ...
4    4    11000
4    4    11100
4    4    11200
4    4    11300
4    4    11400

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何根据列中定义的起始和结束值创建数据帧的行序列

问题

答案1

将数据框架旋转以根据特定列将某些行转换为列

打印句子的方法是基于您输入的单词（您的名字）的第一个字母。

TypeError: DataFrame.drop()接受1到2个位置参数，但提供了3个。

将字符串列转换为浮点数，但当我尝试减去两列时，出现了一个 ValueError。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。