2020年1月6日 18:29:14go评论145阅读模式

英文:

split values from columns and generate sequence number

问题

我有一个数据框中有两列。每列在一行中有多个值。我想要将每个值拆分到另一个表中的新行中，并生成序列号。给定的数据是：

新数据框应该是这样的：

x 76.25
y 87.12
序列号 1
x 345.65
y 96.45
序列号 2
x 78.12
y 85.23
序列号 1
x 35.1
y 65.21
序列号 2
x 98.27
y 56.63
序列号 3

所有的值都是字符串。我不知道该如何做。我应该编写一个函数还是数据框中有任何命令？任何帮助都将不胜感激。

英文:

I have two columns in a df. each column has multiple values in 1 row.I want to split each value in a new row in another table and generate sequence number. given data is

x                                           y
76.25, 345.65                           87.12,96.45
78.12,35.1,98.27                       85.23,65.2,56.63

new df should be like this

x                  76.25
y                  87.12
sequence number      1
x                    345.65
y                    96.45
sequence number       2
x                     78.12
y                      85.23
sequence number         1
x                       35.1
y                      65.21
sequence number         2
x                     98.27
y                     56.63
sequence number         3

all values are strings. I have no idea how should I do it.Should I write a function or there is any command in dataframe? any help is appreciated

答案1

得分: 0

你可以使用iterrows()和concat()来实现：

df = pd.DataFrame({
    'x': ('76.25,345.65', '78.12,35.1,98.27'),
    'y': ('87.12,96.45', '85.23,65.2,56.63')
})
def get_parts():
    for index, row in df.iterrows():
        x = row['x'].split(',')
        y = row['y'].split(',')
        for index, _ in enumerate(x):
            # len(x)必须等于len(y)...
            yield 'x', x[index]
            yield 'y', y[index]
            # 在每个拆分的项目之后生成数字
            yield 'sequence number', index + 1
# 从各个部分生成Series并合并成新的DataFrame
new_df = pd.concat([
    pd.Series(
], 
])
    for p in get_parts()
])

希望这对你有所帮助。

英文:

You can do it using iterrows() + concat():

df = pd.DataFrame({
    &#39;x&#39;: (&#39;76.25,345.65&#39;, &#39;78.12,35.1,98.27&#39;),
    &#39;y&#39;: (&#39;87.12,96.45&#39;, &#39;85.23,65.2,56.63&#39;)
})
def get_parts():
    for index, row in df.iterrows():
        x = row[&#39;x&#39;].split(&#39;,&#39;)
        y = row[&#39;y&#39;].split(&#39;,&#39;)
        for index, _ in enumerate(x):
            #  len(x) must be equal len(y)...
            yield &#39;x&#39;, x[index]
            yield &#39;y&#39;, y[index]
            # generate number after each splitted item
            yield &#39;sequence number&#39;, index + 1
# generate Series from parts and union into new dataframe
new_df = pd.concat([
    pd.Series(
], 
])
    for p in get_parts()
])

Hope this helps.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

从列中拆分值并生成序列号。

问题

答案1

如何根据与物品价格的关系将”expensive_meter”属性添加到图数据库中的顶点？

NOW()无法在查询中使用（它不存在）

基于配对合并数据框

加密的数据库名称，使用Spring Boot Maven的PostgreSQL。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。