2023年2月6日 09:53:20go评论99阅读模式

英文:

Manipulating Values in Pandas DataFrames

问题

# 创建并应用一个名为`change`的函数，修改`grocery`数据框中的单个列
# 以图像中所示的方式
# 使用`map()`或`apply()`函数来解决问题
# 主要问题是使用`split()`方法，因为`category`列中的值长度不一致
# 还可以使用其他字符串操作方法吗？
# 导入pandas库
import pandas as pd
# 创建groceries字典
groceries = {
    'grocery': ['Tesco\'s wafers', 'Asda\'s shortbread', 'Aldi\'s lemon tea', 'Sainsbury\'s croissant', 'Morrison\'s doughnut', 'Amazon fresh\'s peppermint tea', 'Bar becan\'s pizza', 'Pound savers\' shower gel'],
    'category': ['biscuit', 'biscuit', 'tea', 'bakery', 'bakery', 'tea', 'bakery', 'hygiene'],
    'price': [0.99, 1.24, 1.89, 0.75, 0.50, 2.5, 4.99, 2]
}
# 创建数据帧df
df = pd.DataFrame(groceries)
# 定义函数，修改单个列的值 - grocery
def change(x):
    return df['grocery'].str.split(' ').str[1]
# 应用函数到grocery列
df['grocery'] = df['grocery'].map(change)
# 预期的数据帧
expected_df = pd.DataFrame({
    'grocery': ['Wafers', 'Shortbread', 'Lemon Tea', 'Croissant', 'Doughnut', 'Peppermint Tea', 'Pizza', 'Shower Gel'],
    'category': ['biscuit', 'biscuit', 'tea', 'bakery', 'bakery', 'tea', 'bakery', 'hygiene'],
    'price': [0.99, 1.24, 1.89, 0.75, 0.50, 2.5, 4.99, 2]
})

请注意，以上代码是用于创建和应用函数以修改grocery列的代码示例，以生成预期的数据帧。

英文:

I am trying to create and apply a function def change(x): which modifies a single column of values grocery in the grocery data frame as shown in the image below
grocery data

I want to achieve the result in the image below
output

I am at the beginner level in python but I know I can use the map() or apply() functions to solve this. My main problem is using the split() method to achieve the result as the values in the category column are of varying lengths. Or are there other string manipulation methods that can be used?

import pandas as pd
groceries = {
&#39;grocery&#39;:[&#39;Tesco&#39;s wafers&#39;, &#39;Asda&#39;s shortbread&#39;, &#39;Aldi&#39;s lemon tea&#39;, &#39;Sainsbury&#39;s croissant&#39;, &#39;Morrison&#39;s doughnut&#39;, &#39;Amazon fresh&#39;s peppermint tea&#39;, &#39;Bar becan&#39;s pizza&#39;, &#39;Pound savers&#39; shower gel&#39;],
&#39;category&#39;:[&#39;biscuit&#39;, &#39;biscuit&#39;, &#39;tea&#39;, &#39;bakery&#39;, &#39;bakery&#39;, &#39;tea&#39;, &#39;bakery&#39;, &#39;hygiene&#39;],
&#39;price&#39;:[0.99, 1.24, 1.89, 0.75, 0.50, 2.5, 4.99, 2]
}
df = pd.DataFrame(groceries)
df
# function to modify a single column of values - grocery
def change(x):
    return df[&#39;grocery].str.split(&#39; &#39;).str[1]
df = pd.DataFrame(groceries)
df[&#39;grocery&#39;] = df[&#39;grocery&#39;].map(change)
df
# Expected DataFrame
groceries = pd.DataFrame({
&#39;grocery&#39;:[&#39;Wafers&#39;, &#39;Shortbread&#39;, &#39;Lemon Tea&#39;, &#39;Croissant&#39;, &#39;Doughnut&#39;, &#39;Peppermint Tea&#39;, &#39;Pizza&#39;, &#39;Shower Gel&#39;],
&#39;category&#39;:[&#39;biscuit&#39;, &#39;biscuit&#39;, &#39;tea&#39;, &#39;bakery&#39;, &#39;bakery&#39;, &#39;tea&#39;, &#39;bakery&#39;, &#39;hygiene&#39;],
&#39;price&#39;:[0.99, 1.24, 1.89, 0.75, 0.50, 2.5, 4.99, 2]
})

答案1

得分: 0

df['grocery'] = df['grocery'].apply(lambda x: (x.split('s', 1) if 's' in x else x.split('\'', 1))[1].title())

英文:

Assuming you have a dataframe df with your original data:

df[&#39;grocery&#39;] = df[&#39;grocery&#39;].apply(lambda x: (x.split(&quot;&#39;s&quot;, 1) if &quot;&#39;s&quot; in x else x.split(&quot;&#39;&quot;, 1))[1].title())

答案2

得分: 0

我希望这对您的解决方案有效，我使用“'”逗号拆分它，然后从字符串的索引1开始。这取决于条件。

import pandas as pd
groceries = {
    'grocery': [
        "Tesco's wafers", "Asda's shortbread", "Aldi's lemon tea",
        "Sainsbury's croissant", "Morrison's doughnut",
        "Amazon fresh's peppermint tea", "Bar becan's pizza",
        "Pound savers' shower gel"
    ],
    'category': [
        'biscuit', 'biscuit', 'tea', 'bakery', 'bakery', 'tea', 'bakery',
        'hygiene'
    ],
    'price': [0.99, 1.24, 1.89, 0.75, 0.50, 2.5, 4.99, 2]
}
df = pd.DataFrame(groceries)
# 使用“'”逗号拆分并从字符串的索引1开始
df['grocery'] = df['grocery'].apply(lambda x: x.split("'")[1][1:].title())
df

这是您提供的代码的翻译部分。

英文:

I hope this works for your solution, I split it with "'" comma and then start it with from 1 index of a string. It depends on conditions

import pandas as pd
groceries = {
    &#39;grocery&#39;: [
        &quot;Tesco&#39;s wafers&quot;, &quot;Asda&#39;s shortbread&quot;, &quot;Aldi&#39;s lemon tea&quot;,
        &quot;Sainsbury&#39;s croissant&quot;, &quot;Morrison&#39;s doughnut&quot;,
        &quot;Amazon fresh&#39;s peppermint tea&quot;, &quot;Bar becan&#39;s pizza&quot;,
        &quot;Pound savers&#39; shower gel&quot;
    ],
    &#39;category&#39;: [
        &#39;biscuit&#39;, &#39;biscuit&#39;, &#39;tea&#39;, &#39;bakery&#39;, &#39;bakery&#39;, &#39;tea&#39;, &#39;bakery&#39;,
        &#39;hygiene&#39;
    ],
    &#39;price&#39;: [0.99, 1.24, 1.89, 0.75, 0.50, 2.5, 4.99, 2]
}
df = pd.DataFrame(groceries)
# split it with &quot;&#39;&quot; comma and then start it with from 1 index of a string 
# if multiple conditions for grocery string then 
# def grocery_chng(x):
#     # specify multiple conditions to replace a string
#     return x
# df[&#39;grocery&#39;] = df[&#39;grocery&#39;].apply(grocery_chng)
df[&#39;grocery&#39;] = df[&#39;grocery&#39;].apply(lambda x: x.split(&quot;&#39;&quot;)[1][1:].title())
df

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

在Pandas数据框中操作数值

问题

答案1

答案2

无法点击一个元素。

如何处理yolov8中`model.predict`的结果？

‘Columns must be same length as key’ error when trying .Split

在pandas数据框中获取二级索引的值范围

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。