英文:
Manipulating Values in Pandas DataFrames
问题
# 创建并应用一个名为`change`的函数,修改`grocery`数据框中的单个列
# 以图像中所示的方式
# 使用`map()`或`apply()`函数来解决问题
# 主要问题是使用`split()`方法,因为`category`列中的值长度不一致
# 还可以使用其他字符串操作方法吗?
# 导入pandas库
import pandas as pd
# 创建groceries字典
groceries = {
'grocery': ['Tesco\'s wafers', 'Asda\'s shortbread', 'Aldi\'s lemon tea', 'Sainsbury\'s croissant', 'Morrison\'s doughnut', 'Amazon fresh\'s peppermint tea', 'Bar becan\'s pizza', 'Pound savers\' shower gel'],
'category': ['biscuit', 'biscuit', 'tea', 'bakery', 'bakery', 'tea', 'bakery', 'hygiene'],
'price': [0.99, 1.24, 1.89, 0.75, 0.50, 2.5, 4.99, 2]
}
# 创建数据帧df
df = pd.DataFrame(groceries)
# 定义函数,修改单个列的值 - grocery
def change(x):
return df['grocery'].str.split(' ').str[1]
# 应用函数到grocery列
df['grocery'] = df['grocery'].map(change)
# 预期的数据帧
expected_df = pd.DataFrame({
'grocery': ['Wafers', 'Shortbread', 'Lemon Tea', 'Croissant', 'Doughnut', 'Peppermint Tea', 'Pizza', 'Shower Gel'],
'category': ['biscuit', 'biscuit', 'tea', 'bakery', 'bakery', 'tea', 'bakery', 'hygiene'],
'price': [0.99, 1.24, 1.89, 0.75, 0.50, 2.5, 4.99, 2]
})
请注意,以上代码是用于创建和应用函数以修改grocery
列的代码示例,以生成预期的数据帧。
英文:
I am trying to create and apply a function def change(x):
which modifies a single column of values grocery
in the grocery data frame as shown in the image below
grocery data
I want to achieve the result in the image below
output
I am at the beginner level in python but I know I can use the map()
or apply()
functions to solve this. My main problem is using the split()
method to achieve the result as the values in the category
column are of varying lengths. Or are there other string manipulation methods that can be used?
<!-- begin snippet: js hide: false console: true babel: false -->
<!-- language: lang-html -->
import pandas as pd
groceries = {
'grocery':['Tesco's wafers', 'Asda's shortbread', 'Aldi's lemon tea', 'Sainsbury's croissant', 'Morrison's doughnut', 'Amazon fresh's peppermint tea', 'Bar becan's pizza', 'Pound savers' shower gel'],
'category':['biscuit', 'biscuit', 'tea', 'bakery', 'bakery', 'tea', 'bakery', 'hygiene'],
'price':[0.99, 1.24, 1.89, 0.75, 0.50, 2.5, 4.99, 2]
}
df = pd.DataFrame(groceries)
df
# function to modify a single column of values - grocery
def change(x):
return df['grocery].str.split(' ').str[1]
df = pd.DataFrame(groceries)
df['grocery'] = df['grocery'].map(change)
df
# Expected DataFrame
groceries = pd.DataFrame({
'grocery':['Wafers', 'Shortbread', 'Lemon Tea', 'Croissant', 'Doughnut', 'Peppermint Tea', 'Pizza', 'Shower Gel'],
'category':['biscuit', 'biscuit', 'tea', 'bakery', 'bakery', 'tea', 'bakery', 'hygiene'],
'price':[0.99, 1.24, 1.89, 0.75, 0.50, 2.5, 4.99, 2]
})
<!-- end snippet -->
答案1
得分: 0
df['grocery'] = df['grocery'].apply(lambda x: (x.split('s', 1) if 's' in x else x.split('\'', 1))[1].title())
英文:
Assuming you have a dataframe df with your original data:
df['grocery'] = df['grocery'].apply(lambda x: (x.split("'s", 1) if "'s" in x else x.split("'", 1))[1].title())
答案2
得分: 0
我希望这对您的解决方案有效,我使用“'”逗号拆分它,然后从字符串的索引1开始。这取决于条件。
import pandas as pd
groceries = {
'grocery': [
"Tesco's wafers", "Asda's shortbread", "Aldi's lemon tea",
"Sainsbury's croissant", "Morrison's doughnut",
"Amazon fresh's peppermint tea", "Bar becan's pizza",
"Pound savers' shower gel"
],
'category': [
'biscuit', 'biscuit', 'tea', 'bakery', 'bakery', 'tea', 'bakery',
'hygiene'
],
'price': [0.99, 1.24, 1.89, 0.75, 0.50, 2.5, 4.99, 2]
}
df = pd.DataFrame(groceries)
# 使用“'”逗号拆分并从字符串的索引1开始
df['grocery'] = df['grocery'].apply(lambda x: x.split("'")[1][1:].title())
df
这是您提供的代码的翻译部分。
英文:
I hope this works for your solution, I split it with "'" comma and then start it with from 1 index of a string. It depends on conditions
import pandas as pd
groceries = {
'grocery': [
"Tesco's wafers", "Asda's shortbread", "Aldi's lemon tea",
"Sainsbury's croissant", "Morrison's doughnut",
"Amazon fresh's peppermint tea", "Bar becan's pizza",
"Pound savers' shower gel"
],
'category': [
'biscuit', 'biscuit', 'tea', 'bakery', 'bakery', 'tea', 'bakery',
'hygiene'
],
'price': [0.99, 1.24, 1.89, 0.75, 0.50, 2.5, 4.99, 2]
}
df = pd.DataFrame(groceries)
# split it with "'" comma and then start it with from 1 index of a string
# if multiple conditions for grocery string then
# def grocery_chng(x):
# # specify multiple conditions to replace a string
# return x
# df['grocery'] = df['grocery'].apply(grocery_chng)
df['grocery'] = df['grocery'].apply(lambda x: x.split("'")[1][1:].title())
df
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论