英文:
How to make one-liner `pd.Series` with `pd.CategoricalIndex` property
问题
以下是您要翻译的内容:
"I have a one-liner liner code that is working, which from pd.DataFrame
create pd.Series
with an index pd.CategoricalIndex
.
Since pd.DataFrame
is an API based on pd.Series
I would like to generate the same series but now with pd.Series
only, this is a question of optimization and API panda skills.
The pd.DataFrame
code is listed below
import pandas as pd
pd_series_1 = pd.DataFrame(
data=[
("2018-01", 0.0),
("2019-02", 1200.0),
("2019-03", 600.0),
],
columns=['TIME_PERIOD', "OBS_VALUE"],
).astype(
{"TIME_PERIOD": "category"}
).set_index(
"TIME_PERIOD"
)["OBS_VALUE"]
assert pd_series_1.index.name == "TIME_PERIOD"
assert repr(pd_series_1.index) == "CategoricalIndex(['2018-01', '2019-02', '2019-03'], " \
"categories=['2018-01', '2019-02', '2019-03'], " \
"ordered=False, " \
"dtype='category', " \
"name='TIME_PERIOD')", repr(pd_series_1.index)
assert repr(pd_series_1) == "TIME_PERIOD\n" \
"2018-01 0.0\n" \
"2019-02 1200.0\n" \
"2019-03 600.0\n" \
"Name: OBS_VALUE, dtype: float64", repr(pd_series_1)
As you can see, the final series pd_series_1
has: CategoricalIndex.name
equal with 'TIME_PERIOD' and the name
as 'OBS_VALUE'.
The same is desired to have by using only pd.Series
API within constructor or plus additional chain methods alike .set_index
as in pd_series_1
.
The code which I used for pd.Series
is listed below
pd_series_2 = pd.Series(dict(
[
("2018-01", 0.0),
("2019-02", 1200.0),
("2019-03", 600.0),
]),
name='OBS_VALUE',
)
print(pd_series_2)
# 2018-01 0.0
# 2019-02 1200.0
# 2019-03 600.0
# Name: OBS_VALUE, dtype: float64
pd_series_2.index = pd.CategoricalIndex(pd_series_2.index, name='TIME_PERIOD')
print(pd_series_2)
# TIME_PERIOD
# 2018-01 0.0
# 2019-02 1200.0
# 2019-03 600.0
# Name: OBS_VALUE, dtype: float64
As you can observe, I managed to get the result, but the code is not one-liner.
Please suggest one-liner syntax here,
thank you in advance"
英文:
I have a one-liner liner code that is working, which from pd.DataFrame
create pd.Series
with an index pd.CategoricalIndex
.
Since pd.DataFrame
is an API based on pd.Series
I would like to generate the same series but now with pd.Series
only, this is a question of optimization and API panda skills.
The pd.DataFrame
code is listed below
import pandas as pd
pd_series_1 = pd.DataFrame(
data=[
("2018-01", 0.0),
("2019-02", 1200.0),
("2019-03", 600.0),
],
columns=['TIME_PERIOD', "OBS_VALUE"],
).astype(
{"TIME_PERIOD": "category"}
).set_index(
"TIME_PERIOD"
)["OBS_VALUE"]
assert pd_series_1.index.name == "TIME_PERIOD"
assert repr(pd_series_1.index) == "CategoricalIndex(['2018-01', '2019-02', '2019-03'], " \
"categories=['2018-01', '2019-02', '2019-03'], " \
"ordered=False, " \
"dtype='category', " \
"name='TIME_PERIOD')", repr(pd_series_1.index)
assert repr(pd_series_1) == "TIME_PERIOD\n" \
"2018-01 0.0\n" \
"2019-02 1200.0\n" \
"2019-03 600.0\n" \
"Name: OBS_VALUE, dtype: float64", repr(pd_series_1)
As you can see, the final series pd_series_1
has: CategoricalIndex.name
equal with 'TIME_PERIOD' and the name
as 'OBS_VALUE'.
The same is desired to have by using only pd.Series
API within constructor or plus additional chain methods alike .set_index
as in pd_series_1
.
The code which I used for pd.Series
is listed below
pd_series_2 = pd.Series(dict(
[
("2018-01", 0.0),
("2019-02", 1200.0),
("2019-03", 600.0),
]),
name='OBS_VALUE',
)
print(pd_series_2)
# 2018-01 0.0
# 2019-02 1200.0
# 2019-03 600.0
# Name: OBS_VALUE, dtype: float64
pd_series_2.index = pd.CategoricalIndex(pd_series_2.index, name='TIME_PERIOD')
print(pd_series_2)
# TIME_PERIOD
# 2018-01 0.0
# 2019-02 1200.0
# 2019-03 600.0
# Name: OBS_VALUE, dtype: float64
As you can observe, I managed to get the result, but the code is not one-liner.
Please suggest one-liner syntax here,
thank you in advance
答案1
得分: 2
使用 Series.pipe
与 Series.set_axis
:
pd_series_2 = pd.Series(dict(
[
("2018-01", 0.0),
("2019-02", 1200.0),
("2019-03", 600.0),
]),
name='OBS_VALUE',
).pipe(lambda x:x.set_axis(pd.CategoricalIndex(x.index, name='TIME_PERIOD')))
print(pd_series_2.index)
CategoricalIndex(['2018-01', '2019-02', '2019-03'],
categories=['2018-01', '2019-02', '2019-03'],
ordered=False,
dtype='category',
name='TIME_PERIOD')
英文:
Use Series.pipe
with Series.set_axis
:
pd_series_2 = pd.Series(dict(
[
("2018-01", 0.0),
("2019-02", 1200.0),
("2019-03", 600.0),
]),
name='OBS_VALUE',
).pipe(lambda x:x.set_axis(pd.CategoricalIndex(x.index, name='TIME_PERIOD')))
print(pd_series_2.index)
CategoricalIndex(['2018-01', '2019-02', '2019-03'],
categories=['2018-01', '2019-02', '2019-03'],
ordered=False,
dtype='category',
name='TIME_PERIOD')
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论