英文:
How to find the number of males between the age
问题
以下是您要翻译的部分:
My data frame is below
*Find the number of males that is greater than 40 and less than 60
*Find the number of Females that are greater than 40 and less than 60
customer_Id DOB Gender
0 268408 02-01-1920 M
1 268408 02-01-1950 M
2 268408 02-01-1990 F
3 268408 02-01-1970 M
4 268408 02-01-1950 F
** First create column DOB to age, then df.age > 40 & df.age < 60
Pseudo code
now = pd.Timestamp('now')
only_date, only_time = now.date(), now.time()
df['age'] = (pd.to_datetime(only_date) - df['DOB']).astype('<m8[Y]')
info > 'DOB 207518 non-null datetime64[ns]'
its not subtracting
Expected output
M 1
F 0
英文:
My data frame is below
*Find the number of males that is greater than 40 less than 60
*Find the number of Females that is greater than 40 less than 60
customer_Id DOB Gender
0 268408 02-01-1920 M
1 268408 02-01-1950 M
2 268408 02-01-1990 F
3 268408 02-01-1970 M
4 268408 02-01-1950 F
** First create column DOB to age, then df.age > 40 & df.age < 60
Pseudo code
now = pd.Timestamp('now')
only_date, only_time = now.date(), now.time()
df['age'] = (pd.to_datetime(only_date) - df['DOB']).astype('<m8[Y]')
info > DOB 207518 non-null datetime64[ns]
its not substracting
Expected out
M 1
F 0
答案1
得分: 2
你需要尊重日历年份,如果想要准确计算年龄。这可以通过pd.offsets.DateOffset
来实现。首先,我们将出生日期(DOB)转换为datetime
,然后可以检查出生日期是否在今天减去60年和今天减去40年之间。
import pandas as pd
df['DOB'] = pd.to_datetime(df.DOB)
today = pd.to_datetime('today').normalize()
m = df.DOB.between(today - pd.offsets.DateOffset(years=60),
today - pd.offsets.DateOffset(years=40),
inclusive=False)
# 子集并计数
df.loc[m].Gender.value_counts()
#M 1
#Name: Gender, dtype: int64
英文:
You'll need to respect the calendar year if you want to get age perfectly correct. This can be accomplished with pd.offsets.DateOffset
. First we convert DOB to a datetime
, then we can check if the DOB occured between today - 60 years and today - 40 years.
import pandas as pd
df['DOB'] = pd.to_datetime(df.DOB)
today = pd.to_datetime('today').normalize()
m = df.DOB.between(today - pd.offsets.DateOffset(years=60),
today - pd.offsets.DateOffset(years=40),
inclusive=False)
# Subset and Count
df.loc[m].Gender.value_counts()
#M 1
#Name: Gender, dtype: int64
答案2
得分: 1
import datetime as dt
def cal_age(dob=str):
x = dt.datetime.strptime(dob, "%d-%m-%Y")
y = dt.date.today()
age = y.year - x.year - ((y.month, x.day) < (y.month, x.day))
return age
df['Age'] = df.DOB.apply(lambda z: cal_age(z))
df[df.Gender=='M'][df.Age < 60][df.Age > 40].count() # 男性
df[df.Gender=='F'][df.Age < 60][df.Age > 40].count() # 女性
英文:
import datetime as dt
def cal_age(dob=str):
x = dt.datetime.strptime(dob, "%d-%m-%Y")
y = dt.date.today()
age = y.year - x.year - ((y.month, x.day) < (y.month, x.day))
return age
df['Age'] = df.DOB.apply(lambda z: cal_age(z))
df[df.Gender=='M'][df.Age < 60][df.Age > 40].count() # male
df[df.Gender=='F'][df.Age < 60][df.Age > 40].count() # male
答案3
得分: 0
尝试:
df.groupby('Gender').DOB.agg(lambda grp: np.count_nonzero(
(pd.Timestamp.today() - grp).astype('timedelta64[Y]').between(40, 60)))
pd.Timestamp.today() - grp 是当前人的年龄。
astype('timedelta64[Y]') 将其转换为年份。
between(40, 60) 返回一个布尔值 - 当前人是否在所需的年龄范围内。
最后 np.count_nonzero(...) 计算True值。
以上整个计算都针对两个性别执行。
英文:
Try:
df.groupby('Gender').DOB.agg(lambda grp: np.count_nonzero(
(pd.Timestamp.today() - grp).astype('timedelta64[Y]').between(40,60)))
pd.Timestamp.today() - grp is the age of the current person.
astype('timedelta64[Y]') converts it to years.
between(40,60) returns a bool - whether the current person is
in the required age range.
And finally np.count_nonzero(...) counts True values.
The whole above computation is performed for both genders.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论