如何找到年龄之间的男性人数

huangapple go评论132阅读模式
英文:

How to find the number of males between the age

问题

以下是您要翻译的部分:

  1. My data frame is below
  2. *Find the number of males that is greater than 40 and less than 60
  3. *Find the number of Females that are greater than 40 and less than 60
  4. customer_Id DOB Gender
  5. 0 268408 02-01-1920 M
  6. 1 268408 02-01-1950 M
  7. 2 268408 02-01-1990 F
  8. 3 268408 02-01-1970 M
  9. 4 268408 02-01-1950 F
  10. ** First create column DOB to age, then df.age > 40 & df.age < 60
  11. Pseudo code
  12. now = pd.Timestamp('now')
  13. only_date, only_time = now.date(), now.time()
  14. df['age'] = (pd.to_datetime(only_date) - df['DOB']).astype('<m8[Y]')
  15. info > 'DOB 207518 non-null datetime64[ns]'
  16. its not subtracting
  17. Expected output
  18. M 1
  19. F 0
英文:

My data frame is below

*Find the number of males that is greater than 40 less than 60

*Find the number of Females that is greater than 40 less than 60

  1. customer_Id DOB Gender
  2. 0 268408 02-01-1920 M
  3. 1 268408 02-01-1950 M
  4. 2 268408 02-01-1990 F
  5. 3 268408 02-01-1970 M
  6. 4 268408 02-01-1950 F

** First create column DOB to age, then df.age > 40 & df.age < 60

Pseudo code

now = pd.Timestamp('now')
only_date, only_time = now.date(), now.time()

df[&#39;age&#39;] = (pd.to_datetime(only_date) - df[&#39;DOB&#39;]).astype(&#39;&lt;m8[Y]&#39;)

info > DOB 207518 non-null datetime64[ns]
its not substracting

Expected out

  1. M 1
  2. F 0

答案1

得分: 2

你需要尊重日历年份,如果想要准确计算年龄。这可以通过pd.offsets.DateOffset来实现。首先,我们将出生日期(DOB)转换为datetime,然后可以检查出生日期是否在今天减去60年和今天减去40年之间。

  1. import pandas as pd
  2. df['DOB'] = pd.to_datetime(df.DOB)
  3. today = pd.to_datetime('today').normalize()
  4. m = df.DOB.between(today - pd.offsets.DateOffset(years=60),
  5. today - pd.offsets.DateOffset(years=40),
  6. inclusive=False)
  7. # 子集并计数
  8. df.loc[m].Gender.value_counts()
  9. #M 1
  10. #Name: Gender, dtype: int64
英文:

You'll need to respect the calendar year if you want to get age perfectly correct. This can be accomplished with pd.offsets.DateOffset. First we convert DOB to a datetime, then we can check if the DOB occured between today - 60 years and today - 40 years.

  1. import pandas as pd
  2. df[&#39;DOB&#39;] = pd.to_datetime(df.DOB)
  3. today = pd.to_datetime(&#39;today&#39;).normalize()
  4. m = df.DOB.between(today - pd.offsets.DateOffset(years=60),
  5. today - pd.offsets.DateOffset(years=40),
  6. inclusive=False)
  7. # Subset and Count
  8. df.loc[m].Gender.value_counts()
  9. #M 1
  10. #Name: Gender, dtype: int64

答案2

得分: 1

import datetime as dt

def cal_age(dob=str):
x = dt.datetime.strptime(dob, "%d-%m-%Y")
y = dt.date.today()
age = y.year - x.year - ((y.month, x.day) < (y.month, x.day))
return age

df['Age'] = df.DOB.apply(lambda z: cal_age(z))

df[df.Gender=='M'][df.Age < 60][df.Age > 40].count() # 男性
df[df.Gender=='F'][df.Age < 60][df.Age > 40].count() # 女性

英文:
  1. import datetime as dt
  2. def cal_age(dob=str):
  3. x = dt.datetime.strptime(dob, &quot;%d-%m-%Y&quot;)
  4. y = dt.date.today()
  5. age = y.year - x.year - ((y.month, x.day) &lt; (y.month, x.day))
  6. return age
  7. df[&#39;Age&#39;] = df.DOB.apply(lambda z: cal_age(z))
  8. df[df.Gender==&#39;M&#39;][df.Age &lt; 60][df.Age &gt; 40].count() # male
  9. df[df.Gender==&#39;F&#39;][df.Age &lt; 60][df.Age &gt; 40].count() # male

答案3

得分: 0

尝试:

  1. df.groupby('Gender').DOB.agg(lambda grp: np.count_nonzero(
  2. (pd.Timestamp.today() - grp).astype('timedelta64[Y]').between(40, 60)))

pd.Timestamp.today() - grp 是当前人的年龄。

astype('timedelta64[Y]') 将其转换为年份。

between(40, 60) 返回一个布尔值 - 当前人是否在所需的年龄范围内。

最后 np.count_nonzero(...) 计算True值。

以上整个计算都针对两个性别执行。

英文:

Try:

  1. df.groupby(&#39;Gender&#39;).DOB.agg(lambda grp: np.count_nonzero(
  2. (pd.Timestamp.today() - grp).astype(&#39;timedelta64[Y]&#39;).between(40,60)))

pd.Timestamp.today() - grp is the age of the current person.

astype('timedelta64[Y]') converts it to years.

between(40,60) returns a bool - whether the current person is
in the required age range.

And finally np.count_nonzero(...) counts True values.

The whole above computation is performed for both genders.

huangapple
  • 本文由 发表于 2020年1月7日 01:28:35
  • 转载请务必保留本文链接:https://go.coder-hub.com/59616446.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定