统计唯一活动的数量

huangapple go评论85阅读模式
英文:

Count number of unique activities

问题

以下是您要翻译的内容:

我有一个数据框(简化版本),如下所示:

  1. df <- data.frame(ID = rep('A',7), Activity =
  2. c('Login','Login','cat1','Login','cat2','Login','Login'))

这是我想要做的:

  • 从第一行开始
  • 初始化 session=0
  • 创建一个数据框来保存 pathcount
  • 如果 Activity 等于 Login,那么 session=1,检查下一行的 Activity 并记录它。这将成为一个 path,直到下一个 Login
  • 继续直到遇到下一个 Login,然后设置 session=2
  • 这个示例的最终结果将是:
  1. Path count
  2. Login 3
  3. Login, cat1 1
  4. Login, cat2 1
英文:

I have a dataframe (simplified version) as follows:

  1. df &lt;- data.frame(ID = rep(&#39;A&#39;,7), Activity =
  2. c(&#39;Login&#39;,&#39;Login&#39;,&#39;cat1&#39;,&#39;Login&#39;,&#39;cat2&#39;,&#39;Login&#39;,&#39;Login&#39;))
  3. ID Activity
  4. 1 A Login
  5. 2 A Login
  6. 3 A cat1
  7. 4 A Login
  8. 5 A cat2
  9. 6 A Login
  10. 7 A Login

Here is what I wish to do:

  • start from the first row

  • initiate session=0

  • create a dataframe to hold path and count

  • If the Activity is equal to Login, then session=1, check the next row's Activity and record it. This will be a path until the next Login

  • continue until you hit the next Login, then set session=2.

  • The final outcome for this example would be:

    1. Path count
    2. Login 3
    3. Login, cat1 1
    4. Login, cat12 1

答案1

得分: 0

创建基于 "Login" 的组,分割,然后在每个组中粘贴,最后使用表格进行聚合:

  1. data.frame(
  2. table(
  3. sapply(split(df$Activity, cumsum(df$Activity == "Login")), function(i){
  4. paste(i, collapse = ",")
  5. })))
  6. # Var1 Freq
  7. # 1 Login 3
  8. # 2 Login,cat1 1
  9. # 3 Login,cat2 1
英文:

Create groups based on "Login", split, then paste per group, finally, aggregate using table:

  1. data.frame(
  2. table(
  3. sapply(split(df$Activity, cumsum(df$Activity == &quot;Login&quot;)), function(i){
  4. paste(i, collapse = &quot;,&quot;)
  5. })))
  6. # Var1 Freq
  7. # 1 Login 3
  8. # 2 Login,cat1 1
  9. # 3 Login,cat2 1

答案2

得分: 0

使用dplyr库进行数据处理的示例代码如下:

  1. library(dplyr)
  2. df %>%
  3. group_by(grp = cumsum(Activity == "Login")) %>%
  4. summarise(Path = toString(Activity)) %>%
  5. count(Path, name = "count")

输出结果如下:

  1. # A tibble: 3 × 2
  2. Path count
  3. <chr> <int>
  4. 1 Login 3
  5. 2 Login, cat1 1
  6. 3 Login, cat2 1

请注意,我只翻译了代码和注释部分,不包括输出结果。

英文:

Using dplyr

  1. library(dplyr)
  2. df %&gt;%
  3. group_by(grp = cumsum(Activity == &quot;Login&quot;)) %&gt;%
  4. summarise(Path = toString(Activity)) %&gt;%
  5. count(Path, name = &quot;count&quot;)

-output

  1. # A tibble: 3 &#215; 2
  2. Path count
  3. &lt;chr&gt; &lt;int&gt;
  4. 1 Login 3
  5. 2 Login, cat1 1
  6. 3 Login, cat2 1

huangapple
  • 本文由 发表于 2023年4月19日 18:48:01
  • 转载请务必保留本文链接:https://go.coder-hub.com/76053621.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定