如何使用rvest计算图标数量?

huangapple go评论62阅读模式
英文:

How do I count icons using rvest?

问题

我想为此页面上的每个玩家计算总星星数:https://cbgm.news/stats/CONN_Ratings.html

以下是我的rvest代码:

library(tidyverse)
library(rvest)

url <- "https://cbgm.news/stats/CONN_Ratings.html"

scrape <- url %>% 
  read_html() %>% 
  html_nodes("td:nth-child(19)")

scrape

这返回:

{xml_nodeset (14)}
 [1] <td>
<i class="star yellow icon"></i><i class="star yellow ic ...
 [2] <td>
<i class="star yellow icon"></i><i class="star yellow ic ...
 [3] <td>
<i class="star yellow icon"></i><i class="star yellow ic ...
 [4] <td>
<i class="star yellow icon"></i><i class="star yellow ic ...
 [5] <td>
<i class="star yellow icon"></i><i class="star yellow ic ...
 [6] <td>
<i class="star yellow icon"></i><i class="star yellow ic ...
 [7] <td>
<i class="star yellow icon"></i><i class="star yellow ic ...
 [8] <td>
<i class="star yellow icon"></i><i class="star yellow ic ...
 [9] <td>
<i class="star yellow icon"></i><i class="star yellow ic ...
[10] <td>
<i class="star yellow icon"></i><i class="star yellow ic ...
[11] <td>
<i class="star yellow icon"></i><i class="star yellow ic ...
[12] <td>
<i class="star yellow icon"></i><i class="star yellow ic ...
[13] <td>
<i class="star yellow icon"></i><i class="star yellow ic ...
[14] <td><i class="star half yellow icon"></i></td>

如何将xml_nodeset转换为允许进行突变和计算星星图标数量的数据框/表?

感谢您对这个问题的任何帮助!

英文:

I want to count the number of overall stars for each player on this page: https://cbgm.news/stats/CONN_Ratings.html

Here's my rvest code:

library(tidyverse)
library(rvest)

url &lt;- &quot;https://cbgm.news/stats/CONN_Ratings.html&quot;

scrape &lt;- url %&gt;% 
  read_html() %&gt;% 
  html_nodes(&quot;td:nth-child(19)&quot;)

scrape

This returns:

{xml_nodeset (14)}
 [1] &lt;td&gt;\n&lt;i class=&quot;star yellow icon&quot;&gt;&lt;/i&gt;&lt;i class=&quot;star yellow ic ...
 [2] &lt;td&gt;\n&lt;i class=&quot;star yellow icon&quot;&gt;&lt;/i&gt;&lt;i class=&quot;star yellow ic ...
 [3] &lt;td&gt;\n&lt;i class=&quot;star yellow icon&quot;&gt;&lt;/i&gt;&lt;i class=&quot;star yellow ic ...
 [4] &lt;td&gt;\n&lt;i class=&quot;star yellow icon&quot;&gt;&lt;/i&gt;&lt;i class=&quot;star yellow ic ...
 [5] &lt;td&gt;\n&lt;i class=&quot;star yellow icon&quot;&gt;&lt;/i&gt;&lt;i class=&quot;star yellow ic ...
 [6] &lt;td&gt;\n&lt;i class=&quot;star yellow icon&quot;&gt;&lt;/i&gt;&lt;i class=&quot;star yellow ic ...
 [7] &lt;td&gt;\n&lt;i class=&quot;star yellow icon&quot;&gt;&lt;/i&gt;&lt;i class=&quot;star yellow ic ...
 [8] &lt;td&gt;\n&lt;i class=&quot;star yellow icon&quot;&gt;&lt;/i&gt;&lt;i class=&quot;star yellow ic ...
 [9] &lt;td&gt;\n&lt;i class=&quot;star yellow icon&quot;&gt;&lt;/i&gt;&lt;i class=&quot;star yellow ic ...
[10] &lt;td&gt;\n&lt;i class=&quot;star yellow icon&quot;&gt;&lt;/i&gt;&lt;i class=&quot;star yellow ic ...
[11] &lt;td&gt;\n&lt;i class=&quot;star yellow icon&quot;&gt;&lt;/i&gt;&lt;i class=&quot;star yellow ic ...
[12] &lt;td&gt;\n&lt;i class=&quot;star yellow icon&quot;&gt;&lt;/i&gt;&lt;i class=&quot;star yellow ic ...
[13] &lt;td&gt;\n&lt;i class=&quot;star yellow icon&quot;&gt;&lt;/i&gt;&lt;i class=&quot;star yellow ic ...
[14] &lt;td&gt;&lt;i class=&quot;star half yellow icon&quot;&gt;&lt;/i&gt;&lt;/td&gt;\n

How do I convert the xml_nodeset to a df/tibble that allows for mutating and counting the number of star icons?

I appreciate any help with this puzzle!

答案1

得分: 0

以下是代码的翻译部分:

你可以创建一个小函数,用于查找星星(全星和半星)并返回其数量。然后使用`mutate()`来添加一个名为`stars`的新列,该列保存该函数应用于`scrape`中每个元素的结果。

```R
f <- function(s) {
  return(str_count(as.character(s), "star yellow") + str_count(as.character(s), "star half")/2)
}

现在,使用rvest::html_table()以及mutate()函数:

rvest::html_table(url %>% read_html)[[1]] %>% 
  mutate(OVERALL = sapply(scrape, f))

输出:

     NUM POS   PLAYER    FGI   FGJ    FT   SCR   PAS   HDL   ORB   DRB   DEF   BLK   STL  DRFL    DI    IQ   ATH OVERALL
   <int> <chr> <chr>   <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int>   <dbl>
 1     1 PG    Marek …    36    76    54    73    81    50    66    72    65    72    75    21    53    42    58     4  
 2    14 PG    Brian …    10    90    72    50    74    32    71    71    69    53    82    32    65    57    91     3.5
 3    15 PG    Morris…    25    85    56    71    53    60    10    53    76    10    53    28    72    47    76     2  
 4    12 SG    Ryan M…    31    78    96    74    46    38    50    43    71    46    40    35    61    45    75     3.5
 5    21 SG    Lenny …    10    90    67    50    60    49    56    71    58    60    66    39    56    38    69     3  
 6     5 SG    Fred M…    10    83    61    71    30    23    10    78    63    10    16    39    61    38    87     2  
 7    23 SF    Will B…    35    73    58    74    66    38    70    72    52    60    74    21    42    46    30     4  
 8    51 SF    Lyly L…    51    76    83    84    75    32    66    81    61    70    85    24    52    47    60     5  
 9    42 SF    Joe Ch…    58    50    80    70    56    39    53    53    78    10    53    21    54    52    78     2  
10    40 PF    Richar…    63    50    41    72    71    32    79    78    65    71    71    54    39    43    72     4  
11    30 PF    Ammer …    56    54    81    63    60    23    72    72    78    66    56    35    50    54    58     3.5
12    54 C     Xavier…   100    33    36   100    76    16    96    91    76   100    87    61    28    41    73     5  
13    45 C     Brad L…    91    38    56    60    63    19    75    76    78    82    70    58    30    28    68     4  
14    10 C     Ed Str…    68    40    45    10    10    10    13    10    10    10    10    24    17    16    10     0.5

<details>
<summary>英文:</summary>
You could make a small function that looks for stars (full and half) and returns the number. Then use `mutate()` to add a new column `stars` which holds the application of that function to each element of `scrape`.

f <- function(s) {
return(str_count(as.character(s), "star yellow") + str_count(as.character(s), "star half")/2)
}


Now, use `rvest::html_table()` along with `mutate()`

rvest::html_table(url %>% read_html)[[1]] %>%
mutate(OVERALL = sapply(scrape,f))


Output:
 NUM POS   PLAYER    FGI   FGJ    FT   SCR   PAS   HDL   ORB   DRB   DEF   BLK   STL  DRFL    DI    IQ   ATH OVERALL

<int> <chr> <chr> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <dbl>
1 1 PG Marek … 36 76 54 73 81 50 66 72 65 72 75 21 53 42 58 4
2 14 PG Brian … 10 90 72 50 74 32 71 71 69 53 82 32 65 57 91 3.5
3 15 PG Morris… 25 85 56 71 53 60 10 53 76 10 53 28 72 47 76 2
4 12 SG Ryan M… 31 78 96 74 46 38 50 43 71 46 40 35 61 45 75 3.5
5 21 SG Lenny … 10 90 67 50 60 49 56 71 58 60 66 39 56 38 69 3
6 5 SG Fred M… 10 83 61 71 30 23 10 78 63 10 16 39 61 38 87 2
7 23 SF Will B… 35 73 58 74 66 38 70 72 52 60 74 21 42 46 30 4
8 51 SF Lyly L… 51 76 83 84 75 32 66 81 61 70 85 24 52 47 60 5
9 42 SF Joe Ch… 58 50 80 70 56 39 53 53 78 10 53 21 54 52 78 2
10 40 PF Richar… 63 50 41 72 71 32 79 78 65 71 71 54 39 43 72 4
11 30 PF Ammer … 56 54 81 63 60 23 72 72 78 66 56 35 50 54 58 3.5
12 54 C Xavier… 100 33 36 100 76 16 96 91 76 100 87 61 28 41 73 5
13 45 C Brad L… 91 38 56 60 63 19 75 76 78 82 70 58 30 28 68 4
14 10 C Ed Str… 68 40 45 10 10 10 13 10 10 10 10 24 17 16 10 0.5


</details>

huangapple
  • 本文由 发表于 2023年2月27日 02:17:30
  • 转载请务必保留本文链接:https://go.coder-hub.com/75574087.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定