2023年5月28日 07:31:28go评论96阅读模式

英文:

How can I extract only USD values from a column in R data table including salaries in crore?

问题

我期望这段代码给我所需的输出，但实际上每个薪水数值现在都是NA。

英文:

I have a table with data for the ongoing IPL 2023, with includes a list of the players' salaries. The salaries are either written in the forms, for example, "₹6.75 crore (US$850,000)", "₹50 lakh (US$63,000)", or ₹16 crore (US$2.0 million)". I want to only extract the USD value (if it's listed in millions, I want to have the full number, i.e 2.0 milllion should be 2,000,000).

ipl_2023_salaries &lt;- read_excel(file_path)
usd_values &lt;- str_extract_all(ipl_2023_salaries$Salary, &quot;\\(US\\$([0-9.,]+)\\)&quot;)   
usd_values &lt;- sapply(usd_values, function(x) ifelse(length(x) &gt; 0, gsub(&quot;[^0-9.]&quot;, &quot;&quot;, x), NA))   
ipl_2023_salaries$Salary &lt;- as.numeric(usd_values)
print(ipl_2023_salaries)

I expected this code to give me the output I desired, but instead every salary value is now NA.

答案1

得分: 1

使用 `$` 分割字符串，将第二部分传递给 `readr::parse_number()`，如果存在匹配的 "*million*"，则乘以 *1e6*：
``` r
library(stringr)
library(readr)
salary <- c("₹6.75 crore (US$850,000)", "₹50 lakh (US$63,000)", "₹16 crore (US$2.0 million)")
parse_number(str_split_i(salary, "\$", 2)) * ifelse(str_detect(salary, "\$.*million"), 1e6, 1)
#> [1]  850000   63000 2000000

^{创建于 2023-05-28，使用 reprex v2.0.2}


<details>
<summary>英文:</summary>
Split strings at `$`, pass 2nd part to `readr::parse_number()` and if there&#39;s a match for *&quot;million&quot;*, multiply by *1e6*:
``` r
library(stringr)
library(readr)
salary &lt;- c(&quot;₹6.75 crore (US$850,000)&quot;, &quot;₹50 lakh (US$63,000)&quot;, &quot;₹16 crore (US$2.0 million)&quot;)
parse_number(str_split_i(salary, &quot;\$&quot;, 2)) * ifelse(str_detect(salary, &quot;\$.*million&quot;),1e6,1)
#&gt; [1]  850000   63000 2000000

<sup>Created on 2023-05-28 with reprex v2.0.2</sup>

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

How can I extract only USD values from a column in R data table including salaries in crore?

问题

答案1

如何在R中使用大数据集运行狄利克雷回归？

R ggplot标签每个有序小提琴的观察数量，使用facet wrap。

生成一个随机整数向量，使其总和等于给定的数字在R中

How to make a section of the axis log scale and other section to be linear scale to better present the data

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。