将数据框从字符转换为数值形式。

huangapple go评论76阅读模式
英文:

Convert data frame from character to numeric

问题

我有以下包含一个列的数据框,该列当前存储为字符列:

我试图分离文本,但似乎separate()函数不适用于字符列。

我尝试使用以下代码将列转换,但都不适用于我。

第一次尝试:

Overview_10_K_filings_df$Overview_10_K_filings <- as.numeric(as.character(Overview_10_K_filings_df$Overview_10_K_filings))

这会创建错误消息:“警告信息:强制转换引入了NA”。

第二次尝试:

Overview_10_K_filings_df[1] <- apply(Overview_10_K_filings_df[1], 2,
                                     function(x) as.numeric(as.character(x))

你可以帮助我转换这一列吗?或者是否有其他方法可以分离内容?
谢谢!

英文:

I have the following data frame with one column, which is currently stored as a character column:

enter image description here

I am trying to separate the text, but it seems like the separate() function doesn't work on character columns.

I tried to covert the columns, using the following codes. Neither of them works for me.

First try:

Overview_10_K_filings_df$Overview_10_K_filings &lt;- as.numeric(as.character(Overview_10_K_filings_df$Overview_10_K_filings))

This creates the error message: "Warning message: NAs introduced by coercion"

Second try:

Overview_10_K_filings_df[1] &lt;- apply(Overview_10_K_filings_df[1], 2,
                                     function(x) as.numeric(as.character(x))

Can you help me to transform the column? Or is there any other way that I can separate the content?
Thanks!

答案1

得分: 1

通过将字符串转换为DF并在3个步骤中使用str_replace来实现。也许不是实现目标的最简洁方式。这三个步骤被保留在DF中,以说明替换的过程。

library(tidyverse)
  
t <- "QTR4/20151229_10-K_edgar_data_1230058_0000892626-15-000373.txt"
t |> as.data.frame() |> 
mutate(new1=stringr::str_replace(t, '/', ' | ')) |> 
mutate(new2 = stringr::str_replace_all(new1, '_', ' | ')) |> 
mutate(new3 = stringr::str_replace(new2, '.txt', ' | txt')) |> 
select(new3) |> as.character()
#> [1] "QTR4 | 20151229 | 10-K | edgar | data | 1230058 | 0000892626-15-000373 | txt"

更好的方式:
或者你可以这样做:

b <- "_|/|\\."
stringr::str_replace_all(t, b, ' | ')
# [1] "QTR4 | 20151229 | 10-K | edgar | data | 1230058 | 0000892626-15-000373 | txt"
英文:

By creating a DF out of the string and using str_replace in 3 steps.
Maybe not the most concise way of achieving the goal. The three steps are kept in the DF for informative reasons how the replacing goes.

library(tidyverse)
  
t &lt;- &quot;QTR4/20151229_10-K_edgar_data_1230058_0000892626-15-000373.txt&quot;
t |&gt; as.data.frame() |&gt; 
mutate(new1=stringr::str_replace(t, &#39;/&#39;, &#39; | &#39;)) |&gt; 
  mutate(new2 = stringr::str_replace_all(new1, &#39;_&#39;, &#39; | &#39;)) |&gt; 
  mutate(new3 = stringr::str_replace(new2, &#39;.txt&#39;, &#39; | txt&#39;)) |&gt; 
  select(new3) |&gt; as.character()
#&gt; [1] &quot;QTR4 | 20151229 | 10-K | edgar | data | 1230058 | 0000892626-15-000373 | txt&quot;

Better:
Or you do it this way:

b &lt;- &quot;_|/|\\.&quot;
stringr::str_replace_all(t, b, &#39; | &#39;)
# [1] &quot;QTR4 | 20151229 | 10-K | edgar | data | 1230058 | 0000892626-15-000373 | txt&quot;

huangapple
  • 本文由 发表于 2023年1月9日 18:43:58
  • 转载请务必保留本文链接:https://go.coder-hub.com/75056114.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定