2023年4月19日 18:38:53go评论176阅读模式

英文:

Sapply Function in R: NA Introduced by coercion, but I have only numeric values

问题

对于我的项目，我应该使用这个数据集进行差异表达分析，其中行表示患者，列表示基因。第一列 os event 表示生存情况（可能是 0 或 1）。

在分析中，我应该使用以下代码将它们转换为数值：data=sapply(data,as.numeric)，但是我收到了警告：NA introduced by coercion。现在，真正的问题是所有的值都变成了 NA，而我无法理解为什么，因为我唯一奇怪的值是 0。我尝试对以下数据集执行相同的操作（在结构上似乎相同），它运行正常。

英文:

for my project,I should do a Differential Expression Analysis with this dataset, where the rows indicates the patients and the columns indicates the genes. The first column os event denotes the survival (could be 0 or 1).

For the analysis, I should convert these into numerical value with the line of code data=sapply(data,as.numeric), but I have the warning: NA introduced by coercion. Now, the real problem is that all values become equals to NA and I can't understand why, since the only strange value that I have is 0. I tried to do the same with the following dataset (which seems pretty the same in terms of structure) and it works fine:

UPDATE: as suggested in the comments, I run the following command :

dput(data), where data is my Dataframe. The output is this:

structure(list(os.event.RTK_RAS.NRF2.PI3K.WNT.HIPPO.CELL_CYCLE.MYC.NOTCH.TGF_Beta.TP53 = c(&quot;0;2;0;2.642857143;12.875;0;0;0;0;0;0&quot;, 
&quot;0;1.72;0;5.071428571;7.5;0;4.25;0;0;9.333333333;5.666666667&quot;, 
&quot;0;1.52;0;0;6.5;3.166666667;6.625;0;0;10.66666667;8.833333333&quot;, 
&quot;0;0.88;0;1.928571429;6;2.25;0;0;0;0;0&quot;, &quot;1;0.8;0;2.285714286;3.375;2.666666667;15;0;9.714285714;0;8.666666667&quot;, 
&quot;0;1.4;0;0;13.375;0;7.625;0;0;0;10.16666667&quot;, &quot;1;2.48;0;0;8.625;0;8.75;0;0;16.66666667;11.66666667&quot;, 
&quot;0;1.08;0;0;7.875;2.25;6.125;0;5.428571429;0;8.166666667&quot;, &quot;1;1.16;0;2;14.25;2.333333333;10.75;0;10.42857143;0;9.5&quot;, 
&quot;0;1.56;0;0;14;0.8333333333;14;12;0;0;18.66666667&quot;, &quot;0;0;0;0;3.875;0;4.75;0;0;8.666666667;6.333333333&quot;, 
&quot;0;3.88;0;0;1.875;2.833333333;10.25;0;0;0;13.66666667&quot;, &quot;0;7.76;17.5;2.785714286;0;7.583333333;7.875;0;0;0;10.5&quot;, 
&quot;0;3.16;17.5;2.5;9;2.583333333;13.75;8.5;24.85714286;12.33333333;5.833333333&quot;, 
&quot;1;2.36;0;4.857142857;5.75;3;0;0;0;0;0&quot;, &quot;0;4.44;0;1.714285714;2.125;0;9;0;2.714285714;0;12&quot;, 
&quot;1;1;0;2.214285714;4.25;2.583333333;5;0;0;0;6.666666667&quot;, &quot;0;2.52;0;0;4.25;5.25;12;0;4.428571429;0;10.83333333&quot;, 
&quot;0;1.08;0;2.214285714;7.125;2.583333333;0;0;11.28571429;0;0&quot;, 
&quot;0;1.56;0;4.285714286;19.375;5;7.5;0;1.571428571;0;10&quot;, &quot;0;1.08;0;1.571428571;8;1.833333333;0;0;0;0;0&quot;, 
&quot;1;0.56;0;2.142857143;3.125;0;0;0;6.714285714;0;0&quot;, &quot;0;0;0;0;2.125;0;5.5;0;0;0;7.333333333&quot;, 
&quot;1;2.32;0;3.642857143;6.125;4.25;5.875;0;0;0;7.833333333&quot;, &quot;1;0;0;7.714285714;14.875;9;6.5;0;0;0;8.666666667&quot;, 
&quot;0;3.08;0;2.571428571;0;3;2.375;0;2.714285714;0;0&quot;, &quot;1;1.48;0;0;7.5;0;5.375;0;0;0;7.166666667&quot;, 
&quot;0;4;0;0;2.875;0;5.25;11.75;0;0;7&quot;, &quot;0;2.88;0;1.428571429;3;1.666666667;0;0;0;9;0&quot;, 
&quot;0;0;0;0;4.875;0;9.5;0;0;0;20.83333333&quot;, &quot;1;4.12;0;2.642857143;4.25;3.083333333;6;0;0;0;8&quot;, 
&quot;0;0;0;1.5;6.125;1.75;8.125;0;0;0;10.83333333&quot;, &quot;0;1.56;0;0;7.25;0;4.25;0;0;0;5.666666667&quot;, 
&quot;0;1.28;0;0.7142857143;8.75;0.8333333333;3.75;0;0;0;5&quot;, &quot;1;5.56;0;0;5.375;0;8.75;0;7.285714286;0;0&quot;, 
&quot;1;0.96;0;0;2.125;0;7.875;0;4.285714286;0;13&quot;, &quot;0;2.76;0;12.21428571;6;14;6.25;0;6.142857143;0;8.333333333&quot;, 
&quot;0;2.04;0;2.285714286;7.75;2.666666667;13.25;0;0;0;17.66666667&quot;, 
&quot;0;1.28;0;1.285714286;0;0;8.125;0;0;15.33333333;10.83333333&quot;, 
&quot;0;2.76;0;0;8.25;0;7;0;9.857142857;17.66666667;9.333333333&quot;, 
&quot;0;0;0;0;6.875;0;8;0;8.857142857;0;10.66666667&quot;, &quot;1;1.4;0;0;4.125;0;0;0;0;0;0&quot;, 
&quot;1;1.24;0;0;2.125;0;2.625;0;6.571428571;0;3.5&quot;, &quot;0;3.2;0;0.7142857143;3.875;4.333333333;0;0;0;0;0&quot;, 
&quot;0;1.4;0;0;3.375;0;5.875;0;0;0;7.833333333&quot;, &quot;1;1.52;0;3.357142857;5.5;3.916666667;4;0;2.857142857;0;5.333333333&quot;, 
&quot;0;2.84;0;0;8.375;0;11.125;0;5.428571429;13;8.5&quot;)), class = &quot;data.frame&quot;, row.names = c(NA, 
-47L))

答案1

得分: 1

提供的数据可能是以一种奇怪的方式在上游导入的。dput() 包含一个字符向量列表，可能表示数据框架的行，值之间用分号 (;) 分隔。我进行了一些顺序数据整理步骤，以获得我认为是期望的输出：

*在上游，您可能有更好的解决方案，数据被读取/导入。我怀疑原始数据文件实际上不是 .csv 文件，而是您使用了 read.csv 导入数据。
尝试找到适用于您类型数据的正确数据导入函数，或使用像 rio::import(file_name) 这样的通用函数


<details>
<summary>英文:</summary>

The data provided was probably imported in a weird way upstream. The dput() includes a list of character vectors that probably represent rows of a dataframe, with values separated by semicolons (`;`). I had to do some sequential data wrangling steps to get what I think is the desired output:

*You most likely have a better solution upstream, where the data is read-in/imported. I suspect the original data file was not actually a .csv file, but you imported the data with `read.csv`.
Try to find the correct data import function for your type of data, or use an agnostic function like `rio::import(file_name)`

library(purrr)
library(dplyr)
library(tidyr)
library(stringr)

dat %>% separate_wider_delim(delim = ";",
cols = everything(),
names_sep = ".",
) %>%
type.convert(as.is = TRUE) %>%
rename_with(~c("os.event", unlist(str_split(names(dat), "\."))[-c(1:2)]))

A tibble: 47 × 11

os.event RTK_RAS NRF2 PI3K WNT HIPPO CELL_CYCLE MYC NOTCH TGF_Beta TP53
<int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 0 2 0 2.64 12.9 0 0 0 0 0 0
2 0 1.72 0 5.07 7.5 0 4.25 0 0 9.33 5.67
3 0 1.52 0 0 6.5 3.17 6.62 0 0 10.7 8.83
4 0 0.88 0 1.93 6 2.25 0 0 0 0 0
5 1 0.8 0 2.29 3.38 2.67 15 0 9.71 0 8.67
6 0 1.4 0 0 13.4 0 7.62 0 0 0 10.2
7 1 2.48 0 0 8.62 0 8.75 0 0 16.7 11.7
8 0 1.08 0 0 7.88 2.25 6.12 0 5.43 0 8.17
9 1 1.16 0 2 14.2 2.33 10.8 0 10.4 0 9.5
10 0 1.56 0 0 14 0.833 14 12 0 0 18.7

ℹ 37 more rows

ℹ Use `print(n = ...)` to see more rows

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Sapply函数在R中：NA由强制转换引入，但我只有数值。

问题

答案1

A tibble: 47 × 11

ℹ 37 more rows

ℹ Use `print(n = ...)` to see more rows

In R leaflet interactive map, all my values get incorrectly displayed (while values inside data frames are all correct)

In R, 有关 dplyr::bind_rows 合并数据框的问题。

当使用lm_robust时，以及texreg仅获取观察数：

使用”Yardstick”来计算每个分组预测的均方根误差（RMSE）。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论

问题

答案1

A tibble: 47 × 11

ℹ 37 more rows

ℹ Use print(n = ...) to see more rows

发表评论

ℹ Use `print(n = ...)` to see more rows