2023年8月11日 01:36:51go评论163阅读模式

英文:

Replace last non-NA value(s) with NA [R]

问题

I have a dataframe with 7 character columns and 1 column with integers (x).
Column x denotes the amount of non-NA values that I want to replace with NA (in this example x goes as high as 2, but could also be higher than that).

library(data.table)
d1 &lt;- data.table(K0 = c(&quot;A&quot;, &quot;B&quot;, &quot;C&quot;, &quot;D&quot;, &quot;E&quot;, &quot;E&quot;, &quot;E&quot;, &quot;F&quot;, &quot;F&quot;, &quot;F&quot;, &quot;G&quot;, &quot;G&quot;),  
                 K1 = c(&quot;H&quot;, &quot;I&quot;, &quot;J&quot;, &quot;J&quot;, &quot;J&quot;, &quot;K&quot;, &quot;K&quot;, &quot;K&quot;, &quot;1111&quot;, &quot;1111&quot;, &quot;L&quot;, &quot;M&quot;),
                 K2 = c(&quot;N&quot;, &quot;O&quot;, &quot;P&quot;, &quot;P&quot;, &quot;P&quot;, &quot;Q&quot;, &quot;Q&quot;, &quot;Q&quot;, &quot;1112&quot;, &quot;1112&quot;, NA, &quot;R&quot;),
                 K3 = c(&quot;S&quot;, &quot;T&quot;, &quot;1113&quot;, &quot;1113&quot;, &quot;1113&quot;, &quot;U&quot;, &quot;U&quot;, &quot;U&quot;, NA, NA, NA, &quot;1114&quot;),
                 K4 = c(&quot;P&quot;, NA, &quot;1115&quot;, &quot;1115&quot;, &quot;1115&quot;, &quot;1116&quot;, &quot;1116&quot;, &quot;1116&quot;, NA, NA, NA, NA),
                 K5 = c(NA, NA, &quot;1117&quot;, &quot;1117&quot;, &quot;1117&quot;, &quot;1118&quot;, &quot;1118&quot;, &quot;1118&quot;, NA, NA, NA, NA),
                 K6 = c(NA, NA, NA, NA, NA, &quot;1119&quot;,&quot;1119&quot;,&quot;1119&quot;, NA, NA, NA, NA),
                 x = c(0,0,0,1,2,0,1,2,0,1,0,0))

I tried a lot (what is not worth mentioning here), but I can't figure out how to solve this problem.

The desired output should look like this.

d2 &lt;- data.table(K0 = c(&quot;A&quot;, &quot;B&quot;, &quot;C&quot;, &quot;D&quot;, &quot;E&quot;, &quot;E&quot;, &quot;E&quot;, &quot;F&quot;, &quot;F&quot;, &quot;F&quot;, &quot;G&quot;, &quot;G&quot;),  
                 K1 = c(&quot;H&quot;, &quot;I&quot;, &quot;J&quot;, &quot;J&quot;, &quot;J&quot;, &quot;K&quot;, &quot;K&quot;, &quot;K&quot;, &quot;1111&quot;, &quot;1111&quot;, &quot;L&quot;, &quot;M&quot;),
                 K2 = c(&quot;N&quot;, &quot;O&quot;, &quot;P&quot;, &quot;P&quot;, &quot;P&quot;, &quot;Q&quot;, &quot;Q&quot;, &quot;Q&quot;, &quot;1112&quot;, NA, NA, &quot;R&quot;),
                 K3 = c(&quot;S&quot;, &quot;T&quot;, &quot;1113&quot;, &quot;1113&quot;, &quot;1113&quot;, &quot;U&quot;, &quot;U&quot;, &quot;U&quot;, NA, NA, NA, &quot;1114&quot;),
                 K4 = c(&quot;P&quot;, NA, &quot;1115&quot;, &quot;1115&quot;, NA, &quot;1116&quot;, &quot;1116&quot;, &quot;1116&quot;, NA, NA, NA, NA),
                 K5 = c(NA, NA, &quot;1117&quot;, NA, NA, &quot;1118&quot;, &quot;1118&quot;, NA, NA, NA, NA, NA),
                 K6 = c(NA, NA, NA, NA, NA, &quot;1119&quot;,NA,NA, NA, NA, NA, NA),
                 x = c(0,0,0,1,2,0,1,2,0,1,0,0))

An example for clarification; row 3, 4 and 5 are duplicates and this should become

Row 3: C, J, P, 1113, 1115, 1117, NA 0 (0 = no replacement needed)

Row 4: C, J, P, 1113, 1115, NA, NA 1 (1 = last non-NA replaced with NA)

Row 5: C, J, P, 1113, NA, NA, NA 2 (2 = last 2 non-NA replaced with NA)

If some R mastermind could help me out, that would be great.
(I prefer dplyr, but at this point anything will do).
Using R version 4.1.0.

英文:

library(data.table)
d1 &lt;- data.table(K0 = c(&quot;A&quot;, &quot;B&quot;, &quot;C&quot;, &quot;D&quot;, &quot;E&quot;, &quot;E&quot;, &quot;E&quot;, &quot;F&quot;, &quot;F&quot;, &quot;F&quot;, &quot;G&quot;, &quot;G&quot;),  
K1 = c(&quot;H&quot;, &quot;I&quot;, &quot;J&quot;, &quot;J&quot;, &quot;J&quot;, &quot;K&quot;, &quot;K&quot;, &quot;K&quot;, &quot;1111&quot;, &quot;1111&quot;, &quot;L&quot;, &quot;M&quot;),
K2 = c(&quot;N&quot;, &quot;O&quot;, &quot;P&quot;, &quot;P&quot;, &quot;P&quot;, &quot;Q&quot;, &quot;Q&quot;, &quot;Q&quot;, &quot;1112&quot;, &quot;1112&quot;, NA, &quot;R&quot;),
K3 = c(&quot;S&quot;, &quot;T&quot;, &quot;1113&quot;, &quot;1113&quot;, &quot;1113&quot;, &quot;U&quot;, &quot;U&quot;, &quot;U&quot;, NA, NA, NA, &quot;1114&quot;),
K4 = c(&quot;P&quot;, NA, &quot;1115&quot;, &quot;1115&quot;, &quot;1115&quot;, &quot;1116&quot;, &quot;1116&quot;, &quot;1116&quot;, NA, NA, NA, NA),
K5 = c(NA, NA, &quot;1117&quot;, &quot;1117&quot;, &quot;1117&quot;, &quot;1118&quot;, &quot;1118&quot;, &quot;1118&quot;, NA, NA, NA, NA),
K6 = c(NA, NA, NA, NA, NA, &quot;1119&quot;,&quot;1119&quot;,&quot;1119&quot;, NA, NA, NA, NA),
x = c(0,0,0,1,2,0,1,2,0,1,0,0))

I tried a lot (what is not worth mentioning here), but I can't figure out how to solve this problem.

The desired output should look like this.

d2 &lt;- data.table(K0 = c(&quot;A&quot;, &quot;B&quot;, &quot;C&quot;, &quot;D&quot;, &quot;E&quot;, &quot;E&quot;, &quot;E&quot;, &quot;F&quot;, &quot;F&quot;, &quot;F&quot;, &quot;G&quot;, &quot;G&quot;),  
K1 = c(&quot;H&quot;, &quot;I&quot;, &quot;J&quot;, &quot;J&quot;, &quot;J&quot;, &quot;K&quot;, &quot;K&quot;, &quot;K&quot;, &quot;1111&quot;, &quot;1111&quot;, &quot;L&quot;, &quot;M&quot;),
K2 = c(&quot;N&quot;, &quot;O&quot;, &quot;P&quot;, &quot;P&quot;, &quot;P&quot;, &quot;Q&quot;, &quot;Q&quot;, &quot;Q&quot;, &quot;1112&quot;, NA, NA, &quot;R&quot;),
K3 = c(&quot;S&quot;, &quot;T&quot;, &quot;1113&quot;, &quot;1113&quot;, &quot;1113&quot;, &quot;U&quot;, &quot;U&quot;, &quot;U&quot;, NA, NA, NA, &quot;1114&quot;),
K4 = c(&quot;P&quot;, NA, &quot;1115&quot;, &quot;1115&quot;, NA, &quot;1116&quot;, &quot;1116&quot;, &quot;1116&quot;, NA, NA, NA, NA),
K5 = c(NA, NA, &quot;1117&quot;, NA, NA, &quot;1118&quot;, &quot;1118&quot;, NA, NA, NA, NA, NA),
K6 = c(NA, NA, NA, NA, NA, &quot;1119&quot;,NA,NA, NA, NA, NA, NA),
x = c(0,0,0,1,2,0,1,2,0,1,0,0))

An example for clarification; row 3, 4 and 5 are duplicates and this should become

Row 3: C, J, P, 1113, 1115, 1117, NA 0 (0 = no replacement needed)

Row 4: C, J, P, 1113, 1115, NA, NA 1 (1 = last non-NA replaced with NA)

Row 5: C, J, P, 1113, NA, NA, NA 2 (2 = last 2 non-NA replaced with NA)

If some R mastermind could help me out, that would be great.
(I prefer dplyr, but at this point anything will do).
Using R version 4.1.0.

答案1

得分: 1

以下是您要翻译的内容：

One way, not using dplyr but using apply(), would be to create a function that fills the last n non-missing values with NA, then use apply() to do this for every row:

replace_last_n &lt;- function(x){
  # Get the last column - the number of NA&#39;s to fill
  n &lt;- as.numeric(x[length(x)])
  
  # Remove that column
  x &lt;- x[1:length(x)-1]
  
  # If nothing to do
  if(n == 0) return(c(x, n))

  # consider if there n is greater than the number of non-missing values
  if(n &gt; length(x[!is.na(x)])) return(c(rep(NA, length(x)), n))
  
  # Determine number of nonmissing values
  tmp &lt;- length(x[!is.na(x)])
  
  # fill the last n non missing values with NA
  x[!is.na(x)][(tmp - n + 1):tmp] &lt;- NA
  
  # return entire row
  return(c(x, n))
}

Where the last value in x is the number of NA's to fill. For example:

replace_last_n(c(1,2,3,4,5, 1))
# [1]  1  2  3  4 NA  1

replace_last_n(c(1,2,3,4,NA, 1))
# [1]  1  2  3 NA NA  1

replace_last_n(c(1,2,3,NA,5, 2))
# [1]  1  2 NA NA NA  2

Then you can use apply() to use this function on every single row:

apply(d1, 1, \(x) replace_last_n(x)) %&gt;% t() %&gt;% as_tibble
#   K0    K1    K2    K3    K4    K5    K6    V8   
#    &lt;chr&gt; &lt;chr&gt; &lt;chr&gt; &lt;chr&gt; &lt;chr&gt; &lt;chr&gt; &lt;chr&gt; &lt;chr&gt;
#  1 A     H     N     S     P     NA    NA    0    
#  2 B     I     O     T     NA    NA    NA    0    
#  3 C     J     P     1113  1115  1117  NA    0    
#  4 D     J     P     1113  1115  NA    NA    1    
#  5 E     J     P     1113  NA    NA    NA    2    
#  6 E     K     Q     U     1116  1118  1119  0    
#  7 E     K     Q     U     1116  1118  NA    1    
#  8 F     K     Q     U     1116  NA    NA    2    
#  9 F     1111  1112  NA    NA    NA    NA    0    
# 10 F     1111  NA    NA    NA    NA    NA    1    
# 11 G     L     NA    NA    NA    NA    NA    0    
# 12 G     M     R     1114  NA    NA    NA    0

英文:

One way, not using dplyr but using apply(), would be to create a function that fills the last n non-missing values with NA, then use apply() to do this for every row:

replace_last_n &lt;- function(x){
  # Get the last column - the number of NA&#39;s to fill
  n &lt;- as.numeric(x[length(x)])
  
  # Remove that column
  x &lt;- x[1:length(x)-1]
  
  # If nothing to do
  if(n == 0) return(c(x, n))

  # consider if there n is greater than the number of non-missing values
  if(n &gt; length(x[!is.na(x)])) return(c(rep(NA, length(x)), n))
  
  # Determine number of nonmissing values
  tmp &lt;- length(x[!is.na(x)])
  
  # fill the last n non missing values with NA
  x[!is.na(x)][(tmp - n + 1):tmp] &lt;- NA
  
  # return entire row
  return(c(x, n))
}

Where the last value in x is the number of NA's to fill. For example:

replace_last_n(c(1,2,3,4,5, 1))
# [1]  1  2  3  4 NA  1

replace_last_n(c(1,2,3,4,NA, 1))
# [1]  1  2  3 NA NA  1

replace_last_n(c(1,2,3,NA,5, 2))
# [1]  1  2 NA NA NA  2

Then you can use apply() to use this function on every single row:

apply(d1, 1, \(x) replace_last_n(x)) %&gt;% t() %&gt;% as_tibble
#   K0    K1    K2    K3    K4    K5    K6    V8   
#    &lt;chr&gt; &lt;chr&gt; &lt;chr&gt; &lt;chr&gt; &lt;chr&gt; &lt;chr&gt; &lt;chr&gt; &lt;chr&gt;
#  1 A     H     N     S     P     NA    NA    0    
#  2 B     I     O     T     NA    NA    NA    0    
#  3 C     J     P     1113  1115  1117  NA    0    
#  4 D     J     P     1113  1115  NA    NA    1    
#  5 E     J     P     1113  NA    NA    NA    2    
#  6 E     K     Q     U     1116  1118  1119  0    
#  7 E     K     Q     U     1116  1118  NA    1    
#  8 F     K     Q     U     1116  NA    NA    2    
#  9 F     1111  1112  NA    NA    NA    NA    0    
# 10 F     1111  NA    NA    NA    NA    NA    1    
# 11 G     L     NA    NA    NA    NA    NA    0    
# 12 G     M     R     1114  NA    NA    NA    0

答案2

得分: 0

这应该适用于您的示例数据，但不确定您的实际数据结构是否匹配。这里的一个重要假设是中间列没有NA。

library(tidyverse)
d1 %>%
  rowwise() %>%
  mutate(string_collection = list(c(K0, K1, K2, K3, K4, K5, K6))) %>%
  mutate(column_to_keep = length(na.omit(string_collection)) - x) %>%
  mutate(string_collection = list((string_collection)[1:column_to_keep])) %>%
  mutate(string_collection = paste0(string_collection, collapse = ",")) %>%
  dplyr::select(x, string_collection) %>%
  separate(string_collection, into = c("K0", "K1", "K2", "K3", "K4", "K5", "K6"), sep = ",") %>%
  relocate(x, .after = last_col()) %>%
  ungroup()

希望这对您有帮助。

英文:

This should work for your sample data, but not sure if the structure in your actual data would fit. The big assumption here is that there is no NA in the middle columns.

library(tidyverse)
d1 %&gt;% 
rowwise() %&gt;% 
mutate(string_collection = list(c(K0, K1, K2, K3, K4, K5, K6))) %&gt;% 
mutate(column_to_keep = length(na.omit(string_collection)) - x) %&gt;% # assume NA only found at the end
mutate(string_collection = list((string_collection)[1:column_to_keep])) %&gt;% 
mutate(string_collection = paste0(string_collection, collapse = &quot;,&quot;)) %&gt;% # assume character column do not have &quot;,&quot;
dplyr::select(x, string_collection) %&gt;% 
separate(string_collection, into = c(&quot;K0&quot;, &quot;K1&quot;, &quot;K2&quot;, &quot;K3&quot;, &quot;K4&quot;, &quot;K5&quot;, &quot;K6&quot;), sep = &quot;,&quot;) %&gt;% 
relocate(x, .after = last_col()) %&gt;% # can skip the lines below
ungroup()

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Replace last non-NA value(s) with NA [R] 将最后一个非NA值替换为NA [R]

问题

答案1

答案2

如何在一个逐行处理矩阵的for循环中包含一些列向计算？

保留矩阵列和行标题，同时将字符转换为数字

How to convert a numeric variable that counts the number of months since a certain point into a variable with an interpretable date (R)?

如何从字符串中删除所有空格？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论