选择R数据框中的列基于另一个数据框中列的数值。

huangapple go评论111阅读模式
英文:

Selecting columns in R dataframe based on values of column in other dataframe

问题

I have translated the code-related portions for you:

  1. 我已为您翻译了代码相关部分:
  2. # Dataframe 1
  3. # 数据帧1
  4. colname value
  5. 列名
  6. col1 0.45
  7. 1 0.45
  8. col2 -0.2
  9. 2 -0.2
  10. col3 -0.4
  11. 3 -0.4
  12. col4 0.1
  13. 4 0.1
  14. # Dataframe 2
  15. # 数据帧2
  16. col1 col2 col3 col4
  17. 1 2 3 4
  18. 1 5 9 5
  19. 1 5 9 5
  20. 45 29 43 9
  21. 45 29 43 9
  22. 34 33 56 3
  23. 34 33 56 3
  24. 2 67 76 1
  25. 2 67 76 1
  26. # Select relevant columns from dataframe 2
  27. # 从数据帧2中选择相关列
  28. features = sqldf('select colname from dataframe1 where value > 0.3 or value < -0.3')
  29. # Build a string for selecting columns from dataframe 2
  30. # 构建用于从数据帧2中选择列的字符串
  31. stringValue = "col1, col3, col4"
  32. sprintf("SELECT %s FROM dataframe2", stringValue)

Let me know if you need further assistance.

英文:

I have two dataframes as u can see below.

  1. #Dataframe 1
  2. colname value
  3. col1 0.45
  4. col2 -0.2
  5. col3 -0.4
  6. col4 0.1
  7. #Dataframe 2
  8. col1 col2 col3 col4
  9. 1 5 9 5
  10. 45 29 43 9
  11. 34 33 56 3
  12. 2 67 76 1

What I want to do is to firstly select all columns of dataframe 1 that have a value > 0.3 or value < -0.3. The second thing I want is to select all column from dataframe 2 that match this condition. So the columns col1 and col3 of dataframe2 should be selected into a new dataframe like below.

  1. col1 col3
  2. 1 9
  3. 45 43
  4. 34 56
  5. 2 76

The solution I thought about is to firstly select the relevant columns as u can see in the code below.

  1. library(sqldf)
  2. features = sqldf(&#39;select colname from dataframe1 where value &gt; 0.3 or value &lt; -0.3&#39;)

After this to build a string in a for loop that should look like below. And paste this in a sqldf query to select to right columns from dataframe2. However I dont know how to build this string. U guys know this or have a other solution?

  1. stringValue = &quot;col1, col3, col4&quot;
  2. sprintf(&quot;SELECT %s FROM dataframe2&quot;, stringValue)

答案1

得分: 2

With your current dataframe1 only col1 and col3 will get selected.

  1. library(sqldf)
  2. features = sqldf('select colname from dataframe1 where value > 0.3 or value < -0.3')
  3. sqldf(sprintf("SELECT %s FROM dataframe2", paste0(features$colname, collapse = ", ")))

col1 col3

1 1 9

2 45 43

3 34 56

4 2 76

data

#Dataframe 1
dataframe1 <- read.table(text = 'colname value
col1 0.45
col2 -0.2
col3 -0.4
col4 0.1', header = T, sep = "")

#Dataframe 2
dataframe2 <- read.table(text = 'col1 col2 col3 col4
1 5 9 5
45 29 43 9
34 33 56 3
2 67 76 1', header = T, sep = "")

  1. <details>
  2. <summary>英文:</summary>
  3. With your current `dataframe1` only `col1` and `col3` will get selected.
  4. library(sqldf)
  5. features = sqldf(&#39;select colname from dataframe1 where value &gt; 0.3 or value &lt; -0.3&#39;)
  6. sqldf(sprintf(&quot;SELECT %s FROM dataframe2&quot;, paste0(features$colname, collapse = &quot;, &quot;)))
  7. # col1 col3
  8. # 1 1 9
  9. # 2 45 43
  10. # 3 34 56
  11. # 4 2 76
  12. **data**
  13. #Dataframe 1
  14. dataframe1 &lt;- read.table(text = &#39;colname value
  15. col1 0.45
  16. col2 -0.2
  17. col3 -0.4
  18. col4 0.1&#39;, header = T, sep = &quot;&quot;)
  19. #Dataframe 2
  20. dataframe2 &lt;- read.table(text = &#39;col1 col2 col3 col4
  21. 1 5 9 5
  22. 45 29 43 9
  23. 34 33 56 3
  24. 2 67 76 1&#39;, header = T, sep = &quot;&quot;)
  25. </details>
  26. # 答案2
  27. **得分**: 1
  28. base R的一种处理方式:
  29. ```R
  30. mask <- dataframe1$value > 0.3 | dataframe1$value < -0.3
  31. dataframe2[, mask]
  32. col1 col3
  33. 1 1 9
  34. 2 45 43
  35. 3 34 56
  36. 4 2 76
英文:

A base R way of doing this:

  1. &gt; mask &lt;- dataframe1$value &gt; 0.3 | dataframe1$value &lt; -0.3
  2. &gt; dataframe2[, mask]
  3. col1 col3
  4. 1 1 9
  5. 2 45 43
  6. 3 34 56
  7. 4 2 76

答案3

得分: 0

使用 dplyr(不确定是否相关),您可以执行以下操作:

  1. df2 %>%
  2. select(one_of(df1 %>%
  3. filter(value > 0.3 | value < -0.3) %>%
  4. pull(colname) %>%
  5. as.character()))

这通过选择与 df1 中的字符串匹配的列名来工作,这些字符串在 filter 中起作用。

英文:

Using dplyr (not sure if it is relevant), you can do:

  1. df2 %&gt;%
  2. select(one_of(df1 %&gt;% filter(value &gt; 0.3 | value &lt; -0.3) %&gt;% pull(colname) %&gt;% as.character()))

This works by selecting column names that match one_of the strings from df1 that works within the filter.

huangapple
  • 本文由 发表于 2020年1月4日 00:16:21
  • 转载请务必保留本文链接:https://go.coder-hub.com/59581824.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定