选择R数据框中的列基于另一个数据框中列的数值。

huangapple go评论83阅读模式
英文:

Selecting columns in R dataframe based on values of column in other dataframe

问题

I have translated the code-related portions for you:

我已为您翻译了代码相关部分:

# Dataframe 1
# 数据帧1
colname value
列名   值
col1    0.45
1     0.45
col2    -0.2
2     -0.2
col3    -0.4
3     -0.4
col4    0.1
4     0.1

# Dataframe 2
# 数据帧2
col1 col2 col3 col4
1234
1    5    9    5
1    5    9    5
45   29   43   9
45   29   43   9
34   33   56   3
34   33   56   3
2    67   76   1
2    67   76   1

# Select relevant columns from dataframe 2
# 从数据帧2中选择相关列
features = sqldf('select colname from dataframe1 where value > 0.3 or value < -0.3')

# Build a string for selecting columns from dataframe 2
# 构建用于从数据帧2中选择列的字符串
stringValue = "col1, col3, col4"
sprintf("SELECT %s FROM dataframe2", stringValue)

Let me know if you need further assistance.

英文:

I have two dataframes as u can see below.

  #Dataframe 1
    colname value
    col1    0.45
    col2    -0.2
    col3    -0.4
    col4    0.1

#Dataframe 2
col1 col2 col3 col4
1    5    9    5
45   29   43   9
34   33   56   3
2    67   76   1

What I want to do is to firstly select all columns of dataframe 1 that have a value > 0.3 or value < -0.3. The second thing I want is to select all column from dataframe 2 that match this condition. So the columns col1 and col3 of dataframe2 should be selected into a new dataframe like below.

col1  col3 
1     9   
45    43   
34    56   
2     76   

The solution I thought about is to firstly select the relevant columns as u can see in the code below.

library(sqldf)
features = sqldf(&#39;select colname from dataframe1 where value &gt; 0.3 or value &lt; -0.3&#39;)

After this to build a string in a for loop that should look like below. And paste this in a sqldf query to select to right columns from dataframe2. However I dont know how to build this string. U guys know this or have a other solution?

  stringValue = &quot;col1, col3, col4&quot;
   sprintf(&quot;SELECT %s FROM dataframe2&quot;, stringValue)

答案1

得分: 2

With your current dataframe1 only col1 and col3 will get selected.

library(sqldf)
features = sqldf('select colname from dataframe1 where value > 0.3 or value < -0.3')
sqldf(sprintf("SELECT %s FROM dataframe2", paste0(features$colname, collapse = ", ")))

col1 col3

1 1 9

2 45 43

3 34 56

4 2 76

data

#Dataframe 1
dataframe1 <- read.table(text = 'colname value
col1 0.45
col2 -0.2
col3 -0.4
col4 0.1', header = T, sep = "")

#Dataframe 2
dataframe2 <- read.table(text = 'col1 col2 col3 col4
1 5 9 5
45 29 43 9
34 33 56 3
2 67 76 1', header = T, sep = "")


<details>
<summary>英文:</summary>

With your current `dataframe1` only `col1` and `col3` will get selected.

    library(sqldf)
    features = sqldf(&#39;select colname from dataframe1 where value &gt; 0.3 or value &lt; -0.3&#39;)
    sqldf(sprintf(&quot;SELECT %s FROM dataframe2&quot;, paste0(features$colname, collapse = &quot;, &quot;)))


    #       col1 col3
    #    1    1    9
    #    2   45   43
    #    3   34   56
    #    4    2   76


**data**

    #Dataframe 1
    dataframe1 &lt;- read.table(text = &#39;colname value
        col1    0.45
                             col2    -0.2
                             col3    -0.4
                             col4    0.1&#39;, header = T, sep = &quot;&quot;)
    
    #Dataframe 2
    dataframe2 &lt;- read.table(text = &#39;col1 col2 col3 col4
    1    5    9    5
    45   29   43   9
    34   33   56   3
    2    67   76   1&#39;, header = T, sep = &quot;&quot;)

</details>



# 答案2
**得分**: 1

base R的一种处理方式:

```R
mask <- dataframe1$value > 0.3 | dataframe1$value < -0.3
dataframe2[, mask]

  col1 col3
1    1    9
2   45   43
3   34   56
4    2   76
英文:

A base R way of doing this:

&gt; mask &lt;- dataframe1$value &gt; 0.3 | dataframe1$value &lt; -0.3
&gt; dataframe2[, mask]

  col1 col3
1    1    9
2   45   43
3   34   56
4    2   76

答案3

得分: 0

使用 dplyr(不确定是否相关),您可以执行以下操作:

df2 %>%
  select(one_of(df1 %>%
    filter(value > 0.3 | value < -0.3) %>%
    pull(colname) %>%
    as.character()))

这通过选择与 df1 中的字符串匹配的列名来工作,这些字符串在 filter 中起作用。

英文:

Using dplyr (not sure if it is relevant), you can do:

df2 %&gt;% 
select(one_of(df1 %&gt;% filter(value &gt; 0.3 | value &lt; -0.3) %&gt;% pull(colname) %&gt;% as.character()))

This works by selecting column names that match one_of the strings from df1 that works within the filter.

huangapple
  • 本文由 发表于 2020年1月4日 00:16:21
  • 转载请务必保留本文链接:https://go.coder-hub.com/59581824.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定