将一个数据框中以列值作为列名的数据进行转换(R语言)。

huangapple go评论97阅读模式
英文:

Transform a dataframe with column value as column names in R

问题

我理解你的需求,你想将原始的数据框按照特定的方式进行转换,使其具有类似于示例中的结构。你提到尝试使用pivot_wider,但似乎没有成功。以下是你提供的代码的修改版本:

# 使用pivot_wider重新尝试
df_transformed <- df %>%
  pivot_wider(names_from = date,
              values_from = c(price, num_floors),
              names_glue = "{.value}_{date}",
              values_fill = list(price = NaN, num_floors = NaN))

这将按照你描述的方式转换数据框,每列包含给定日期的价格和楼层数。每列中,前两行是第一个房屋的数据,接下来两行是第二个房屋的数据,并使用NaN填充缺失的条目。

英文:

I have a dataframe like this

         date     price num_floors    house
1  2023-01-01  94.30076          3        A
2  2023-01-01  95.58771          2        B
3  2023-01-02 102.78559          1        C
4  2023-01-03  93.29053          3        D

and I want to change it, so that each column contains the prices and num_floors for all houses for a given date. For one column, the first two rows of a column refer to the first house, the next two to the second house. The remaining entries without data are filled with the missing value NaN.

Now, I want to transform the above dataframe, so that it has a similar structure as the following:

  2023-01-01    2023-01-02  2023-01-03
1   94.30076     102.78559    93.29053
2          3             1           3         
3   95.58771            NA          NA
4          2            NA          NA

Each column contains the prices and num_floors for all houses for a given date. For one column, the first two rows of a column refer to the first house, the next two to the second house. The remaining entries without data are filled with the missing value NaN.

I tried with pivot_wider. It does not work:

# Pivot the dataframe
df_transformed &lt;- df %&gt;%
  pivot_wider(names_from = date,
              values_from = c(price, num_floors),
              values_fill = NaN)

答案1

得分: 1

以下是翻译好的内容:

"首先,您可以将数据转换为长格式,然后在添加行号后再转换为宽格式。"

"输出:"

| `2023-01-01` | `2023-01-02` | `2023-01-03` |
| ------------ | ------------ | ------------ |
|     <dbl>    |     <dbl>    |     <dbl>    |
|     94.3     |     103.     |     93.3     |
|      3       |       1      |       3      |
|     95.6     |      NA      |      NA      |
|      2       |      NA      |      NA      |

(我要注意的是,这种转换方式似乎不太可能是表示您的数据的最佳方式)。

英文:

You can pivot to long format first, and then go wide after adding a row number.

pivot_wider(
  pivot_longer(df, -date) %&gt;% mutate(n=row_number(), .by=date),
  id_cols = n, names_from=date, values_from = value
) %&gt;% select(-n)

Output:

  `2023-01-01` `2023-01-02` `2023-01-03`
         &lt;dbl&gt;        &lt;dbl&gt;        &lt;dbl&gt;
1         94.3         103.         93.3
2          3             1           3  
3         95.6          NA          NA  
4          2            NA          NA 

(I will note that this transformation seem unlikely to be the optimal way to represent your data).

huangapple
  • 本文由 发表于 2023年5月24日 21:33:01
  • 转载请务必保留本文链接:https://go.coder-hub.com/76324118.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定