将一个数据框中以列值作为列名的数据进行转换(R语言)。

huangapple go评论118阅读模式
英文:

Transform a dataframe with column value as column names in R

问题

我理解你的需求,你想将原始的数据框按照特定的方式进行转换,使其具有类似于示例中的结构。你提到尝试使用pivot_wider,但似乎没有成功。以下是你提供的代码的修改版本:

  1. # 使用pivot_wider重新尝试
  2. df_transformed <- df %>%
  3. pivot_wider(names_from = date,
  4. values_from = c(price, num_floors),
  5. names_glue = "{.value}_{date}",
  6. values_fill = list(price = NaN, num_floors = NaN))

这将按照你描述的方式转换数据框,每列包含给定日期的价格和楼层数。每列中,前两行是第一个房屋的数据,接下来两行是第二个房屋的数据,并使用NaN填充缺失的条目。

英文:

I have a dataframe like this

  1. date price num_floors house
  2. 1 2023-01-01 94.30076 3 A
  3. 2 2023-01-01 95.58771 2 B
  4. 3 2023-01-02 102.78559 1 C
  5. 4 2023-01-03 93.29053 3 D

and I want to change it, so that each column contains the prices and num_floors for all houses for a given date. For one column, the first two rows of a column refer to the first house, the next two to the second house. The remaining entries without data are filled with the missing value NaN.

Now, I want to transform the above dataframe, so that it has a similar structure as the following:

  1. 2023-01-01 2023-01-02 2023-01-03
  2. 1 94.30076 102.78559 93.29053
  3. 2 3 1 3
  4. 3 95.58771 NA NA
  5. 4 2 NA NA

Each column contains the prices and num_floors for all houses for a given date. For one column, the first two rows of a column refer to the first house, the next two to the second house. The remaining entries without data are filled with the missing value NaN.

I tried with pivot_wider. It does not work:

  1. # Pivot the dataframe
  2. df_transformed &lt;- df %&gt;%
  3. pivot_wider(names_from = date,
  4. values_from = c(price, num_floors),
  5. values_fill = NaN)

答案1

得分: 1

以下是翻译好的内容:

"首先,您可以将数据转换为长格式,然后在添加行号后再转换为宽格式。"

"输出:"

  1. | `2023-01-01` | `2023-01-02` | `2023-01-03` |
  2. | ------------ | ------------ | ------------ |
  3. | <dbl> | <dbl> | <dbl> |
  4. | 94.3 | 103. | 93.3 |
  5. | 3 | 1 | 3 |
  6. | 95.6 | NA | NA |
  7. | 2 | NA | NA |

(我要注意的是,这种转换方式似乎不太可能是表示您的数据的最佳方式)。

英文:

You can pivot to long format first, and then go wide after adding a row number.

  1. pivot_wider(
  2. pivot_longer(df, -date) %&gt;% mutate(n=row_number(), .by=date),
  3. id_cols = n, names_from=date, values_from = value
  4. ) %&gt;% select(-n)

Output:

  1. `2023-01-01` `2023-01-02` `2023-01-03`
  2. &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt;
  3. 1 94.3 103. 93.3
  4. 2 3 1 3
  5. 3 95.6 NA NA
  6. 4 2 NA NA

(I will note that this transformation seem unlikely to be the optimal way to represent your data).

huangapple
  • 本文由 发表于 2023年5月24日 21:33:01
  • 转载请务必保留本文链接:https://go.coder-hub.com/76324118.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定