2023年7月18日 10:12:23go评论83阅读模式

英文:

Top row in excel is a main category and row under that is sub category in excel how convert to column and make it 2 related factor In "R'

问题

技能流顶部行
列（雇主支持，州领地提名，区域，商业创新，全球人才，技能独立，杰出人才，熟练区域，11月在岸（应删除）
家庭流顶部行
列（伴侣，父母，子女，其他家庭）

如何使技能流和家庭流成为列，并在它们前面加上子类别？

期望在R语言中将其转换为列
这里是下载我的数据的链接。

英文:

image of excel excel image- Skill stream top row
Columns (Employer sponsored, State Territory Nominated, Regional, Business innovation, Global talent , skill independent , distinguished talent , skilled regional , November onshore ( should be deleted)

Family stream top row
columns (Partner. Parent, Child ,Other family )

how to make Skill stream and Family stream as a column with subcategory infront of them ?

Expecting in R language to convert it to column
Here is a link to download my data.

答案1

得分: 0

# 载入库
library(readxl)
library(tidyverse)
library(zoo)

file_path <- "~/Downloads/migration_trends_statistical_package_2021_22.xlsx"

# 提取顶部列名。出于某种原因，这需要跳过前5行。不要问我为什么，我不知道。这可能因表而异，因此您需要调整以获得相同的结果。
categories <- read_excel(path = file_path, sheet = "1.1", col_names = FALSE, skip = 5, n_max = 2) %>%
                mutate(across(everything(), ~ str_trim(str_remove_all(., "\\d+"))))

# 然后将其转换为干净的长数据框，而不是杂乱的宽数据框
categories <- data.frame(
    category = na.locf(as.character(categories[1,])), # 用最后一个非NA值填充空白
    name = as.character(categories[2,])
)

read_excel(path = file_path, sheet = "1.1", skip = 7, col_names = categories[,2]) %>%
  pivot_longer(-Year, values_transform = list(value = as.numeric)) %>%
  left_join(categories)

# A tibble: 468 × 4
   Year    name                                  value category    
   <chr>   <chr>                                 <dbl> <chr>       
 1 2012–13 "Employer Sponsored"                  47740 Skill stream
 2 2012–13 "State/Territory Nominated"           21637 Skill stream
 3 2012–13 "Regional"                               NA Skill stream
 4 2012–13 "Business Innovation and Investment"   7010 Skill stream
 5 2012–13 "Global Talent (Independent)"            NA Skill stream
 6 2012–13 "Skilled Independent"                 44251 Skill stream
 7 2012–13 "Distinguished\r\n Talent"              200 Skill stream
 8 2012–13 "Skilled Regional"                     8132 Skill stream
 9 2012–13 "November\r\nOnshore"                    NA Skill stream
10 2012–13 "Skill stream total"                 128973 Skill stream
# ℹ 458 more rows


<details>
<summary>英文:</summary>

load libraries

library(readxl)
library(tidyverse)
library(zoo)

file_path <-"~/Downloads/migration_trends_statistical_package_2021_22.xlsx"

pull out the top column names. For some reason, this requires skipping the first 5 rows. Don't ask me why, I don't know. This is likely to change sheet to sheet, so you want to adjust it so that you get the same result.

categories <- read_excel(path = file_path, sheet = "1.1", col_names = FALSE, skip = 5, n_max = 2) %>%
mutate(across(everything(), ~ str_trim(str_remove_all(., "\d+"))))

we then turn it into a cleaned long dataframe, from a messy wide one

categories <- data.frame(
category = na.locf(as.character(categories[1,])), # fill in the blanks with the last non-NA value
name = as.character(categories[2,])
)

read_excel(path = file_path, sheet = "1.1", skip = 7, col_names = categories[,2]) %>%
pivot_longer(-Year, values_transform = list(value = as.numeric)) %>%
left_join(categories)

A tibble: 468 × 4

Year name value category
<chr> <chr> <dbl> <chr>
1 2012–13 "Employer Sponsored" 47740 Skill stream
2 2012–13 "State/Territory Nominated" 21637 Skill stream
3 2012–13 "Regional" NA Skill stream
4 2012–13 "Business Innovation and Investment" 7010 Skill stream
5 2012–13 "Global Talent (Independent)" NA Skill stream
6 2012–13 "Skilled Independent" 44251 Skill stream
7 2012–13 "Distinguished\r\n Talent" 200 Skill stream
8 2012–13 "Skilled Regional" 8132 Skill stream
9 2012–13 "November\r\nOnshore" NA Skill stream
10 2012–13 "Skill stream total" 128973 Skill stream

ℹ 458 more rows


</details>

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Top row in excel is a main category and row under that is sub category in excel how convert to column and make it 2 related factor In "R'

问题

答案1

load libraries

pull out the top column names. For some reason, this requires skipping the first 5 rows. Don't ask me why, I don't know. This is likely to change sheet to sheet, so you want to adjust it so that you get the same result.

we then turn it into a cleaned long dataframe, from a messy wide one

A tibble: 468 × 4

ℹ 458 more rows

执行一个包含字符字符串的变异函数。

使用`rows_append()`时添加因子水平。

根据另一列的行值有条件地返回一向量的某些行值。

将数据框按同一列的多个范围分组行。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论