如何在R中更改多个文件中特定列的名称?

huangapple go评论64阅读模式
英文:

How can I change the name of a particular column in various files in R?

问题

I have various .txt files stored in multiple folders. The txt files have various columns, one of which is Temperature. Few initial txt files temperature column name as T2 [℃] while others have it as T2 [°C]. I want to keep the temperature column name as T2 [°C] in all the files. I do not want to change the names of other columns. Also, the number of columns in all the files is not the same. (e.g. Few files have columns such as Pressure, Temperature, Radiation, Wind velocity, Wind direction and other files have only Pressure, Temperature and Radiation. It can be thought of as a case of missing data. I could think of a logical condition that wherever we have T2 [℃], it should be replaced with T2 [°C])

I tried to use colnames(my_dataframe)[colnames(my_dataframe) == "id"] = "c1" but it didn't work.

I am using the following code in R

setwd("D:/Data/RawData/Task")
dir <- "D:/Data/RawData/Task/"
fnames <- list.files(dir, full.names = T, recursive = TRUE)
colnames(fnames)[colnames(fnames) == "T2 [℃]"] = "T2 [°C]"
xy <- do.call(rbind, lapply(fnames, read.table, header=TRUE, sep = "\t", check.names = FALSE,
skip = 27))

Could anyone please help me in changing the column name as well as in fixing the number of columns in all the files.

英文:

I have various .txt files stored in multiple folders. The txt files have various columns, one of which is Temperature. Few initial txt files temperature column name as T2 [°C] while others have it as T2 [?C]. I want to keep the temperature column name as T2 [°C] in all the files. I do not want to change the names of other columns. Also, the number of columns in all the files is not the same. (e.g. Few files have columns such as Pressure, Temperature, Radiation, Wind velocity, Wind direction and other files have only Pressure, Temperature and Radiation. It can be thought of as a case of missing data. I could think of a logical condition that whereever we have T2 [?C], it should be replaced with T2 [°C])

I tried to use colnames(my_dataframe)[colnames(my_dataframe) == "id"] ="c1" but it didn't work.

I am using following code in R

setwd(&quot;D:/Data/RawData/Task&quot;)
dir &lt;- &quot;D:/Data/RawData/Task/&quot;
fnames &lt;- list.files(dir, full.names = T, recursive = TRUE)
colnames(fnames)[colnames(fnames) == &quot;T2 [?C]&quot;] =&quot;T2 [&#176;C]&quot;
xy &lt;- do.call(rbind, lapply(fnames, read.table, header=TRUE, sep = &quot;\t&quot;, check.names = FALSE, 
skip = 27))

Could anyone please help me in changing the column name as well as in fixing the number of columns in all the files.

答案1

得分: 1

我们可以在lapply函数中整理列名,然后使用rbindlist, fill = TRUE进行合并,这会用NA填充缺失的列。我选择将列名中的[]替换为()(使用[ ]可能会导致问题)。

#libraries
library(data.table)

#list of files
filelist <- list.files("D:/Data/RawData/Task/", 
                       full.names = TRUE,
                       recursive = TRUE,
                       pattern = ".txt$")

#read
dt <- lapply(filelist, fread)

#adjust colnames
dt.tidied <- lapply(dt, FUN = function(x){
  #adjust ? to &#176;
  setnames(x, old = "T2 [?C]", new = "T2 [&#176;C]", skip_absent = TRUE)
  
  #replace [] with ()
  colnames(x) <- gsub("\\[", "(", colnames(x))
  colnames(x) <- gsub("\\]", ")", colnames(x))
  
  #return
  return(x)
})


#bind, filling missing columns to NA
merged <- rbindlist(dt.tidied, fill = TRUE)
英文:

We can tidy the column names in an lapply function and merge using rbindlist, fill = TRUE which fills missing columns with NA. I chose to replace [] with () in the column names (use of [ ] may lead to issues).

#libraries
library(data.table)

#list of files
filelist &lt;- list.files(&quot;D:/Data/RawData/Task/&quot;, 
                       full.names = TRUE,
                       recursive = TRUE
                       pattern = &quot;.txt$&quot;)

#read
dt &lt;- lapply(filelist, fread)

#adjust colnames
dt.tidied &lt;- lapply(dt, FUN = function(x){
  #adjust ? to &#176;
  setnames(x, old = &quot;T2 [?C]&quot;, new = &quot;T2 [&#176;C]&quot;, skip_absent = TRUE)
  
  #replace [] with ()
  colnames(x) &lt;- gsub(&quot;\\[&quot;, &quot;(&quot;, colnames(x))
  colnames(x) &lt;- gsub(&quot;\\]&quot;, &quot;)&quot;, colnames(x))
  
  #return
  return(x)
})


#bind, filling missing columns to NA
merged &lt;- rbindlist(dt.tidied, fill = TRUE)

</details>



huangapple
  • 本文由 发表于 2023年5月17日 18:19:50
  • 转载请务必保留本文链接:https://go.coder-hub.com/76271028.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定