使用正则表达式更改列名

huangapple go评论60阅读模式
英文:

Use regular expression to change column names

问题

df的列名例如为:

split_the_tree1.zip.name, split_the_tree2.zip.region, split_the_tree3.zip.type


我想要删除所有在`.zip.`之前的部分,包括`.zip.`,最终得到列名:`name, region, type`

我尝试了以下代码:

    gsub(".*\\.zip\\.", "", colnames(df))
    
    gsub('\\*.zip\\*', '', colnames(df))
英文:

the df column names are for instance:

split_the_tree1.zip.name, split_the_tree2.zip.region, split_the_tree3.zip.type

I would like to remove everything that comes before '.zip.' including the '.zip.'

To end up with column names: name, region, type

I tried something like this:

gsub(".*\zip\/", "", colnames(df))

gsub('*.zip', '', colnames(df))

答案1

得分: 4

colnames(df) <- gsub('.*zip.', '', colnames(df))

请注意我们使用的是.*而不是*.

尽管快速完成此操作的方法是注意到您想要的名称仅存储为扩展名,因此使用tools::file_ext函数

strings <- c("split_the_tree1.zip.name", "split_the_tree2.zip.region",
"split_the_tree3.zip.type")

tools::file_ext(strings)
[1] "name" "region" "type"

英文:
colnames(df) &lt;- gsub(&#39;.*zip.&#39;, &#39;&#39;, colnames(df))

Note that we have .* and not *.

Although the quick way to do this is to note that the names you want are simply stored as extensions, hence use tools::file_ext function

strings &lt;- c(&quot;split_the_tree1.zip.name&quot;, &quot;split_the_tree2.zip.region&quot;, 
             &quot;split_the_tree3.zip.type&quot;)

tools::file_ext(strings)
[1] &quot;name&quot;   &quot;region&quot; &quot;type&quot;  

huangapple
  • 本文由 发表于 2023年2月16日 06:23:32
  • 转载请务必保留本文链接:https://go.coder-hub.com/75465987.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定