读取一个文本文件,根据分隔符将其拆分为多行。

huangapple go评论83阅读模式
英文:

Read a text file with one line and split it to multiple rows based on a delimiter

问题

我想使用“readr”库的函数,如read.delimread.tableread.csv,将一个.txt文件导入R数据框。.txt文件只有一行,其中包含所有数据。

这一行在导入时应该被拆分为不同的行,分隔符是一个空格。一行中的值是用制表符分隔的。无论我尝试什么,我都无法使用定义的空格分隔符将这一行拆分为多行。该文件总是被导入为一行。是否有一种在R中以这种特定方式导入的方法?

我的尝试只导致了包含所有数据的单行数据框。

A,B和C是列名。3,2和1应该是第一行,4,5和6应该是第二行。

示例数据: "A" "B" "C" "3" "2" "1" "4" "5" "6"

英文:

I want to import a .txt file in R dataframe using "readr" library functions such as read.delim, read.table, read.csv. The .txt file has only one single row which contains all data.

This one row should be split into different rows during import and the delimiter for this is one whitespace. The values in one row are defined with TAB delimiter. Whatever I try, I was not able to split this one row into many rows with the defined whitespace delimiter. The file is always imported as one row. Is there a way to import in this specific way in R?

My trials only resulted in dataframes with a single row displaying all data in columns.

A, B, and C are column names. 3, 2, and 1 should be the first row and 4, 5, and 6 should be the second.

Example Data: "A"   "B"   "C" "3"   "2"   "1" "4"   "5"   "6"

答案1

得分: 1

我用 text = ... 作为示例,

你可以使用 read.table(file = "your/file/path.txt")

data = read.table(text = '"A" "B" "C" "3" "2" "1" "4" "5" "6"')
data =
data[-(1:3)] |>
unlist() |>
as.numeric() |>
matrix(ncol = 3, byrow = TRUE) |>
as.data.frame() |>
setNames(data[1:3])
data

A B C

1 3 2 1

2 4 5 6

英文:
# I use `text = ...` for illustration purposes,
# you can use `read.table(file = "your/file/path.txt")
data = read.table(text = '"A"   "B"   "C" "3"   "2"   "1" "4"   "5"   "6"') 
data = 
  data[-(1:3)] |>
  unlist() |>
  as.numeric() |>
  matrix(ncol = 3, byrow = TRUE) |>
  as.data.frame() |>
  setNames(data[1:3])
data
#   A B C
# 1 3 2 1
# 2 4 5 6

答案2

得分: 1

不是 tidyverse,但这应该有效。我使用 readChar 将整个文件读取为文本字符串,然后使用 gsub 将没有相邻制表符的空格替换为换行符,最后将该字符串传递给 read.table

fileName <- 'testinput.txt'
readChar(fileName, file.info(fileName)$size) |
  gsub(pattern = "([A-Z0-9\&quot;]) ([A-Z0-9\&quot;])", replacement = "\\n\") |
  read.table(text=_, header=TRUE)

  A B C
1 3 2 1
2 4 5 6

可能有一种 tidyverse 的方法来执行相同的步骤。

英文:

It's not tidyverse, but this should work. I'm reading the whole file as a text string with readChar, changing the spaces without adjacent tabs to newlines with gsub then passing that string to read.table.

fileName &lt;- &#39;testinput.txt&#39;
readChar(fileName, file.info(fileName)$size)  |&gt;
  gsub(pattern = &quot;([A-Z0-9\&quot;]) ([A-Z0-9\&quot;])&quot; ,replacement = &quot;\\n\&quot;) |&gt;
  read.table(text=_,header=TRUE)

  A B C
1 3 2 1
2 4 5 6

There is probably a tidyverse way to perform these same steps.

答案3

得分: 1

这是一个在base中的解决方案:

fileName <- 'C:\\test.txt';

read.table(text = paste0(strsplit(
                     readChar(fileName, file.info(fileName)$size), ' ')[[1]], 
               collapse = "\n"), 
       sep = "\t", header = T)
#   A B C
# 1 3 2 1
# 2 4 5 6
英文:

Here's a solution in base:

fileName &lt;- &#39;C:\\test.txt&#39;

read.table(text = paste0(strsplit(
                         readChar(fileName, file.info(fileName)$size), &#39; &#39;)[[1]], 
                   collapse = &quot;\n&quot;), 
           sep = &quot;\t&quot;, header = T)
#   A B C
# 1 3 2 1
# 2 4 5 6

huangapple
  • 本文由 发表于 2023年3月7日 05:42:53
  • 转载请务必保留本文链接:https://go.coder-hub.com/75656115.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定