2023年3月7日 05:42:53go评论125阅读模式

英文:

Read a text file with one line and split it to multiple rows based on a delimiter

问题

我想使用“readr”库的函数，如read.delim，read.table，read.csv，将一个.txt文件导入R数据框。.txt文件只有一行，其中包含所有数据。

这一行在导入时应该被拆分为不同的行，分隔符是一个空格。一行中的值是用制表符分隔的。无论我尝试什么，我都无法使用定义的空格分隔符将这一行拆分为多行。该文件总是被导入为一行。是否有一种在R中以这种特定方式导入的方法？

我的尝试只导致了包含所有数据的单行数据框。

A，B和C是列名。3，2和1应该是第一行，4，5和6应该是第二行。

示例数据： "A" "B" "C" "3" "2" "1" "4" "5" "6"

英文:

I want to import a .txt file in R dataframe using "readr" library functions such as read.delim, read.table, read.csv. The .txt file has only one single row which contains all data.

This one row should be split into different rows during import and the delimiter for this is one whitespace. The values in one row are defined with TAB delimiter. Whatever I try, I was not able to split this one row into many rows with the defined whitespace delimiter. The file is always imported as one row. Is there a way to import in this specific way in R?

My trials only resulted in dataframes with a single row displaying all data in columns.

A, B, and C are column names. 3, 2, and 1 should be the first row and 4, 5, and 6 should be the second.

Example Data: &quot;A&quot;   &quot;B&quot;   &quot;C&quot; &quot;3&quot;   &quot;2&quot;   &quot;1&quot; &quot;4&quot;   &quot;5&quot;   &quot;6&quot;

答案1

得分: 1

我用 `text = ...` 作为示例，

你可以使用 `read.table(file = "your/file/path.txt")`

A B C

1 3 2 1

2 4 5 6

英文:

# I use `text = ...` for illustration purposes,
# you can use `read.table(file = &quot;your/file/path.txt&quot;)
data = read.table(text = &#39;&quot;A&quot;   &quot;B&quot;   &quot;C&quot; &quot;3&quot;   &quot;2&quot;   &quot;1&quot; &quot;4&quot;   &quot;5&quot;   &quot;6&quot;&#39;) 
data = 
  data[-(1:3)] |&gt;
  unlist() |&gt;
  as.numeric() |&gt;
  matrix(ncol = 3, byrow = TRUE) |&gt;
  as.data.frame() |&gt;
  setNames(data[1:3])
data
#   A B C
# 1 3 2 1
# 2 4 5 6

答案2

得分: 1

不是 tidyverse，但这应该有效。我使用 readChar 将整个文件读取为文本字符串，然后使用 gsub 将没有相邻制表符的空格替换为换行符，最后将该字符串传递给 read.table。

fileName <- 'testinput.txt'
readChar(fileName, file.info(fileName)$size) |
  gsub(pattern = "([A-Z0-9\&quot;]) ([A-Z0-9\&quot;])", replacement = "\\n\") |
  read.table(text=_, header=TRUE)
  A B C
1 3 2 1
2 4 5 6

可能有一种 tidyverse 的方法来执行相同的步骤。

英文:

It's not tidyverse, but this should work. I'm reading the whole file as a text string with readChar, changing the spaces without adjacent tabs to newlines with gsub then passing that string to read.table.

fileName &lt;- &#39;testinput.txt&#39;
readChar(fileName, file.info(fileName)$size)  |&gt;
  gsub(pattern = &quot;([A-Z0-9\&quot;]) ([A-Z0-9\&quot;])&quot; ,replacement = &quot;\\n\&quot;) |&gt;
  read.table(text=_,header=TRUE)
  A B C
1 3 2 1
2 4 5 6

There is probably a tidyverse way to perform these same steps.

答案3

得分: 1

这是一个在base中的解决方案：

fileName <- 'C:\\test.txt';
read.table(text = paste0(strsplit(
                     readChar(fileName, file.info(fileName)$size), ' ')[[1]], 
               collapse = "\n"), 
       sep = "\t", header = T)

#   A B C
# 1 3 2 1
# 2 4 5 6

英文:

Here's a solution in base:

fileName &lt;- &#39;C:\\test.txt&#39;
read.table(text = paste0(strsplit(
                         readChar(fileName, file.info(fileName)$size), &#39; &#39;)[[1]], 
                   collapse = &quot;\n&quot;), 
           sep = &quot;\t&quot;, header = T)

#   A B C
# 1 3 2 1
# 2 4 5 6

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

读取一个文本文件，根据分隔符将其拆分为多行。

问题

答案1

我用 `text = ...` 作为示例，

你可以使用 `read.table(file = "your/file/path.txt")`

A B C

1 3 2 1

2 4 5 6

答案2

答案3

如何在ggplot中使用Stat_summary来计算平均值，而不考虑分组。

在DataFrame中插入新行，并将另一行的内容粘贴到新行。

我如何在R中将列表中的数据框命名为它们来自的CSV文件？

在R中查找数据框（在每一行中）特定值的列名。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

发表评论

问题

答案1

我用 text = ... 作为示例，

你可以使用 read.table(file = "your/file/path.txt")

A B C

1 3 2 1

2 4 5 6

答案2

答案3

发表评论

我用 `text = ...` 作为示例，

你可以使用 `read.table(file = "your/file/path.txt")`