2023年5月17日 17:51:42go评论100阅读模式

英文:

reading data file containing multiple ID's into different csvs

问题

以下是您要翻译的内容：

给定以下数据结构的文件：

FIXED=0
LINES=1
POINTS=5
390 397
390 396
389 395
389 394
388 393
IMAGE=Name1.jpg
ID=1 
FIXED=0
LINES=1
POINTS=4
255 503
256 502
256 501
256 500
IMAGE=Name2.jpg
ID=2 
FIXED=0
LINES=1
POINTS=6
262 431
262 430
262 429
262 428
262 427
262 426
IMAGE=Name3.jpg
ID=3

其中：

FIXED 和 ID 之间的行属于一个个体
数字表示两列变量

我们如何读取数据，然后转换为单独的 .csv 文件，其中：

每个 .csv 的名称是 IMAGE= 后面的名称，即 Name1, Name2, Name3...
Name1.csv 的第一列数据是数字的第一列（390 390 389 389 388）
Name1.csv 的第二列数据是数字的第二列（397 396 395 394 393）
对于 Name2.csv、Name3.csv 等都是相同的方式
FIXED=0、LINES=1、POINTS=5、ID=1 可以忽略不计

请注意，POINTS 和 IMAGE 之间的行数不固定。

英文:

Given a file with the following data structure:

FIXED=0
LINES=1
POINTS=5
390 397
390 396
389 395
389 394
388 393
IMAGE=Name1.jpg
ID=1 
FIXED=0
LINES=1
POINTS=4
255 503
256 502
256 501
256 500
IMAGE=Name2.jpg
ID=2 
FIXED=0
LINES=1
POINTS=6
262 431
262 430
262 429
262 428
262 427
262 426
IMAGE=Name3.jpg
ID=3

Were:

The lines between FIXED and ID belong to an individual
The numbers represent two columns of variables

How would we read in the data and then transform into individual .csv files were:

The name of each .csv is the line after IMAGE= Name1, Name2, Name3...
First column of data of Name1.csv is the first column of numbers (390 390 389 389 388)
Second column of data of Name1.csv is the second column of numbers (397 396 395 394 393)
The same for Name2.csv, Name3.csv....
FIXED=0, LINES=1, POINTS=5, ID=1 can be dispensed

Please note that the number of rows between POINTS and IMAGE is not contant

答案1

得分: 1

这是你可以尝试的方法：

library(stringr)
# 从文件中读取数据
data <- readLines("your_file.txt")
# 初始化变量
current_individual <- NULL
current_points <- NULL
current_data <- NULL
# 处理数据的每一行
for (line in data) {
  # 检查行是否以"IMAGE="开头
  if (str_starts(line, "IMAGE=")) {
    # 从行中提取个体名称
    individual_name <- str_remove(line, "IMAGE=")
    individual_name <- str_remove(individual_name, ".jpg")
    
    # 如果存在数据，将其保存到CSV文件中
    if (!is.null(current_individual) && !is.null(current_data)) {
      csv_file <- paste0(current_individual, ".csv")
      write.csv(current_data, file = csv_file, row.names = FALSE)
    }
    
    # 初始化新个体的变量
    current_individual <- individual_name
    current_points <- NULL
    current_data <- NULL
    
  } else if (str_starts(line, "POINTS=")) {
    # 从行中提取点数
    num_points <- as.numeric(str_remove(line, "POINTS="))
    
    # 初始化点数的变量
    current_points <- num_points
    current_data <- matrix(nrow = num_points, ncol = 2)
    
  } else if (str_detect(line, "\\d+ \\d+")) {
    # 从行中提取两个数字
    numbers <- str_split(line, " ")[[1]]
    
    # 将数字添加到当前数据中
    current_data <- rbind(current_data, as.numeric(numbers))
  }
}
# 将最后一个个体的数据保存到CSV文件中
if (!is.null(current_individual) && !is.null(current_data)) {
  csv_file <- paste0(current_individual, ".csv")
  write.csv(current_data, file = csv_file, row.names = FALSE)
}

这是你提供的R代码的翻译部分。

英文:

You could try this method:

library(stringr)
# Read the data from file
data &lt;- readLines(&quot;your_file.txt&quot;)
# Initialize variables
current_individual &lt;- NULL
current_points &lt;- NULL
current_data &lt;- NULL
# Process each line of the data
for (line in data) {
  # Check if the line starts with &quot;IMAGE=&quot;
  if (str_starts(line, &quot;IMAGE=&quot;)) {
    # Extract the individual name from the line
    individual_name &lt;- str_remove(line, &quot;IMAGE=&quot;)
    individual_name &lt;- str_remove(individual_name, &quot;.jpg&quot;)
    
    # If there is existing data, save it to a CSV file
    if (!is.null(current_individual) &amp;&amp; !is.null(current_data)) {
      csv_file &lt;- paste0(current_individual, &quot;.csv&quot;)
      write.csv(current_data, file = csv_file, row.names = FALSE)
    }
    
    # Initialize variables for the new individual
    current_individual &lt;- individual_name
    current_points &lt;- NULL
    current_data &lt;- NULL
    
  } else if (str_starts(line, &quot;POINTS=&quot;)) {
    # Extract the number of points from the line
    num_points &lt;- as.numeric(str_remove(line, &quot;POINTS=&quot;))
    
    # Initialize variables for the points
    current_points &lt;- num_points
    current_data &lt;- matrix(nrow = num_points, ncol = 2)
    
  } else if (str_detect(line, &quot;\\d+ \\d+&quot;)) {
    # Extract the two numbers from the line
    numbers &lt;- str_split(line, &quot; &quot;)[[1]]
    
    # Append the numbers to the current data
    current_data &lt;- rbind(current_data, as.numeric(numbers))
  }
}
# Save the last individual&#39;s data to a CSV file
if (!is.null(current_individual) &amp;&amp; !is.null(current_data)) {
  csv_file &lt;- paste0(current_individual, &quot;.csv&quot;)
  write.csv(current_data, file = csv_file, row.names = FALSE)
}

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

读取包含多个ID的数据文件并保存为不同的CSV文件。

问题

答案1

在多个变量上使用`arrange`函数而不使用三个点的方式

将DataFrame中现有列的值更改为单个特定值

如何根据匹配多列从另一个数据框中替换 NA 值

从具有分组变量的数据框中随机抽取行的样本。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。