2020年1月4日 00:46:19go评论179阅读模式

英文:

Find first occurrence of a character in column of a data frame in R

问题

I can help you with the translation:

在R中处理字符串时感到困惑...

我在R数据框中有一列字符串。每个字符串中都包含一次且仅一次的字符"="。我想知道每个列元素中"="字符的位置，这是将该列拆分为两个独立列的步骤之一（一个用于"="之前的部分，另一个用于"="之后的部分）。有人能帮忙吗？我相信这很简单，但我一直在努力寻找答案。

例如，如果我有：

x <- data.frame(string = c("aa=1", "aa=2", "aa=3", "b=1", "b=2", "abc=5"))

我想要一段代码返回：

(3, 3, 3, 2, 2, 4)

谢谢。

英文:

Struggling with string handling in R...

I've got a column of strings in an R data frame. Each one contains the "=" character once and only once. I'd like to know the position of the "=" character in each element of the column, as a step to splitting the column into two separate columns (one for the bit before the "=" and one for the bit after the "="). Can anyone help please? I'm sure it's simple but I'm struggling to find the answer.

For example, if I have:

x &lt;- data.frame(string = c(&quot;aa=1&quot;, &quot;aa=2&quot;, &quot;aa=3&quot;, &quot;b=1&quot;, &quot;b=2&quot;, &quot;abc=5&quot;))

I'd like a bit of code to return

> (3, 3, 3, 2, 2, 4)

Thank you.

答案1

得分: 1

这是一种方法：

library(stringr)
str_locate(x$string, "=")[,1]

英文:

Here's a way to do:

library(stringr)
str_locate(x$string, &quot;=&quot;)[,1]

答案2

得分: 1

在基本的 R 中，您可以执行以下操作：

as.numeric(lapply(strsplit(as.character(x$string), ""), function(x) which(x == "=")))

>[1] 3 3 3 2 2 4

英文:

In Base R you can do:

as.numeric(lapply(strsplit(as.character(x$string), &quot;&quot;), function(x) which(x == &quot;=&quot;)))

>[1] 3 3 3 2 2 4

答案3

得分: 1

你可以使用 gregexpr：

unlist(lapply(gregexpr(pattern = ''='', x$string), min))
[1] 3 3 3 2 2 4

英文:

You can use gregexpr:

unlist(lapply(gregexpr(pattern = &#39;=&#39;, x$string), min))
[1] 3 3 3 2 2 4

答案4

得分: 1

要获取“=”的位置，您可以使用regexp函数：

regexpr("=", x$string)
#[1] 3 3 3 2 2 4
#attr(,"match.length")
#[1] 1 1 1 1 1 1
#attr(,"useBytes")
#[1] TRUE

但是，正如@Michael所述，如果您的目标是拆分字符串，您可以使用strsplit：

strsplit(x$string, "=")
#[[1]]
#[1] "aa" "1" 
#[[2]]
#[1] "aa" "2" 
#[[3]]
#[1] "aa" "3" 
#[[4]]
#[1] "b"  "1" 
#[[5]]
#[1] "b"  "2" 
#[[6]]
#[1] "abc" "5"

或者使用do.call和rbind组合来创建一个新的数据框：

do.call(rbind, strsplit(x$string, "="))
#     [,1]  [,2]
#[1,] "aa"  "1" 
#[2,] "aa"  "2" 
#[3,] "aa"  "3" 
#[4,] "b"   "1" 
#[5,] "b"   "2" 
#[6,] "abc" "5"

英文:

To get the position of "=" you can use the regexp function:

regexpr(&quot;=&quot;, x$string)
#[1] 3 3 3 2 2 4
#attr(,&quot;match.length&quot;)
#[1] 1 1 1 1 1 1
#attr(,&quot;useBytes&quot;)
#[1] TRUE

However, as @Michael stated if your goal is to split the string you can use strsplit:

strsplit(x$string, &quot;=&quot;)
#[[1]]
#[1] &quot;aa&quot; &quot;1&quot; 
#
#[[2]]
#[1] &quot;aa&quot; &quot;2&quot; 
#
#[[3]]
#[1] &quot;aa&quot; &quot;3&quot; 
#
#[[4]]
#[1] &quot;b&quot; &quot;1&quot;
#
#[[5]]
#[1] &quot;b&quot; &quot;2&quot;
#
#[[6]]
#[1] &quot;abc&quot; &quot;5&quot;

Or to combine with do.call and `rbind to create a new dataframe:

do.call(rbind, strsplit(x$string, &quot;=&quot;))
#     [,1]  [,2]
#[1,] &quot;aa&quot;  &quot;1&quot; 
#[2,] &quot;aa&quot;  &quot;2&quot; 
#[3,] &quot;aa&quot;  &quot;3&quot; 
#[4,] &quot;b&quot;   &quot;1&quot; 
#[5,] &quot;b&quot;   &quot;2&quot; 
#[6,] &quot;abc&quot; &quot;5&quot;

答案5

得分: 1

以下是翻译好的部分：

这是获取一个两列数据框的另一种解决方案，第一列包含等号（=）之前的字符，第二列包含等号之后的字符。您可以在不获取等号位置的情况下完成这个操作。

library(stringr)
t(as.data.frame(strsplit(x$string, "=")))
#              [,1]  [,2]
#c..aa....1..  "aa"  "1" 
#c..aa....2..  "aa"  "2" 
#c..aa....3..  "aa"  "3" 
#c..b....1..   "b"   "1" 
#c..b....2..   "b"   "2" 
#c..abc....5.. "abc" "5"

英文:

Here is another solution to obtain a two column dataframe, the first containing the characters before = and the second one containing the characters after =. You can do that without obtaining the positions of the = character.

library(stringr)
t(as.data.frame(strsplit(x$string, &quot;=&quot;)))
#              [,1]  [,2]
#c..aa....1..  &quot;aa&quot;  &quot;1&quot; 
#c..aa....2..  &quot;aa&quot;  &quot;2&quot; 
#c..aa....3..  &quot;aa&quot;  &quot;3&quot; 
#c..b....1..   &quot;b&quot;   &quot;1&quot; 
#c..b....2..   &quot;b&quot;   &quot;2&quot; 
#c..abc....5.. &quot;abc&quot; &quot;5&quot;

答案6

得分: 0

一些人可能会觉得这更容易阅读
    library(tidyverse)
    x %>%
      mutate(
        number = string %>%
          str_extract('[:digit:]+'),
        text = string %>%
          str_extract('[:alpha:]+')
      ) %>%
      as_tibble()
    # 一个 tibble: 6 x 3
      string number text 
      <fct>  <chr>  <chr>
    1 aa=1   1      aa   
    2 aa=2   2      aa   
    3 aa=3   3      aa   
    4 b=1    1      b    
    5 b=2    2      b    
    6 abc=5  5      abc

英文:

Some may find this more readable

library(tidyverse)
x %&gt;%
  mutate(
    number = string %&gt;% str_extract(&#39;[:digit:]+&#39;),
    text = string %&gt;%  str_extract(&#39;[:alpha:]+&#39;)
  ) %&gt;%
  as_tibble()
# A tibble: 6 x 3
  string number text 
  &lt;fct&gt;  &lt;chr&gt;  &lt;chr&gt;
1 aa=1   1      aa   
2 aa=2   2      aa   
3 aa=3   3      aa   
4 b=1    1      b    
5 b=2    2      b    
6 abc=5  5      abc

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

在R中查找数据框列中字符的第一次出现。

问题

答案1

答案2

答案3

答案4

答案5

答案6

生成一个函数，其中一个部分是箱线图，另一部分是密度图。

Equivalent of Python string.format in Go?

Ghost CMS API 来自 R

在一个 pandas 数据框中添加多行到新创建的列中

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

发表评论