2023年1月9日 09:52:51go评论102阅读模式

英文:

Conditional REGEX in R to select text in front of or behind a specific character --- "%" in this case

问题

ShippingStreet <- c("123 Main St%234 Center Street", "%555 Folsom Street",
"59 Hyde Street%")

我有一个地址的字符向量，它是通过合并两个不同向量的内容形成的。一个“%”分隔每个观察中的数据，左边（1）和右边（2）。数据看起来像这样：

我想保留“%”左边的数据，即使右边有内容，如果左边没有内容，我想保留右边的数据。

所以输出应该如下所示：

123 Main St
555 Folsom Street
59 Hyde Street

我编写了一个条件正则表达式如下，并在gsub中使用它，但它没有做我认为它应该做的事情。

pattrn_pct <- "/(?(?=%)..(%.*$)|(^.*%))/gm" <<< 寻找 %，然后选择 % 后面的内容，如果 % 前面有内容，则删除，或者如果 % 前面没有内容，则选择 % 后面的内容...
gsub(pattrn_pct, "", ShippingStreet, perl = TRUE) <<< 用空字符串替换选择

请注意，你提供的正则表达式语法在R中可能不适用，你可以尝试以下代码来实现你的目标：

ShippingStreet <- c("123 Main St%234 Center Street", "%555 Folsom Street",
                    "59 Hyde Street%")
# 使用正则表达式替换来提取所需的部分
result <- gsub(".*?%([^%]+)|%.*$", "\", ShippingStreet)
# 输出结果
cat(result, sep = "\n")

这将输出所期望的结果。

英文:

I have a character vector of addresses which is formed by merging contents of two different vectors. A "%" separates the data in each observation, left(1) from right(2). And the data looks like this:

ShippingStreet &lt;- c(&quot;123 Main St%234 Center Street&quot;, &quot;%555 Folsom Street&quot;,
                    &quot;59 Hyde Street%&quot;)

I want to keep the data on the left side of % even if there is something on the right, and on the right side if there is nothing on the left.

So output should look like this:

123 Main St
555 Folsom Street
59 Hyde street

I wrote a conditional regex as follows and use it in the gsub, but it is not doing what I though it should do.

pattrn_pct &lt;- &quot;/(?(?=%)..(%.*$)|(^.*%))/gm&quot;`   &lt;&lt;&lt; looks for % and then selects behind the % to drop if there is something in front of the %, or after the % if nothing in front ...
gsub(pattrn_pct, &quot;&quot;, ShippingStreet, perl=T)  &lt;&lt;&lt; replace selection with &quot;&quot;

答案1

得分: 1

我们可以在这里使用str_extract()，使用正则表达式模式[^%]+：

str_extract(ShippingStreet, "[^%]+")

数据：

ShippingStreet <- c("123 Main St%234 Center Street", "%555 Folsom Street",
                    "59 Hyde Street%")

英文:

We can use str_extract() here with the regex pattern [^%]+:

str_extract(ShippingStreet, &quot;[^%]+&quot;)
[1] &quot;123 Main St&quot;       &quot;555 Folsom Street&quot; &quot;59 Hyde Street&quot;

Data:

ShippingStreet &lt;- c(&quot;123 Main St%234 Center Street&quot;, &quot;%555 Folsom Street&quot;,
                    &quot;59 Hyde Street%&quot;)

答案2

得分: 0

使用base R中的sub

sub("^%?([^%]+).*", "\", ShippingStreet)
[1] "123 Main St"       "555 Folsom Street" "59 Hyde Street"

英文:

Using sub in base R

sub(&quot;^%?([^%]+).*&quot;, &quot;\&quot;, ShippingStreet)
[1] &quot;123 Main St&quot;       &quot;555 Folsom Street&quot; &quot;59 Hyde Street&quot;   
</details>

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Conditional REGEX in R to select text in front of or behind a specific character — "%" in this case

问题

答案1

答案2

计算多个配对变量的实际差异和百分比差异同时。

返回数据表中每个组的多行。

在R中进行具有两个中介变量的中介调节分析。

Use of svyglm and svydesign with R for multistage stratified cluster design

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。