2023年7月20日 13:30:33go评论81阅读模式

英文:

How to extract the names from a "name1 11/22 name2 33 / 44 name3 last3 55/66" in bash

问题

My input data formatted as the pattern '<name> <a>/'. It is possible occurs zero or many times in the same line. And it is possible exist a extra space between the '/'.

I expect to extract the names as

name1
name2
name3 last3

Here is the wrong code

echo &quot;name1 11/22 name2 33 / 44 name3 last3 55/66&quot; \
 | grep -o -E '\b[a-zA-Z][a-zA-Z0-9. ]+\b'))

will extract

name1 11
name2 33
name3 last3 55

This script should also pass empty line as no output.

英文:

My input data formatted as the pattern '<name> <a>/'. It is possible occurs zero or many times in the same line. And it is possible exist a extra space between the '/'.

I expect to extract the names as

name1
name2
name3 last3

Here is the wrong code

echo &quot;name1 11/22 name2 33 / 44 name3 last3 55/66&quot; \
 | grep -o -E &#39;\b[a-zA-Z][a-zA-Z0-9. ]+\b&#39;))

will extract

name1 11
name2 33
name3 last3 55

This script should also pass empty line as no output.

答案1

得分: 1

name1
name2
name3 last3

英文:

$ echo &quot;name1 11/22 name2 33 / 44 name3 last3 55/66&quot; | 
    awk -F&#39; *[0-9]* */ *[0-9]* *&#39; &#39;{for(i=1;i&lt;NF;i++) print $i}&#39;
name1
name2
name3 last3
$ echo &quot;name1 11/22 name2 33 / 44 name3 last3 55/66&quot; | 
    awk -F&#39; *[0-9]* */ *[0-9]* *&#39; -v OFS=&#39;\n&#39; &#39;{NF-=1}1&#39;
name1
name2
name3 last3

答案2

得分: 0

使用 grep，您可以轻松提取<a>/部分：

$ echo "name1 11/22 name2 33 / 44 name3 last3 55/66" |
  grep -oE '[^[:space:]]+[[:space:]]*/[[:space:]]*[^[:space:]]+'
11/22
33 / 44
55/66

但如果您想要的不是打印这些内容，而是用换行符替换它们，sed或awk可能是更好的选择。使用 sed 的示例：

$ echo "name1 11/22 name2 33 / 44 name3 last3 55/66" |
  sed 's![[:space:]]*[^[:space:]][^[:space:]]*[[:space:]]*/[[:space:]]*[^[:space:]][^[:space:]]*[[:space:]]*!\n!g'
name1
name2
name3 last3

或者，使用 GNU sed：

$ echo "name1 11/22 name2 33 / 44 name3 last3 55/66" |
  sed -E 's!\s*\S+\s*/\s*\S+\s*!\n!g'
name1
name2
name3 last3

请注意，在每行的最后一个名称后也会添加一个换行符，导致输出中有空行。如果这不可接受，我们可以单独处理每行的最后一个名称。使用 GNU sed 的示例：

$ echo "name1 11/22 name2 33 / 44 name3 last3 55/66" |
  sed -E 's!\s*\S+\s*/\s*\S+\s*$!!;s!\s*\S+\s*/\s*\S+\s*!\n!g'
name1
name2
name3 last3

使用 awk，我们可以将字段分隔符定义为您要删除的<a>/部分，并在单独的行上打印所有字段（除了最后一个空字段）：

$ echo "name1 11/22 name2 33 / 44 name3 last3 55/66" |
  awk -F '[[:space:]]*[^[:space:]]+[[:space:]]*/[[:space:]]*[^[:space:]]+[[:space:]]*' '{for(i=1;i<NF;i++) print $i}'
name1
name2
name3 last3

或者，使用 GNU awk：

$ echo "name1 11/22 name2 33 / 44 name3 last3 55/66" |
  awk -F '\\s*\\S+\\s*/\\s*\\S+\\s*' '{for(i=1;i<NF;i++) print $i}'
name1
name2
name3 last3

英文:

With grep you could easily extract the <a>/ parts:

$ echo &quot;name1 11/22 name2 33 / 44 name3 last3 55/66&quot; |
  grep -oE &#39;[^[:space:]]+[[:space:]]*/[[:space:]]*[^[:space:]]+&#39;
11/22
33 / 44
55/66

But as what you want is not to print these but replace them with newlines, sed or awk are probably better choices. Example with sed:

$ echo &quot;name1 11/22 name2 33 / 44 name3 last3 55/66&quot; |
  sed &#39;s![[:space:]]*[^[:space:]][^[:space:]]*[[:space:]]*/[[:space:]]*[^[:space:]][^[:space:]]*[[:space:]]*!\n!g&#39;
name1
name2
name3 last3

Or, with GNU sed:

$ echo &quot;name1 11/22 name2 33 / 44 name3 last3 55/66&quot; |
  sed -E &#39;s!\s*\S+\s*/\s*\S+\s*!\n!g&#39;
name1
name2
name3 last3

Note that a newline is also added after the last name of a line, leading to empty lines in the output. If this is not acceptable we can process the last name of a line separately. Example with GNU sed:

$ echo &quot;name1 11/22 name2 33 / 44 name3 last3 55/66&quot; |
  sed -E &#39;s!\s*\S+\s*/\s*\S+\s*$!!;s!\s*\S+\s*/\s*\S+\s*!\n!g&#39;
name1
name2
name3 last3

With awk we can define the field separator as the <a>/ parts you want to remove and print all fields (except the last empty field) on a separate line:

$ echo &quot;name1 11/22 name2 33 / 44 name3 last3 55/66&quot; |
  awk -F &#39;[[:space:]]*[^[:space:]]+[[:space:]]*/[[:space:]]*[^[:space:]]+[[:space:]]*&#39; &#39;
    {for(i=1;i&lt;NF;i++) print $i}&#39;
name1
name2
name3 last3

Or, with GNU awk:

$ echo &quot;name1 11/22 name2 33 / 44 name3 last3 55/66&quot; |
  awk -F &#39;\\s*\\S+\\s*/\\s*\\S+\\s*&#39; &#39;{for(i=1;i&lt;NF;i++) print $i}&#39;
name1
name2
name3 last3

答案3

得分: 0

如果您的 `grep` 支持 `-P`（PCRE）选项，请尝试：
```shell
echo "name1 11/22 name2 33 / 44 name3 last3 55/66" \
 | grep -oP '\b[a-zA-Z][a-zA-Z0-9. ]+\b(?=\s+\d+\s*/\s*\d+)'

输出：

name1
name2
name3 last3

(?=\s+\d+\s*/\s*\d+) 是前瞻断言，用于匹配以下序列：

一个或多个空白字符
一个或多个数字
零个或多个空白字符
一个斜杠字符
零个或多个空白字符
一个或多个数字

匹配的子字符串不包括在输出中。


<details>
<summary>英文:</summary>
If your `grep` supports `-P` (PCRE) option, would you please try:

echo "name1 11/22 name2 33 / 44 name3 last3 55/66"
| grep -oP '\b[a-zA-Z][a-zA-Z0-9. ]+\b(?=\s+\d+\s*/\s*\d+)'

Output:

name1
name2
name3 last3

`(?=\s+\d+\s*/\s*\d+)` is the lookahead assertion which matches
a sequence of:
- one or more blank character(s)
- one or more digit(s)
- zero or more blank character(s)
- a slash character
- zero or more blank character(s)
- one or more digit(s)
The matched substring is not included in the output.
</details>
# 答案4
**得分**: 0

使用GNU awk来处理多字符的RS、RT和\s/\S，这可能是您想要的：

$ echo "name1 11/22 name2 33 / 44 name3 last3 55/66" |
awk -v RS='\s+\S+\s*/\s*\S+\s*' 'RT'
name1
name2
name3 last3


<details>
<summary>英文:</summary>
Using GNU awk for multi-char `RS`, `RT`, and `\s/\S`, this might be what you want:
    $ echo &quot;name1 11/22 name2 33 / 44 name3 last3 55/66&quot; | 
        awk -v RS=&#39;\\s+\\S+\\s*/\\s*\\S+\\s*&#39; &#39;RT&#39;
    name1
    name2
    name3 last3
</details>

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何在bash中从“name1 11/22 name2 33 / 44 name3 last3 55/66”中提取名称

问题

答案1

答案2

答案3

根据输入的列顺序重新排列CSV文件中的列/数值。

Bash脚本使用jq: –arg值在脚本中未被使用。

显示与管道grep条件匹配的文件。

使用awk循环文件并为两列打印新行的方法

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。