2023年6月26日 15:49:07go评论74阅读模式

英文:

Find differences exist in A but not B

问题

This is A:

SomeFollowingText 12-45-78-54 1.1.1.1
SomeFollowingText 78-56-14-54 2.2.2.2
SomeFollowingText 52-12-98-ab 3.3.3.3
SomeFollowingText 89-za-gg-99 4.4.4.4

This is B:

78-56-14-54
ab-g2-ax-47
52-12-98-ab
g2-47-n7-o9
56-13-z8-ab

The expected output:

SomeFollowingText 12-45-78-54 1.1.1.1
SomeFollowingText 89-za-gg-99 4.4.4.4

This is my failed attempt:

awk '{print $2}' /home/path/A | while read line
do
    if [ "$(grep $line /home/path/B)" != "$line" ]
    then
        echo $line
    fi
done >> /home/path/non-match

But the non-match contains the whole A.

I also tried this:

cat /home/path/A | while read line
do
    if [ "$(grep $line /home/path/B)" != "$line | awk '{print $2}'" ]
    then
        echo $line
    fi
done >> /home/path/non-match

英文:

This is A:

SomeFollowingText 12-45-78-54 1.1.1.1
SomeFollowingText 78-56-14-54 2.2.2.2
SomeFollowingText 52-12-98-ab 3.3.3.3
SomeFollowingText 89-za-gg-99 4.4.4.4

This is B:

78-56-14-54
ab-g2-ax-47
52-12-98-ab
g2-47-n7-o9
56-13-z8-ab

The expected output:

SomeFollowingText 12-45-78-54 1.1.1.1
SomeFollowingText 89-za-gg-99 4.4.4.4

This is my failed attempt:

awk &#39;{print $2}&#39; /home/path/A | while read line
do
    if [ &quot;grep $line /home/path/B&quot; != &quot;$line&quot; ]
    then
        echo $line
    fi
done &gt;&gt; /home/path/non-match

But the non-match contains the whole A.

I also tried this:

cat /home/path/A | while read line
do
    if [ &quot;$(grep $line /home/path/B)&quot; != &quot;$line | awk &#39;{print $2}&#39;&quot; ]
    then
        echo $line
    fi
done &gt;&gt; /home/path/non-match

答案1

得分: 3

$ awk 'FNR==NR{a[$1];next} !($2 in a)' B A
SomeFollowingText 12-45-78-54 1.1.1.1
SomeFollowingText 89-za-gg-99 4.4.4.4

英文:

$ awk &#39;FNR==NR{a[$1];next} !($2 in a)&#39; B A
SomeFollowingText 12-45-78-54 1.1.1.1
SomeFollowingText 89-za-gg-99 4.4.4.4

答案2

得分: 1

if [ "grep $line /home/path/B" != "$line" ] 比较两个字符串，一个是例如 grep 12-45-78-54 /home/path/B，另一个是 12-45-78-54。这两个字符串将始终不同。

显然，您想要比较grep命令的输出与$line，因此应该使用 if [ "$(grep $line /home/path/B)" != "$line" ]。

然而，这不会给您期望的输出，因为您使用awk从每个输入行中有效删除了第1个和第3个标记，您无法在循环内部恢复这些标记。

解决您问题的一个更简单的方法是 grep -vf B A > non-match，正如 @tshiono 建议的那样。

英文:

if [ "grep $line /home/path/B" != "$line" ] compares two strings, one is e.g. grep 12-45-78-54 /home/path/B and the other is 12-45-78-54. Those will always be different.

What you apparently want to do is compare the output of the grep command to $line, so you should use if [ "$(grep $line /home/path/B)" != "$line" ] instead.

However, this will not give you the expected output, since you effectively deleted the 1st and 3rd token from each input line using awk and you can't recover those from within the loop.

A much simpler approach to your problem would be grep -vf B A >non-match, as @tshiono suggested.

答案3

得分: 1

你可以在 awk 中很容易地完成这个任务：

awk '(NR==FNR) {b[$1]=$1}; (NR>FNR && ! b[$2]) {print}' /home/path/B /home/path/A >/home/path/non-match

解释：awk 会遍历所有提供的输入文件（按顺序是 /home/path/B 和 A）。FNR 表示当前文件的行号（"record" 数），而 NR 表示总的行号，因此比较它们可以轻松判断是否正在处理第一个文件（它们相等）或者不是（它们不相等）。

(NR==FNR) {b[$1]=$1} 的意思是：对于第一个文件（/home/path/B）中的每一行，将第一个（唯一的）字段添加到名为 b 的数组中，并以自身为索引。

(NR>FNR && ! b[$2]) {print} 的意思是：对于第二个文件（/home/path/A）中的每一行，如果第二个字段 ($2) 不在 b 数组中，就打印该行。

英文:

You can do this pretty easily in awk:

awk &#39;(NR==FNR) {b[$1]=$1}; (NR&gt;FNR &amp;&amp; ! b[$2]) {print}&#39; /home/path/B /home/path/A &gt;/home/path/non-match

Explanation: awk reads through all lines in all supplied input files (/home/path/B and A, in that order). FNR is the line ("record" number) in the current file, and NR is the overall line number, so comparing them is an easy way to tell if it's processing the first file (they'll be equal) or not (unequal).

(NR==FNR) {b[$1]=$1} means: for each line in the first file (/home/path/B) add the first (only) field to an array named b, indexed by itself.

(NR>FNR && ! b[$2]) {print} means: for each line in the second file (/home/path/A), if the second field ($2) isn't in the b array, print the line.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

在A中存在但在B中不存在的差异。

问题

答案1

答案2

答案3

从多个记录的文本文件中提取特定文本的方法是什么？

Python shell脚本无法使用source命令执行。

在golang中，使用`exec.Command`执行多个bash命令时可能会遇到问题。

Coinbase.com 无效的签名

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论