英文:
Best way to filter using awk or grep
问题
我有一个文件,其内容格式如下:
1234567890 ->
2345678901 -> /some/directory/some_file.txt
我试图运行grep或awk命令,只获取包含文件路径而不仅仅是数字 -> 组合的行。我目前尝试的是:
awk '/^[[:digit:]]+\/data\/.*/gm' file_to_test.txt
但它返回了所有行。我觉得解决方案应该很简单,但我可能没有看到。非常感谢任何帮助。
英文:
I have a file that has the following content format;
1234567890 ->
2345678901 -> /some/directory/some_file.txt
What I am attempting to do is to run either a grep or awk command that will only give me the lines that contain the file paths and not just the number -> combo. My present attempt to do this is;
awk '/^[[:digit:]]+\/data\/.*/gm' file_to_test.txt
except it is returning all the lines. I feel like the solution is really simple and I am just not seeing it. Any help would be really appreciated.
答案1
得分: 3
awk 'NF==3 {print $0}' file_to_test.txt
对于任何具有三个字段(NF==3
)的行,打印整行(print $0
)。
正如 markp-fuso 指出的:awk 的默认行为是只打印整行,因此上述命令可以缩写为:
awk 'NF==3' file_to_test.txt
英文:
awk 'NF==3 {print $0}' file_to_test.txt
For any line that has three fields (NF==3
), print the whole line (print $0
).
As markp-fuso pointed out: the default behaviour of awk is to just print the whole line, so the above can be shortened to just:
awk 'NF==3' file_to_test.txt
答案2
得分: 1
如果文件路径始终包含斜杠/(您没有明确指定,但您的示例表明是这种情况),一个简单的
grep -F / | cut -w -f 3-
就可以。grep
选择行,而cut
选择行中的文件路径。-w
指定行中的字段由空格分隔。我使用3-
,即从第3个字段到末尾的所有内容,以允许文件路径包含空格。
英文:
If the file pathes always contain a /
(you did not specify this explicitly, but your example suggests that this is the case), a simple
grep -F / | cut -w -f 3-
should do. The grep
selects the lines, and the cut
selects the file pathes from the line. -w
specifies that the fields in the line are separated by white space. I use 3-
, i.e. everything from field 3 to the end, to allow file pathes to contain spaces.
答案3
得分: 0
你可以使用 sed
删除只包含数字和箭头的行。
sed -e '/[0-9]* -> *$/d' ./some_file.txt
输出:
2345678901 -> /some/directory/some_file.csv
英文:
You could use sed
to delete lines that only contain the digits and the arrow.
sed -e '/[0-9]* -> *$/d' ./some_file.txt
Output:
2345678901 -> /some/directory/some_file.csv
答案4
得分: 0
echo '1234567890 ->
2345678901 -> /some/directory/some_file.txt' |
mawk 'NF *= _ < $NF' FS='^[^/]+' OFS=
/some/directory/some_file.txt
英文:
echo '1234567890 ->
2345678901 -> /some/directory/some_file.txt' |
mawk 'NF *= _ < $NF' FS='^[^/]+' OFS=
/some/directory/some_file.txt
答案5
得分: 0
以下是翻译好的部分:
如果需要的话,我可以使用GNU AWK
进行如下处理,假设有一个名为 file.txt
的文件:
1234567890 ->
2345678901 -> /some/directory/some_file.txt
然后运行以下命令:
awk '$3' file.txt
将得到如下输出:
2345678901 -> /some/directory/some_file.txt
解释:将每一行视为包含由一个或多个空白字符分隔的列(这是GNU AWK
的默认行为),然后找到第三列为真的元素。请注意,使用GNU AWK
,您可以引用超出行范围的字段。免责声明:此解决方案假定路径永远不以只包含 0
数字开头,如果不是这种情况,请不要使用它。
(在GNU Awk 5.1.0中测试过)
英文:
> solution is really simple
If this is desired I would harness GNU AWK
following way, let file.txt
1234567890 ->
2345678901 -> /some/directory/some_file.txt
then
awk '$3' file.txt
gives output
2345678901 -> /some/directory/some_file.txt
Explanation: treat lines as containing columns separated by one-or-more whitespace characters (this is GNU AWK
default) and find elements where 3rd column is truthy. Observe that using GNU AWK
you might reference fields which are outside range for lines, Disclaimer: this solution assumes path never starts with just 0
digits, if this is not case do not use it
(tested in GNU Awk 5.1.0)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论