使用bash在文件中使用正则表达式匹配子字符串。

huangapple go评论64阅读模式
英文:

Match substring with regex inside file with bash

问题

我有一个包含以下内容的文本文件:

┌─ 这是文件夹描述

├─ ##### ─────── 品红色

├─ 大小 ──────── 245.7 GiB

├─ 网址 ──────── https://www.google.com


├─ 视频 ─────── 23.976fps

├─ 音频 ─────┬─ 法语
│ │
│ │
│ └─ 德语

├─ 字幕 ─────── 法语, 德语, 意大利语, C#


我想验证在 *音频* 和 *字幕* 之间是否有 `法语`,我不关心其余部分。

我尝试过:

    grep -E '音频(.*)(法语|Franz?)(.*)字幕' test.txt

但没有匹配。我相当确定在使用 grep 时出了问题。你能指导我正确的做法吗?
英文:

I have a following text file that contains content:

┌─ This is folder description
│
├─ ##### ─────── Magenta 
│
├─ Size ──────── 245.7 GiB
│
├─ URL ───────── https://www.google.com
│
│
├─ Video ─────── 23.976fps
│
├─ Audio ─────┬─ French 
│             │         
│             │
│             └─ German
│
├─ Subtitles ─── French, German, Italian, C#

I want to verify if there is French between Audio and Subtitles, I don't care rest.

I tried:

grep -E 'Audio(.*)(French|Franz?)(.*)Sub(s|(titles?)|(titulos))' test.txt

But no match. I'm pretty sure that is something wrong in using grep. Can lead me to correct way to do that ?

答案1

得分: 3

grep默认每次读取一行,所以不能在单行上进行多行匹配。如果使用GNU grep来读取整个文件并使用-o来获取输出,可以执行以下操作:

$ grep -Ezo 'Audio(.*)(French|Franz?)(.*)Sub(s|(titles?)|(titulos))' test.txt
Audio ─────┬─ French
│             │
│             │
│             └─ German
├─ Subtitles

或者,使用awk也可以实现:

$ awk '
    { rec = rec $0 ORS }
    END {
        if ( match(rec,/Audio(.*)(French|Franz?)(.*)Sub(s|(titles?)|(titulos))/) ) {
            print substr(rec,RSTART,RLENGTH)
        }
    }
' test.txt
Audio ─────┬─ French
│             │
│             │
│             └─ German
├─ Subtitles
英文:

grep reads 1 line at a time by default so you can't do a multi-line match on a single line. Using GNU grep for -z (to read the whole file at one time) and -o (as I'm guessing at what output you want):

$ grep -Ezo 'Audio(.*)(French|Franz?)(.*)Sub(s|(titles?)|(titulos))' test.txt
Audio ─────┬─ French
│             │
│             │
│             └─ German
│
├─ Subtitles

Alternatively, using any awk:

$ awk '
    { rec = rec $0 ORS }
    END {
        if ( match(rec,/Audio(.*)(French|Franz?)(.*)Sub(s|(titles?)|(titulos))/) ) {
            print substr(rec,RSTART,RLENGTH)
        }
    }
' test.txt
Audio ─────┬─ French
│             │
│             │
│             └─ German
│
├─ Subtitles

答案2

得分: 0

如果始终有2个音频,则可以简化为:

grep Audio -A3 file.txt | grep French

添加 sed

sed -n '/Audio/,/Subtitles/p' file.txt | grep -v Subtitles | grep French
英文:

If there are always 2 audios than it could be simple:

grep Audio -A3 file.txt | grep French

Adding sed:

sed -n '/Audio/,/Subtitles/p' file.txt | grep -v Subtitles | grep French

huangapple
  • 本文由 发表于 2023年5月29日 21:14:34
  • 转载请务必保留本文链接:https://go.coder-hub.com/76357710.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定