英文:
bash script - if statement with grep
问题
对不起,我只能提供中文翻译,不包括代码部分。以下是您提供的内容的翻译:
"Preface that I am new to bash scripting and attempting to parse a log file and output the info I am looking for into a txt file.
I have got a start to my script, however I am now stuck with a condition statement. I'm not sure grep is the best use case now, perhaps awk or sed.
contents of file example:
2022-1-3 14:00:00 ERROR THREAD234 - error info here
2022-1-4 02:00:00 WARNI THREAD235 - warning info here
additional warning info here, sometimes includes word error, but i do not want to capture this additional line as it is a warning
2022-2-3 01:00:00 ERROR THREAD333 - error info here2
error info continued, sometimes there are multiple lines to an error and they do not all include the word error. however, these additional lines to do not include date/times. these are typically stack traces.
2023-3-4 11:00:00 INFO0 THREAD333 - info here
2022-2-5 01:00:00 ERROR THREAD333 - error info here3
2022-2-6 06:00:00 ERROR THREAD333 - error info here3
desired output:
1 ERORR - error info here
1 ERROR - error info here2
error info contined, sometimes includes a tab at the beginning of this line and sometimes does not
2 ERROR - error info here3
current output:
1 ERORR - error info here
1 ERROR - error info here2
2 ERROR - error info here3
My end goal:
I am trying to only grab the errors and their following line if it is continuing information on that error. My thought is to use a conditional if. If the next line from ERROR does not start with a date, then print. If it does, only print the error.
I do not want to include the date, time, or thread in my output and do not want errors info to repeat themselves in the output.
Currently where I'm at with the bash script, it does work but I need to fine tune it with the condition to include next line if the error is continuing."
英文:
Preface that I am new to bash scripting and attempting to parse a log file and output the info I am looking for into a txt file.
I have got a start to my script, however I am now stuck with a condition statement. I'm not sure grep is the best use case now, perhaps awk or sed.
contents of file example:
2022-1-3 14:00:00 ERROR THREAD234 - error info here
2022-1-4 02:00:00 WARNI THREAD235 - warning info here
additional warning info here, sometimes includes word error, but i do not want to capture this additional line as it is a warning
2022-2-3 01:00:00 ERROR THREAD333 - error info here2
error info continued, sometimes there are multiple lines to an error and they do not all include the word error. however, these additional lines to do not include date/times. these are typically stack traces.
2023-3-4 11:00:00 INFO0 THREAD333 - info here
2022-2-5 01:00:00 ERROR THREAD333 - error info here3
2022-2-6 06:00:00 ERROR THREAD333 - error info here3
desired output:
1 ERORR - error info here
1 ERROR - error info here2
error info contined, sometimes includes a tab at the beginning of this line and sometimes does not
2 ERROR - error info here3
current output:
1 ERORR - error info here
1 ERROR - error info here2
2 ERROR - error info here3
My end goal:
I am trying to only grab the errors and their following line if it is continuing information on that error. My thought is to use a conditional if. If the next line from ERROR does not start with a date, then print. If it does, only print the error.
I do not want to include the date, time, or thread in my output and do not want errors info to repeat themselves in the output.
Currently where I'm at with the bash script, it does work but I need to fine tune it with the condition to include next line if the error is continuing.
#!/bin/bash
read -p "File path to log, no spaces: " file
outputFile=Desktop/errorOutput.txt
error=$(grep ERROR $file | cut -b 25-32,47-1000 | sort | uniq -c)
touch $outputFile
echo "$error" > $outputFile
cat $outputFile
I'm attempted an if statement for the grep, however the logic is flawed. I'm currently attempting to figure it out with awk instead.
答案1
得分: 1
一种可能的解决方案使用GNU AWK:
/ERROR/ {gsub("THREAD.* -", "-", $0); !a[$0]++}
END{for (i in a) {print a[i], i}}' logfile
1 ERROR - error info here
1 ERROR - error info here2
error info continued, sometimes there are multiple lines to an error and they do not all include the word error. however, these additional lines to do not include date/times. these are typically stack traces.
2 ERROR - error info here3
英文:
One potential solution using GNU AWK:
awk 'BEGIN{RS="[[:digit:]-]+ [[:digit:]:]+ "; ORS=""
PROCINFO["sorted_in"]="@ind_str_asc"}
/ERROR/ {gsub("THREAD.* -", "-", $0); !a[$0]++}
END{for (i in a) {print a[i], i}}' logfile
1 ERROR - error info here
1 ERROR - error info here2
error info continued, sometimes there are multiple lines to an error and they do not all include the word error. however, these additional lines to do not include date/times. these are typically stack traces.
2 ERROR - error info here3
答案2
得分: 0
这是一个执行此操作的Ruby代码:
ruby -e '
$<.read.scan(/(ERROR[\s\S]*?)(?=^\d{4}|\z)/). # 寻找错误直到下一个日期
flatten(1). # 移除一个匹配级别
map(&:strip). # 移除末尾的 "\n"
map{|e| e.sub(/^(ERROR\s+)[^-]+/,"\")}. # 移除 THREAD 部分
group_by{|e| e[/ERROR\s+-\s+.*$/]}. # 按错误标签分组
map{|k,v| puts "#{v.length} #{v[0]}"} # 打印它们
' 文件
输出:
1 ERROR - 这里是错误信息
1 ERROR - 这里是错误信息2
错误信息继续,有时一个错误会有多行,它们并不都包含单词 "error"。然而,这些额外的行不包含日期/时间。通常它们是堆栈跟踪。
2 ERROR - 这里是错误信息3
英文:
Here is a Ruby that does that:
ruby -e '
$<.read.scan(/(ERROR[\s\S]*?)(?=^\d{4}|\z)/). # find the error to next date
flatten(1). # remove one match level
map(&:strip). # remove " \n" at end
map{|e| e.sub(/^(ERROR\s+)[^-]+/,"\")}. # remove THREAD part
group_by{|e| e[/ERROR\s+-\s+.*$/]}. # group by error tag
map{|k,v| puts "#{v.length} #{v[0]}"} # print them
' file
Prints:
1 ERROR - error info here
1 ERROR - error info here2
error info continued, sometimes there are multiple lines to an error and they do not all include the word error. however, these additional lines to do not include date/times. these are typically stack traces.
2 ERROR - error info here3
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论