英文:
I need a file converter script
问题
这是您提供的脚本中可能存在的问题之一:
在脚本的这一行:
while read -a LINE from $FILE
应该将其更正为:
while read -a LINE; do
这样脚本应该能够正确读取文件中的每一行,并执行后续的操作以转换数据。希望这有助于解决您的问题。
英文:
I need to convert a file that looks like this:
Fri Apr 14 15:42:02 UTC 2023
MemTotal: 65039504 kB
MemFree: 41010436 kB
MemAvailable: 45100588 kB
Fri Apr 14 16:35:01 UTC 2023
MemTotal: 65039504 kB
MemFree: 40409508 kB
MemAvailable: 44902852 kB
Fri Apr 14 16:36:01 UTC 2023
MemTotal: 65039504 kB
MemFree: 40411232 kB
MemAvailable: 44905376 kB
To something that looks like this:
15:42:02,65039504,41010436,45100588
16:35:01,65039504,40409508,44902852
16:36:01,65039504,40411232,44905376
Here's the script that I came up with:
#!/bin/bash
set -x
export TIME
export MEMTOTAL
export MEMFREE
export MEMAVAIL
export FILE="./SMALL-SAMPLE.txt"
while read -a LINE from $FILE
do
WORD1=${LINE[0]}
WORD2=${LINE[1]}
WORD3=${LINE[2]}
WORD4=${LINE[3]}
WORD5=${LINE[4]}
WORD6=${LINE[5]}
WORD7=${LINE[6]}
case $WORD1 in
"Fri")
TIME=$WORD4
;;
"MemTotal")
MEMTOTAL=$WORD2
;;
"MemFree")
MEMFREE=$WORD2
;;
"MemAvailable")
MEMAVAIL=$WORD2
;;
*)continue;;
esac
LINEOUT="$TIME,$MEMTOTAL,$MEMFREE,$MEMAVAIL"
echo $LINEOUT
done < $FILE
Here's the output:
15:42:02,,,
16:35:01,,,
16:36:01,,,
I've got a rookie mistake hidden in that script somewhere...any ideas about why I cannot get my data in?
答案1
得分: 2
因为 TMTOWTDI(There's More Than One Way To Do It),这是一个更短的 Perl 版本:
<file perl -anE '
push @a, $F[3]||$F[1]||();
say join ",", splice @a if @a > 3;
'
-n
使 Perl 对每条记录/行运行脚本。-a
打开记录的自动拆分(按空格),将其拆分为数组@F
。- 将
$F[3]
(时间戳)或$F[1]
(数值)中首个已定义的项追加到@a
。如果两者都未定义,则追加()
不执行任何操作。 - 如果
@a
有4个元素,将它们打印出来,并截断数组。
英文:
Because TMTOWTDI, a shorter Perl version:
<file perl -anE '
push @a, $F[3]||$F[1]||();
say join",",splice@a if @a>3;
'
-n
makes Perl run the script for each record/line.-a
turns on autosplit of records (by whitespace) into array@F
- Append to
@a
the first defined of$F[3]
(timestamp) or$F[1]
(value). If neither is defined, appending()
is a no-op. - If
@a
has 4 elements, print them, and truncate the array.
Your code has a few errors and stylistic issues.
- As @barmar stated, you should only print a line of output when you see the MemAvailable line (and so you also don't need the
*
/continue
clause) read ... from $FILE
doesn't mean what you may think it does.- You are not matching the trailing colon.
- You should normally quote variable use to avoid unintended word-splitting, globbing, etc (although that shouldn't happen here).
- You shouldn't use all-caps variable names - they are reserved for the system.
- There is no great advantage to defining new variables that only get used once.
- No need to export any variables.
file="./SMALL-SAMPLE.txt"
while read -a line
do
case "${line[0]}" in
Fri)
time=${line[3]}
;;
MemTotal:)
memtotal=${line[1]}
;;
MemFree:)
memfree=${line[1]}
;;
MemAvailable:)
memavail=${line[1]}
echo "$time,$memtotal,$memfree,$memavail"
;;
esac
done < "$file"
答案2
得分: 1
One approach would be:
-
移除所有空行(使用 `sed '/^$/d' $FILE')
-
每次迭代读取四行(重复使用
read
) -
使用
cut
命令提取所需字段
$ cat script.sh
#!/bin/bash
FILE="./SMALL-SAMPLE.txt"
while read ltime; \
read lmemt; \
read lmemf; \
read lmema;
do
TIME=$(echo $ltime | cut -d ' ' -f 4)
MEMT=$(echo $lmemt | cut -d ' ' -f 2)
MEMF=$(echo $lmemf | cut -d ' ' -f 2)
MEMA=$(echo $lmema | cut -d ' ' -f 2)
echo "$TIME,$MEMT,$MEMF,$MEMA"
done < <(sed '/^$/d' $FILE)
测试:
$ ./script.sh
15:42:02,65039504,41010436,45100588
16:35:01,65039504,40409508,44902852
16:36:01,65039504,40411232,44905376
英文:
One approach would be:
-
Remove all empty lines (with
sed '/^$/d' $FILE
) -
Read four lines at once each iteration (repeating
read
for each one) -
Extract the desired field using
cut
command
<!-- -->$ cat script.sh
#!/bin/bashFILE="./SMALL-SAMPLE.txt"
while read ltime;
read lmemt;
read lmemf;
read lmema;
do
TIME=$(echo $ltime | cut -d ' ' -f 4)
MEMT=$(echo $lmemt | cut -d ' ' -f 2)
MEMF=$(echo $lmemf | cut -d ' ' -f 2)
MEMA=$(echo $lmema | cut -d ' ' -f 2)echo "$TIME,$MEMT,$MEMF,$MEMA"
done < <(sed '/^$/d' $FILE)
Testing:
$ ./script.sh
15:42:02,65039504,41010436,45100588
16:35:01,65039504,40409508,44902852
16:36:01,65039504,40411232,44905376
答案3
得分: 0
if (/:[^ ]/) { print /\d+:\d+:\d+/g, "," }
elsif (/: /) { print /(\d+)/, $. % 6 == 5 ? "\n" : "," }
' -- file
-n
逐行读取输入并针对每一行运行代码;- 如果行中包含冒号后跟非空格字符,打印包含三组数字由冒号分隔的部分,以及逗号;
- 否则,如果行中包含冒号后跟空格字符,打印其中的数字,每个组的最后一行后面打印换行符,否则打印逗号。
英文:
perl -ne '
if (/:[^ ]/) { print /\d+:\d+:\d+/g, "," }
elsif (/: /) { print /(\d+)/, $. % 6 == 5 ? "\n" : "," }
' -- file
-n
reads the input line by line and runs the code for each line;- if the line contains a colon followed by non-space, print the part that contains three groups of digits separated by colons, and a comma;
- otherwise, if the line contains a colon followed by a space, print the digits it contains, followed by a newline on every last line of a group, otherwise a comma.
答案4
得分: 0
以下是您要翻译的内容:
你在 case
选项中忘记了尾随的冒号 ("MemTotal:"
而不是 "MemTotal"
)。但是有更简单和更快的解决方案(bash 循环速度较慢)。
使用 awk
的示例(在GNU awk
和macOS Ventura提供的awk
上测试过):
$ awk -v RS= -v OFS=, '/^Mem/ {print t,$2,$5,$8;next} {t=$4}' file
15:42:02,65039504,41010436,45100588
16:35:01,65039504,40409508,44902852
16:36:01,65039504,40411232,44905376
解释:
-v RS=
将记录分隔符设置为空行。-v OFS=,
将输出字段分隔符设置为逗号。/^Mem/ {print t,$2,$5,$8;next}
适用于以Mem
开头的记录,打印变量t
的值和字段 2、5 和 8(记录中的 3 个大小),然后转到下一个记录。{t=$4}
将第四个字段(时间)存储在变量t
中。
英文:
You forgot the trailing colons in your case
choices ("MemTotal:"
instead of "MemTotal"
). But there are much simpler and much faster solutions (bash loops are slow).
Example with awk
(tested with GNU awk
and the awk
that comes with macOS Ventura):
$ awk -v RS= -v OFS=, '/^Mem/ {print t,$2,$5,$8;next} {t=$4}' file
15:42:02,65039504,41010436,45100588
16:35:01,65039504,40409508,44902852
16:36:01,65039504,40411232,44905376
Explanations:
-v RS=
sets the record separator to empty lines.-v OFS=,
sets the output field separator to commas./^Mem/ {print t,$2,$5,$8;next}
applies to records starting withMem
, prints the value of variablet
and fields 2, 5 and 8 (the 3 sizes in the record), and goes to the next record.{t=$4}
stores the fourth field (time) in variablet
.
答案5
得分: 0
15:42:02,65039504,41010436,45100588
16:35:01,65039504,40409508,44902852
16:36:01,65039504,40411232,44905376
英文:
echo 'Fri Apr 14 15:42:02 UTC 2023
MemTotal: 65039504 kB
MemFree: 41010436 kB
MemAvailable: 45100588 kB
Fri Apr 14 16:35:01 UTC 2023
MemTotal: 65039504 kB
MemFree: 40409508 kB
MemAvailable: 44902852 kB
Fri Apr 14 16:36:01 UTC 2023
MemTotal: 65039504 kB
MemFree: 40411232 kB
MemAvailable: 44905376 kB' |
> nawk '(NF = NF)^(ORS = NR % 4 ? "," : "\n")' RS='\n+'
> OFS= FS='^(([^: ]+ )+|[^ :]+: )| [[:alpha:]]+.+$'
-
more succinct ::
> gawk '(ORS=/v/?RS:",")^!(NF+=OFS=_)' FS='^([^:]+|[^ ]+) | [?-|].+$' -
extreme compression ::
> mawk 'NF&&/v/*($__==/: /?","$2:$4)'15:42:02,65039504,41010436,45100588
16:35:01,65039504,40409508,44902852
16:36:01,65039504,40411232,44905376
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论