英文:
Add entries from a CSV into different arrays on awk and print them
问题
我正在制作一个.awk脚本,用来处理包含产品价格的.csv文件。输入文件如下所示:
Product,Price
EG412,25
EG411,15
EG516,55
EG517,60
LG210,10
LG180,5
HG915,95
我已经通过将第二列相加并除以NR - 1来计算平均值,但现在我需要根据产品价格是否高于或低于平均价格点将产品添加到数组中。我遇到的问题是我的数组没有打印出来,而且还添加了包含"Product,Price"的csv顶部列。我目前的代码如下:
BEGIN{
FS=","
sum=0
avg=0
high=0
low=0
}
{
sum+=$2
total=NR-1
}
{
avg=sum/total
}
{
if ($2 > avg && NR > 1) {
expensive[high] = $1
high++
} else if ($2 < avg && NR > 1) {
cheap[low] = $1
low++
}
}
{
for (i in expensive) {
print i
i++
}
}
END{
printf "Average Price: %.2f\n", avg
}
我感觉我可能以非常复杂的方式来解决这个问题,但我无法弄清楚如何使它正常工作。当我以这种方式运行它来测试高价值产品的数组时,它返回的结果是:
0
1
0
1
0
1
0
Average Price: 37.86
我会感激任何帮助解决这个问题。
英文:
I am making a .awk script and am taking in a .csv file containing product prices. The input file is:
Product,Price
EG412,25
EG411,15
EG516,55
EG517,60
LG210,10
LG180,5
HG915,95
I have already gotten an average through adding the second column and dividing by NR - 1, but now I am supposed to add products into arrays based upon if they are above or below the average price point. The issue I am running into is that my arrays are not printing and are also adding the top column of the csv containing "Product,Price". The code I have is:
BEGIN{
FS=","
sum=0
avg=0
high=0
low=0
}
{
sum+=$2
total=NR-1
}
{
avg=sum/total
}
{
if ($2 > avg && NR > 1) {
expensive[high] = $1
high++
} else if ($2 < avg && NR > 1) {
cheap[low] = $1
low++
}
}
{
for (i in expensive) {
print i
i++
}
}
END{
printf "Average Price: ""%.2\n", avg
}
I feel like I am doing this in an incredibly convoluted way, but I can't figure out how to get this to work. When I run it this way to test the array for only high value products, the result it returns is:
0
1
0
1
0
1
0
Average Price: 37.86
I would appreciate any help resolving this issue.
答案1
得分: 1
使用两遍方法,首先计算平均值,然后确定哪些值高于/低于平均值:
$ cat tst.awk
BEGIN { FS="," }
FNR == 1 {
next
}
NR == FNR {
tot += $2
ave = tot / (NR-1)
next
}
{
if ( $2 < ave ) {
cheap[$1]
}
else if ( $2 > ave ) {
expensive[$1]
}
}
END {
print "平均值:", ave+0
print "\n便宜商品:"
for ( product in cheap ) {
print product
}
print "\n昂贵商品:"
for ( product in expensive ) {
print product
}
}
$ awk -f tst.awk 文件 文件
平均值: 37.8571
便宜商品:
LG180
LG210
EG411
EG412
昂贵商品:
EG516
EG517
HG915
```
英文:
Use a 2-pass approach, first to calculate the average and then to determine which values are above/below the average:
$ cat tst.awk
BEGIN { FS="," }
FNR == 1 {
next
}
NR == FNR {
tot += $2
ave = tot / (NR-1)
next
}
{
if ( $2 < ave ) {
cheap[$1]
}
else if ( $2 > ave ) {
expensive[$1]
}
}
END {
print "Average:", ave+0
print "\nCheap:"
for ( product in cheap ) {
print product
}
print "\nExpensive:"
for ( product in expensive ) {
print product
}
}
<p>
$ awk -f tst.awk file file
Average: 37.8571
Cheap:
LG180
LG210
EG411
EG412
Expensive:
EG516
EG517
HG915
答案2
得分: 0
你应该事先计算好平均值,然后将其传递给你的脚本,类似这样的方式应该可以工作:
calc.awk
# 数组从1开始索引
BEGIN { low = high = 1 }
NR == 1 { next }
$2 < avg { cheap[low++] = $2 }
$2 >= avg { expensive[high++] = $2 }
END {
printf "平均值:%.2f\n", avg
printf "便宜:"
for (i in cheap)
printf " %d", cheap[i]
printf "\n高价:"
for (i in expensive)
printf " %d", expensive[i]
printf "\n"
}
运行它如下:
awk -v avg=37.8571 -F, -f calc.awk infile.csv
输出:
平均值:37.86
便宜:25 15 10 5
高价:55 60 95
英文:
You should calculate the average before hand, then pass it into your script, something like this should work:
calc.awk
# Arrays are 1-indexed
BEGIN { low = high = 1 }
NR == 1 { next }
$2 < avg { cheap[low++] = $2 }
$2 >= avg { expensive[high++] = $2 }
END {
printf "Average: %.2f\n", avg
printf "Cheap:"
for (i in cheap)
printf " %d", cheap[i]
printf "\nHigh:"
for (i in expensive)
printf " %d", expensive[i]
printf "\n"
}
Run it like this:
awk -v avg=37.8571 -F, -f calc.awk infile.csv
Output:
Average: 37.86
Cheap: 25 15 10 5
High: 55 60 95
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论