2023年6月1日 15:01:40go评论93阅读模式

英文:

Add entries from a CSV into different arrays on awk and print them

问题

我正在制作一个.awk脚本，用来处理包含产品价格的.csv文件。输入文件如下所示：

Product,Price
EG412,25
EG411,15
EG516,55
EG517,60
LG210,10
LG180,5
HG915,95

我已经通过将第二列相加并除以NR - 1来计算平均值，但现在我需要根据产品价格是否高于或低于平均价格点将产品添加到数组中。我遇到的问题是我的数组没有打印出来，而且还添加了包含"Product,Price"的csv顶部列。我目前的代码如下：

BEGIN{
    FS=","
    sum=0
    avg=0
    high=0
    low=0
}
{
    sum+=$2
    total=NR-1
}
{
    avg=sum/total
}
{
    if ($2 > avg && NR > 1) {
        expensive[high] = $1
        high++
    } else if ($2 < avg && NR > 1) {
        cheap[low] = $1
        low++
    }
}
{
    for (i in expensive) {
        print i
        i++
    }
}
END{
    printf "Average Price: %.2f\n", avg
}

我感觉我可能以非常复杂的方式来解决这个问题，但我无法弄清楚如何使它正常工作。当我以这种方式运行它来测试高价值产品的数组时，它返回的结果是：

0
1
0
1
0
1
0
Average Price: 37.86

我会感激任何帮助解决这个问题。

英文:

I am making a .awk script and am taking in a .csv file containing product prices. The input file is:

Product,Price
EG412,25
EG411,15
EG516,55
EG517,60
LG210,10
LG180,5
HG915,95

I have already gotten an average through adding the second column and dividing by NR - 1, but now I am supposed to add products into arrays based upon if they are above or below the average price point. The issue I am running into is that my arrays are not printing and are also adding the top column of the csv containing "Product,Price". The code I have is:

BEGIN{
    FS=&quot;,&quot;
    sum=0
    avg=0
    high=0
    low=0
}
{
    sum+=$2
    total=NR-1
}
{
    avg=sum/total
}
{
    if ($2 &gt; avg &amp;&amp; NR &gt; 1) {
        expensive[high] = $1
        high++
    } else if ($2 &lt; avg &amp;&amp; NR &gt; 1) {
        cheap[low] = $1
        low++
    }
}
{
    for (i in expensive) {
        print i
        i++
    }
}
END{
    printf &quot;Average Price: &quot;&quot;%.2\n&quot;, avg
}

I feel like I am doing this in an incredibly convoluted way, but I can't figure out how to get this to work. When I run it this way to test the array for only high value products, the result it returns is:

0
1
0
1
0
1
0
Average Price: 37.86

I would appreciate any help resolving this issue.

答案1

得分: 1

使用两遍方法，首先计算平均值，然后确定哪些值高于/低于平均值：

$ cat tst.awk
BEGIN { FS="," }
FNR == 1 {
    next
}
NR == FNR {
    tot += $2
    ave = tot / (NR-1)
    next
}
{
    if ( $2 < ave ) {
        cheap[$1]
    }
    else if ( $2 > ave ) {
        expensive[$1]
    }
}
END {
    print "平均值:", ave+0
    print "\n便宜商品:"
    for ( product in cheap ) {
        print product
    }
    print "\n昂贵商品:"
    for ( product in expensive ) {
        print product
    }
}

$ awk -f tst.awk 文件 文件
平均值: 37.8571
便宜商品:
LG180
LG210
EG411
EG412
昂贵商品:
EG516
EG517
HG915

```

英文:

Use a 2-pass approach, first to calculate the average and then to determine which values are above/below the average:

$ cat tst.awk
BEGIN { FS=&quot;,&quot; }
FNR == 1 {
    next
}
NR == FNR {
    tot += $2
    ave = tot / (NR-1)
    next
}
{
    if ( $2 &lt; ave ) {
        cheap[$1]
    }
    else if ( $2 &gt; ave ) {
        expensive[$1]
    }
}
END {
    print &quot;Average:&quot;, ave+0
    print &quot;\nCheap:&quot;
    for ( product in cheap ) {
        print product
    }
    print &quot;\nExpensive:&quot;
    for ( product in expensive ) {
        print product
    }
}

<p>

$ awk -f tst.awk file file
Average: 37.8571
Cheap:
LG180
LG210
EG411
EG412
Expensive:
EG516
EG517
HG915

答案2

得分: 0

你应该事先计算好平均值，然后将其传递给你的脚本，类似这样的方式应该可以工作：

calc.awk

# 数组从1开始索引
BEGIN { low = high = 1 }
NR == 1 { next }
$2  < avg { cheap[low++]      = $2 }
$2 >= avg { expensive[high++] = $2 }
END {
  printf "平均值：%.2f\n", avg
  printf "便宜："
  for (i in cheap)
    printf " %d", cheap[i]
  printf "\n高价："
  for (i in expensive)
    printf " %d", expensive[i]
  printf "\n"
}

运行它如下：

awk -v avg=37.8571 -F, -f calc.awk infile.csv

输出：

平均值：37.86
便宜：25 15 10 5
高价：55 60 95

英文:

You should calculate the average before hand, then pass it into your script, something like this should work:

calc.awk

# Arrays are 1-indexed
BEGIN { low = high = 1 }
NR == 1 { next }
$2  &lt; avg { cheap[low++]      = $2 }
$2 &gt;= avg { expensive[high++] = $2 }
END {
  printf &quot;Average: %.2f\n&quot;, avg
  printf &quot;Cheap:&quot;
  for (i in cheap)
    printf &quot; %d&quot;, cheap[i]
  printf &quot;\nHigh:&quot;
  for (i in expensive)
    printf &quot; %d&quot;, expensive[i]
  printf &quot;\n&quot;
}

Run it like this:

awk -v avg=37.8571 -F, -f calc.awk infile.csv

Output:

Average: 37.86
Cheap: 25 15 10 5
High: 55 60 95

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

将CSV中的条目添加到awk中的不同数组并打印它们。

问题

答案1

答案2

bash + 如何在同一行上打印序列行

awk to get values between start and end pattern if a floating number is larger than threshold can't handle 4 digits before point

awk和sed用于重命名文件中带有索引的部分。

在AWK中在文件顶部插入一行

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。