为什么在移除了where子句后,我得到了完全意料之外的结果?

huangapple go评论47阅读模式
英文:

Why do I get completely unexpected results after removing the where clause?

问题

sql-1: 选择 cpu, 计算(*) 在 cpu 分组后的数量;

+-----------+-----------------+
| cpu       | COUNT(UInt8(1)) |
+-----------+-----------------+
| cpu0      | 1               |
| cpu1      | 1               |
| cpu3      | 1               |
| cpu-total | 1               |
| cpu2      | 1               |
+-----------+-----------------+

sql-2: 选择 cpu, 计算(*) 在 cpu 中,其中 usage_user > 0.1,然后按 cpu 分组;

+-----------+-----------------+
| cpu       | COUNT(UInt8(1)) |
+-----------+-----------------+
| cpu0      | 40225           |
| cpu-total | 30712           |
| cpu1      | 64182           |
| cpu3      | 51382           |
| cpu2      | 60444           |
+-----------+-----------------+

sql-3: 选择计算(*) 在 cpu 中的数量;

+-----------------+
| COUNT(UInt8(1)) |
+-----------------+
| 426880          |
+-----------------+

这是一个 bug 吗?还是有原因?
cpu 表是由 telegraf 抓取 Linux 环境并写入生成的。表结构如下:

+------------------+-----------------------+-------------+-------------------+
| COLUMN_NAME      | DATA_TYPE             | COLUMN_TYPE | COMPRESSION_CODEC |
+------------------+-----------------------+-------------+-------------------+
| time             | TIMESTAMP(NANOSECOND) | TIME        | DEFAULT           |
| cpu              | STRING                | TAG         | DEFAULT           |
| host             | STRING                | TAG         | DEFAULT           |
| usage_guest      | DOUBLE                | FIELD       | DEFAULT           |
| usage_guest_nice | DOUBLE                | FIELD       | DEFAULT           |
| usage_idle       | DOUBLE                | FIELD       | DEFAULT           |
| usage_iowait     | DOUBLE                | FIELD       | DEFAULT           |
| usage_irq        | DOUBLE                | FIELD       | DEFAULT           |
| usage_nice       | DOUBLE                | FIELD       | DEFAULT           |
| usage_softirq    | DOUBLE                | FIELD       | DEFAULT           |
| usage_steal      | DOUBLE                | FIELD       | DEFAULT           |
| usage_system     | DOUBLE                | FIELD       | DEFAULT           |
| usage_user       | DOUBLE                | FIELD       | DEFAULT           |
+------------------+-----------------------+-------------+-------------------+
英文:

sql-1: select cpu, count(*) from cpu group by cpu;

+-----------+-----------------+
| cpu       | COUNT(UInt8(1)) |
+-----------+-----------------+
| cpu0      | 1               |
| cpu1      | 1               |
| cpu3      | 1               |
| cpu-total | 1               |
| cpu2      | 1               |
+-----------+-----------------+

sql-2: select cpu, count(*) from cpu where usage_user > 0.1 group by cpu;

+-----------+-----------------+
| cpu       | COUNT(UInt8(1)) |
+-----------+-----------------+
| cpu0      | 40225           |
| cpu-total | 30712           |
| cpu1      | 64182           |
| cpu3      | 51382           |
| cpu2      | 60444           |
+-----------+-----------------+

sql-3: select count(*) from cpu;

+-----------------+
| COUNT(UInt8(1)) |
+-----------------+
| 426880          |
+-----------------+

Is this a bug? Or is there a reason?
The cpu table is generated by telegraf grabbing the linux environment and writing. The table structure is as follows:

    +------------------+-----------------------+-------------+-------------------+
| COLUMN_NAME      | DATA_TYPE             | COLUMN_TYPE | COMPRESSION_CODEC |
+------------------+-----------------------+-------------+-------------------+
| time             | TIMESTAMP(NANOSECOND) | TIME        | DEFAULT           |
| cpu              | STRING                | TAG         | DEFAULT           |
| host             | STRING                | TAG         | DEFAULT           |
| usage_guest      | DOUBLE                | FIELD       | DEFAULT           |
| usage_guest_nice | DOUBLE                | FIELD       | DEFAULT           |
| usage_idle       | DOUBLE                | FIELD       | DEFAULT           |
| usage_iowait     | DOUBLE                | FIELD       | DEFAULT           |
| usage_irq        | DOUBLE                | FIELD       | DEFAULT           |
| usage_nice       | DOUBLE                | FIELD       | DEFAULT           |
| usage_softirq    | DOUBLE                | FIELD       | DEFAULT           |
| usage_steal      | DOUBLE                | FIELD       | DEFAULT           |
| usage_system     | DOUBLE                | FIELD       | DEFAULT           |
| usage_user       | DOUBLE                | FIELD       | DEFAULT           |
+------------------+-----------------------+-------------+-------------------+

答案1

得分: 0

语句 SELECT COUNT(*) 可能存在问题。请使用 count(time) 来获取正确的结果:

d ❯ 从 cpu 中选择 count(*);
+-----------------+
| COUNT(UInt8(1)) |
+-----------------+
| 270             |
+-----------------+

d ❯ 从 cpu 中选择 cpu, count(time) 按 cpu 分组;
+-----------+-----------------+
| cpu       | COUNT(cpu.time) |
+-----------+-----------------+
| cpu2      | 45              |
| cpu3      | 45              |
| cpu1      | 45              |
| cpu4      | 45              |
| cpu0      | 45              |
| cpu-total | 45              |
+-----------+-----------------+
英文:

Statement SELECT COUNT(*) may be sick. Use count(time) to get the correct:

d ❯ select count(*) from cpu;
+-----------------+
| COUNT(UInt8(1)) |
+-----------------+
| 270             |
+-----------------+

d ❯ select cpu, count(time) from cpu group by cpu;
+-----------+-----------------+
| cpu       | COUNT(cpu.time) |
+-----------+-----------------+
| cpu2      | 45              |
| cpu3      | 45              |
| cpu1      | 45              |
| cpu4      | 45              |
| cpu0      | 45              |
| cpu-total | 45              |
+-----------+-----------------+

huangapple
  • 本文由 发表于 2023年6月15日 14:05:44
  • 转载请务必保留本文链接:https://go.coder-hub.com/76479548.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定