如何处理SQL查询,计算各种布尔标志的组合?

huangapple go评论58阅读模式
英文:

how to approach SQL query that counts various combination of boolean flags?

问题

我有一个表格,结构如下,我想要能够计算我正在搜索的所有预定义情景的数量 -

如何处理SQL查询,计算各种布尔标志的组合?

实际数据显然要复杂得多,但出于简单起见,我在这里使用了清晰的数据。我正在努力找到一种聪明的方法来做到这一点,而不是为每个“stat”创建聚合表格。

以下是生成示例的脚本:

declare @persons table(
person_id INT,
country_id INT,
is_adult BIT,
is_fat BIT,
is_tall BIT,
is_bald BIT)

declare @stats table(
    stat_name varchar(max),
    persons_count INT
)

insert @persons
values
    (1000, 12, 1, 0, 1, 0),
    (1001, 15, 1, 0, 0, 1),
    (1002, 12, 0, 1, 1, 0),
    (1003, 20, 1, 0, 0, 0),
    (1004, 15, 0, 1, 0, 1)

insert @stats
values
    ('adults, fat, tall', 0),
    ('adults, fat, tall, bald', 0),
    ('fat, bald', 0),
    ('tall', 0)

谢谢

英文:

I have this table structured like the below, and I want to be able to count all the predefined scenarios I'm searching for -

如何处理SQL查询,计算各种布尔标志的组合?

The real data is obviously much more complicated but if for simplicity sake I use a clear data here. I'm struggling to find a smart way of doing that instead of having aggregated table per each "stat".

here's the script for generating the example:

declare @persons table(
person_id INT,
country_id INT,
is_adult BIT,
is_fat BIT,
is_tall BIT,
is_bald BIT)

declare @stats table(
    stat_name varchar(max),
    persons_count INT
)

insert @persons
values
    (1000, 12, 1, 0, 1, 0),
    (1001, 15, 1, 0, 0, 1),
    (1002, 12, 0, 1, 1, 0),
    (1003, 20, 1, 0, 0, 0),
    (1004, 15, 0, 1, 0, 1)

insert @stats
values
    ('adults, fat, tall', 0),
    ('adults, fat, tall, bald', 0),
    ('fat, bald', 0),
    ('tall', 0)

Thx

答案1

得分: 1

DECLARE @persons TABLE(
    person_id INT,
    country_id INT,
    is_adult BIT,
    is_fat BIT,
    is_tall BIT,
    is_bald BIT
    , agg AS is_adult + 2 * is_fat + 4 * is_tall + 8 * is_bald
)

DECLARE @stats TABLE(
    stat_name varchar(max),
    persons_count INT, agg int
)

INSERT @persons
VALUES
    (1000, 12, 1, 0, 1, 0),
    (1001, 15, 1, 0, 0, 1),
    (1002, 12, 0, 1, 1, 0),
    (1003, 20, 1, 0, 0, 0),
    (1004, 15, 0, 1, 0, 1)

INSERT @stats
VALUES
    ('adults, fat, tall', 0, 1 + 2 + 4),
    ('adults, fat, tall, bald', 0, 1 + 2 + 4 + 8),
    ('fat, bald', 0, 2 + 8),
    ('tall', 0, 4),
    ('adult,tall', 0, 1 + 4)

SELECT	s.stat_name, count(*)
FROM	@persons p
INNER JOIN @stats s
    ON	s.agg & p.agg = s.agg
GROUP BY s.stat_name
英文:

You can create a bit mask for each value and then match on that mask from your stat table:

DECLARE @persons TABLE(
person_id INT,
country_id INT,
is_adult BIT,
is_fat BIT,
is_tall BIT,
is_bald BIT
, agg AS is_adult + 2 * is_fat + 4 * is_tall + 8 * is_bald
)

DECLARE @stats TABLE(
    stat_name varchar(max),
    persons_count INT, agg int
)

INSERT @persons
VALUES
    (1000, 12, 1, 0, 1, 0),
    (1001, 15, 1, 0, 0, 1),
    (1002, 12, 0, 1, 1, 0),
    (1003, 20, 1, 0, 0, 0),
    (1004, 15, 0, 1, 0, 1)

INSERT @stats
VALUES
    ('adults, fat, tall', 0, 1 + 2 + 4),
    ('adults, fat, tall, bald', 0, 1 + 2 + 4 + 8),
    ('fat, bald', 0, 2 + 8),
    ('tall', 0, 4),
    ('adult,tall', 0, 1 + 4)

SELECT	s.stat_name, count(*)
FROM	@persons p
INNER JOIN @stats s
	ON	s.agg & p.agg = s.agg
GROUP BY s.stat_name

i added agg column which is a 4 bit mask of 1 or 0es depending on your is_trait field.
It follows the 2 ** 0, 2 ** 1, 2 ** 2, ... 2 ** N pattern for each new field.
Binary-wise it looks like: 1101 for adult, slim, tall and bald (mask goes from the right).

For the stats, you create same mask of the fields you wanna match.

Finally for the join, you do a bit & (and) operator and check so your result matched. For example: 0111 & 1001 will return bits where both bits in same position is 1, ie: 0001

EDIT: Great username 如何处理SQL查询,计算各种布尔标志的组合?

huangapple
  • 本文由 发表于 2023年2月23日 19:33:26
  • 转载请务必保留本文链接:https://go.coder-hub.com/75544259.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定