英文:
How to do a MinMax Scale in Snowflake column and still maintain overall sum of column?
问题
I currently have a challenge with Snowflake, where I have a PRICE column like the bellow, the goal is to "scale" this values but keep the original sum intact, like, I do not need to respect the proportions on the scale, but lowest value should continue to be the lowest and so on.
Also I guess this can be a SQL problem.
I tried running a script using WIDTH_BUCKET but there is no option to maintain the total sum.
SELECT
sale_date,
price,
WIDTH_BUCKET(price, 200000, 600000, 5) AS "SALES GROUP"
FROM home_sales
ORDER BY sale_date;
The output(SALES GROUP column) I am really looking for is like this:
By this example the total SUM of column price is 180.
+------------+-----------+-------------+
| SALE_DATE | PRICE | SALES GROUP |
|------------+-----------+-------------|
| 2013-08-01 | 10 | 12 |
| 2014-02-01 | 20 | 24 |
| 2015-04-01 | 30 | 28 |
| 2016-04-01 | 10 | 12 |
| 2017-04-01 | 50 | 47 |
| 2018-04-01 | 60 | 57 |
+------------+-----------+-------------+
If we sum the SALES GROUP column, it still has a total of 180.
*OBS: I know this is not exactly scaling, but I am new to snowflake and did not find the most correct term.
英文:
I currently have a challenge with Snowflake, where I have a PRICE column like the bellow, the goal is to "scale" this values but keep the original sum intact, like, I do not need to respect the proportions on the scale, but lowest value should continue to be the lowest and so on.
Also I guess this can be a SQL problem.
I tried running a script using WIDTH_BUCKET but there is no option to maintain the total sum.
`SELECT
sale_date,
price,
WIDTH_BUCKET(price, 200000, 600000, 5) AS "SALES GROUP"
FROM home_sales
ORDER BY sale_date;`
`+------------+-----------+-------------+
| SALE_DATE | PRICE | SALES GROUP |
|------------+-----------+-------------|
| 2013-08-01 | 10 | 1 |
| 2014-02-01 | 20 | 2 |
| 2015-04-01 | 30 | 3 |
| 2016-04-01 | 10 | 1 |
| 2017-04-01 | 50 | 4 |
| 2018-04-01 | 60 | 5 |
+------------+-----------+-------------+`
The output(SALES GROUP column) I am really looking for is like this:
By this example the total SUM of column price is 180.
`+------------+-----------+-------------+
| SALE_DATE | PRICE | SALES GROUP |
|------------+-----------+-------------|
| 2013-08-01 | 10 | 12 |
| 2014-02-01 | 20 | 24 |
| 2015-04-01 | 30 | 28 |
| 2016-04-01 | 10 | 12 |
| 2017-04-01 | 50 | 47 |
| 2018-04-01 | 60 | 57 |
+------------+-----------+-------------+`
If we sum the SALES GROUP column, it still has a total of 180.
*OBS: I know this is not exactly scaling, but I am new to snowflake and did not find the most correct term.
答案1
得分: 0
如何将自然对数放大以匹配总和?
select *, (ln(price) * sum(price) over() / sum(ln(price)) over())::int as scaled_price
from t;
英文:
How about scaling up the natural log to match the sum?
select *, (ln(price) * sum(price) over() / sum(ln(price)) over())::int as scaled_price
from t;
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论