2023年7月17日 18:41:23go评论65阅读模式

英文:

How to do a MinMax Scale in Snowflake column and still maintain overall sum of column?

问题

I currently have a challenge with Snowflake, where I have a PRICE column like the bellow, the goal is to "scale" this values but keep the original sum intact, like, I do not need to respect the proportions on the scale, but lowest value should continue to be the lowest and so on.
Also I guess this can be a SQL problem.

I tried running a script using WIDTH_BUCKET but there is no option to maintain the total sum.

SELECT 
    sale_date, 
    price,
    WIDTH_BUCKET(price, 200000, 600000, 5) AS "SALES GROUP"
  FROM home_sales
  ORDER BY sale_date;

The output(SALES GROUP column) I am really looking for is like this:
By this example the total SUM of column price is 180.

+------------+-----------+-------------+
| SALE_DATE  |     PRICE | SALES GROUP |
|------------+-----------+-------------|
| 2013-08-01 | 10        |          12 |
| 2014-02-01 | 20        |          24 |
| 2015-04-01 | 30        |          28 |
| 2016-04-01 | 10        |          12 |
| 2017-04-01 | 50        |          47 |
| 2018-04-01 | 60        |          57 |
+------------+-----------+-------------+

If we sum the SALES GROUP column, it still has a total of 180.
*OBS: I know this is not exactly scaling, but I am new to snowflake and did not find the most correct term.

英文:

I tried running a script using WIDTH_BUCKET but there is no option to maintain the total sum.

`SELECT 
    sale_date, 
    price,
    WIDTH_BUCKET(price, 200000, 600000, 5) AS &quot;SALES GROUP&quot;
  FROM home_sales
  ORDER BY sale_date;`

`+------------+-----------+-------------+
| SALE_DATE  |     PRICE | SALES GROUP |
|------------+-----------+-------------|
| 2013-08-01 | 10        |           1 |
| 2014-02-01 | 20        |           2 |
| 2015-04-01 | 30        |           3 |
| 2016-04-01 | 10        |           1 |
| 2017-04-01 | 50        |           4 |
| 2018-04-01 | 60        |           5 |
+------------+-----------+-------------+`

The output(SALES GROUP column) I am really looking for is like this:
By this example the total SUM of column price is 180.

`+------------+-----------+-------------+
| SALE_DATE  |     PRICE | SALES GROUP |
|------------+-----------+-------------|
| 2013-08-01 | 10        |          12 |
| 2014-02-01 | 20        |          24 |
| 2015-04-01 | 30        |          28 |
| 2016-04-01 | 10        |          12 |
| 2017-04-01 | 50        |          47 |
| 2018-04-01 | 60        |          57 |
+------------+-----------+-------------+`

If we sum the SALES GROUP column, it still has a total of 180.
*OBS: I know this is not exactly scaling, but I am new to snowflake and did not find the most correct term.

答案1

得分: 0

如何将自然对数放大以匹配总和？

select *, (ln(price) * sum(price) over() / sum(ln(price)) over())::int as scaled_price
from t;

英文:

How about scaling up the natural log to match the sum?

select *, (ln(price) * sum(price) over() / sum(ln(price)) over())::int as scaled_price
from t;

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何在Snowflake列中进行MinMax缩放，同时仍然保持列的总和？

问题

答案1

SQL Server 返回只有列 X 发生变化的行。

Django的only()和values()与prefetch_related()不兼容。

Oracle SQL查询以获取员工元素的总和。

PSQL / SQL：是否可以进一步优化此查询，而不需要对数据库进行写访问？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论