英文:
Presto/Trino/Athena - Wrong subtraction of casted varchar to double
问题
计算出来的体重看起来有这么多小数位,是因为你在查询中将体重从字符串(varchar)转换为双精度(double)。当你进行这种类型转换时,通常会保留更多的小数位,以确保数据精度。这是正常的行为,特别是如果原始体重数据包含了很多小数位。如果你希望结果显示更少的小数位,你可以使用ROUND函数或者CAST函数来限制小数位数,例如:
SELECT t1.user_id, t1.sample_id, t1.weight,
ROUND(cast(t1.weight as double) - cast(t2.weight as double), 2) AS weight_loss
FROM my_table t1
JOIN my_table t2 ON t1.user_id = t2.user_id AND t1.sample_id - 1 = t2.sample_id
ORDER BY t1.user_id, t1.sample_id
在上面的示例中,我使用了ROUND
函数将结果限制为两位小数。你可以根据需要调整小数位数。
英文:
I am using AWS Athena and trying to calculate the weight loss of each user between two samples.
My weight column is varchar, so I cast it into double and then subtract them.
I am using the following query:
SELECT t1.user_id, t1.sample_id, t1.weight,
cast(t1.weight, double) - cast(t2.weight, double) AS weight_loss
FROM my_table t1
JOIN my_table t2 ON t1.user_id = t2.user_id AND t1.sample_id - 1 = t2.sample_id
ORDER BY t1.user_id, t1.sample_id
and I get the folowing result:
Why does the calculated weight looks like this with so many floating points?
答案1
得分: 1
Decimal数据类型在Presto中是可以解决您的问题的工具。
以以下代码为例:
SELECT t1.user_id, t1.sample_id, t1.weight,
cast(t1.weight, DECIMAL(10,1)) - cast(t2.weight, DECIMAL(10,1)) AS weight_loss
FROM my_table t1
JOIN my_table t2 ON t1.user_id = t2.user_id AND t1.sample_id - 1 = t2.sample_id
ORDER BY t1.user_id, t1.sample_id
英文:
Decimal Data Type in Presto is the tool which can slove your problem.
See the following code as example:
SELECT t1.user_id, t1.sample_id, t1.weight,
cast(t1.weight, DECIMAL(10,1)) - cast(t2.weight, DECIMAL(10,1)) AS weight_loss
FROM my_table t1
JOIN my_table t2 ON t1.user_id = t2.user_id AND t1.sample_id - 1 = t2.sample_id
ORDER BY t1.user_id, t1.sample_id
答案2
得分: 1
首先,如先前提到的 - 您可以使用更精确的数据类型,即 decimal
。此外,我建议查看窗口函数,特别是 lag
函数,因为实际上不需要执行连接操作(如果数据量很大的话,连接操作可能会很昂贵,而且我不确定 Presto/Trino 能否优化它)。以下是一些示例代码:
select user_id,
sample_id,
weight,
decimal_weight - lag(decimal_weight) over (partition by user_id order by sample_id) AS weight_loss
from (
SELECT user_id,
sample_id,
weight,
cast(weight as decimal(10,1)) decimal_weight
FROM my_table)
ORDER BY user_id, sample_id;
英文:
First of all as mentioned previously - you can use more precise data type i.e. decimal
. But also I would recommend to look into window functions, especially lag
one because there is no need to actually perform a join (which can be quite costly if there is a lot of data and I'm not sure that Presto/Trion will be able to optimize that). Something along this lines:
select user_id,
sample_id,
weight,
decimal_weight - lag(decimal_weight) over (partition by user_id order by sample_id) AS weight_loss
from (
SELECT user_id,
sample_id,
weight,
cast(weight as decimal(10,1)) decimal_weight
FROM my_table)
ORDER BY user_id, sample_id;
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论