英文:
How to write NaN to biqquery in Java?
问题
在BigQuery表中,我们可以使用 isNaN()
函数来检查 NaN 值。
但是在Java(JVM语言)中如何将 NaN 写入BigQuery表格?
写入字符串 "NaN" 吗?
欢迎提出任何评论。谢谢。
更新:
我们尝试发送 Double.NaN
,但是BigQuery不接受它并抛出异常。不知道为什么。
插入请求发送错误,表格:xxxx
com.google.cloud.bigquery.BigQueryException: java.lang.IllegalArgumentException
at com.google.cloud.bigquery.BigQueryException.translateAndThrow(BigQueryException.java:100)
在 com.google.cloud.bigquery.BigQueryImpl.insertAll(BigQueryImpl.java:979)
...
导致于:java.lang.IllegalArgumentException
at com.google.common.base.Preconditions.checkArgument(Preconditions.java:128)
at com.google.api.client.util.Preconditions.checkArgument(Preconditions.java:35)
at com.google.api.client.json.JsonGenerator.serialize(JsonGenerator.java:134)
在 com.google.api.client.json.JsonGenerator.serialize(JsonGenerator.java:173)
在 com.google.api.client.json.JsonGenerator.serialize(JsonGenerator.java:173)
在 com.google.api.client.json.JsonGenerator.serialize(JsonGenerator.java:173)
在 com.google.api.client.json.JsonGenerator.serialize(JsonGenerator.java:146)
在 com.google.api.client.json.JsonGenerator.serialize(JsonGenerator.java:173)
在 com.google.api.client.json.JsonGenerator.serialize(JsonGenerator.java:105)
英文:
In biqquery table, we can use isNaN()
to check NaN value.
But how to write NaN to bigquery table in java (JVM languages)?
Write string "NaN"?
Any comments are welcome. Thanks
UPDATE:
we were trying to send Double.NaN
, but the big query does not accept it and throwing exceptions. No idea why.
Error sending insert request, table: xxxx
com.google.cloud.bigquery.BigQueryException: java.lang.IllegalArgumentException
at com.google.cloud.bigquery.BigQueryException.translateAndThrow(BigQueryException.java:100)
at com.google.cloud.bigquery.BigQueryImpl.insertAll(BigQueryImpl.java:979)
...
Caused by: java.lang.IllegalArgumentException
at com.google.common.base.Preconditions.checkArgument(Preconditions.java:128)
at com.google.api.client.util.Preconditions.checkArgument(Preconditions.java:35)
at com.google.api.client.json.JsonGenerator.serialize(JsonGenerator.java:134)
at com.google.api.client.json.JsonGenerator.serialize(JsonGenerator.java:173)
at com.google.api.client.json.JsonGenerator.serialize(JsonGenerator.java:173)
at com.google.api.client.json.JsonGenerator.serialize(JsonGenerator.java:173)
at com.google.api.client.json.JsonGenerator.serialize(JsonGenerator.java:146)
at com.google.api.client.json.JsonGenerator.serialize(JsonGenerator.java:173)
at com.google.api.client.json.JsonGenerator.serialize(JsonGenerator.java:105)
答案1
得分: 1
Float.NaN
或Double.NaN
将无法正常工作。
在底层,Java BigQuery API使用Google HTTP Client Library for Java。
不支持NaN
和无限大的JSON符号。可能是因为这个原因,当对提供的数据进行序列化时,JsonGenerator
明确禁止这样做:
double doubleValue = ((Number) value).doubleValue();
Preconditions.checkArgument(!Double.isInfinite(doubleValue) && !Double.isNaN(doubleValue));
writeNumber(doubleValue);
值得注意的是,与此相反,该库允许解析这些值。请参阅相关提交。
除了上述情况,在Java中不能通过其他方式提供NaN
值。
在我看来,NaN
并不是用来作为函数参数传递的,而应该被视为计算错误或某种溢出的结果,就像在IEEE_DIVIDE
中可能引起的一样,这也是BigQuery中存在此数据类型的原因:
> 如果输入是有限的但输出是非有限的,则函数调用和运算符返回溢出错误。如果输入包含非有限值,则输出可以是非有限的。一般情况下,函数不会引入NaN
或+/-inf
。然而,像IEEE_DIVIDE
这样的特定函数可以在有限输入上返回非有限值。所有这些情况在数学函数中都有明确说明。
请不要使用NaN
,而是将其视为null
值:这是其他语言中采用的方法。
英文:
Float.NaN
or Double.NaN
will not work.
Under the hood the Java BigQuery API uses the Google HTTP Client Library for Java.
Neither NaN
nor infinity are supported JSON symbols. Probably for that reason, when serializing the provided data, JsonGenerator
explicitly forbids it:
double doubleValue = ((Number) value).doubleValue();
Preconditions.checkArgument(!Double.isInfinite(doubleValue) && !Double.isNaN(doubleValue));
writeNumber(doubleValue);
As a side note, in contrast, the library allows parsing this values. Please, see the relevant commit.
In addition two the aforementioned cases, in Java you cannot provide the NaN
value by other mean.
In my opinion, NaN
is not conceived to be passed as a function argument, rather they should be considered a computational error or the result of some kind of overflow like the one that could be caused in IEEE_DIVIDE
, and that is the reason for the existence of the data type in BigQuery:
> Function calls and operators return an overflow error if the input is
> finite but the output would be non-finite. If the input contains non-finite
> values, the output can be non-finite. In general functions do not introduce
> NaN
s or +/-inf
. However, specific functions like IEEE_DIVIDE
can
> return non-finite values on finite input. All such cases are noted
> explicitly in Mathematical functions.
Instead of using NaN
, please treat then as null
values: that is the approach followed in other languages.
答案2
得分: 0
以下是用于BigQuery标准SQL的内容,提供了几种生成NaN的方法,我只是用IS_NaN进行了包装,以检查结果实际上是否为NaN,而不仅仅是字符串'NaN'。
SELECT
test1,
IS_NAN(test1),
test2,
IS_NAN(test2)
FROM (
SELECT
IEEE_DIVIDE(0, 0) test1,
CAST('NaN' AS FLOAT64) test2
)
带有结果的查询:
Row test1 f0_ test2 f1_
1 NaN true NaN true
英文:
Below is for BigQuery Standard SQL and gives you few ways of generating Nan, I just wrapped it with IS_NaN to check that result is actually NaN and not just string 'NaN'
SELECT
test1,
IS_NAN(test1),
test2,
IS_NAN(test2)
FROM (
SELECT
IEEE_DIVIDE(0, 0) test1,
CAST('NaN' AS FLOAT64) test2
)
with result
Row test1 f0_ test2 f1_
1 NaN true NaN true
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论