如何在Java中将NaN写入BigQuery?

huangapple go评论86阅读模式
英文:

How to write NaN to biqquery in Java?

问题

在BigQuery表中,我们可以使用 isNaN() 函数来检查 NaN 值。

但是在Java(JVM语言)中如何将 NaN 写入BigQuery表格?

写入字符串 "NaN" 吗?

欢迎提出任何评论。谢谢。

更新:

我们尝试发送 Double.NaN,但是BigQuery不接受它并抛出异常。不知道为什么。

插入请求发送错误,表格:xxxx
com.google.cloud.bigquery.BigQueryException: java.lang.IllegalArgumentException
at com.google.cloud.bigquery.BigQueryException.translateAndThrow(BigQueryException.java:100)
在 com.google.cloud.bigquery.BigQueryImpl.insertAll(BigQueryImpl.java:979)
...
导致于:java.lang.IllegalArgumentException
at com.google.common.base.Preconditions.checkArgument(Preconditions.java:128)
at com.google.api.client.util.Preconditions.checkArgument(Preconditions.java:35)
at com.google.api.client.json.JsonGenerator.serialize(JsonGenerator.java:134)
在 com.google.api.client.json.JsonGenerator.serialize(JsonGenerator.java:173)
在 com.google.api.client.json.JsonGenerator.serialize(JsonGenerator.java:173)
在 com.google.api.client.json.JsonGenerator.serialize(JsonGenerator.java:173)
在 com.google.api.client.json.JsonGenerator.serialize(JsonGenerator.java:146)
在 com.google.api.client.json.JsonGenerator.serialize(JsonGenerator.java:173)
在 com.google.api.client.json.JsonGenerator.serialize(JsonGenerator.java:105)

英文:

In biqquery table, we can use isNaN() to check NaN value.

But how to write NaN to bigquery table in java (JVM languages)?

Write string "NaN"?

Any comments are welcome. Thanks

UPDATE:

we were trying to send Double.NaN, but the big query does not accept it and throwing exceptions. No idea why.

Error sending insert request, table: xxxx
com.google.cloud.bigquery.BigQueryException: java.lang.IllegalArgumentException
at com.google.cloud.bigquery.BigQueryException.translateAndThrow(BigQueryException.java:100)
at com.google.cloud.bigquery.BigQueryImpl.insertAll(BigQueryImpl.java:979)
...
Caused by: java.lang.IllegalArgumentException
 	at com.google.common.base.Preconditions.checkArgument(Preconditions.java:128)
 	at com.google.api.client.util.Preconditions.checkArgument(Preconditions.java:35) 	
    at com.google.api.client.json.JsonGenerator.serialize(JsonGenerator.java:134)
	at com.google.api.client.json.JsonGenerator.serialize(JsonGenerator.java:173)
 	at com.google.api.client.json.JsonGenerator.serialize(JsonGenerator.java:173)
 	at com.google.api.client.json.JsonGenerator.serialize(JsonGenerator.java:173)
 	at com.google.api.client.json.JsonGenerator.serialize(JsonGenerator.java:146)
 	at com.google.api.client.json.JsonGenerator.serialize(JsonGenerator.java:173)
 	at com.google.api.client.json.JsonGenerator.serialize(JsonGenerator.java:105)

答案1

得分: 1

Float.NaNDouble.NaN将无法正常工作。

在底层,Java BigQuery API使用Google HTTP Client Library for Java

不支持NaN和无限大的JSON符号。可能是因为这个原因,当对提供的数据进行序列化时,JsonGenerator 明确禁止这样做

double doubleValue = ((Number) value).doubleValue();
Preconditions.checkArgument(!Double.isInfinite(doubleValue) && !Double.isNaN(doubleValue));
writeNumber(doubleValue);

值得注意的是,与此相反,该库允许解析这些值。请参阅相关提交

除了上述情况,在Java中不能通过其他方式提供NaN值。

在我看来,NaN并不是用来作为函数参数传递的,而应该被视为计算错误或某种溢出的结果,就像在IEEE_DIVIDE中可能引起的一样,这也是BigQuery中存在此数据类型的原因

> 如果输入是有限的但输出是非有限的,则函数调用和运算符返回溢出错误。如果输入包含非有限值,则输出可以是非有限的。一般情况下,函数不会引入NaN+/-inf。然而,像IEEE_DIVIDE这样的特定函数可以在有限输入上返回非有限值。所有这些情况在数学函数中都有明确说明。

请不要使用NaN,而是将其视为null值:这是其他语言中采用的方法。

英文:

Float.NaN or Double.NaN will not work.

Under the hood the Java BigQuery API uses the Google HTTP Client Library for Java.

Neither NaN nor infinity are supported JSON symbols. Probably for that reason, when serializing the provided data, JsonGenerator explicitly forbids it:

double doubleValue = ((Number) value).doubleValue();
Preconditions.checkArgument(!Double.isInfinite(doubleValue) && !Double.isNaN(doubleValue));
writeNumber(doubleValue);

As a side note, in contrast, the library allows parsing this values. Please, see the relevant commit.

In addition two the aforementioned cases, in Java you cannot provide the NaN value by other mean.

In my opinion, NaN is not conceived to be passed as a function argument, rather they should be considered a computational error or the result of some kind of overflow like the one that could be caused in IEEE_DIVIDE, and that is the reason for the existence of the data type in BigQuery:

> Function calls and operators return an overflow error if the input is
> finite but the output would be non-finite. If the input contains non-finite
> values, the output can be non-finite. In general functions do not introduce
> NaNs or +/-inf. However, specific functions like IEEE_DIVIDE can
> return non-finite values on finite input. All such cases are noted
> explicitly in Mathematical functions.

Instead of using NaN, please treat then as null values: that is the approach followed in other languages.

答案2

得分: 0

以下是用于BigQuery标准SQL的内容,提供了几种生成NaN的方法,我只是用IS_NaN进行了包装,以检查结果实际上是否为NaN,而不仅仅是字符串'NaN'。

SELECT
  test1,
  IS_NAN(test1),
  test2,
  IS_NAN(test2)
FROM (  
SELECT 
  IEEE_DIVIDE(0, 0) test1, 
  CAST('NaN' AS FLOAT64) test2
)  

带有结果的查询:

Row   test1   f0_     test2   f1_
1     NaN     true    NaN     true
英文:

Below is for BigQuery Standard SQL and gives you few ways of generating Nan, I just wrapped it with IS_NaN to check that result is actually NaN and not just string 'NaN'

SELECT
  test1,
  IS_NAN(test1),
  test2,
  IS_NAN(test2)
FROM (  
SELECT 
  IEEE_DIVIDE(0, 0) test1, 
  CAST('NaN' AS FLOAT64) test2
)  

with result

Row	test1	f0_	    test2	f1_	 
1	NaN	    true	NaN	    true	 

huangapple
  • 本文由 发表于 2020年9月4日 23:10:47
  • 转载请务必保留本文链接:https://go.coder-hub.com/63743781.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定