在 1870 年之前的日期中,从毫秒获取的 java.sql.Date 会偏移一天。

huangapple go评论104阅读模式
英文:

java.sql.Date from millis off by 1 day for dates before 1870

问题

如果我从毫秒创建一个SQL日期对象,那么在1870-01-01之前的日期会差1天。

package org.example;

import org.joda.time.DateTime;
import org.joda.time.DateTimeZone;
import org.joda.time.format.DateTimeFormatter;
import org.joda.time.format.ISODateTimeFormat;

import java.sql.Date;

/**
 * Hello world!
 *
 */
public class App {

    static final DateTimeFormatter DATE_FORMATTER = ISODateTimeFormat.date();
    public static void main( String[] args )
    {
        DateTimeZone localTimeZone = DateTimeZone.forID("Asia/Kolkata");
        long millis = DATE_FORMATTER.withZone(localTimeZone).parseMillis(String.valueOf("1854-01-01"));
        Date date = new Date(millis);
        DateTime jodaDate = new DateTime(millis, localTimeZone);
        //Date date = new Date(DATE_FORMATTER.withZone(localTimeZone).parseMillis()
        System.out.println("millis " + millis + ",date " + date + ",jodadate " + jodaDate);
    }
}

上述代码的输出结果为:

millis -3660616408000,date 1853-12-31,jodadate 1854-01-01T00:00:00.000+05:53:28

在这段代码中,我使用Joda库从字符串获取了自纪元以来的毫秒数,然后创建了一个SQL日期对象。但是,这个SQL日期与我最初解析的字符串相比多了一天。

奇怪的是,我注意到只有在1870-01-01之后或之后的日期上,我才会看到SQL日期有所偏差。我是否漏掉了什么?

英文:

If I create an SQL Date object from millis, the date is off by 1 day for dates before 1870-01-01

package org.example;

import org.joda.time.DateTime;
import org.joda.time.DateTimeZone;
import org.joda.time.format.DateTimeFormatter;
import org.joda.time.format.ISODateTimeFormat;

import java.sql.Date;

/**
 * Hello world!
 *
 */
public class App 
{

    static final DateTimeFormatter DATE_FORMATTER = ISODateTimeFormat.date();
    public static void main( String[] args )
    {
        DateTimeZone localTimeZone = DateTimeZone.forID("Asia/Kolkata");
        long millis = DATE_FORMATTER.withZone(localTimeZone).parseMillis(String.valueOf("1854-01-01"));
        Date date = new Date(millis);
        DateTime jodaDate = new DateTime(millis,localTimeZone);
        //Date date = new Date(DATE_FORMATTER.withZone(localTimeZone).parseMillis()
        System.out.println("millis " + millis + ",date " + date + ",jodadate " + jodaDate);
    }
}

Above code gives the output

millis -3660616408000,date 1853-12-31,jodadate 1854-01-01T00:00:00.000+05:53:28

In this code I'm getting the milliseconds from epoch of a string using Joda libraries. Then creating an SQL Date object from it. But this SQL date is one day off from the initial string I parsed.

Strange thing is I don't see the SQL date being off I started with any date on or after 1870-01-01. Is there something I'm missing here?

---EDIT---

Below is an example where I don't see the date being one day off.

package org.example;

import org.joda.time.DateTime;
import org.joda.time.DateTimeZone;
import org.joda.time.format.DateTimeFormatter;
import org.joda.time.format.ISODateTimeFormat;

import java.sql.Date;

/**
 * Hello world!
 *
 */
public class App 
{

    static final DateTimeFormatter DATE_FORMATTER = ISODateTimeFormat.date();
    public static void main( String[] args )
    {
        DateTimeZone localTimeZone = DateTimeZone.forID("Asia/Kolkata");
        long millis = DATE_FORMATTER.withZone(localTimeZone).parseMillis(String.valueOf("1870-01-01"));
        Date date = new Date(millis);
        DateTime jodaDate = new DateTime(millis,localTimeZone);
        //Date date = new Date(DATE_FORMATTER.withZone(localTimeZone).parseMillis()
        System.out.println("millis " + millis + ",date " + date + ",jodadate " + jodaDate);
    }
}

Above code gives output

millis -3155692870000,date 1870-01-01,jodadate 1870-01-01T00:00:00.000+05:21:10

答案1

得分: 4

以下是翻译好的部分:

前言

你的代码实际上在任何JVM上运行都会导致所有日期都有一天的偏差,前提是该JVM的默认时区配置为Asia/Kolkata的“西部”,因为“你的系统的本地时区”是其中的一个固有部分(因为new Date(millis).toString()会根据本地时区打印millis所在的日期)。因此,之前的回答者(后来删除了他们的回答)在全面复现你的问题时,错误地识别了问题的一部分 - “输入”的一部分也是将平台默认时区设置为Asia/Kolkata,显然,这是你的计算机已经设置的。

解释这个问题非常复杂,需要一些背景知识。

tzdata的深入研究

以下是当前的Asia/Kolkatatzdata部分

# 区域	名称		STDOFF	规则	格式	[UNTIL]
区域	Asia/Kolkata	5:53:28 -	LMT	1854年6月28日 # 加尔各答
			            5:53:20	-	HMT	1870年	    # 豪拉平均时间?
			            5:21:10	-	MMT	1906年1月1日 # 马德拉斯当地时间
			            5:30	-	IST	1941年10月
			            5:30	1:00	+0630	1942年5月15日
			            5:30	-	IST	1942年9月
			            5:30	1:00	+0630	1945年10月15日
			            5:30	-	IST

可以看到,1870年是其一个拐点:引入了一个微小的更改,将偏移量调整了2分钟10秒。

这只是一个巨大文本的小片段,列出了所有时区和所有时区的所有拐点。这被称为tzdata,是一个独立的项目。问题在于,jodatime作为一个库,自带了自己的tzdata副本,并在所有计算中使用它,而new Date().toString()也需要将其millis转换为日期以确定要显示的日期,它使用你的OpenJDK的定义(请记住,java.util.Date 是一个谎言 - 它根本不表示日期,它表示epochmillis;它的toString仅仅是为了向后兼容而困惑了。几乎该类中的每个方法都已被弃用,不应该用这个类做任何事情)。

解释(概述)

如果你的joda分发中的tzdata与你的OpenJDK中的Asia/Kolkata相关日期的tzdata不同,我们就有了解释。Jodatime需要确定Asia/Kolkata区域的1854-01-01 00:00:00的正确millis,它使用自己的tzdata来进行计算。然后,将这个millis发送给j.u.Date,然后你问j.u.Date:将其转换回我系统的默认区域的人类计算(很可能也是Asia/Kolkata)。这__应该__是无损转换 - 它应该总是让你正确回到1854-01-01除非你的jodatime和OpenJDK中的Asia/Kolkata的tzdata定义不一致

回顾历史,最后一个jodatime版本存在差异的日期是追溯到2018年

区域	Asia/Kolkata	5:53:28 -	LMT	1880年        # 加尔各答
			            5:53:20	-	HMT	1941年10月    # 豪拉平均时间?
			            6:30	-	+0630	1942年5月15日
			            5:30	-	IST	1942年9月
			            5:30	1:00	+0630	1945年10月15日
			            5:30	-	IST

这很微妙:以下日期范围存在差异:

在1870年之前,它们一致:偏移量为5:53:28。

从1870年到1880年,当前的定义说偏移量是5:53:20。然而,2018年版本说偏移量是5:53:28。

然后,从1880年到1906年,两者再次一致(都是5:53:20)。

从1906年到1941年,当前版本说偏移量是5:21:10,而2018年版本说偏移量是5:53:20。

然后,切换到5:30:00偏移日期也不同;从1942年开始,它们再次一致直至今天。

__根据错误的方向,你可能察觉不到。一个错误的“方向”(旧版本认为偏移量比新版本“小”)你可能察觉不到:通过将1910-01-01 00:00:00翻译为1910-01-01 01:00:00(不完全一样,但是加上了错误)。toString只是打印日期,所以你无法分辨。

当旧版本认为偏移量比新版本“大”的时候,你将会将`187

英文:

Preamble

Your code as written actually would be off by one day for everything, if you run that code on any JVM whose default timezone is configured to be 'west' of Asia/Kolkata, because 'your system's local timezone' is an intrinsic part of it (because new Date(millis).toString() prints the date that those millis are in as per your local timezone). Hence, a previous answerer (who since deleted their answer) who put in the effort to fully reproduce your issue misidentified the problem - part of the 'input' is also to set the platform default tz to Asia/Kolkata, which, evidently, is what your Computer is already set to.

The explanation to this problem is very complicated and requires a bunch of background.

A deep dive on tzdata.

Here is the current tzdata section for Asia/Kolkata:

# Zone	NAME		STDOFF	RULES	FORMAT	[UNTIL]
Zone	Asia/Kolkata	5:53:28 -	LMT	1854 Jun 28 # Kolkata
			            5:53:20	-	HMT	1870	    # Howrah Mean Time?
			            5:21:10	-	MMT	1906 Jan  1 # Madras local time
			            5:30	-	IST	1941 Oct
			            5:30	1:00	+0630	1942 May 15
			            5:30	-	IST	1942 Sep
			            5:30	1:00	+0630	1945 Oct 15
			            5:30	-	IST

As you can see, 1870 is an inflection point for it: A tiny little change, adjusting the offset bij 2 minutes and 10 seconds was introduced.

That is just a small snippet of a huge boatload of text that lists every timezone and every inflection point for all of them. This is called the tzdata and is a separate project. The thing is, jodatime as a library ships with its own copy of this tzdata and uses this for all its calculations, whereas new Date().toString() needs to convert its millis too in order to determine what date to show, and it uses your OpenJDK's definitions (remember, java.util.Date is a lie - it does not represent dates at all, it represents epochmillis; its toString is merely confused for backwards compatibility reasons. There's a reason almost every method in that class is deprecated, you should not be using this class for anything).

The explanation (in overview)

If the tzdata that is baked into your joda distro is different from the tzdata baked into your OpenJDK specifically in regards to Asia/Kolkata for the relevant date, we have our explanation. Jodatime needs to determine the correct millis for 1854-01-01 00:00:00 in the Asia/Kolkata zone which it does using its tzdata. This millis is then sent to j.u.Date, and then you ask j.u.Date: Convert that back to human reckoning, in my system's default zone (Presumably, also Asia/Kolkata). That SHOULD be a lossless conversion - it should always get you right back to 1854-01-01. Unless the tzdata definitions for Asia/Kolkata of your jodatime and your OpenJDK disagree.

Looking back in history, the last version of jodatime where there is a disagreement is all the way back to 2018:

Zone	Asia/Kolkata	5:53:28 -	LMT	1880        # Kolkata
			            5:53:20	-	HMT	1941 Oct    # Howrah Mean Time?
			            6:30	-	+0630	1942 May 15
			            5:30	-	IST	1942 Sep
			            5:30	1:00	+0630	1945 Oct 15
			            5:30	-	IST

It's subtle: There is a difference in the following date ranges:

Until 1870 they are in agreement: Offse of 5:53:28.

From 1870 to 1880, the current def says the offset is 5:53:20. The 2018 version, however, says the offset is 5:53:28.

Then from 1880 to 1906 the two are in agreement again (both go with 5:53:20).

From 1906 to 1941, the current version says the offset is 5:21:10, whereas the 2018 version says the offset is 5:53:20.

Then the switchover to the 5:30:00 offset date is also not the same; from 1942 they are in agreement again until the present day.

Depending on the direction of the error you won't notice. One 'direction' of error (Where old thinks the offset is less than new) you won't notice: The translation errors by translating 1910-01-01 00:00:00 to 1910-01-01 01:00:00 (not exactly - but, with the error added to the date). toString just prints the date, so you can't tell.

Its when the old thinks the offset is more than the new that you end up translating 1870-01-01 00:00:00 to 1869-12-31 23:... - that the error is subtracted, thus resulting in a date that is 'too low'.

I'm 99.9% certain your particular version of jodatime is quite old (2018 or earlier). If you update it to the latest, this problem goes away, but, your code will just fail again if ever these definitions change again. I suggest you read on so you understand how this works, and what you can do to ensure this problem won't show up in the future.

A quick note on tzdata principles

More generally, THE JVM IS BROKEN - anytime you have ALL these things happening:

  • You translate any date from human reckoning to millis and back again, one step being done on JDK version X, and another on JDK version Y, where X and Y are on different sides of 'the time when OpenJDK messed up and picked the simplified version of the post-tzdata refactor' line (which is, more or less, a point update of OpenJDK17).
  • That time is before 1970.

Then you should assume that an off-by-one error is going to occur in that date. This hits a few time zones particularly hard (for example, Europe/Amsterdam. For example if you store dates in the h2 database system using JDBC's .setDate and .getDate, or even .setObject(x, LocalDate), and that date is prior to 1945, it will be off by one when you upgrade your JDK.

This is a bug, I think the OpenJDK is the primary actor that messed this up, I reported it, they do not understand. I rate the odds they ever fix it as low.

NB: The reason 1970 is magic, is that the tzdata refactored: They realized that exact knowledge about timezone status prior to 1970 is incredibly hard to verify, so recently they changed policy: Any defs that affect 1970 or after are aggressively verified, fact-checked, and distributed, and an extremely high bar is required if you want to tell the tzdata maintainer that it has an error in it that affects 1970 or later. However, in contrast, a few newspaper articles are enough for pre-1970 updates and the tzdata makes far fewer guarantees about accuracy because being accurate is difficult as a general principle due to lots of details being lost to the mists of time. As part of this policy update, there are 'simple' and 'complex' versions of tzdata now: The simple version more aggressively errors for pre-1970. For example, it says that The Netherlands was like Belgium pre-1970 which is known-wrong for the second world war years. Nevertheless, tzdata will not fix it because of policy ('we do not guarantee correctness for pre-1970 if you take the simple set'), and don't want to hear that this was the wrong choice (NL is larger than Belgium, so NL should be 'leading'. Or just include both, you don't guarantee correctness but you don't guarantee wrongness either). However, the worst offender here is team OpenJDK that messed up by picking the simple set.

How to avoid this problem from hurting you

joda-time is obsolete and has been for a decade. Do not use it - java itself has a variant of joda-time now (it was added to the JDK by the author of JodaTime!), in the java.time package. In your scenario the root cause is that you are operating with different tzdata defs and all sorts of heck is going to break loose when you do that. At least by using java.time, all is in agreement.

But that's not enough - I do that and the Europe/Amsterdam thing caused a lot of pain.

You also want to avoid millis to human to millis conversion as much as you can. In order to do that properly, __do not ever use java.util.Date, and as a corrollary, do not use java.sql.Date or java.sql.Timestamp either. These things are just broken. Use .getObject(colIdx, OffsetDateTime.class) for example. It's wordy, but JDBC5 spec requires that databases support it.

Then, use a database that doesn't implement it badly. H2, unfortunately, messed this up. Understandable, perhaps - this stuff is hard to find.

OMG I didn't do that and now all my data is corrupt.

You can load the proper version of a tzdata def straight into the JDK, using the tzupdater.jar tool. If this has caused heck on your systems, find out which tzdata version was present when you did step 1 of all your conversions, load that into your JDK using tzupdater, and now the reverse of step 1 no longer causes off-by-1 errors.

huangapple
  • 本文由 发表于 2023年8月10日 21:28:57
  • 转载请务必保留本文链接:https://go.coder-hub.com/76876189.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定