为什么新的 SimpleDateFormat 对象包含错误年份的日历?

2022-01-11 00:00:00 date calendar simpledateformat java

我遇到了一种奇怪的行为,这让我感到好奇,但还没有令人满意的解释.

为简单起见,我将注意到的症状简化为以下代码:

导入 java.text.SimpleDateFormat;导入 java.util.GregorianCalendar;公共类 CalendarTest {公共静态无效主要(字符串[]参数){System.out.println(new SimpleDateFormat().getCalendar());System.out.println(new GregorianCalendar());}}

当我运行这段代码时,我得到了与以下输出非常相似的东西:

<块引用>

java.util.GregorianCalendar[time=-1274641455755,areFieldsSet=true,areAllFieldsSet=true,lenient=true,zone=sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]],firstDayOfWeek=1,minimalDaysInFirstWeek=1,ERA=1,YEAR=1929,MONTH=7,WEEK_OF_YEAR=32,WEEK_OF_MONTH=2,DAY_OF_MONTH=10,DAY_OF_YEAR=222,DAY_OF_WEEK=7,DAY_OF_WEEK_IN_MONTH=2,AM_PM=1,HOUR=8,HOUR_OF_DAY=20,MINUTE=55,SECOND=44,MILLISECOND=245,ZONE_OFFSET=-28800000,DST_OFFSET=0]java.util.GregorianCalendar[time=1249962944248,areFieldsSet=true,areAllFieldsSet=true,lenient=true,zone=sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]],firstDayOfWeek=1,minimalDaysInFirstWeek=1,ERA=1,YEAR=2009,MONTH=7,WEEK_OF_YEAR=33,WEEK_OF_MONTH=3,DAY_OF_MONTH=10,DAY_OF_YEAR=222,DAY_OF_WEEK=2,DAY_OF_WEEK_IN_MONTH=2,AM_PM=1,HOUR=8,HOUR_OF_DAY=20,MINUTE=55,SECOND=44,MILLISECOND=248,ZONE_OFFSET=-28800000,DST_OFFSET=3600000]

(如果我向 SimpleDateFormat 提供像 "yyyy-MM-dd" 这样的有效格式字符串,也会发生同样的事情.)

请原谅可怕的非环绕行,但这是比较两者的最简单方法.如果滚动到大约 2/3 处,您会看到日历的 YEAR 值分别为 1929 和 2009.(还有一些其他差异,例如一年中的星期、星期和 DST 偏移量.)两者显然都是 GregorianCalendar 的实例,但它们不同的原因令人费解.

据我所知,格式化程序在格式化传递给它的 Date 对象时会产生准确的结果.显然,正确的功能比正确的参考年份更重要,但这种差异仍然令人不安.我不认为我必须在全新的日期格式化程序上设置日历才能获得当前年份......

我已经在使用 Java 5(OS X 10.4,PowerPC)和 Java 6(OS X 10.6,Intel)的 Mac 上进行了测试,结果相同.由于这是一个 Java 库 API,我假设它在所有平台上的行为都相同.对这里正在发生的事情有任何见解吗?

(注意:这个 SO question 有点相关,但不一样.)

<小时>

以下所有答案都有助于解释这种行为.事实证明,SimpleDateFormat 实际上在某种程度上记录了这一点:

<块引用>

对于使用缩写年份模式(y"或yy")进行解析,SimpleDateFormat 必须解释相对于某个世纪的缩写年份.它通过将日期调整为时间之前的 80 年和之后的 20 年来实现这一点SimpleDateFormat 实例已创建."

因此,他们只是默认将内部日历设置回 80 年,而不是花哨地解析日期的年份.该部分本身没有记录,但是当您了解它时,所有部分都可以组合在一起.

解决方案

我不知道 Tom 为什么说这与序列化有关",但他说得对:

private void initializeDefaultCentury() {calendar.setTime(new Date());calendar.add(Calendar.YEAR, -80);parseAmbiguousDatesAsAfter(calendar.getTime());}

这是 SimpleDateFormat.java 中的第 813 行,在此过程中非常晚.到那时为止,年份是正确的(日期部分的其余部分也是如此),然后将其递减 80.

啊哈!

parseAmbiguousDatesAsAfter() 的调用与 set2DigitYearStart() 调用的私有函数相同:

/* 定义一个世纪的窗口,在这个窗口中使用* 两位数的年份.*/私人无效parseAmbiguousDatesAsAfter(日期开始日期){defaultCenturyStart = 开始日期;日历.setTime(开始日期);defaultCenturyStartYear = calendar.get(Calendar.YEAR);}/*** 设置 100 年期间 2 位数的年份将被解释为在* 从用户指定的日期开始.** @param startDate 在解析过程中,两位数的年份将被放置在范围内* <code>开始日期</code>到 <code>startDate + 100 年</code>.* @see #get2DigitYearStart* @从 1.2 开始*/公共无效 set2DigitYearStart(日期 startDate){parseAmbiguousDatesAsAfter(startDate);}

现在我明白发生了什么.彼得在他关于苹果和橙子"的评论中是对的!SimpleDateFormat 中的年份是默认世纪"的第一年,即两位数年份字符串(例如,1/12/14")被解释为的范围.请参阅 http:///java.sun.com/j2se/1.4.2/docs/api/java/text/SimpleDateFormat.html#get2DigitYearStart%28%29:

因此,在效率"胜于清晰的胜利中,SimpleDateFormat 中的年份用于存储解析两位数年份的 100 年期间的开始",而不是当前年份!

谢谢,这很有趣——终于让我安装了 jdk 源(我的 / 分区上只有 4GB 的总空间.)

I came upon a strange behavior that has left me curious and without a satisfactory explanation as yet.

For simplicity, I've reduced the symptoms I've noticed to the following code:

import java.text.SimpleDateFormat;
import java.util.GregorianCalendar;

public class CalendarTest {
    public static void main(String[] args) {
        System.out.println(new SimpleDateFormat().getCalendar());
        System.out.println(new GregorianCalendar());
    }
}

When I run this code, I get something very similar to the following output:

java.util.GregorianCalendar[time=-1274641455755,areFieldsSet=true,areAllFieldsSet=true,lenient=true,zone=sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]],firstDayOfWeek=1,minimalDaysInFirstWeek=1,ERA=1,YEAR=1929,MONTH=7,WEEK_OF_YEAR=32,WEEK_OF_MONTH=2,DAY_OF_MONTH=10,DAY_OF_YEAR=222,DAY_OF_WEEK=7,DAY_OF_WEEK_IN_MONTH=2,AM_PM=1,HOUR=8,HOUR_OF_DAY=20,MINUTE=55,SECOND=44,MILLISECOND=245,ZONE_OFFSET=-28800000,DST_OFFSET=0]
java.util.GregorianCalendar[time=1249962944248,areFieldsSet=true,areAllFieldsSet=true,lenient=true,zone=sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]],firstDayOfWeek=1,minimalDaysInFirstWeek=1,ERA=1,YEAR=2009,MONTH=7,WEEK_OF_YEAR=33,WEEK_OF_MONTH=3,DAY_OF_MONTH=10,DAY_OF_YEAR=222,DAY_OF_WEEK=2,DAY_OF_WEEK_IN_MONTH=2,AM_PM=1,HOUR=8,HOUR_OF_DAY=20,MINUTE=55,SECOND=44,MILLISECOND=248,ZONE_OFFSET=-28800000,DST_OFFSET=3600000]

(The same thing happens if I provide a valid format string like "yyyy-MM-dd" to SimpleDateFormat.)

Forgive the horrendous non-wrapping lines, but it's the easiest way to compare the two. If you scroll to about 2/3rds of the way over, you'll see that the calendars have YEAR values of 1929 and 2009, respectively. (There are a few other differences, such as week of year, day of week, and DST offset.) Both are obviously instances of GregorianCalendar, but the reason why they differ is puzzling.

From what I can tell the formatter produces accurate when formatting Date objects passed to it. Obviously, correct functionality is more important than the correct reference year, but the discrepancy is disconcerting nonetheless. I wouldn't think that I'd have to set the calendar on a brand-new date formatter just to get the current year...

I've tested this on Macs with Java 5 (OS X 10.4, PowerPC) and Java 6 (OS X 10.6, Intel) with the same results. Since this is a Java library API, I assume it behaves the same on all platforms. Any insight on what's afoot here?

(Note: This SO question is somewhat related, but not the same.)


Edit:

The answers below all helped explain this behavior. It turns out that the Javadocs for SimpleDateFormat actually document this to some degree:

"For parsing with the abbreviated year pattern ("y" or "yy"), SimpleDateFormat must interpret the abbreviated year relative to some century. It does this by adjusting dates to be within 80 years before and 20 years after the time the SimpleDateFormat instance is created."

So, instead of getting fancy with the year of the date being parsed, they just set the internal calendar back 80 years by default. That part isn't documented per se, but when you know about it, the pieces all fit together.

解决方案

I'm not sure why Tom says "it's something to do with serialization", but he has the right line:

private void initializeDefaultCentury() {
    calendar.setTime( new Date() );
    calendar.add( Calendar.YEAR, -80 );
    parseAmbiguousDatesAsAfter(calendar.getTime());
}

It's line 813 in SimpleDateFormat.java, which is very late in the process. Up to that point, the year is correct (as is the rest of the date part), then it's decremented by 80.

Aha!

The call to parseAmbiguousDatesAsAfter() is the same private function that set2DigitYearStart() calls:

/* Define one-century window into which to disambiguate dates using
 * two-digit years.
 */
private void parseAmbiguousDatesAsAfter(Date startDate) {
    defaultCenturyStart = startDate;
    calendar.setTime(startDate);
    defaultCenturyStartYear = calendar.get(Calendar.YEAR);
}

/**
 * Sets the 100-year period 2-digit years will be interpreted as being in
 * to begin on the date the user specifies.
 *
 * @param startDate During parsing, two digit years will be placed in the range
 * <code>startDate</code> to <code>startDate + 100 years</code>.
 * @see #get2DigitYearStart
 * @since 1.2
 */
public void set2DigitYearStart(Date startDate) {
    parseAmbiguousDatesAsAfter(startDate);
}

Now I see what's going on. Peter, in his comment about "apples and oranges", was right! The year in SimpleDateFormat is the first year of the "default century", the range into which a two-digit year string (e.g, "1/12/14") is interpreted to be. See http://java.sun.com/j2se/1.4.2/docs/api/java/text/SimpleDateFormat.html#get2DigitYearStart%28%29 :

So in a triumph of "efficiency" over clarity, the year in the SimpleDateFormat is used to store "the start of the 100-year period into which two digit years are parsed", not the current year!

Thanks, this was fun -- and finally got me to install the jdk source (I only have 4GB total space on my / partition.)

相关文章