-
Notifications
You must be signed in to change notification settings - Fork 685
- Why can't I do day arithmetic on a
year_month_day
? - Why can't I do day arithmetic on a
year_month_day
, part 2? - Why is
%A
failing? - Why is
local_t
not a proper clock? - Why can't I compare instances of
zoned_time
? - On the encoding of
weekday
This library is meant to be a foundational library upon which you can efficiently build higher-level date libraries (like tz.h). A core component of this library is that it makes expensive computations explicit, so that you can see where they are in your code. Higher-level code can hide these expensive/explicit operations as desired.
A good way to estimate the cost of any given date computation is to count the number of conversions from a field type (e.g. year_month_day
or year_month_weekday
) to a serial type (e.g. sys_days
), and vice-versa. As an example, here is a real-world example (found in the issues list):
We need to compute the day after the 3rd Tuesday of the month. If day-oriented arithmetic was allowed on year_month_weekday
, that would be in the form of a function like this:
constexpr
year_month_weekday
operator+(const year_month_weekday& ymwd, const days& dd) noexcept
{
return year_month_weekday{sys_days{ymwd} + dd};
}
The programmer would probably use it like this:
year_month_day
get_meeting_date(year y, month m)
{
return year_month_day{Tuesday[3]/m/y + days{1}};
}
That is super-compact syntax! Here is what it costs:
- Convert
Tuesday[3]/m/y
(year_month_weekday
) tosys_days
in order to adddays
. - Convert the
sys_days
computed back toyear_month_weekday
. - Convert the temporary
year_month_weekday
computed in 2 back tosys_days
. - Convert the
sys_days
to ayear_month_day
.
4 conversions.
Here is the way you have to write this function today (because Tuesday[3]/m/y + days{1}
is a compile-time error):
year_month_day
get_meeting_date(year y, month m)
{
return year_month_day{sys_days{Tuesday[3]/m/y} + days{1}};
}
The syntax is slightly more verbose in that you have to explicitly convert the year_month_weekday
into a sys_days
in order to perform the day-oriented arithmetic. Here is what it costs:
- Convert
Tuesday[3]/m/y
(year_month_weekday
) tosys_days
in order to adddays
. - Convert the
sys_days
to ayear_month_day
.
2 conversions. Roughly twice as fast! And the code generation (using clang++ -O3) is roughly half the size: 152 assembly statements vs 335 assembly statements.
Finally, one can take advantage of the fact that the conversion from sys_days
to year_month_day
can be made implicitly, and so one can further simplify the syntax (this change does not impact code generation):
year_month_day
get_meeting_date(year y, month m)
{
return sys_days{Tuesday[3]/m/y} + days{1};
}
This philosophy is similar to that which we have for containers: It would be super easy to create vector<T>::push_front(const T&)
. But that would make it too easy for programmers to write inefficient code. The compiler helps remind the programmer that perhaps deque<T>
or list<T>
would be a better choice when he attempts to code with vector<T>::push_front(const T&)
.
It would be very easy to add T& list<T>::operator[](size_t index)
. But that would encourage the programmer to use list<T>
when a random-access container would probably be more appropriate for the task.
This library continues in that tradition: The expensive operations are not hidden.
Lately there has been a bit of mass hysteria over the impression that this library makes it between difficult and impossible to do day-oriented arithmetic on dates. Please let me assure you, it is very easy to add/subtract any number of days from a date. For example:
#include "date/date.h"
#include <iostream>
int
main()
{
using namespace date;
sys_days date = 2019_y/December/16;
date += days{100};
std::cout << date << '\n'; // Prints out: 2020-03-25
date = date - days{100};
std::cout << date << '\n'; // Prints out: 2019-12-16
}
So what's behind all of the excitement?
Bottom line: There's more than one calendrical ("date") class in this library and they are good at different things. And none of them can do everything. But the one thing that they can all do is easily convert to one another.
Here is an overview of 3 types in this library that each represent a date. All of these types are a calendar.
sys_days
sys_days
is the canonical calendrical type in this library. Every calendar can implicitly convert to and from it without loss of information. It is the "hub calendar". Under the hood it is nothing but a count of days since the 1970 epoch. And this data structure is what is used by most other date libraries, because it is very efficient at some important things.
sys_days
is very good at adding and subtracting days. It can do this in one assembly instruction and sub-nanosecond speeds, because all that happens is an integral addition/subtraction under the hood. It is also very good at interfacing with the existing chrono
family of system_clock
time_point
s because sys_days
is a system_clock
time_point
(of day-precision).
auto tp = date + 7h + 45min + 15s + 321ms; // tp is a system_clock time_point with millisecond precision
sys_days
is not good at extracting the year, month and day fields of a date. Most other libraries deal with this by ignoring the issue. They perform a computation that extracts all three fields (or nearly so) when you ask for the year. Then they perform the same computation when you ask for the month. And then again when you ask for the day. This library says: you can't ask sys_days
for just one of these fields.
year y = date.year(); // compile-time error, date has type sys_days
Instead you can ask for all three fields at once:
year_month_day ymd = date; // ok, date has type sys_days
year_month_day
year_month_day
is a simple {year, month, day}
data structure. It is very good at returning the year, month and day fields. It is not good at day-oriented arithmetic. To do that, the best thing to do is convert to sys_days
, perform the arithmetic, and convert back. Or better yet, never convert to year_month_day
in the first place if that makes sense for your application.
year_month_weekday
There's also a 3rd calendar that is a simple {year, month, weekday, N}
, which represents the Nth weekday of a month and year pair (e.g. 1st Sunday of May 2020). It also does not do day-oriented arithmetic. But it is very good at getting the year, month, weekday and N fields. And given a sys_days
date
, it is very easy to obtain a year_month_weekday
:
year_month_weekday ymw = date; // ok, date has type sys_days
Or go the other way:
date = ymw;
One can even explicitly convert directly between year_month_day
and year_month_weekday
:
year_month_day ymd{ymw}; // ok
The compiler simply implicitly bounces off of sys_days
under the hood.
Aside: You can even write your own calendar as long as it implicitly converts to and from sys_days
. Then it interoperates with year_month_day
and year_month_weekday
exactly as above.
So does this mean I can't add to the day field even if I know it won't overflow into the next month?
Nope. You can easily add to the "day field" of a year_month_day
or year_month_weekday
. But in this case, it is up to you to do any necessary error checking if applicable.
year_month_day ymd = 2019_y/December/16; // 2019-12-16
ymd = ymd.year()/ymd.month()/(ymd.day() + days{1}); // 2019-12-17
Or:
year_month_weekday ymw = 2019_y/December/Monday[3]; // 3rd Monday of December 2019
ymw = ymw.year()/ymw.month()/ymw.weekday()[ymw.index()+1]; // 4th Monday of December 2019
Virtually no computation and little code generation takes place for either of the examples above. They just stuff new values into one of the fields.
So choose whichever calendar is most useful for you. And when you need to, convert among them to get done what you need to get done. Your code will be efficient, readable, and type-safe.
This program is supposed to work:
#include "date/date.h"
#include <iostream>
#include <sstream>
int
main()
{
using namespace date;
std::istringstream in{"Sun 2016-12-11"};
sys_days tp;
in >> parse("%A %F", tp);
std::cout << tp << '\n';
}
But for me it outputs 1970-01-01
and in.fail()
is true.
Answer: This is a bug in your std::lib std::time_get
facet. It should use both %a
and %A
identically, and these should both parse either the abbreviated or full week_day
name in a case-insensitive manner. Unfortunately some implementations won't do that, resulting in this failure. These same implementations have the same bug with %b
and %B
(month names).
You can work around this bug by compiling with -DONLY_C_LOCALE
. This flag restricts you to the "C" locale, but it also avoids use of your std::time_get
and std::time_put
facets, instead implementing that logic within this library.
Reference bug report for gcc: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78714
local_time
is a time_point<local_t, Duration>
, where local_t
is just an empty struct
. Every other time_point
has a proper clock with (now()
) for that template parameter. Why is local_time
special this way, and won't this make generic programming with time_point
s harder?
This was done because local_time
is different from all other time_point
s. It has no associated timezone. Thus it has no epoch. A local_time
could be paired with any timezone. For example:
auto tp = local_days{nov/6/2016} + 7h;
This represents Nov. 6, 2016 07:00:00 in any timezone. To make this time_point concrete, you have to pair it with a specific timezone:
auto zt = make_zoned("America/New_York", tp);
Now zt
represents Nov. 6, 2016 07:00:00 in New York. We could have just as easily paired it with any other timezone, including current_zone()
.
Now imagine if local_time
had an associated clock with which you could call now()
: What would this code mean?
std::this_thread::sleep_until(tp);
This would be a logic error because it is ambiguous what time to sleep until. As specified, with local_t
being an empty struct
, the above line of code (a logic error) does not compile:
error: no member named 'now' in 'date::local_t'
while (_Clock::now() < __t)
~~~~~~~~^
note: in instantiation of function template specialization
'std::this_thread::sleep_until<date::local_t, std::chrono::hours>>' requested here
std::this_thread::sleep_until(tp);
So this is yet another example of errors being detected at compile time, rather than letting them happen at run time.
The proper way to do this is to pair tp
with some timezone creating a zoned_time
(zt
above), and then saying:
std::this_thread::sleep_until(zt.get_sys_time());
This compiles and will do exactly what it looks like it does: Sleeps until Nov. 6, 2016 07:00:00 in New York. This works correctly even if there is a daylight saving transition between the time the sleep_until
is called, and the time to wake up (currently such a transition is scheduled 6 hours earlier than the wakeup time). It works because all sleep_until
has to worry about is the sys_time
(time_point<system_clock, Duration>
) it was handed, and that sys_time
has been correctly mapped from "America/New_York".
When working with zoned_time
objects, it seems natural that one should be able to make comparisons between two instances in relation to the instant in time each represents. Though each zoned_time
references a sys_time
, it also has an associated local timezone complicates the consideration of how operators like <
should be defined. Such operations could be defined relative to sys_time
or to sys_time
and time_zone*
, though there are issues with each. For two instances of zoned_time
x
and y
, operator<
could be defined as x.get_sys_time() < y.get_sys_time()
; however, there would then be cases where !(x < y) && !(y < x) && x != y
, which breaks not only with tradition but probably several algorithms as well. I.e. if x
and y
have the same sys_time
, but different time_zone*
.
In the case that comparison is required by an application, named functors or lambdas defining the comparison operations should be used.
As this library developed, a decision about weekday
encoding had to be made. There were two existing competing standards:
-
C and C++ in the encoding of
tm.tm_wday
which maps[0, 6]
to[Sun, Sat]
. -
ISO which maps
[1, 7]
to[Mon, Sun]
.
Implicit in each of these encodings is the answer to the question: What is the first day of the week? And the answer depends upon your locality or culture. Monday is the first day of the week according to the international standard ISO 8601, but in the US, Canada, and Japan it's counted as the second day of the week.
This library seeks to not answer this question, and thus leave the answer up to the client. Indeed this library doesn't even force a decision to Sunday or Monday. It could be Thursday for all this library cares. How does it do this?
First, weekday
has no operator<()
. Sunday is neither less than or greater than Monday. Sunday is simply not equal to Monday. If you have to store weekday
in an associative container you will have to come up with your own ordering function.
Second, subtraction of one weekday
from another is unsigned modulo 7 arithmetic. That is, no matter what weekday you subtract from another, you will always get an answer with the type days
with a value in the range [days{0}, days{6}]
. This definition of subtraction for weekday
has a profound impact on calendrical algorithms that compute with weekday
s. It makes every algorithm encoding independent.
For example, let's say you want to write an algorithm that takes a sys_days
, and computes the next Tuesday, with the proviso that if the current day is a Tuesday, then the result is the same day (not important to the point, just to nail down a specific algorithm). That might look like:
date::sys_days
next_Tue(date::sys_days x)
{
using namespace date;
return x + (Tuesday - weekday{x});
}
This computes the current weekday
, and then subtracts it from Tuesday
and adds that many days
to the current date. The important thing to notice about this algorithm is that it is independent of the underlying encoding of weekday
as long as weekday
subtraction is unsigned modulo 7. Tuesday
is always ahead of weekday{x}
by some number of days in the range of [0, 6].
These two characteristics make the encoding of weekday
much less important, except perhaps for those wishing to format or parse weekday
as an integer. C and POSIX provide two ways to format a weekday as an integer: "%w"
and "%u"
, using strftime
will deal with both the [0, 6] and [1, 7] mappings. This library supports these same mappings with format
. Additionally POSIX supports "%w"
(but not "%u"
) for parsing (with strptime
). This library support both "%w"
and "%u"
, using parse
. So this library provides the same support for the [0, 6] mapping as POSIX (and better than C), and better support for the [1, 7] mapping than both C and POSIX.
This better support takes two forms: Reducing the importance of the encoding on calendrical algorithms, and better parsing support for the ISO [1, 7] encoding.
But ultimately what encoding did this library chose?
Answer: Neither.
The weekday{unsigned wd}
constructor accepts both the C [0, 6] encoding and the ISO [1, 7] encoding. How does it tell the difference? It is really quite simple. The two encodings have the same mapping for all of the days except Sunday. The [0, 6] encoding maps 0 to Sunday and the [1, 7] encoding maps 7 to Sunday. The weekday{unsigned wd}
constructor accepts both wd == 0
and wd == 7
to mean Sunday. And aside from this constructor, weekday
does not expose what encoding it uses internally. It can store [0, 6] and map 7 to 0 at this constructor, or it can store [1, 7] and map 0 to 7 at this constructor.
This library was written to enable clients to abandon the ancient C/POSIX <time.h>
API. However the reality is that clients still often have to interoperate with legacy code that uses the old <time.h>
API. This old API uses the [0, 6] encoding for the tm_wday
member of the struct tm
. The weekday{unsigned wd}
constructor can accept a tm_wday
directly, and work perfectly fine, even though it is also specified to accept the ISO encoding.
Filling out the tm_wday
member of struct tm
with a weekday
is also easy:
tm.tm_wday = (wd - Sunday).count();
Indeed, tm_wday
's specification says: "days since Sunday — [0, 6]." This formulation works, even if the implementation is encoding weekday
with [1, 7], because weekday
difference is specified to be circular (modulo 7).
As a demonstration of the lowered importance of weekday encoding in calendrical algorithms, see how to print a calendar which is configurable on the first day of the week. For example here is how to use this code to make Thursday the first day of the week:
print_calendar_year(std::cout, 3, 2018_y, Thursday);
which outputs:
January 2018 February 2018 March 2018
Th Fr Sa Su Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa Su Mo Tu We
1 2 3 1 2 3 4 5 6 7 1 2 3 4 5 6 7
4 5 6 7 8 9 10 8 9 10 11 12 13 14 8 9 10 11 12 13 14
11 12 13 14 15 16 17 15 16 17 18 19 20 21 15 16 17 18 19 20 21
18 19 20 21 22 23 24 22 23 24 25 26 27 28 22 23 24 25 26 27 28
25 26 27 28 29 30 31 29 30 31
April 2018 May 2018 June 2018
Th Fr Sa Su Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa Su Mo Tu We
1 2 3 4 1 2 1 2 3 4 5 6
5 6 7 8 9 10 11 3 4 5 6 7 8 9 7 8 9 10 11 12 13
12 13 14 15 16 17 18 10 11 12 13 14 15 16 14 15 16 17 18 19 20
19 20 21 22 23 24 25 17 18 19 20 21 22 23 21 22 23 24 25 26 27
26 27 28 29 30 24 25 26 27 28 29 30 28 29 30
31
July 2018 August 2018 September 2018
Th Fr Sa Su Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa Su Mo Tu We
1 2 3 4 1 1 2 3 4 5
5 6 7 8 9 10 11 2 3 4 5 6 7 8 6 7 8 9 10 11 12
12 13 14 15 16 17 18 9 10 11 12 13 14 15 13 14 15 16 17 18 19
19 20 21 22 23 24 25 16 17 18 19 20 21 22 20 21 22 23 24 25 26
26 27 28 29 30 31 23 24 25 26 27 28 29 27 28 29 30
30 31
October 2018 November 2018 December 2018
Th Fr Sa Su Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa Su Mo Tu We
1 2 3 1 2 3 4 5 6 7 1 2 3 4 5
4 5 6 7 8 9 10 8 9 10 11 12 13 14 6 7 8 9 10 11 12
11 12 13 14 15 16 17 15 16 17 18 19 20 21 13 14 15 16 17 18 19
18 19 20 21 22 23 24 22 23 24 25 26 27 28 20 21 22 23 24 25 26
25 26 27 28 29 30 31 29 30 27 28 29 30 31