-
Notifications
You must be signed in to change notification settings - Fork 11.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[12.x] prefer "datetime" types over "timestamp" types #54256
base: master
Are you sure you want to change the base?
Conversation
"datetimes" are the better default choice for date related columns, and should be the recommended way from Laravel going forward - address 2038 issue - only 1 extra bye - internal binary storage for equal performance - ignorant of server/SQL timezone
Here's my two cents. While I'm not sure about performance indication, although I believe it should be minimal on both sides. The only drawback was storing as
|
I personally feel it is better to wait till closer to 2038 and see what is the consensus is for this topic for timestamps. Maybe by then there is a better solution or non issue |
@Rizky92 storing as an
I don't understand this point. can you elaborate? @ziming we have a data type literally called "datetime" that was built to handle dates and times. if we start enforcing this good standard now, the 2038 problem literally goes away. regardless of the 2038 aspect, using |
I understand that. That's why it's a one alternative over many that had the potential to avoid breaking change. :)
My bad. I shouldn't have add that. I was thinking whether it's relevant to the scope of this PR. Basically using datetime loses timezone information over timestamp have, because timestamp internally uses dataabse server timezone to offset the datetime information. Changing the columns to datetime means new apps must explicitly define which timezone it is live on. |
Hey @browner12 - thanks for this PR. Are there any breaking changes for existing applications? |
In our MySQL application we stopped using TIMESTAMP data type because it may differ of server settings. This may lead to big date issues. |
@taylorotwell shoot, I knew there was something in my original post I forgot that I wanted to add. As far as I can tell, there would be no breaking changes because this PR only affects stubs, which would only affect migrations going forward. I've also setup a small application with 2 models, with one using As stated, the one questionable change we could make is to have Would love some others thoughts on this. I've also done some testing locally about updating the column definition of a ALTER TABLE `test` CHANGE `updated_at` `updated_at` datetime NULL; Basically what seems to happen is the value you see remains unchanged. It just loses it's awareness of the server/sql timezone setting. Again, if you were doing everything in UTC anyway, there is no impact. |
There's also a "timezone" option you can specify on the connection in config (at least for mysql, that is not present by default) to switch the timezone for how the timestamps are retrieved independent of the server. Overall I'd say this change is necessary not only because of timestamp issues, but also to make more dev aware that majority of datepickers return based on local user timezone if not set to ISO format (and even that, js ISO is not really compatible with the PHP ISO validation), not the servers or the projects and it should be handled no matter if the project is single country targeted etc. as country does not mean you have to live there to use it. There's also not a lot of content around this to make more developers aware of it and I often encounter dangerous project changes to dates without the knowledge why it's made as it is from juniors. |
maybe this |
@kminek it's not as simple as just adding it. To set a time zone, the mysql.time_zone_name table must contain it or it will throw an exception, and on windows it's empty by default so you have to go download and import it. It also adds additional sql call on init. And in all cases you must ensure that it contains your specified time zone or it will result in your site being unavailable because of that one added sql call that will fail if it's not. |
yah, while related, the database config |
Let the user/app decide/configure what he wants as default instead of forcing framework defaults. In my personal experience I discourage using The tricky part lies in clients being able to connect in different timezones (see connection time_zone session variable) for different reasons. The We had and still have this legacy issue that we drag with us, because back in the days developers chose to use As a rule of thumb, use |
This is starting to feel more and more like the comment everyone just regurgitates about
The difference here is in which assumption is more dangerous. IMO, for the vast majority of users, the MUCH safer option is to completely ignore any server, SQL GLOBAL, or SQL SESSION timezone settings, and to assume that everything is handled in UTC. Then it is the responsibility of the code to apply any timezone related adjustments to user facing output.
We can solve it now, AND we can solve it with a more space efficient result (5 bytes) than what will probably end up being the solution which will store 8 bytes.
I'm guessing the VAST majority of users are not connecting to their database with custom timezone variables. While Laravel supports it, most users don't even know it exists because it is not part of the default config options. You also have the downside of it requiring an extra query on EVERY connection. Most users are just relying on whatever SQL GLOBAL value they have set (which is hopefully UTC). As I re-read your comment, what it seems to really come down to is WHERE the timezone conversion is happening. You are doing it via SQL. I'm advocating for it in the code because:
We already are enforcing a "default", currently a |
How do you come up that for 99% the default should be datetime? Majority never ever needs date ranges outside of timestamp range, its most of the time created_at and updated_at anyway, it is more efficient in storage and index size, and it offers good handling of multiple clients using the same database, especially, when it is not easy to enforce the same datetime handling on each client. Majority is not a micro service architecture where each service has its own database. So I would keep the status quo, no need to change something for everyone for no urgent reason. |
I mean, this is just a stub, any argument for or against datetime can be countered with the same switched arguments for timestamp, with the only difference being datetime can store beyond the limit, it's just depends on the use. I'm all for datetime as default when it comes to multi-timezone apps but the downside to this is that any Laravel package you use that adds dates uses timestamps, which would mean either you use timestamps to avoid footgun, or you accept your fate and monitor all those migrations. Overall neither datetime nor timestamp is a golden bullet when it comes to handling dates from various users as you still need to do the conversion to a specific timezone, just depends to which timezone, if you don't think about this, you don't have the problems you're describing. |
I know, its kind of pointless to argue about stubs, but then why do we need to change it? Did it cause trouble or issues? It feels like we change this, because of personal opinion. If the average user understands the differences or has the need to use this or that, he will change it in his project. |
Thinking about this more, I'm all for datetime, but having mixed datetime and timestamps in the same codebase without any reason sounds really bad, which will end up with this change as nobody really pays attention to it and can have unexpected side effects and this not being a great idea to default unless all packages switch too. |
I would argue the average user doesn't understand, and is exactly the reason I'm trying to get rid of the footgun that is
I disagree. I laid out some very clear points about why As for the point about mix-and-matching both types, whether that happens internally or from a 3rd party package, it won't matter as long as your SQL timezone variable never changes. Whether you are using Here is the crux of the whole argument:
this can be summarized in the following table:
We want the "Input" column to always equal the "Retrieved Value" column. As you can see in row 3, this doesn't happen when we change the SQL Timezone, and is the source of the problems. There are some people who want the behavior in line 3, as @TheLevti might, but I would argue those are the small minority of people. For the vast majority of people we should default to |
Don't get me wrong, I'm fully on board with the datetime, UTC and such and use it for many years without any problems, accepting the fate you could say to also ensure all third party migrations use them to not mix and match. My only concern is what will the effect be overall, because all libraries will still use timestamps, but your application will use datetimes, which I really doubt many will take note. I personally won't be effected by the stub change as I always publish the stubs, 3rd party migrations and adjust them to use datetime, and to prevent them from magically changing as it would in this case so maybe my concern is overblown. |
@browner12 You can't do such bottomless assumptions about who uses what, which part is a majority, their knowledge and what is best for them. You provided 0 evidence or statistics about how users use this framework. Please be aware that this is the most popular php framework, not a hobby product you are developing with a bunch of people. You can not know why or if users use different time zones in applications or for their database connection or if they are required to use something else than UTC. And then saying that they should all switch to UTC is just ridiculous. You can not decide for everyone how to do something right, just because you have a strong opinion about it, maybe in your own product, but not in a framework that millions depend on. As such any breaking change that is not absolutely necessary (e.g. because it otherwise affects everyone), must be avoided. And yes if you are not aware this would be a breaking change. Imagine someone already using a non UTC connection and he is used to create models/classes via stubs. Suddenly new tables or columns are not stored in UTC anymore, but in his timezone. As a result reads from e.g. a BI tool that connects with UTC would then show incorrect results between those columns, requiring datetime juggling to fix that. People will suddenly have a mix of datetime/timestamp columns which differ in a potentially variable amount of hours from each other. Also if its right for the majority, does not mean you can just dump the minority and let them suffer. If you have a need to use datetime in your stubs, feel free to overwrite the stubs and change the default, but changing this for everyone, because of opinion is totally unnecessary. |
Fair point. I meant my numbers as hyperbole rather than statistic. I thought that was obvious, but I'll try to be more clear next time. Also fair, my usage assumptions are a priori. They are based on my years as a programmer and in the Laravel ecosystem, rather than hard data. My assumptions are based on anecdotal things I've seen in many projects over the years, and on the discussions I've had over the previous couple of weeks that most programmers don't seem to fully understand the timezone implications of their data type.
I am not suggesting that everyone switch to UTC, I'm stating that UTC is the best storage option generally speaking. This is not just my opinion, but the opinion of the framework as well. https://laravel.com/docs/11.x/eloquent-mutators#date-casting-and-timezones As I've stated previously, having both For users like yourself in the minority (my opinion) who have different timezones depending on the DB connection, you would have some annoyances, but still no breaking changes. You would have to be cognizant of the field type when writing your code, to know if you needed to perform the timezone adjustment in the code, or if it was already handled for you in SQL.
IMO default stubs ARE for the majority. We have the opportunity here to give the majority a much better data type and I think we should take it. The intent is not to make the minority suffer, and the framework has a very simple one time fix so the minority doesn't have to suffer. Any users who wish to stick with |
Honestly at this point I don't mind anymore. I would anyway review each column to have the ideal type and not just blindly accept defaults. If this gets accepted, we should for sure have it clearly highlighted in the 11 -> 12 migration guide. |
@@ -16,8 +16,8 @@ return new class extends Migration | |||
$table->string('type'); | |||
$table->morphs('notifiable'); | |||
$table->text('data'); | |||
$table->timestamp('read_at')->nullable(); | |||
$table->timestamps(); | |||
$table->dateTime('read_at')->nullable(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the method really named dateTime
and not datetime
? Looks strange because we have datetimes
method below and not dateTimes
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is a casing inconsistency that I was going to address after this PR.
I know this has been brought up before, but I'm going to make the case again why
datetime
s are the superior data type compared totimestamp
s, and why Laravel should make these their recommended and default types for v12 and beyond.Premises
What this PR does NOT do
Parity between
datetime
andtimestamp
Storage requirements
timestamp
requires 4 bytes for storage.datetime
requires 5 bytes for storage. both allow an additional 3 bytes for precision.One of the proposals for solving the 2038 problem for
timestamp
is to increase it to a 64 bit integer, which would increase its storage requirements to 8 bytes.https://dev.mysql.com/doc/refman/8.4/en/storage-requirements.html#data-types-storage-reqs-date-time
Performance
There has been some confusion in other related PRs, Issues, and Discussions about how the performance of
datetime
would be worse thantimestamp
because it stores the date as a string, and string comparison is slower than integer comparison.datetime
is actually stored internally in a fixed length binary format which allows comparisons to be just as efficient as integer comparison.For testing, I created a table with the following migration:
I filled the table with 100,000 rows with a random date stored in both the "timestamp" and "datetime" fields. I ran the following queries and had results consistently within 1ms of each other.
Allow using "CURRENT_TIMESTAMP"
Both data types allow using the "CURRENT_TIMESTAMP" for both an initial value and an "on update" value.
datetime
benefitsSolves the 2038 issue
timestamp
fields store their value internally as a signed 32 bit integer, which means any dates after 2038/01/19 are not valid for timestamps. this is not as big of an issue right now, since most stored dates are in the past, but could potentially be a huge problem when we reach that date. it does affect current use, too, when you may be storing a future date, like an expiration.datetime
fields have a minimum value of 1000-01-01 and a maximum value of 9999-12-31, giving us a much wider valid date range, and eliminating the 2038 problemIgnorant of Server/SQL timezone
Lastly, what may be the most important of all the benefits of
datetime
, it is completely ignorant of the timezone set on either the server or SQL, whiletimestamp
is not.When a date is entered into a
timestamp
it will first attempt to convert it to UTC for internal storage. This is dependent on a couple of factors. SQL could have its own explicitly set timezone. More likely, it will be set to "SYSTEM" which means it defers to the timezone set on the OS. Either way, issues arise when SQL deems its timezone to be something other than UTC. Let's say for example, SQL's timezone is set to CST(-6). When it receives a value for a timestamp field, it will interpret the value it receives as a CST value, and convert it to UTC for internal storage, and then also convert it back to CST when the value is retrieved. Now, whether you actually intended to give it a CST value is irrelevant, because all you really care about is that the value you gave it is EXACTLY what you got back.As long as that SQL timezone value stays the same, you're actually kind of ok, even if things don't technically match up. However, things can go very poorly if the SQL timezone changes.
Imagine again we have our server with the timezone set to CST. We insert a row with a CST value, and SQL converts the
timestamp
field to UTC internally. Now someone comes along and sees that the server is set to CST, but should probably be UTC because that's pretty standard for servers. Unfortunately that simple change would mess up all of our data. Now when that row is retrieved from the database, SQL sees the server is in UTC, so it just gives the internal value it stored back to us, even though thats not correct and should have been converted.This means the value we put into the database is NOT the value we got out! Some might argue that's intentional, but I would say for the large majority of people any timezone other than UTC on the server is pure happenstance or oversight, and not actually what they intended.
If we switch to
datetime
fields, SQL ignores any server or SQL timezone settings and simply stores the value you give it, and returns exactly the same value when you request it. By making ourselves ignorant of any server settings, we actually protect ourselves from any unintentional errors like mentioned above.For some real numbers, assume we started with a server in CST, the table will show how
timestamp
anddatetime
differ.Questionable Changes
One thing I did not change was the
softDeletes()
method. I think ideally it would change to usingdatetime
s internally, and then a newsoftDeletesTimestamp()
method would be created for that specific use. However, I'm not sure how that would affect existing usage ofsoftDeletes()
that were executed when it usedtimestamp
s.