"The Largest Database Migration I Have Ever Written"

Started by GenericArchangel
Posted
3 replies
Login to subscribe to responses
GenericArchangel's avatar
Posts: 3467
GenericArchangel
Wallet After Summer Sale
Magical Inkwell - Wrote MLP fanfiction consisting of at least around 1.5k words, and has a verified link to the platform of their choice

Free the Tantabus
So I was looking through the changelog to see if I could figure out the exact reason the site was updated, and noticed this gem with no additional context. I assume this is something fairly important to how the site works, even though there have been no significant end-user changes or issues (congrats to the dev team on that, btw). So, out of curiosity: what was that update, and what is the whole database migration thing about?

(yes I see that was two days ago, but I presume the change "deployed" is related)
byte[]'s avatar
Posts: 1719
byte[]
Site Administrator
Site Developer
This migration didn’t actually change anything fundamental about the way the application code works. This migration is huge because six things were accomplished at once, and it required a fair amount of testing beforehand to ensure it would apply correctly.


1. Delete 9 unnecessary / unused indices on 8 tables
2. Create 32 necessary indices for cascaded updates and deletes in step 6
3. Delete data with inappropriate SQL NULL values* as a prerequisite for step 5
4. Delete data with broken references** as a prerequisite for step 6
5. Add NOT NULL constraints to 218 columns across 30 tables
6. Add FOREIGN KEY referential constraints to 71 columns across 37 tables


In total, the completed run of this migration took exactly 548 seconds to apply from start to finish. There was about a 20 minute delay at the beginning while I waited for pg_dump to complete. I set the parallelism as high as I could get it, but I do not think I can get it to dump any faster, unfortunately.


Some hitches that I ran into along the way that required me to restart the migration:

– One post had a topic_position that was NULL, and my code to set a meaningful value there was broken. I fixed this with 2bf1b1dc57 and restarted the migration.
– A few Rails-generated foreign key names were different in development than in production, which threw a wrench into the works several times while I had to drop those constraints. I knew this was going to happen before I started, and tried to account for it in 95d509ed4b, but it did not break the way I expected it to.


Overall, that’s pretty much it.


* A row in source_changes with a NULL image_id is inappropriate because it is meaningless.
** A row in source_changes with an image_id that exists, but does not point to any known ID number, is inappropriate because it is meaningless.
byte[]'s avatar
Posts: 1719
byte[]
Site Administrator
Site Developer
To answer another question:

The dates that show up in the changelog have no real connection to the date that code actually makes it live (though they may be well-correlated). They are just commit dates.
Posted Report
Interested in advertising on Derpibooru? Click here for information!
Travelling Pony Museum

Derpibooru costs over $25 a day to operate - help support us financially!

Syntax quick reference: *bold* _italic_ [spoiler]hide text[/spoiler] @code@ +underline+ -strike- ^sup^ ~sub~