Comment by ndr
Comment by ndr 5 days ago
I found it very lacking in how to do CD with no downtime.
It requires a particular dance if you ever want to add/delete a field and make sure both new-code and old-code work with both new-schema and old-schema.
The workaround I found was to run tests with new-schema+old-code in CI when I have schema changes, and then `makemigrations` before deploying new-code.
Are there better patterns beyond "oh you can just be careful"?
This is not specific to Django, but to any project using a database. Here's a list of a couple quite useful resources I used when we had to address this:
* https://github.com/tbicr/django-pg-zero-downtime-migrations
* https://docs.gitlab.com/development/migration_style_guide/
* https://pankrat.github.io/2015/django-migrations-without-dow...
* https://www.caktusgroup.com/blog/2021/05/25/django-migration...
* https://openedx.atlassian.net/wiki/spaces/AC/pages/23003228/...
Generally it's also advisable to set a statement timeout for migrations otherwise you can end up with unintended downtime -- ALTER TABLE operations very often require ACCESS EXCLUSIVE lock, and if you're migrating a table that already has an e.g. very long SELECT operation from a background task on it, all other SELECTs will queue up behind the migration and cause request timeouts.
There are some cases you can work around this limitation by manually composing operations that require less strict locks, but in our case, it was much simpler to just make sure all Celery workers were stopped during migrations.