Comment by EionRobb

Comment by EionRobb 2 days ago

20 replies

One of the biggest contributors I've had in the past for SQLite blocking was disk fragmentation.

We had some old Android tablets using our app 8 hours a day for 3-4 years. They'd complain if locking errors and slowness but every time they'd copy their data to send to us, we couldn't replicate, even on the same hardware. It wasn't until we bought one user a new device and got them to send us the old one that we could check it out. We thought maybe the ssd had worn out over the few years of continual use but installing a dev copy of our app was super fast. In the end what did work was to "defrag" the db file by copying it to a new location, deleting the original, then moving it back to the same name. Boom, no more "unable to open database" errors, no more slow downs.

I tried this on Jellyfin dbs a few months ago after running it for years and then suddenly running into performance issues, it made a big difference there too.

simscitizen a day ago

Copying the file likely forces the creation of a new one with no or lower filesystem fragmentation (e.g. a 1MB file probably gets assigned to 1MB of consecutive FS blocks). Then those FS blocks likely get assigned to flash dies in a way that makes sense (i.e. the FS blocks are evenly distributed across flash dies). This can improve I/O perf by some constant factor. See https://www.usenix.org/system/files/fast24-jun.pdf for instance for more explanation.

I would say that the much more common degradation is caused by write amplification due to a nearly full flash drive (or a flash drive that appears nearly full to the FTL because the system doesn't implement some TRIM-like mechanism to tell the FTL about free blocks). This generally leads to systemwide slowdown though rather than slowdown accessing just one particular file.

This was especially prevalent on some older Android devices which didn't bother to implement TRIM or an equivalent feature (which even affected the Google devices, like the Nexus 7).

izacus 2 days ago

That's much more likely flash degradation than actual fragmentation. Did you use cheap tablets with eMMC storage?

  • EionRobb a day ago
    • izacus a day ago

      Yeah, then copying the files most likely moved them to different part of storage and it improved performance again.

      Sadly that's a common plague for cheaper Android hardware - after enough writes the flash performance drops off a cliff making those devices essentially unusable :/

      (More expensive hardware - including Apples - tends to have UFS type storage which lasts longer.)

  • EionRobb a day ago

    We had that thought too. I'll have to try dig out what the tablets were to find out exactly what type - this would have been 3 or 4 years ago now. We sort of ruled that out because:

    The other workaround to get a speed boost was the user to uninstall and reinstall the app (and then wait for all the data to download again) but that didn't fly because the users would delete before they'd synced off all their data and then data would go missing.

    This was all despite having VACUUM running whenever the app started.

    Whether it was bad flash or no, we still had to try resolve it as the client wouldn't buy new hardware until we could prove that we had the knowledge to make the problem go away first :/

    • izacus 5 hours ago

      Yeah, I don't think there's much you can do from software side there - those kind of Android devices just end up unusable due to I/O performance degradation and it's hard to keep them running.

  • georgemcbay a day ago

    > That's much more likely flash degradation than actual fragmentation. Did you use cheap tablets with eMMC storage?

    My understanding of the parent reply's situation is that this was happening on the tablets of their users, so it kinda doesn't matter that it can be avoided by not using cheap tablets.

    Most apps aren't in a position to tell their users that they are on their own when they run into what feels like an unreasonable app slowdown because they didn't buy a good enough device to run it on, especially when they've previously experienced it running just fine.

    If all their apps feel like crap on that tablet, sure, that might fly... but if its only your app (or only a small set of apps that use SQLite in the same way the OP's company did) that feels like crap after a while, that's effectively a you problem (to solve) even if its not really a you problem.

    In any case, its an interesting data point and could be very useful information to others who run into similar issues.

    • izacus a day ago

      I don't quite understand what you're arguing here.

      I'm merely saying that the root cause was misidentified - the performance degradation didn't happen due to fragmentation, but because the flash storage was degraded to the point where the write performance dropped significantly. This happens faster for eMMC vs. SSD-style storage.

      Copying the DB file moved the data to different storage blocks which is why it (temporarily again) improved performance.

ErroneousBosh a day ago

> One of the biggest contributors I've had in the past for SQLite blocking was disk fragmentation.

Is that even still a thing? I thought modern filesystems like ext4 were supposed to be largely immune to that.

  • tredre3 20 hours ago

    Ext4 isn't used on Android, and it isn't immune to fragmentation.

    The way ext4 reduces fragmentation is with some basic heuristics: mainly, it spreads files across the full disk instead of finding the next free spot. So they have room to grow without fragmenting. When the space gets low, it fragments just as badly as older file systems unfortunately.

[removed] a day ago
[deleted]
Multicomp 2 days ago

Would the SQLite vacuum function help with that?

  • mceachen 2 days ago

    You can VACUUM INTO, ~~but standard vacuum won’t rewrite the whole db~~ (vacuum rewrites the whole db)

    https://sqlite.org/lang_vacuum.html

    (Edit: if multiple processes are concurrently reading and writing, and one process vacuums, verify that the right things happen: specifically, that concurrent writes from other processes during a vacuum don’t get erased by the other processes’ vacuum. You may need an external advisory lock to avoid data loss).

    • return_to_monke 2 days ago

      > You can VACUUM INTO, but standard vacuum won’t rewrite the whole db.

      This is not true. From the link you posted:

      > The VACUUM command works by copying the contents of the database into a temporary database file and then overwriting the original with the contents of the temporary file.

    • teddyh a day ago

      > You can VACUUM INTO, ~~but standard vacuum won’t rewrite the whole db~~ (vacuum rewrites the whole db)

      HN does not support whatever markup you are trying to use. You have to use Unicode:

      “You can VACUUM INTO, b̶u̶t̶ ̶s̶t̶a̶n̶d̶a̶r̶d̶ ̶v̶a̶c̶u̶u̶m̶ ̶w̶o̶n̶’̶t̶ ̶r̶e̶w̶r̶i̶t̶e̶ ̶t̶h̶e̶ ̶w̶h̶o̶l̶e̶ ̶d̶b̶ (vacuum rewrites the whole db)”

didip 2 days ago

This is fascinating. What would be the solution for this? You can’t ask users to defrag.

  • DANmode 2 days ago

    Perform the file operation, after zipping the existing db as a backup, and leaving the original where it sits.

    Success, performance increase.

    Failure, no change.

    • axitanull a day ago

      Forgive my lack of knowledge, but how is simply zipping the original file would "defrag" the file?

      Shouldn't the file be moved into different disk fragment first, for that to happen?

      • DANmode a day ago

        That's "the file operation" :]

        Again: zip is a backup