sqlite 快速备份

Despite having “lite” in its name, SQLite can handle very large databases quite efficiently. The single largest I am currently dealing with just passed 100 GB, and occasionally when backing it up with the SQLite standard Command Line utility it would take many hours.
After some asking on the sqlite-users list (thanks Clemens Madish & Simon Slavin), and some experimenting, I have gotten the backup time down to 4-5 minutes on a live 100 GB database, which is just fine in my book. Below are the findings and an utility, sql3bak, which wraps the results of the findings.

Backup Steps
The primary reason of the very long backup time is that when using the sqlite3_backup_step API function, if you pass a fixed number of pages, and the database is being updated before the backup completes, the backup will… restart from the beginning. And the command line utility backups in steps of 100 pages.
This behavior was useful for the “legacy” journal modes like DELETE or TRUNCATE, where read transactions block write transactions and vice-versa. A smallish number of steps would prevent the backup from locking the databases for an extended period of time.
But if you use the Write-Ahead Logging mode, it is much less useful. For WAL datsbase, you can backup an arbitrarily sized database in just one step, without preventing other write (or read) transactions to happen, and without risk of the backup restarting.
Synchronous Mode
Another cause of performance loss is that the destination database in the backup is using the synchronous=NORMALmode by default. This mode is a Good Thing, but in the particular case of backup, it is only useful if you are replacing a backup and do not want the previous backup to be lost in case of failure during the backup.
If on the other hand you are backing up to a new file each time, it is possible to use synchronous=OFF for the backup, and protect the backup from being accessed while the backup is taking place by using DELETE or TRUNCATE journal mode instead.
Cache Size
The last speedup tweak comes from the cache_size option.
For the backup destination, using a cache_size of 1 provides the best performance: SQLite offloads pages to the disk directly and does not waste time maintaining a cache for pages that will never be accessed again.
For the source database, the best cache_size is a little more subtle: if the source WAL file is empty, using a cache_size of 1 also appears optimal, as every page will be read only once, so any greater cache would just be wasted.
However when the source database WAL file is large, a higher backup performance was achieved with a “large enough” cache. The exact value seems to depend on what is in the WAL exactly, but having a cache too small had a more negative impact than having a cache too large.
sql3bak
I wrapped up all the above logic in a small utility, creatively named “sql3bak”.
Note that it is really intended for backing up largish SQLite database (at least dozens of megabytes, up to hundreds of of gigabytes). It also assumes that for large databases with large WAL files, it is safe to allocate up to 1 GB of RAM for caches.
The source code can be found on bitbucket sql3bak repository, and a pre-compiled (signed) binary is available in thedownloads.
(note: to compile, you may have to adapt the SQLite3 calls to your favorite bindings, as I am using a custom binding based on a bundled dll, similar to what I used for dwsMPIR.Bundle.pas)

https://www.delphitools.info/2016/10/06/faster-backups-for-large-sqlite-databases/
https://bitbucket.org/egrange/sql3bak/downloads

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
【社区内容提示】社区部分内容疑似由AI辅助生成,浏览时请结合常识与多方信息审慎甄别。
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

相关阅读更多精彩内容

  • **2014真题Directions:Read the following text. Choose the be...
    又是夜半惊坐起阅读 13,503评论 0 23
  • 俗话说:“男怕入错行,女怕嫁错郎”。但是对于当今时代来说,不论男女都怕入错行,选错另一半。 踏入社会短短几个月,看...
    智书阅读 1,715评论 0 3
  • 义解九型之五十五 女人是水,水最智慧。女人有时把自己的男人当儿子培养,有时把自己的男人当父亲依靠。就像水一样变幻形...
    王庆义阅读 1,868评论 0 0
  • 生活再忙,再累,身体是生命的本钱,大家一定要注意身体。爱家人要从爱自己开始,好好爱自己才能够爱别人,有健康的身体才...

友情链接更多精彩内容