Rails on SQLite: exciting new ways to cause outages

11 Sep 2025

Rails on SQLite: exciting new ways to cause outages

This post was originally given as a talk for Friendly.rb and Kaigi on Rails. The slides are also available.

Between Litestack and the Rails 8 trifecta of Solid Cable, Solid Cache, and Solid Queue, it’s easier than ever to spin up a Rails app that doesn’t need a database service, or a redis service, or a file storage service. It’s great to simplify things, but even after 20 years of deploying Rails apps I was still caught out by some of the ways things are different.

Based on what happened when I built a new side project in Rails on SQLite, we’ll cover what’s different, what’s new, and several ways that you can knock your site offline or even destroy your entire production database. As we go, we’ll also talk about the advantages of using SQLite, and how those differences can help you.

So who am I, how did I learn these things, and why should you listen to me? I’m André Arko, better known on the internet as @indirect. A long time ago, I helped create Bundler, and I’ve been the OSS team lead for RubyGems and Bundler for more than a decade at this point.

I work at Spinel Cooperative, a collective of Ruby open source maintainers building rv, the Ruby language manager that can install Ruby in one second flat. We offer retainers for unlimited access to core team experts from Bundler, Rails, Hotwire, and more, who can answer your questions and solve your problems.

I’ve been deploying Rails applications to production since 2004, and most relevantly for this particular talk, I built a service called feedyour.email. Feed Your Email turns email subscriptions into RSS feeds that you can subscribe to in your feed reader. There is no signup, there are no accounts, you just go to the website and push a button to get an email address and a feed URL. Emails sent to that address will show up in that feed.

Feed Your Email is only possible as a service because of SQLite—if I had to maintain three Postgres instances and a couple of web instances and a couple of worker instances, I would have felt like it was too much hassle and cost too much money, and given up. SQLite reduced the complexity of building and deploying this service to the point where I was able to do it casually, for fun, and keep it running for everyone while feeling like it was worth it just for my own personal use.

This app serves about a million requests per month. That number sounds kind of big, but let’s do the math: 24 requests per minute, or one every 2.6 seconds. We can definitely serve at least one request every 2.6 seconds off of our Rails app, even on a small digital ocean droplet or a small cloud VM. I run my service on Fly.io, and hosting costs about USD$14 per month.

It has bonus features like a web view so you can share individual emails as a link without giving away your secret email address or letting anyone see the other emails in that feed, and it also has full-text search if you’re trying to find a particular email. That full-text search is a feature of SQLite, which brings us back to the topic of this talk. How did SQLite help? Let’s take a look.

The biggest fundamental difference, where almost every other difference comes from, is that SQLite is built in to your web server process. The reason for the “lite” in the name is that it doesn’t run a separate process, it doesn’t listen on a port or a socket, and you can’t connect to it. All the data is just in a single file, and your web process reads and writes that file when needed. This is awesome: you can’t have database connection errors anymore.

There’s a small issue with giving the web process its own database in a local file, though. If you deploy your app as usual, your production database can disappear at any time! Heroku destroys dynos every 24 hours, including all files. Fly.io loses the files in a container anytime they restart. In a world of containers, it’s incredibly easy to miss that your new SQLite database is on an ephemeral filesystem and will disappear along with the Puma process in your web container.

That leads us to the first and most important rule of using SQLite: put your database files in persistent storage. On AWS that means EBS, on Fly.io that means Volumes, but wherever you’re hosting, make sure that you can keep your database files across restarts (and ideally have automatic snapshots).

Now that your database won’t disappear at any moment, let’s talk about what it means to put all the data in a single file. You probably use Rails.cache to store temporary data—that’s in a single SQLite file now, too. You also probably use ActiveJob to send emails or do slower work in the background. All of those jobs are also in a single SQLite file now. By default, it’s the same file!

Putting everything in the same SQLite file makes everything very easy. You just need to keep track of that one file, and you’ll know that your model data, your caches, and your jobs will all be safe.

SQLite being in-process raises a new problem, though… what if your background job needs to update a model? You might be used to deploying your background workers in a separate container, so you can scale them as needed. That’s not going to fly anymore, because your background workers need to be able to read and write that same single file that your web server is reading and writing.

Since your database is now in just one file, you have two options. You can run your workers in a second process inside the same VM or container as the web process, or you can run your workers in threads inside the web process.

If this is a smallish application, doing a million requests per month or less, you’ll be absolutely fine putting your background jobs in threads. As a bonus, putting background jobs in threads can almost halve the amount of RAM you need because a single Rails process is handling both web and job requests.

If you really want to scale up your application, though, you’ll need to do what gets called “vertical” scaling rather than the traditional “horizontal” scaling. You can’t add more VMs, because other VMs won’t be able to see your database file. Instead, you need bigger and bigger single servers, with more and more CPU cores and RAM. That’s definitely possible, though. There are servers nowadays with 128 cores, or even more, and terabytes of RAM. Unfortunately, once you have scaled your wildly popular application vertically to the moon, you’ll discover the biggest limitation of SQLite: it’s just one file on disk.

If you have dozens processes and hundreds of threads in web servers and background job workers, all trying to write into this one database file at the same time, there’s probably going to be a lot of contention. By default, SQLite uses the filesystem to take out a lock on the entire database for each transaction. Holding the lock, it executes any read and write queries, commits, and then releases the lock. Then the next process can take the filesystem lock and do the same thing.

This can create quite the queue if even read-only queries have to wait in line and happen one at a time (because if they didn’t have the lock, some writer might sneak in and change the data mid-read!). To (partially) address this problem, SQLite offers a middle ground in the form of a Write-Ahead Log. The WAL log is an append-only file where any database writes can be written, one at a time. Then, a SQLite-controlled process copies those write instructions into the actual database file between reads. In the meantime, there can be as many readers as you want, because writes don’t have to block reads, and many reads from the same file at once are quite safe.

This solves the problem with only allowing one single read or write at a time, but it definitely has a cost. The database isn’t just one file anymore, it’s now a series of files, and you need to back them up and restore them together if you want to make sure you haven’t lost any data. Hopefully that’s not too much trouble, but it is definitely something to be aware of while planning your backup and disaster recovery strategy.

There’s one other approach worth calling out at this point, and that strategy is deliberately using multiple SQLite database files. If you are putting your not just your model data, but also your Rails cache, and also your background jobs, and maybe also your uploaded files all together into a single SQLite database file, your different use-cases can start to step on one another’s toes. For example, if you go to queue a few thousand jobs, any writes from your web requests will end up in the writer queue behind that pile of jobs in line to be written.

Creating a separate SQLite file per system, or per access pattern, can help a lot with this. In Rails, the most common splits are separate SQLite databases for ActiveRecord, for the Rails cache, for background jobs, and for ActionCable. Depending on your application, it might also make sense to put your ActiveStorage blobs into a SQLite database or into the same filesystem that you are already backing up, as well. There’s a lot of complexity and overhead involved in setting up S3 buckets with the correct permissions and getting files into and out of them, and you might just want to skip all of that in your new, simplified Rails config.

Taking this approach to an extreme might even involve sharding your model data across many database files. The most extreme example of this that I’ve heard of was an application that chose to shard their model data across one SQLite database file per customer. That meant every new signup created a new SQLite database file on disk, which is in some ways absurd, but it also meant that every individual user had the full power and speed of SQLite available to them. It’s hard to have read or write contention when every user gets their own separate database!

So now that we’ve covered vertically scaling the Rails server itself, let’s talk about the other implications of your application running on exactly one server. The downside to there being just one server is that if that server goes down, your entire app is down. No degraded service, no slower than usual application, just… no application at all.

If you’re running in a container, it’s impossible to deploy without downtime because only one container can ever have the volume with the database mounted. The old container has to stop before the new container can start. If you’re running in a VM, you might be able to deploy without downtime by running a local reverse proxy and more than one web server process, and restarting those web server processes one at a time rather than all at once. Welcome to how we used to do things in the 2000s, and my apologies.

That said, some of the implications of only one server are good: if there’s only one, it’s pretty easy to run status checks, and it’s pretty easy to troubleshoot. You don’t need to debug connections between load balancers and dozens of web servers and database servers and redis servers and file storage servers, you just need to debug the one server. That can definitely make your job easier!

Another implication of having just one single server: there is only one place for network requests to go. As I alluded to a moment ago, the only kind of load balancing that you can do is by running a local proxy and adding multiple separate processes as backends. The server itself is only going to have one IP address and one port where it can be reached, and there’s a certain amount of scale where that one IP and one port is going to become limiting. The good news is that you probably won’t hit that scale, and if you do, you’ll probably want to stop using SQLite anyway.

If you ever want to try switching towards or away from SQLite, the sequel gem has the amazing ability to read from one database and write into another, doing a full database copy while respecting all the quirks and limitations of each database. If you want to move from Mysql or Postgres over to SQLite, or if you ever want to load a SQLite database into Mysql or Postgres, I highly recommend it. The duckdb command line tool also has excellent cross-database capabilities, and is the next thing I would try if sequel wasn’t working for me for some reason.

There’s one more limitation that we need to consider that falls out of there only being one server: your app can only run in one geographic location. Some applications can benefit from adding additional web processes (or even database read replicas) spread out closer to end users, and that’s not possible if you are limited to a maximum of one server total for your entire application.

That said, there’s nothing stopping you from using the more usual kind of CDN-based global distribution. If your application has a decent amount of static or cacheable content, you can at least still set the cache-control headers and run the app behind Fastly or Cloudlfare.

Before we wrap up, I want to make sure to cover the various backup and replication options available to you while using SQLite for your application. The absolute all-star of SQLite backup and replication is called Litestream. It’s available as a gem, and can be used as easily as setting a few environment variables and using the litestream command provided by the gem to wrap your puma or other web server.

What litestream does is fairly simple: it forwards a copy of each entry added to the write-ahead log over to any S3-compatible file store — you might even say that it streams your data in a light way. If you ever have a catastrophe, and your database file gets deleted or corrupted, the bucket will have a full copy of the WAL that you can replay to restore your database back to where it was when the server stopped working.

On AWS, this still means setting up an S3 bucket and setting the right env vars, but at least you don’t need to deal with the bucket having public access, or setting up signed uploads, or any of the other things that make S3 a huge pain. You just need a private bucket and read/write credentials, and you’re good to go. If you’re using fly.io, you don’t even have to set the env vars yourself! They are set automatically by the command that creates the S3-compatible bucket on Tigris.

There’s one last thing that you can try using if you’re feeling especially adventurous, LiteFS. LiteFS is a fascinating software achievement, offering full Mysql or Postgres-style replication for multiple SQLite databases running in many locations. The completely deranged trick that they use to do this is creating an entire software filesystem using FUSE, and then putting the SQLite database inside that filesystem. This gives them access to every filesystem read and write call made by your application, and allows them to create their own operations that are then sent to every other member of the cluster to be applied.

This kind of setup comes with a lot of caveats. The biggest one is the usual distributed systems kind of caveat. You’ll have stale reads where some users will see old data, and if the primary crashes you might lose some data. If you’re okay with the tradeoffs of a distributed system (and you’re okay with the idea of all of your database reads and writes going through a FUSE filesystem that might be adding extra bugs), LiteFS offers a version of the ultimate web application dream.

In the dream SQLite plus LiteFS world, you have all the advantages of SQLite and all the advantages of a fully replicated multi-writer database setup. Any individual server can go down without causing any downtime for the application as a whole, and every user has a full copy of the application and all its data, running extremely close to them.

I haven’t built that perfect system yet, but it feels more attainable than it ever has before thanks to SQLite.

My time to write is sponsored by Spinel. If your company could use some world-class expertise on gems, Rails, CI, or developer productivity, check out spinel.coop and hire us!