Sunday 21 December 2008

MySQL running out of disc space

Running out of disc space is not a good situation. However, if it does happen, it would be nice to have some control over what happens.

We use MyISAM. When you run out of disc space, MyISAM just sits there and waits. And waits, and waits, apparently forever, for some space to become available.

This is not good, because an auditing/logging application (which ours is) may have lots of available servers which it could send its data to - getting an error from one would simply mean that the data could be audited elsewhere.

But if the server just hangs, and waits, the application isn't (currently) smart enough to give up and try another server, so it hangs the audit process too. Which means that audit data starts to back up, and customers wonder why they can't see recent data in their reports etc.

There has to be a better way. I propose
  • A background thread monitors the disc space level every few seconds
  • When it falls below a critical level (still more than can reasonably be filled up in a few seconds), force the server to become read-only
  • When in this mode, modifications to the data fail, quickly, with an error code which tells the process (or develope) exactly what the problem is (Out of disc space)
  • When the disc space falls below some threshold, the read-only mode is turned back off.
That way, clients get what they expect - either quick service for inserts, or a fast error telling them what's wrong (Go away, I'm full, audit your data somewhere else)

Drizzle etc, should do this.

Or perhaps, it's a job for the storage engine?

Happy Christmas.

7 comments:

Anonymous said...

How often are you actually running out of disk space? This should happen, well, never :) But if it does, it should be once in a blue moon. Any more often and you have a systemic problem that needs to be addressed.

MySQL does not handle running out of disk space well and you risk corrupting both your MyISAM and InnoDB tablespaces (if you use InnoDB). That turns a bad day into a worse one.

As a general rule, it's wise to keep at least 10% of disk space free all the time. Any less and you can run into performance issues, fragmentation, etc. But you also remove your safety blanket which helps allow you to make careful, calculated, decisions about how to fix the issue. Instead, if you run totally out of space, your site is down and you have to make decisions, some that should be given careful thought, fast.

Point is - the most important thing to do is fix the root problem. If you have tables which can be read only, or can be split out (say by date), compress the ones you don't need to write to using 'myisampack' or consider converting them to the ARCHIVE engine (if you can handle lack of indexes anyway). 5.1 supports partitioning to help with this as well and you could even look at the InnoDB plugin or Percona's InnoDB fork called XtraDB. Both support on-the-fly compression.

You could also look at moving your data to another disk. Though I find it gacky, you can create symlinks to move your .MYD and .MYI files to another drive. You can also do this with the database directories.

Moving to a large RAID (you are using RAID, right? :) would help too and, in an extreme case, you could split your reporting data onto a dedicated reporting server.

Now having said all that, I agree that you should be monitoring disk usage. Nagios, Nimbus, home brew scripts, etc. can help with that. Drizzle *could* do this but one of it's goals is to make the database simpler, not more complex and this is really more of a system administration issue over a DB one.

That said, you can also have the same scripts monitor disk usage by simply logging in to MySQL/Drizzle and running "SET GLOBAL read_only=1;" then removing it once the disk usage has been solved.

MySQL should also throw errors, I agree with you here as well, or even shutdown if the issue is bad enough to cause table-corruption issues. That, too, is scriptable, but it would be nice if MySQL didn't try to trudge ahead and write to the disk when it's totally full :)

Mark Robson said...

How often do we run out of space? In production, never, as space is managed properly and monitored vigilantly.

In non-production environments: Sometimes, but then it doesn't always matter too much.

Thankyou for your tips on saving space, although that wasn't quite what I had in mind :)

Craig Gagne said...
This comment has been removed by the author.
Craig Gagne said...

Are you the same Mark Robson that developed Gencontrol?

Mark Robson said...

Yes I am.

Craig Gagne said...

Great! I so need help (but thats a different conversation). I am in need of the C++ Source for the app but the gensortium site seems down. I am trying to make Gencontrol be view only as we just need to monitor what our users are doing for quality control issues. Any help you can offer would be so greatly appreciated!

Craig

Ryan Carver said...

I am also interested in the source code for Gencontrol. Do you have a live link for this tool? I use it and would like to post a link to your download page.