• Suicide
    • having no backups
    • depending on slaves for backup
    • keeping backups on same SAN
    • having a single DBA – Frank didn't like this one at all
    • not keeping binlogs
  • Restoring from backup
    • how much time?
    • uncompressed backup ready to mount?
    • separate network for recovery?
  • In Fotolog, 1TB of data was severely hit.
    • first problem: backup was highly compressed (tar.gz)
    • uncompressing took hours
    • so keep uncompressed backups (at least last N days)
    • it should be mountable, rather than transferable
  • Frank going over recovery modes at http://dev.mysql.com/doc/refman/5.0/en/forcing-recovery.html
  • Row by row recovery
    • row by row recovery (get the range of ids)
    • custom scripts
    • may not be able to use primary key
    • foreign key based retrieval faster
    • lose 4 seconds for each crashed record (in Fotolog, for some reason some values were crashing mysqld)
  • Lessons
    • SANs make sense (in some environments)
    • try to replicate the whole SAN (in Fotolog, a SAN actually failed because of a bug in its maintenance program)
    • everything will fail at some point
    • backup everything (cron jobs, my.cnf, custom scripts)
    • have backup in a form ready to restore
    • don't count replication a backup
    • be worried about 'routine' operations
  • Peter Zaitsev of Percona takes the stage to talk about his homegrown tools for InnoDB recovery
    • innodb-tools – will recover even if mysqld doesn't start, for example if half of RAID0 fails or somebody deleted some data. innodb-tools will recover using InnoDB tablespaces.
  • We're out of time
● ● ●
Artem Russakovskii is a San Francisco programmer and blogger. Follow Artem on Twitter (@ArtemR) or subscribe to the RSS feed.

In the meantime, if you found this article useful, feel free to buy me a cup of coffee below.