1						
						
					MySQL Conference Liveblogging: Optimizing MySQL For High Volume Data Logging Applications (Thursday 2:50PM)
Posted by Artem Russakovskii on April 17th, 2008 in Databases 
											
- http://en.oreilly.com/mysql2008/public/schedule/detail/874
- presented by Charles Lee of Hyperic
- Hyperic has the best performance with MySQL out of MySQL, Oracle, and Postgres in their application
- I suddenly remember hyperic was highly recommended above nagios in MySQL Conference Liveblogging: Monitoring Tools (Wednesday 5:15PM)
- performance bottleneck
- the database
- CPU
- memory
- IO
- disk latency
- network latency
- slow queries
- media size deployment example
- 300 platforms (300 remote agents collecting data)
- 2,100 servers
- 21,000 services (10 services per server), sounds feasible
- 468,000 metrics (20 metrics per service)
- 28,800,000 metric data rows per day
- larger deployments have a lot more of these (sounds crazy)
- data
- measurement_id
- timestamp
- value
- primary key (timestamp, measurement_id)
- data flow
- agent collects data and sends reports to server with multiple data points
- server batch inserts metric data points
- if network connection fails, agent continues to collect but server "backfills" unavailable
- when agent reconnects, spooled data overwrite backfilled data points (why not use REPLACE for all inserts?)
- things are very basic so far
- batch insert
- INSERT INTO TABLE (a,b,c) VALUES (0,0,0), (1,1,1),…
- using MySQL batch insert statements vs prepared statements with multiple queries in other databases seems to improve overall performance by 30%
- batch inserts are limited by 'max_allowed_packet'
- other options for increasing insert speed
- set unique_checks=0, insert, set unique_checks=1 (definitely need to make sure data is valid first)
- set foreign_key_checks=0, insert, set foreign_key_checks=1 (same concerns as above)
- Hyperic doesn't use the 2 above
- INSERT … ON DUPLICATE KEY UPDATE
- when regular INSERT fails, retry batch with INSERT ON DUPLICATE KEY syntax
- it's much slower but it allows
- this is all basic, where are the performance tweaks?!
- batch aggregate inserter
- queue metric data from separate agent reports
- minimize number of inserts, connections, CPU load
- maximize workload efficiency
- optimal configuration for 700 agents
- 3 workers
- 2000 batch size seems to work best
- queue size of 4,000,000
- this seems to peak at 2.2mil metric data inserts per minute
- data consolidation
- inspired by rrdtool
- lower resolution tables track min, avg, and max
- data compression runs hourly
- size limit 2 days
- every hour, data is rolled up into another table that holds hourly aggregated values with size limit 14 days, then that one gets rolled up into a monthly table, etc
- this is is a good approach if you don't care about each data point
- I'm overwhelmed by the amount of "you know"s from the speaker. Parasite words, ahh! Sorry Charles 🙂
- software partitioning
- measurement data split into 18 tables, representing 9 days (2 per day)
- they didn't want to do more than 2 SELECTs to get data per day, hence such sharding
- oddly, Charles didn't actually use the word 'shard' once
- tables truncated, rather than deleting rows => huge performance boost
- truncation vs deletion
- deletion causes contention on rows
- truncation doesn't produce fragmentation
- truncation just drops and recreates the table – single DDL operation
- indexes
- every InnoDB table has a special index called the clustered index (based on primary key) where the physical data for the rows is stored
- advantages
- selects faster – row data is on the same page where the index search leads
- inserts in (timestamp) order – avoid page splits and fragmentation
- shows comparison between non-clustered index and clustered index (see slides)
- still no mention of configuration tweaks
- UNION ALL works better than inner SELECTS because the optimizer didn't optimize them enough (at least in the version these guys are using, not sure which)
- recommended server options are on the very last slide, I was waiting for those the most! I guess I'll look up the slides after
In the meantime, if you found this article useful, feel free to buy me a cup of coffee below.
 beer planet is a blog about technology, programming, computers, and geek life. It is run by Artem Russakovskii - a local San Francisco geek who is currently pursuing his own projects and regularly enjoys hacking Android, PHP, CSS, Javascript, AJAX, Perl, and regular expressions, working on Wordpress plugins and tools, tweaking MySQL queries and server settings, administering Linux machines, blogging, learning new things, and other geeky stuff.
 beer planet is a blog about technology, programming, computers, and geek life. It is run by Artem Russakovskii - a local San Francisco geek who is currently pursuing his own projects and regularly enjoys hacking Android, PHP, CSS, Javascript, AJAX, Perl, and regular expressions, working on Wordpress plugins and tools, tweaking MySQL queries and server settings, administering Linux machines, blogging, learning new things, and other geeky stuff.



