<?xml version="1.0" encoding="UTF-8"?> <rss
version="2.0"
xmlns:content="http://purl.org/rss/1.0/modules/content/"
xmlns:wfw="http://wellformedweb.org/CommentAPI/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:atom="http://www.w3.org/2005/Atom"
xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
> <channel><title>beer planet &#187; logging</title> <atom:link href="http://beerpla.net/tag/logging/feed/" rel="self" type="application/rss+xml" /><link>http://beerpla.net</link> <description>where things have nothing to do with beer - tutorials, tips, how-tos, thoughts, hacks, and other techy nonsense</description> <lastBuildDate>Fri, 06 Jan 2012 08:50:59 +0000</lastBuildDate> <language>en</language> <sy:updatePeriod>hourly</sy:updatePeriod> <sy:updateFrequency>1</sy:updateFrequency> <generator>http://wordpress.org/?v=3.3.1</generator> <atom:link rel='hub' href='http://beerpla.net/?pushpress=hub'/> <item><title>MySQL Conference Liveblogging: Optimizing MySQL For High Volume Data Logging Applications (Thursday 2:50PM)</title><link>http://beerpla.net/2008/04/17/mysql-conference-liveblogging-optimizing-mysql-for-high-volume-data-logging-applications-thursday-250pm/</link> <comments>http://beerpla.net/2008/04/17/mysql-conference-liveblogging-optimizing-mysql-for-high-volume-data-logging-applications-thursday-250pm/#comments</comments> <pubDate>Thu, 17 Apr 2008 21:56:06 +0000</pubDate> <dc:creator>Artem Russakovskii</dc:creator> <category><![CDATA[Databases]]></category> <category><![CDATA[application]]></category> <category><![CDATA[conference]]></category> <category><![CDATA[high volume]]></category> <category><![CDATA[logging]]></category> <category><![CDATA[MySQL]]></category> <category><![CDATA[optimize]]></category> <category><![CDATA[scale]]></category> <guid
isPermaLink="false">http://beerpla.net/2008/04/17/mysql-conference-liveblogging-optimizing-mysql-for-high-volume-data-logging-applications-thursday-250pm/</guid> <description><![CDATA[<ul><li><a
title="http://en.oreilly.com/mysql2008/public/schedule/detail/874" href="http://en.oreilly.com/mysql2008/public/schedule/detail/874">http://en.oreilly.com/mysql2008/public/schedule/detail/874</a></li><li>presented by <a
href="http://en.oreilly.com/mysql2008/public/schedule/speaker/1287">Charles Lee</a> of <a
href="http://hyperic.com/">Hyperic</a></li><li>Hyperic has the best performance with MySQL out of MySQL, Oracle, and Postgres in their application</li><li><em>I suddenly remember hyperic was highly recommended above nagios in </em><a
href="http://beerpla.net/2008/04/16/mysql-conference-liveblogging-monitoring-tools-wednesday-515pm/"><em>MySQL Conference Liveblogging: Monitoring Tools (Wednesday 5:15PM)</em></a></li><li>performance bottleneck</li></ul><ul><li>the database</li></ul><ul><li>CPU</li><li>memory</li></ul><li>IO</li><ul><li>disk latency</li><li>network latency</li></ul><li>slow queries</li><li>media size deployment example</li><ul><li>300 platforms (300 remote agents collecting data)</li><li>2,100 servers</li><li>21,000 services (10 services per server), <em>sounds feasible</em></li><li>468,000 metrics (20 metrics per service)</li><li>28,800,000 metric data rows per day</li><li>larger deployments have a lot more of these (<em>sounds crazy</em>)</li></ul><li>data</li><ul><li>measurement_id</li><li>timestamp</li><li>value</li><li>primary key (timestamp, measurement_id)</li></ul><li>data flow</li><ul></ul><p>...<div
class=clear></div> <a
href="http://beerpla.net/2008/04/17/mysql-conference-liveblogging-optimizing-mysql-for-high-volume-data-logging-applications-thursday-250pm/" class="read_more"><div
class=excerpt-end>Read the rest of this article &#187;</div></a></p>]]></description> <content:encoded><![CDATA[<ul><li><a
title="http://en.oreilly.com/mysql2008/public/schedule/detail/874" href="http://en.oreilly.com/mysql2008/public/schedule/detail/874">http://en.oreilly.com/mysql2008/public/schedule/detail/874</a></li><li>presented by <a
href="http://en.oreilly.com/mysql2008/public/schedule/speaker/1287">Charles Lee</a> of <a
href="http://hyperic.com/">Hyperic</a></li><li>Hyperic has the best performance with MySQL out of MySQL, Oracle, and Postgres in their application</li><li><em>I suddenly remember hyperic was highly recommended above nagios in </em><a
href="http://beerpla.net/2008/04/16/mysql-conference-liveblogging-monitoring-tools-wednesday-515pm/"><em>MySQL Conference Liveblogging: Monitoring Tools (Wednesday 5:15PM)</em></a></li><li>performance bottleneck</li><ul><li>the database</li><ul><li>CPU</li><li>memory</li></ul><li>IO</li><ul><li>disk latency</li><li>network latency</li></ul><li>slow queries</li></ul><li>media size deployment example</li><ul><li>300 platforms (300 remote agents collecting data)</li><li>2,100 servers</li><li>21,000 services (10 services per server), <em>sounds feasible</em></li><li>468,000 metrics (20 metrics per service)</li><li>28,800,000 metric data rows per day</li><li>larger deployments have a lot more of these (<em>sounds crazy</em>)</li></ul><li>data</li><ul><li>measurement_id</li><li>timestamp</li><li>value</li><li>primary key (timestamp, measurement_id)</li></ul><li>data flow</li><ul><li>agent collects data and sends reports to server with multiple data points</li><li>server batch inserts metric data points</li><li>if network connection fails, agent continues to collect but server &#034;backfills&#034; unavailable</li><li>when agent reconnects, spooled data overwrite backfilled data points (<em>why not use REPLACE for all inserts?</em>)</li></ul><li><em>things are very basic so far</em></li><li>batch insert</li><ul><li>INSERT INTO TABLE (a,b,c) VALUES (0,0,0), (1,1,1),&#8230;</li><li>using MySQL batch insert statements vs prepared statements with multiple queries in other databases seems to improve overall performance by 30%</li><li>batch inserts are limited by &#039;max_allowed_packet&#039;</li></ul><li>other options for increasing insert speed</li><ul><li>set unique_checks=0, insert, set unique_checks=1 (<em>definitely need to make sure data is valid first</em>)</li><li>set foreign_key_checks=0, insert, set foreign_key_checks=1 (<em>same concerns as above</em>)</li><li>Hyperic doesn&#039;t use the 2 above</li></ul><li>INSERT &#8230; ON DUPLICATE KEY UPDATE</li><ul><li>when regular INSERT fails, retry batch with INSERT ON DUPLICATE KEY syntax</li><li>it&#039;s much slower but it allows</li></ul><li><em>this is all basic, where are the performance tweaks?!</em></li><li>batch aggregate inserter</li><ul><li>queue metric data from separate agent reports</li><ul><li>minimize number of inserts, connections, CPU load</li><li>maximize workload efficiency</li></ul><li>optimal configuration for 700 agents</li><ul><li>3 workers</li><li>2000 batch size seems to work best</li><li>queue size of 4,000,000</li></ul><li>this seems to peak at 2.2mil metric data inserts per minute</li></ul><li>data consolidation</li><ul><li>inspired by rrdtool</li><li>lower resolution tables track min, avg, and max</li><li>data compression runs hourly</li><li>size limit 2 days</li><li>every hour, data is rolled up into another table that holds hourly aggregated values with size limit 14 days, then that one gets rolled up into a monthly table, etc</li><li><em>this is is a good approach if you don&#039;t care about each data point</em></li></ul><li><em>I&#039;m overwhelmed by the amount of &#034;you know&#034;s from the speaker. Parasite words, ahh! Sorry Charles <img
src='http://beerpla.net/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </em></li><li>software partitioning</li><ul><li>measurement data split into 18 tables, representing 9 days (2 per day)</li><li>they didn&#039;t want to do more than 2 SELECTs to get data per day, hence such sharding</li><li><em>oddly, Charles didn&#039;t actually use the word &#039;shard&#039; once</em></li><li>tables truncated, rather than deleting rows =&gt; huge performance boost</li><li>truncation vs deletion</li><ul><li>deletion causes contention on rows</li><li>truncation doesn&#039;t produce fragmentation</li><li>truncation just drops and recreates the table &#8211; single DDL operation</li></ul></ul><li>indexes</li><ul><li>every <strong>InnoDB</strong> table has a special index called the <strong>clustered index</strong> (based on primary key) where the physical data for the rows is stored</li><li>advantages</li><ul><li>selects faster &#8211; row data is on the same page where the index search leads</li><li>inserts in (timestamp) order &#8211; avoid page splits and fragmentation</li></ul><li>shows comparison between non-clustered index and clustered index (see slides)</li></ul><li><em>still no mention of configuration tweaks</em></li><li>UNION ALL works better than inner SELECTS because the optimizer didn&#039;t optimize them enough (at least in the version these guys are using, not sure which)</li><li><em>recommended server options are on the very last slide, I was waiting for those the most! I guess I&#039;ll look up the slides after</em></li></ul><div
class="shr-bookmarks shr-bookmarks-expand"><ul
class="socials"><li
class="shr-twitter"> <a
href="http://www.shareaholic.com/api/share/?title=MySQL+Conference+Liveblogging%3A+Optimizing+MySQL+For+High+Volume+Data+Logging+Applications+%28Thursday+2%3A50PM%29&amp;link=http://beerpla.net/2008/04/17/mysql-conference-liveblogging-optimizing-mysql-for-high-volume-data-logging-applications-thursday-250pm/&amp;notes=%20http%3A%2F%2Fen.oreilly.com%2Fmysql2008%2Fpublic%2Fschedule%2Fdetail%2F874%20presented%20by%20Charles%20Lee%20of%20Hyperic%20Hyperic%20has%20the%20best%20performance%20with%20MySQL%20out%20of%20MySQL%2C%20Oracle%2C%20and%20Postgres%20in%20their%20application%20I%20suddenly%20remember%20hyperic%20was%20highly%20recommended%20above%20nagios%20in%20MySQL%20Conference%20Liveblogging%3A%20Monito&amp;short_link=http://bit.ly/cmLynB&amp;v=1&amp;apitype=1&amp;apikey=8afa39428933be41f8afdb8ea21a495c&amp;source=Shareaholic&amp;template=%24%7Btitle%7D+-+%24%7Bshort_link%7D&amp;service=7&amp;tags=&amp;ctype=" rel="nofollow" class="external" title="Tweet This!">Tweet This!</a></li><li
class="shr-facebook"> <a
href="http://www.shareaholic.com/api/share/?title=MySQL+Conference+Liveblogging%3A+Optimizing+MySQL+For+High+Volume+Data+Logging+Applications+%28Thursday+2%3A50PM%29&amp;link=http://beerpla.net/2008/04/17/mysql-conference-liveblogging-optimizing-mysql-for-high-volume-data-logging-applications-thursday-250pm/&amp;notes=%20http%3A%2F%2Fen.oreilly.com%2Fmysql2008%2Fpublic%2Fschedule%2Fdetail%2F874%20presented%20by%20Charles%20Lee%20of%20Hyperic%20Hyperic%20has%20the%20best%20performance%20with%20MySQL%20out%20of%20MySQL%2C%20Oracle%2C%20and%20Postgres%20in%20their%20application%20I%20suddenly%20remember%20hyperic%20was%20highly%20recommended%20above%20nagios%20in%20MySQL%20Conference%20Liveblogging%3A%20Monito&amp;short_link=http://bit.ly/cmLynB&amp;v=1&amp;apitype=1&amp;apikey=8afa39428933be41f8afdb8ea21a495c&amp;source=Shareaholic&amp;template=&amp;service=5&amp;tags=&amp;ctype=" rel="nofollow" title="Share this on Facebook">Share this on Facebook</a></li><li
class="shr-googlebuzz"> <a
href="http://www.shareaholic.com/api/share/?title=MySQL+Conference+Liveblogging%3A+Optimizing+MySQL+For+High+Volume+Data+Logging+Applications+%28Thursday+2%3A50PM%29&amp;link=http://beerpla.net/2008/04/17/mysql-conference-liveblogging-optimizing-mysql-for-high-volume-data-logging-applications-thursday-250pm/&amp;notes=%20http%3A%2F%2Fen.oreilly.com%2Fmysql2008%2Fpublic%2Fschedule%2Fdetail%2F874%20presented%20by%20Charles%20Lee%20of%20Hyperic%20Hyperic%20has%20the%20best%20performance%20with%20MySQL%20out%20of%20MySQL%2C%20Oracle%2C%20and%20Postgres%20in%20their%20application%20I%20suddenly%20remember%20hyperic%20was%20highly%20recommended%20above%20nagios%20in%20MySQL%20Conference%20Liveblogging%3A%20Monito&amp;short_link=http://bit.ly/cmLynB&amp;v=1&amp;apitype=1&amp;apikey=8afa39428933be41f8afdb8ea21a495c&amp;source=Shareaholic&amp;template=&amp;service=257&amp;tags=&amp;ctype=" rel="nofollow" class="external" title="Post on Google Buzz">Post on Google Buzz</a></li><li
class="shr-reddit"> <a
href="http://www.shareaholic.com/api/share/?title=MySQL+Conference+Liveblogging%3A+Optimizing+MySQL+For+High+Volume+Data+Logging+Applications+%28Thursday+2%3A50PM%29&amp;link=http://beerpla.net/2008/04/17/mysql-conference-liveblogging-optimizing-mysql-for-high-volume-data-logging-applications-thursday-250pm/&amp;notes=%20http%3A%2F%2Fen.oreilly.com%2Fmysql2008%2Fpublic%2Fschedule%2Fdetail%2F874%20presented%20by%20Charles%20Lee%20of%20Hyperic%20Hyperic%20has%20the%20best%20performance%20with%20MySQL%20out%20of%20MySQL%2C%20Oracle%2C%20and%20Postgres%20in%20their%20application%20I%20suddenly%20remember%20hyperic%20was%20highly%20recommended%20above%20nagios%20in%20MySQL%20Conference%20Liveblogging%3A%20Monito&amp;short_link=http://bit.ly/cmLynB&amp;v=1&amp;apitype=1&amp;apikey=8afa39428933be41f8afdb8ea21a495c&amp;source=Shareaholic&amp;template=&amp;service=40&amp;tags=&amp;ctype=" rel="nofollow" class="external" title="Share this on Reddit">Share this on Reddit</a></li><li
class="shr-hackernews"> <a
href="http://www.shareaholic.com/api/share/?title=MySQL+Conference+Liveblogging%3A+Optimizing+MySQL+For+High+Volume+Data+Logging+Applications+%28Thursday+2%3A50PM%29&amp;link=http://beerpla.net/2008/04/17/mysql-conference-liveblogging-optimizing-mysql-for-high-volume-data-logging-applications-thursday-250pm/&amp;notes=%20http%3A%2F%2Fen.oreilly.com%2Fmysql2008%2Fpublic%2Fschedule%2Fdetail%2F874%20presented%20by%20Charles%20Lee%20of%20Hyperic%20Hyperic%20has%20the%20best%20performance%20with%20MySQL%20out%20of%20MySQL%2C%20Oracle%2C%20and%20Postgres%20in%20their%20application%20I%20suddenly%20remember%20hyperic%20was%20highly%20recommended%20above%20nagios%20in%20MySQL%20Conference%20Liveblogging%3A%20Monito&amp;short_link=http://bit.ly/cmLynB&amp;v=1&amp;apitype=1&amp;apikey=8afa39428933be41f8afdb8ea21a495c&amp;source=Shareaholic&amp;template=&amp;service=202&amp;tags=&amp;ctype=" rel="nofollow" class="external" title="Submit this to Hacker News">Submit this to Hacker News</a></li><li
class="shr-delicious"> <a
href="http://www.shareaholic.com/api/share/?title=MySQL+Conference+Liveblogging%3A+Optimizing+MySQL+For+High+Volume+Data+Logging+Applications+%28Thursday+2%3A50PM%29&amp;link=http://beerpla.net/2008/04/17/mysql-conference-liveblogging-optimizing-mysql-for-high-volume-data-logging-applications-thursday-250pm/&amp;notes=%20http%3A%2F%2Fen.oreilly.com%2Fmysql2008%2Fpublic%2Fschedule%2Fdetail%2F874%20presented%20by%20Charles%20Lee%20of%20Hyperic%20Hyperic%20has%20the%20best%20performance%20with%20MySQL%20out%20of%20MySQL%2C%20Oracle%2C%20and%20Postgres%20in%20their%20application%20I%20suddenly%20remember%20hyperic%20was%20highly%20recommended%20above%20nagios%20in%20MySQL%20Conference%20Liveblogging%3A%20Monito&amp;short_link=http://bit.ly/cmLynB&amp;v=1&amp;apitype=1&amp;apikey=8afa39428933be41f8afdb8ea21a495c&amp;source=Shareaholic&amp;template=&amp;service=2&amp;tags=&amp;ctype=" rel="nofollow" class="external" title="Share this on del.icio.us">Share this on del.icio.us</a></li><li
class="shr-stumbleupon"> <a
href="http://www.shareaholic.com/api/share/?title=MySQL+Conference+Liveblogging%3A+Optimizing+MySQL+For+High+Volume+Data+Logging+Applications+%28Thursday+2%3A50PM%29&amp;link=http://beerpla.net/2008/04/17/mysql-conference-liveblogging-optimizing-mysql-for-high-volume-data-logging-applications-thursday-250pm/&amp;notes=%20http%3A%2F%2Fen.oreilly.com%2Fmysql2008%2Fpublic%2Fschedule%2Fdetail%2F874%20presented%20by%20Charles%20Lee%20of%20Hyperic%20Hyperic%20has%20the%20best%20performance%20with%20MySQL%20out%20of%20MySQL%2C%20Oracle%2C%20and%20Postgres%20in%20their%20application%20I%20suddenly%20remember%20hyperic%20was%20highly%20recommended%20above%20nagios%20in%20MySQL%20Conference%20Liveblogging%3A%20Monito&amp;short_link=http://bit.ly/cmLynB&amp;v=1&amp;apitype=1&amp;apikey=8afa39428933be41f8afdb8ea21a495c&amp;source=Shareaholic&amp;template=&amp;service=38&amp;tags=&amp;ctype=" rel="nofollow" class="external" title="Stumble upon something good? Share it on StumbleUpon">Stumble upon something good? Share it on StumbleUpon</a></li><li
class="shr-mail"> <a
href="http://www.shareaholic.com/api/share/?title=MySQL%20Conference%20Liveblogging%3A%20Optimizing%20MySQL%20For%20High%20Volume%20Data%20Logging%20Applications%20%28Thursday%202%3A50PM%29&amp;link=http://beerpla.net/2008/04/17/mysql-conference-liveblogging-optimizing-mysql-for-high-volume-data-logging-applications-thursday-250pm/&amp;notes=%20http%3A%2F%2Fen.oreilly.com%2Fmysql2008%2Fpublic%2Fschedule%2Fdetail%2F874%20presented%20by%20Charles%20Lee%20of%20Hyperic%20Hyperic%20has%20the%20best%20performance%20with%20MySQL%20out%20of%20MySQL%2C%20Oracle%2C%20and%20Postgres%20in%20their%20application%20I%20suddenly%20remember%20hyperic%20was%20highly%20recommended%20above%20nagios%20in%20MySQL%20Conference%20Liveblogging%3A%20Monito&amp;short_link=http://bit.ly/cmLynB&amp;v=1&amp;apitype=1&amp;apikey=8afa39428933be41f8afdb8ea21a495c&amp;source=Shareaholic&amp;template=&amp;service=201&amp;tags=&amp;ctype=" rel="nofollow" class="external" title="Email this to a friend?">Email this to a friend?</a></li></ul><div
style="clear: both;"></div></div> Similar Posts:<ul><li><a
href="http://beerpla.net/2009/05/11/mysql-deletingupdating-rows-common-to-2-tables-speed-and-slave-lag-considerations/" rel="bookmark" title="May 11, 2009">[MySQL] Deleting/Updating Rows Common To 2 Tables &#8211; Speed And Slave Lag Considerations</a></li><li><a
href="http://beerpla.net/2008/04/15/mysql-conference-liveblogging-explain-demystified-tuesday-200p/" rel="bookmark" title="April 15, 2008">MySQL Conference Liveblogging: EXPLAIN Demystified (Tuesday 2:00PM)</a></li><li><a
href="http://beerpla.net/2008/04/15/mysql-conference-liveblogging-performance-guide-for-mysql-cluster-tuesday-1050am/" rel="bookmark" title="April 15, 2008">MySQL Conference Liveblogging: Performance Guide For MySQL Cluster (Tuesday 10:50AM)</a></li><li><a
href="http://beerpla.net/2009/02/17/swapping-column-values-in-mysql/" rel="bookmark" title="February 17, 2009">Swapping Column Values in MySQL</a></li><li><a
href="http://beerpla.net/2009/03/18/mysql-indexing-considerations-of-implementing-a-priority-field-in-your-application/" rel="bookmark" title="March 18, 2009">MySQL Indexing Considerations Of Implementing A Priority Field In Your Application</a></li></ul><p><a
class="a2a_dd a2a_target addtoany_share_save" href="http://www.addtoany.com/share_save#url=http%3A%2F%2Fbeerpla.net%2F2008%2F04%2F17%2Fmysql-conference-liveblogging-optimizing-mysql-for-high-volume-data-logging-applications-thursday-250pm%2F&amp;title=MySQL%20Conference%20Liveblogging%3A%20Optimizing%20MySQL%20For%20High%20Volume%20Data%20Logging%20Applications%20%28Thursday%202%3A50PM%29" id="wpa2a_2"><img
src="http://beerpla.net/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share"/></a></p>]]></content:encoded> <wfw:commentRss>http://beerpla.net/2008/04/17/mysql-conference-liveblogging-optimizing-mysql-for-high-volume-data-logging-applications-thursday-250pm/feed/</wfw:commentRss> <slash:comments>1</slash:comments> </item> </channel> </rss>
