<?xml version="1.0" encoding="UTF-8"?> <rss
version="2.0"
xmlns:content="http://purl.org/rss/1.0/modules/content/"
xmlns:wfw="http://wellformedweb.org/CommentAPI/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:atom="http://www.w3.org/2005/Atom"
xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
> <channel><title>beer planet &#187; optimize</title> <atom:link href="http://beerpla.net/tag/optimize/feed/" rel="self" type="application/rss+xml" /><link>http://beerpla.net</link> <description>where things have nothing to do with beer - tutorials, tips, how-tos, thoughts, hacks, and other techy nonsense</description> <lastBuildDate>Thu, 17 May 2012 22:50:53 +0000</lastBuildDate> <language>en</language> <sy:updatePeriod>hourly</sy:updatePeriod> <sy:updateFrequency>1</sy:updateFrequency> <generator>http://wordpress.org/?v=3.3.2</generator> <atom:link rel='hub' href='http://beerpla.net/?pushpress=hub'/> <item><title>[Solr] How To Fix java.io.IOException: directory FOO exists and is a directory, but cannot be listed: list() returned null</title><link>http://beerpla.net/2009/09/21/solr-how-to-fix-java-io-ioexception-directory-foo-exists-and-is-a-directory-but-cannot-be-listed-list-returned-null/</link> <comments>http://beerpla.net/2009/09/21/solr-how-to-fix-java-io-ioexception-directory-foo-exists-and-is-a-directory-but-cannot-be-listed-list-returned-null/#comments</comments> <pubDate>Mon, 21 Sep 2009 19:47:53 +0000</pubDate> <dc:creator>Artem Russakovskii</dc:creator> <category><![CDATA[Linux]]></category> <category><![CDATA[Programming]]></category> <category><![CDATA[Solr]]></category> <category><![CDATA[commit]]></category> <category><![CDATA[directory]]></category> <category><![CDATA[exception]]></category> <category><![CDATA[exists]]></category> <category><![CDATA[Java]]></category> <category><![CDATA[limit]]></category> <category><![CDATA[list]]></category> <category><![CDATA[null]]></category> <category><![CDATA[open file]]></category> <category><![CDATA[optimize]]></category> <category><![CDATA[ulimit]]></category> <guid
isPermaLink="false">http://beerpla.net/2009/09/21/solr-how-to-fix-java-io-ioexception-directory-foo-exists-and-is-a-directory-but-cannot-be-listed-list-returned-null/</guid> <description><![CDATA[<h2>The Problem</h2><p>I am throwing up a quick post about a relatively cryptic error that Solr started throwing the other day here at Plaxo. After happily running for a few days, I suddenly started getting pages about failed Solr indexing.</p><p>Upon closer examination, I saw the following repeatedly in the log file:</p><div
class="wp_syntax"><div
class="code"><pre>catalina.2009-09-18.log:SEVERE: java.io.IOException: directory 'DATADIR/index'
exists and is a directory, but cannot be listed: list() returned null</pre></div></div><p>I tried to see if sending an <a
href="http://www.google.com/search?q=site:wiki.apache.org+solr+optimize" rel="nofollow">OPTIMIZE</a> command would help but the server returned the same response.</p><h2>Digging Deeper</h2><p>The reason was these errors was quite simple &#8211; Solr was running into the system level limit on allowed number of open files (ulimit). This limit can be seen by running</p><div
class="wp_syntax"><div
class="code"><pre>ulimit </pre></div>...<div
class=clear></div> <a
href="http://beerpla.net/2009/09/21/solr-how-to-fix-java-io-ioexception-directory-foo-exists-and-is-a-directory-but-cannot-be-listed-list-returned-null/" class="read_more"><div
class=excerpt-end>Read the rest of this article &#187;</div></a></div>]]></description> <content:encoded><![CDATA[<h2>The Problem</h2><p>I am throwing up a quick post about a relatively cryptic error that Solr started throwing the other day here at Plaxo. After happily running for a few days, I suddenly started getting pages about failed Solr indexing.</p><p>Upon closer examination, I saw the following repeatedly in the log file:</p><div
class="wp_syntax"><div
class="code"><pre>catalina.2009-09-18.log:SEVERE: java.io.IOException: directory 'DATADIR/index'
exists and is a directory, but cannot be listed: list() returned null</pre></div></div><p>I tried to see if sending an <a
href="http://www.google.com/search?q=site:wiki.apache.org+solr+optimize" rel="nofollow">OPTIMIZE</a> command would help but the server returned the same response.</p><h2>Digging Deeper</h2><p>The reason was these errors was quite simple &#8211; Solr was running into the system level limit on allowed number of open files (ulimit). This limit can be seen by running</p><div
class="wp_syntax"><div
class="code"><pre>ulimit -n
1024</pre></div></div><p>or simply</p><div
class="wp_syntax"><div
class="code"><pre>ulimit -a | grep 'open files'
open files                      (-n) 1024</pre></div></div><p>This means that if a process tries to open that many files at the same time, the kernel will prohibit opening any more, which in my case caused the Java IOException.</p><p>In my case, I haven&#039;t been using the Solr OPTIMIZE command for a while, so after a lot of <a
href="http://www.google.com/search?q=site:wiki.apache.org+solr+commit" rel="nofollow">COMMIT</a>s, the Solr data got pretty fragmented, thus hitting the open files limit.</p><h2>The Solution</h2><p>There are 2 things to be done here:</p><ol><li>OPTIMIZE Solr more often. When an OPTIMIZE occurs, multiple index files are merged into 1, thus reducing the number of files that need to be opened. However, before you can OPTIMIZE, you have to raise the allowed number of open files (see the next bullet).</p></li><li>Set a higher open files limit for the user that runs Solr (in my case, the <strong><em>solr</em></strong> user) &#8211; for example to 4096 instead of 1024. One way to do it is by adding a file /etc/security/limits.d/solr.conf with the following contents:<p></p><div
class="wp_syntax"><div
class="code"><pre>solr hard nofile 4096
solr soft nofile 4096</pre></div></div><p>and then logging out and back in. The file should be automatically loaded, which you can verify by running the ulimit commands from the section above.</li></ol><p>Happy Solring!</p><p>By the way, here&#039;s a really good resource for Solr 1.4 that just came out: <a
href="http://www.amazon.com/dp/1847195881/?tag=beepla-20">Solr 1.4 Enterprise Search</a>. I have this book and it&#039;s quite helpful in explaining such topics as multicore setup, search methods, replication, etc.</p><p
align="center"><iframe
style="width: 120px; height: 240px" marginheight="0" src="http://rcm.amazon.com/e/cm?lt1=_blank&amp;bc1=000000&amp;IS2=1&amp;bg1=FFFFFF&amp;fc1=000000&amp;lc1=0000FF&amp;t=beepla-20&amp;o=1&amp;p=8&amp;l=as1&amp;m=amazon&amp;f=ifr&amp;md=10FE9736YVPPT7A0FBG2&amp;asins=1847195881" frameborder="0" marginwidth="0" scrolling="no"></iframe></p><div
class="shr-bookmarks shr-bookmarks-expand"><ul
class="socials"><li
class="shr-twitter"> <a
href="http://www.shareaholic.com/api/share/?title=%5BSolr%5D+How+To+Fix+java.io.IOException%3A+directory+FOO+exists+and+is+a+directory%2C+but+cannot+be+listed%3A+list%28%29+returned+null&amp;link=http://beerpla.net/2009/09/21/solr-how-to-fix-java-io-ioexception-directory-foo-exists-and-is-a-directory-but-cannot-be-listed-list-returned-null/&amp;notes=The%20Problem%20%20I%20am%20throwing%20up%20a%20quick%20post%20about%20a%20relatively%20cryptic%20error%20that%20Solr%20started%20throwing%20the%20other%20day%20here%20at%20Plaxo.%20After%20happily%20running%20for%20a%20few%20days%2C%20I%20suddenly%20started%20getting%20pages%20about%20failed%20Solr%20indexing.%20%20Upon%20closer%20examination%2C%20I%20saw%20the%20following%20repeatedly%20in%20the%20log%20f&amp;short_link=http://bit.ly/dBdAdi&amp;v=1&amp;apitype=1&amp;apikey=8afa39428933be41f8afdb8ea21a495c&amp;source=Shareaholic&amp;template=%24%7Btitle%7D+-+%24%7Bshort_link%7D&amp;service=7&amp;tags=&amp;ctype=" rel="nofollow" class="external" title="Tweet This!">Tweet This!</a></li><li
class="shr-facebook"> <a
href="http://www.shareaholic.com/api/share/?title=%5BSolr%5D+How+To+Fix+java.io.IOException%3A+directory+FOO+exists+and+is+a+directory%2C+but+cannot+be+listed%3A+list%28%29+returned+null&amp;link=http://beerpla.net/2009/09/21/solr-how-to-fix-java-io-ioexception-directory-foo-exists-and-is-a-directory-but-cannot-be-listed-list-returned-null/&amp;notes=The%20Problem%20%20I%20am%20throwing%20up%20a%20quick%20post%20about%20a%20relatively%20cryptic%20error%20that%20Solr%20started%20throwing%20the%20other%20day%20here%20at%20Plaxo.%20After%20happily%20running%20for%20a%20few%20days%2C%20I%20suddenly%20started%20getting%20pages%20about%20failed%20Solr%20indexing.%20%20Upon%20closer%20examination%2C%20I%20saw%20the%20following%20repeatedly%20in%20the%20log%20f&amp;short_link=http://bit.ly/dBdAdi&amp;v=1&amp;apitype=1&amp;apikey=8afa39428933be41f8afdb8ea21a495c&amp;source=Shareaholic&amp;template=&amp;service=5&amp;tags=&amp;ctype=" rel="nofollow" title="Share this on Facebook">Share this on Facebook</a></li><li
class="shr-googlebuzz"> <a
href="http://www.shareaholic.com/api/share/?title=%5BSolr%5D+How+To+Fix+java.io.IOException%3A+directory+FOO+exists+and+is+a+directory%2C+but+cannot+be+listed%3A+list%28%29+returned+null&amp;link=http://beerpla.net/2009/09/21/solr-how-to-fix-java-io-ioexception-directory-foo-exists-and-is-a-directory-but-cannot-be-listed-list-returned-null/&amp;notes=The%20Problem%20%20I%20am%20throwing%20up%20a%20quick%20post%20about%20a%20relatively%20cryptic%20error%20that%20Solr%20started%20throwing%20the%20other%20day%20here%20at%20Plaxo.%20After%20happily%20running%20for%20a%20few%20days%2C%20I%20suddenly%20started%20getting%20pages%20about%20failed%20Solr%20indexing.%20%20Upon%20closer%20examination%2C%20I%20saw%20the%20following%20repeatedly%20in%20the%20log%20f&amp;short_link=http://bit.ly/dBdAdi&amp;v=1&amp;apitype=1&amp;apikey=8afa39428933be41f8afdb8ea21a495c&amp;source=Shareaholic&amp;template=&amp;service=257&amp;tags=&amp;ctype=" rel="nofollow" class="external" title="Post on Google Buzz">Post on Google Buzz</a></li><li
class="shr-reddit"> <a
href="http://www.shareaholic.com/api/share/?title=%5BSolr%5D+How+To+Fix+java.io.IOException%3A+directory+FOO+exists+and+is+a+directory%2C+but+cannot+be+listed%3A+list%28%29+returned+null&amp;link=http://beerpla.net/2009/09/21/solr-how-to-fix-java-io-ioexception-directory-foo-exists-and-is-a-directory-but-cannot-be-listed-list-returned-null/&amp;notes=The%20Problem%20%20I%20am%20throwing%20up%20a%20quick%20post%20about%20a%20relatively%20cryptic%20error%20that%20Solr%20started%20throwing%20the%20other%20day%20here%20at%20Plaxo.%20After%20happily%20running%20for%20a%20few%20days%2C%20I%20suddenly%20started%20getting%20pages%20about%20failed%20Solr%20indexing.%20%20Upon%20closer%20examination%2C%20I%20saw%20the%20following%20repeatedly%20in%20the%20log%20f&amp;short_link=http://bit.ly/dBdAdi&amp;v=1&amp;apitype=1&amp;apikey=8afa39428933be41f8afdb8ea21a495c&amp;source=Shareaholic&amp;template=&amp;service=40&amp;tags=&amp;ctype=" rel="nofollow" class="external" title="Share this on Reddit">Share this on Reddit</a></li><li
class="shr-hackernews"> <a
href="http://www.shareaholic.com/api/share/?title=%5BSolr%5D+How+To+Fix+java.io.IOException%3A+directory+FOO+exists+and+is+a+directory%2C+but+cannot+be+listed%3A+list%28%29+returned+null&amp;link=http://beerpla.net/2009/09/21/solr-how-to-fix-java-io-ioexception-directory-foo-exists-and-is-a-directory-but-cannot-be-listed-list-returned-null/&amp;notes=The%20Problem%20%20I%20am%20throwing%20up%20a%20quick%20post%20about%20a%20relatively%20cryptic%20error%20that%20Solr%20started%20throwing%20the%20other%20day%20here%20at%20Plaxo.%20After%20happily%20running%20for%20a%20few%20days%2C%20I%20suddenly%20started%20getting%20pages%20about%20failed%20Solr%20indexing.%20%20Upon%20closer%20examination%2C%20I%20saw%20the%20following%20repeatedly%20in%20the%20log%20f&amp;short_link=http://bit.ly/dBdAdi&amp;v=1&amp;apitype=1&amp;apikey=8afa39428933be41f8afdb8ea21a495c&amp;source=Shareaholic&amp;template=&amp;service=202&amp;tags=&amp;ctype=" rel="nofollow" class="external" title="Submit this to Hacker News">Submit this to Hacker News</a></li><li
class="shr-delicious"> <a
href="http://www.shareaholic.com/api/share/?title=%5BSolr%5D+How+To+Fix+java.io.IOException%3A+directory+FOO+exists+and+is+a+directory%2C+but+cannot+be+listed%3A+list%28%29+returned+null&amp;link=http://beerpla.net/2009/09/21/solr-how-to-fix-java-io-ioexception-directory-foo-exists-and-is-a-directory-but-cannot-be-listed-list-returned-null/&amp;notes=The%20Problem%20%20I%20am%20throwing%20up%20a%20quick%20post%20about%20a%20relatively%20cryptic%20error%20that%20Solr%20started%20throwing%20the%20other%20day%20here%20at%20Plaxo.%20After%20happily%20running%20for%20a%20few%20days%2C%20I%20suddenly%20started%20getting%20pages%20about%20failed%20Solr%20indexing.%20%20Upon%20closer%20examination%2C%20I%20saw%20the%20following%20repeatedly%20in%20the%20log%20f&amp;short_link=http://bit.ly/dBdAdi&amp;v=1&amp;apitype=1&amp;apikey=8afa39428933be41f8afdb8ea21a495c&amp;source=Shareaholic&amp;template=&amp;service=2&amp;tags=&amp;ctype=" rel="nofollow" class="external" title="Share this on del.icio.us">Share this on del.icio.us</a></li><li
class="shr-stumbleupon"> <a
href="http://www.shareaholic.com/api/share/?title=%5BSolr%5D+How+To+Fix+java.io.IOException%3A+directory+FOO+exists+and+is+a+directory%2C+but+cannot+be+listed%3A+list%28%29+returned+null&amp;link=http://beerpla.net/2009/09/21/solr-how-to-fix-java-io-ioexception-directory-foo-exists-and-is-a-directory-but-cannot-be-listed-list-returned-null/&amp;notes=The%20Problem%20%20I%20am%20throwing%20up%20a%20quick%20post%20about%20a%20relatively%20cryptic%20error%20that%20Solr%20started%20throwing%20the%20other%20day%20here%20at%20Plaxo.%20After%20happily%20running%20for%20a%20few%20days%2C%20I%20suddenly%20started%20getting%20pages%20about%20failed%20Solr%20indexing.%20%20Upon%20closer%20examination%2C%20I%20saw%20the%20following%20repeatedly%20in%20the%20log%20f&amp;short_link=http://bit.ly/dBdAdi&amp;v=1&amp;apitype=1&amp;apikey=8afa39428933be41f8afdb8ea21a495c&amp;source=Shareaholic&amp;template=&amp;service=38&amp;tags=&amp;ctype=" rel="nofollow" class="external" title="Stumble upon something good? Share it on StumbleUpon">Stumble upon something good? Share it on StumbleUpon</a></li><li
class="shr-mail"> <a
href="http://www.shareaholic.com/api/share/?title=%5BSolr%5D%20How%20To%20Fix%20java.io.IOException%3A%20directory%20FOO%20exists%20and%20is%20a%20directory%2C%20but%20cannot%20be%20listed%3A%20list%28%29%20returned%20null&amp;link=http://beerpla.net/2009/09/21/solr-how-to-fix-java-io-ioexception-directory-foo-exists-and-is-a-directory-but-cannot-be-listed-list-returned-null/&amp;notes=The%20Problem%20%20I%20am%20throwing%20up%20a%20quick%20post%20about%20a%20relatively%20cryptic%20error%20that%20Solr%20started%20throwing%20the%20other%20day%20here%20at%20Plaxo.%20After%20happily%20running%20for%20a%20few%20days%2C%20I%20suddenly%20started%20getting%20pages%20about%20failed%20Solr%20indexing.%20%20Upon%20closer%20examination%2C%20I%20saw%20the%20following%20repeatedly%20in%20the%20log%20f&amp;short_link=http://bit.ly/dBdAdi&amp;v=1&amp;apitype=1&amp;apikey=8afa39428933be41f8afdb8ea21a495c&amp;source=Shareaholic&amp;template=&amp;service=201&amp;tags=&amp;ctype=" rel="nofollow" class="external" title="Email this to a friend?">Email this to a friend?</a></li></ul><div
style="clear: both;"></div></div> Similar Posts:<ul><li><a
href="http://beerpla.net/2010/03/06/how-to-show-hiddeninvisible-files-in-total-commander-both-locally-and-on-an-ftp-server/" rel="bookmark" title="March 6, 2010">How To Show Hidden/Invisible Files In Total Commander, Both Locally And On An FTP Server</a></li><li><a
href="http://beerpla.net/2009/09/03/comparison-between-solr-and-sphinx-search-servers-solr-vs-sphinx-fight/" rel="bookmark" title="September 3, 2009">Comparison Between Solr And Sphinx Search Servers (Solr Vs Sphinx &#8211; Fight!)</a></li><li><a
href="http://beerpla.net/2007/08/04/watch-a-useful-linux-command-you-may-have-never-heard-of/" rel="bookmark" title="August 4, 2007">Watch &#8211; A Useful Linux Command You May Have Never Heard Of</a></li><li><a
href="http://beerpla.net/2009/04/08/perl-finding-files-the-fun-and-elegant-way/" rel="bookmark" title="April 8, 2009">[Perl] Finding Files, The Fun And Elegant Way</a></li><li><a
href="http://beerpla.net/2008/10/11/how-to-sort-folders-the-same-way-as-files-in-total-commander/" rel="bookmark" title="October 11, 2008">How To Sort Folders The Same Way As Files In Total Commander</a></li></ul><p><a
class="a2a_dd a2a_target addtoany_share_save" href="http://www.addtoany.com/share_save#url=http%3A%2F%2Fbeerpla.net%2F2009%2F09%2F21%2Fsolr-how-to-fix-java-io-ioexception-directory-foo-exists-and-is-a-directory-but-cannot-be-listed-list-returned-null%2F&amp;title=%5BSolr%5D%20How%20To%20Fix%20java.io.IOException%3A%20directory%20FOO%20exists%20and%20is%20a%20directory%2C%20but%20cannot%20be%20listed%3A%20list%28%29%20returned%20null" id="wpa2a_2"><img
src="http://beerpla.net/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share"/></a></p>]]></content:encoded> <wfw:commentRss>http://beerpla.net/2009/09/21/solr-how-to-fix-java-io-ioexception-directory-foo-exists-and-is-a-directory-but-cannot-be-listed-list-returned-null/feed/</wfw:commentRss> <slash:comments>1</slash:comments> </item> <item><title>MySQL Conference Liveblogging: Optimizing MySQL For High Volume Data Logging Applications (Thursday 2:50PM)</title><link>http://beerpla.net/2008/04/17/mysql-conference-liveblogging-optimizing-mysql-for-high-volume-data-logging-applications-thursday-250pm/</link> <comments>http://beerpla.net/2008/04/17/mysql-conference-liveblogging-optimizing-mysql-for-high-volume-data-logging-applications-thursday-250pm/#comments</comments> <pubDate>Thu, 17 Apr 2008 21:56:06 +0000</pubDate> <dc:creator>Artem Russakovskii</dc:creator> <category><![CDATA[Databases]]></category> <category><![CDATA[application]]></category> <category><![CDATA[conference]]></category> <category><![CDATA[high volume]]></category> <category><![CDATA[logging]]></category> <category><![CDATA[MySQL]]></category> <category><![CDATA[optimize]]></category> <category><![CDATA[scale]]></category> <guid
isPermaLink="false">http://beerpla.net/2008/04/17/mysql-conference-liveblogging-optimizing-mysql-for-high-volume-data-logging-applications-thursday-250pm/</guid> <description><![CDATA[<ul><li><a
title="http://en.oreilly.com/mysql2008/public/schedule/detail/874" href="http://en.oreilly.com/mysql2008/public/schedule/detail/874">http://en.oreilly.com/mysql2008/public/schedule/detail/874</a></li><li>presented by <a
href="http://en.oreilly.com/mysql2008/public/schedule/speaker/1287">Charles Lee</a> of <a
href="http://hyperic.com/">Hyperic</a></li><li>Hyperic has the best performance with MySQL out of MySQL, Oracle, and Postgres in their application</li><li><em>I suddenly remember hyperic was highly recommended above nagios in </em><a
href="http://beerpla.net/2008/04/16/mysql-conference-liveblogging-monitoring-tools-wednesday-515pm/"><em>MySQL Conference Liveblogging: Monitoring Tools (Wednesday 5:15PM)</em></a></li><li>performance bottleneck</li></ul><ul><li>the database</li></ul><ul><li>CPU</li><li>memory</li></ul><li>IO</li><ul><li>disk latency</li><li>network latency</li></ul><li>slow queries</li><li>media size deployment example</li><ul><li>300 platforms (300 remote agents collecting data)</li><li>2,100 servers</li><li>21,000 services (10 services per server), <em>sounds feasible</em></li><li>468,000 metrics (20 metrics per service)</li><li>28,800,000 metric data rows per day</li><li>larger deployments have a lot more of these (<em>sounds crazy</em>)</li></ul><li>data</li><ul><li>measurement_id</li><li>timestamp</li><li>value</li><li>primary key (timestamp, measurement_id)</li></ul><li>data flow</li><ul><li>agent collects data and sends reports to server with multiple data points</li>...<div
class=clear></div> <a
href="http://beerpla.net/2008/04/17/mysql-conference-liveblogging-optimizing-mysql-for-high-volume-data-logging-applications-thursday-250pm/" class="read_more"><div
class=excerpt-end>Read the rest of this article &#187;</div></a></ul>]]></description> <content:encoded><![CDATA[<ul><li><a
title="http://en.oreilly.com/mysql2008/public/schedule/detail/874" href="http://en.oreilly.com/mysql2008/public/schedule/detail/874">http://en.oreilly.com/mysql2008/public/schedule/detail/874</a></li><li>presented by <a
href="http://en.oreilly.com/mysql2008/public/schedule/speaker/1287">Charles Lee</a> of <a
href="http://hyperic.com/">Hyperic</a></li><li>Hyperic has the best performance with MySQL out of MySQL, Oracle, and Postgres in their application</li><li><em>I suddenly remember hyperic was highly recommended above nagios in </em><a
href="http://beerpla.net/2008/04/16/mysql-conference-liveblogging-monitoring-tools-wednesday-515pm/"><em>MySQL Conference Liveblogging: Monitoring Tools (Wednesday 5:15PM)</em></a></li><li>performance bottleneck</li><ul><li>the database</li><ul><li>CPU</li><li>memory</li></ul><li>IO</li><ul><li>disk latency</li><li>network latency</li></ul><li>slow queries</li></ul><li>media size deployment example</li><ul><li>300 platforms (300 remote agents collecting data)</li><li>2,100 servers</li><li>21,000 services (10 services per server), <em>sounds feasible</em></li><li>468,000 metrics (20 metrics per service)</li><li>28,800,000 metric data rows per day</li><li>larger deployments have a lot more of these (<em>sounds crazy</em>)</li></ul><li>data</li><ul><li>measurement_id</li><li>timestamp</li><li>value</li><li>primary key (timestamp, measurement_id)</li></ul><li>data flow</li><ul><li>agent collects data and sends reports to server with multiple data points</li><li>server batch inserts metric data points</li><li>if network connection fails, agent continues to collect but server &#034;backfills&#034; unavailable</li><li>when agent reconnects, spooled data overwrite backfilled data points (<em>why not use REPLACE for all inserts?</em>)</li></ul><li><em>things are very basic so far</em></li><li>batch insert</li><ul><li>INSERT INTO TABLE (a,b,c) VALUES (0,0,0), (1,1,1),&#8230;</li><li>using MySQL batch insert statements vs prepared statements with multiple queries in other databases seems to improve overall performance by 30%</li><li>batch inserts are limited by &#039;max_allowed_packet&#039;</li></ul><li>other options for increasing insert speed</li><ul><li>set unique_checks=0, insert, set unique_checks=1 (<em>definitely need to make sure data is valid first</em>)</li><li>set foreign_key_checks=0, insert, set foreign_key_checks=1 (<em>same concerns as above</em>)</li><li>Hyperic doesn&#039;t use the 2 above</li></ul><li>INSERT &#8230; ON DUPLICATE KEY UPDATE</li><ul><li>when regular INSERT fails, retry batch with INSERT ON DUPLICATE KEY syntax</li><li>it&#039;s much slower but it allows</li></ul><li><em>this is all basic, where are the performance tweaks?!</em></li><li>batch aggregate inserter</li><ul><li>queue metric data from separate agent reports</li><ul><li>minimize number of inserts, connections, CPU load</li><li>maximize workload efficiency</li></ul><li>optimal configuration for 700 agents</li><ul><li>3 workers</li><li>2000 batch size seems to work best</li><li>queue size of 4,000,000</li></ul><li>this seems to peak at 2.2mil metric data inserts per minute</li></ul><li>data consolidation</li><ul><li>inspired by rrdtool</li><li>lower resolution tables track min, avg, and max</li><li>data compression runs hourly</li><li>size limit 2 days</li><li>every hour, data is rolled up into another table that holds hourly aggregated values with size limit 14 days, then that one gets rolled up into a monthly table, etc</li><li><em>this is is a good approach if you don&#039;t care about each data point</em></li></ul><li><em>I&#039;m overwhelmed by the amount of &#034;you know&#034;s from the speaker. Parasite words, ahh! Sorry Charles <img
src='http://beerpla.net/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </em></li><li>software partitioning</li><ul><li>measurement data split into 18 tables, representing 9 days (2 per day)</li><li>they didn&#039;t want to do more than 2 SELECTs to get data per day, hence such sharding</li><li><em>oddly, Charles didn&#039;t actually use the word &#039;shard&#039; once</em></li><li>tables truncated, rather than deleting rows =&gt; huge performance boost</li><li>truncation vs deletion</li><ul><li>deletion causes contention on rows</li><li>truncation doesn&#039;t produce fragmentation</li><li>truncation just drops and recreates the table &#8211; single DDL operation</li></ul></ul><li>indexes</li><ul><li>every <strong>InnoDB</strong> table has a special index called the <strong>clustered index</strong> (based on primary key) where the physical data for the rows is stored</li><li>advantages</li><ul><li>selects faster &#8211; row data is on the same page where the index search leads</li><li>inserts in (timestamp) order &#8211; avoid page splits and fragmentation</li></ul><li>shows comparison between non-clustered index and clustered index (see slides)</li></ul><li><em>still no mention of configuration tweaks</em></li><li>UNION ALL works better than inner SELECTS because the optimizer didn&#039;t optimize them enough (at least in the version these guys are using, not sure which)</li><li><em>recommended server options are on the very last slide, I was waiting for those the most! I guess I&#039;ll look up the slides after</em></li></ul><div
class="shr-bookmarks shr-bookmarks-expand"><ul
class="socials"><li
class="shr-twitter"> <a
href="http://www.shareaholic.com/api/share/?title=MySQL+Conference+Liveblogging%3A+Optimizing+MySQL+For+High+Volume+Data+Logging+Applications+%28Thursday+2%3A50PM%29&amp;link=http://beerpla.net/2008/04/17/mysql-conference-liveblogging-optimizing-mysql-for-high-volume-data-logging-applications-thursday-250pm/&amp;notes=%20http%3A%2F%2Fen.oreilly.com%2Fmysql2008%2Fpublic%2Fschedule%2Fdetail%2F874%20presented%20by%20Charles%20Lee%20of%20Hyperic%20Hyperic%20has%20the%20best%20performance%20with%20MySQL%20out%20of%20MySQL%2C%20Oracle%2C%20and%20Postgres%20in%20their%20application%20I%20suddenly%20remember%20hyperic%20was%20highly%20recommended%20above%20nagios%20in%20MySQL%20Conference%20Liveblogging%3A%20Monito&amp;short_link=http://bit.ly/cmLynB&amp;v=1&amp;apitype=1&amp;apikey=8afa39428933be41f8afdb8ea21a495c&amp;source=Shareaholic&amp;template=%24%7Btitle%7D+-+%24%7Bshort_link%7D&amp;service=7&amp;tags=&amp;ctype=" rel="nofollow" class="external" title="Tweet This!">Tweet This!</a></li><li
class="shr-facebook"> <a
href="http://www.shareaholic.com/api/share/?title=MySQL+Conference+Liveblogging%3A+Optimizing+MySQL+For+High+Volume+Data+Logging+Applications+%28Thursday+2%3A50PM%29&amp;link=http://beerpla.net/2008/04/17/mysql-conference-liveblogging-optimizing-mysql-for-high-volume-data-logging-applications-thursday-250pm/&amp;notes=%20http%3A%2F%2Fen.oreilly.com%2Fmysql2008%2Fpublic%2Fschedule%2Fdetail%2F874%20presented%20by%20Charles%20Lee%20of%20Hyperic%20Hyperic%20has%20the%20best%20performance%20with%20MySQL%20out%20of%20MySQL%2C%20Oracle%2C%20and%20Postgres%20in%20their%20application%20I%20suddenly%20remember%20hyperic%20was%20highly%20recommended%20above%20nagios%20in%20MySQL%20Conference%20Liveblogging%3A%20Monito&amp;short_link=http://bit.ly/cmLynB&amp;v=1&amp;apitype=1&amp;apikey=8afa39428933be41f8afdb8ea21a495c&amp;source=Shareaholic&amp;template=&amp;service=5&amp;tags=&amp;ctype=" rel="nofollow" title="Share this on Facebook">Share this on Facebook</a></li><li
class="shr-googlebuzz"> <a
href="http://www.shareaholic.com/api/share/?title=MySQL+Conference+Liveblogging%3A+Optimizing+MySQL+For+High+Volume+Data+Logging+Applications+%28Thursday+2%3A50PM%29&amp;link=http://beerpla.net/2008/04/17/mysql-conference-liveblogging-optimizing-mysql-for-high-volume-data-logging-applications-thursday-250pm/&amp;notes=%20http%3A%2F%2Fen.oreilly.com%2Fmysql2008%2Fpublic%2Fschedule%2Fdetail%2F874%20presented%20by%20Charles%20Lee%20of%20Hyperic%20Hyperic%20has%20the%20best%20performance%20with%20MySQL%20out%20of%20MySQL%2C%20Oracle%2C%20and%20Postgres%20in%20their%20application%20I%20suddenly%20remember%20hyperic%20was%20highly%20recommended%20above%20nagios%20in%20MySQL%20Conference%20Liveblogging%3A%20Monito&amp;short_link=http://bit.ly/cmLynB&amp;v=1&amp;apitype=1&amp;apikey=8afa39428933be41f8afdb8ea21a495c&amp;source=Shareaholic&amp;template=&amp;service=257&amp;tags=&amp;ctype=" rel="nofollow" class="external" title="Post on Google Buzz">Post on Google Buzz</a></li><li
class="shr-reddit"> <a
href="http://www.shareaholic.com/api/share/?title=MySQL+Conference+Liveblogging%3A+Optimizing+MySQL+For+High+Volume+Data+Logging+Applications+%28Thursday+2%3A50PM%29&amp;link=http://beerpla.net/2008/04/17/mysql-conference-liveblogging-optimizing-mysql-for-high-volume-data-logging-applications-thursday-250pm/&amp;notes=%20http%3A%2F%2Fen.oreilly.com%2Fmysql2008%2Fpublic%2Fschedule%2Fdetail%2F874%20presented%20by%20Charles%20Lee%20of%20Hyperic%20Hyperic%20has%20the%20best%20performance%20with%20MySQL%20out%20of%20MySQL%2C%20Oracle%2C%20and%20Postgres%20in%20their%20application%20I%20suddenly%20remember%20hyperic%20was%20highly%20recommended%20above%20nagios%20in%20MySQL%20Conference%20Liveblogging%3A%20Monito&amp;short_link=http://bit.ly/cmLynB&amp;v=1&amp;apitype=1&amp;apikey=8afa39428933be41f8afdb8ea21a495c&amp;source=Shareaholic&amp;template=&amp;service=40&amp;tags=&amp;ctype=" rel="nofollow" class="external" title="Share this on Reddit">Share this on Reddit</a></li><li
class="shr-hackernews"> <a
href="http://www.shareaholic.com/api/share/?title=MySQL+Conference+Liveblogging%3A+Optimizing+MySQL+For+High+Volume+Data+Logging+Applications+%28Thursday+2%3A50PM%29&amp;link=http://beerpla.net/2008/04/17/mysql-conference-liveblogging-optimizing-mysql-for-high-volume-data-logging-applications-thursday-250pm/&amp;notes=%20http%3A%2F%2Fen.oreilly.com%2Fmysql2008%2Fpublic%2Fschedule%2Fdetail%2F874%20presented%20by%20Charles%20Lee%20of%20Hyperic%20Hyperic%20has%20the%20best%20performance%20with%20MySQL%20out%20of%20MySQL%2C%20Oracle%2C%20and%20Postgres%20in%20their%20application%20I%20suddenly%20remember%20hyperic%20was%20highly%20recommended%20above%20nagios%20in%20MySQL%20Conference%20Liveblogging%3A%20Monito&amp;short_link=http://bit.ly/cmLynB&amp;v=1&amp;apitype=1&amp;apikey=8afa39428933be41f8afdb8ea21a495c&amp;source=Shareaholic&amp;template=&amp;service=202&amp;tags=&amp;ctype=" rel="nofollow" class="external" title="Submit this to Hacker News">Submit this to Hacker News</a></li><li
class="shr-delicious"> <a
href="http://www.shareaholic.com/api/share/?title=MySQL+Conference+Liveblogging%3A+Optimizing+MySQL+For+High+Volume+Data+Logging+Applications+%28Thursday+2%3A50PM%29&amp;link=http://beerpla.net/2008/04/17/mysql-conference-liveblogging-optimizing-mysql-for-high-volume-data-logging-applications-thursday-250pm/&amp;notes=%20http%3A%2F%2Fen.oreilly.com%2Fmysql2008%2Fpublic%2Fschedule%2Fdetail%2F874%20presented%20by%20Charles%20Lee%20of%20Hyperic%20Hyperic%20has%20the%20best%20performance%20with%20MySQL%20out%20of%20MySQL%2C%20Oracle%2C%20and%20Postgres%20in%20their%20application%20I%20suddenly%20remember%20hyperic%20was%20highly%20recommended%20above%20nagios%20in%20MySQL%20Conference%20Liveblogging%3A%20Monito&amp;short_link=http://bit.ly/cmLynB&amp;v=1&amp;apitype=1&amp;apikey=8afa39428933be41f8afdb8ea21a495c&amp;source=Shareaholic&amp;template=&amp;service=2&amp;tags=&amp;ctype=" rel="nofollow" class="external" title="Share this on del.icio.us">Share this on del.icio.us</a></li><li
class="shr-stumbleupon"> <a
href="http://www.shareaholic.com/api/share/?title=MySQL+Conference+Liveblogging%3A+Optimizing+MySQL+For+High+Volume+Data+Logging+Applications+%28Thursday+2%3A50PM%29&amp;link=http://beerpla.net/2008/04/17/mysql-conference-liveblogging-optimizing-mysql-for-high-volume-data-logging-applications-thursday-250pm/&amp;notes=%20http%3A%2F%2Fen.oreilly.com%2Fmysql2008%2Fpublic%2Fschedule%2Fdetail%2F874%20presented%20by%20Charles%20Lee%20of%20Hyperic%20Hyperic%20has%20the%20best%20performance%20with%20MySQL%20out%20of%20MySQL%2C%20Oracle%2C%20and%20Postgres%20in%20their%20application%20I%20suddenly%20remember%20hyperic%20was%20highly%20recommended%20above%20nagios%20in%20MySQL%20Conference%20Liveblogging%3A%20Monito&amp;short_link=http://bit.ly/cmLynB&amp;v=1&amp;apitype=1&amp;apikey=8afa39428933be41f8afdb8ea21a495c&amp;source=Shareaholic&amp;template=&amp;service=38&amp;tags=&amp;ctype=" rel="nofollow" class="external" title="Stumble upon something good? Share it on StumbleUpon">Stumble upon something good? Share it on StumbleUpon</a></li><li
class="shr-mail"> <a
href="http://www.shareaholic.com/api/share/?title=MySQL%20Conference%20Liveblogging%3A%20Optimizing%20MySQL%20For%20High%20Volume%20Data%20Logging%20Applications%20%28Thursday%202%3A50PM%29&amp;link=http://beerpla.net/2008/04/17/mysql-conference-liveblogging-optimizing-mysql-for-high-volume-data-logging-applications-thursday-250pm/&amp;notes=%20http%3A%2F%2Fen.oreilly.com%2Fmysql2008%2Fpublic%2Fschedule%2Fdetail%2F874%20presented%20by%20Charles%20Lee%20of%20Hyperic%20Hyperic%20has%20the%20best%20performance%20with%20MySQL%20out%20of%20MySQL%2C%20Oracle%2C%20and%20Postgres%20in%20their%20application%20I%20suddenly%20remember%20hyperic%20was%20highly%20recommended%20above%20nagios%20in%20MySQL%20Conference%20Liveblogging%3A%20Monito&amp;short_link=http://bit.ly/cmLynB&amp;v=1&amp;apitype=1&amp;apikey=8afa39428933be41f8afdb8ea21a495c&amp;source=Shareaholic&amp;template=&amp;service=201&amp;tags=&amp;ctype=" rel="nofollow" class="external" title="Email this to a friend?">Email this to a friend?</a></li></ul><div
style="clear: both;"></div></div> Similar Posts:<ul><li><a
href="http://beerpla.net/2009/05/11/mysql-deletingupdating-rows-common-to-2-tables-speed-and-slave-lag-considerations/" rel="bookmark" title="May 11, 2009">[MySQL] Deleting/Updating Rows Common To 2 Tables &#8211; Speed And Slave Lag Considerations</a></li><li><a
href="http://beerpla.net/2008/04/15/mysql-conference-liveblogging-explain-demystified-tuesday-200p/" rel="bookmark" title="April 15, 2008">MySQL Conference Liveblogging: EXPLAIN Demystified (Tuesday 2:00PM)</a></li><li><a
href="http://beerpla.net/2008/04/15/mysql-conference-liveblogging-performance-guide-for-mysql-cluster-tuesday-1050am/" rel="bookmark" title="April 15, 2008">MySQL Conference Liveblogging: Performance Guide For MySQL Cluster (Tuesday 10:50AM)</a></li><li><a
href="http://beerpla.net/2009/02/17/swapping-column-values-in-mysql/" rel="bookmark" title="February 17, 2009">Swapping Column Values in MySQL</a></li><li><a
href="http://beerpla.net/2009/03/18/mysql-indexing-considerations-of-implementing-a-priority-field-in-your-application/" rel="bookmark" title="March 18, 2009">MySQL Indexing Considerations Of Implementing A Priority Field In Your Application</a></li></ul><p><a
class="a2a_dd a2a_target addtoany_share_save" href="http://www.addtoany.com/share_save#url=http%3A%2F%2Fbeerpla.net%2F2008%2F04%2F17%2Fmysql-conference-liveblogging-optimizing-mysql-for-high-volume-data-logging-applications-thursday-250pm%2F&amp;title=MySQL%20Conference%20Liveblogging%3A%20Optimizing%20MySQL%20For%20High%20Volume%20Data%20Logging%20Applications%20%28Thursday%202%3A50PM%29" id="wpa2a_4"><img
src="http://beerpla.net/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share"/></a></p>]]></content:encoded> <wfw:commentRss>http://beerpla.net/2008/04/17/mysql-conference-liveblogging-optimizing-mysql-for-high-volume-data-logging-applications-thursday-250pm/feed/</wfw:commentRss> <slash:comments>1</slash:comments> </item> </channel> </rss>
