Updated: September 28th, 2009
The Problem
I am throwing up a quick post about a relatively cryptic error that Solr started throwing the other day here at Plaxo. After happily running for a few days, I suddenly started getting pages about failed Solr indexing.
Upon closer examination, I saw the following repeatedly in the log file:
catalina.2009-09-18.log:SEVERE: java.io.IOException: directory 'DATADIR/index' exists and is a directory, but cannot be listed: list() returned null
I tried to see if sending an OPTIMIZE command would help but the server returned the same response.
Digging Deeper
The reason was these errors was quite simple – Solr was running into the system level limit on allowed number of open files (ulimit). This limit can be seen by running
MySQL Conference Liveblogging: Optimizing MySQL For High Volume Data Logging Applications (Thursday 2:50PM)
- http://en.oreilly.com/mysql2008/public/schedule/detail/874
- presented by Charles Lee of Hyperic
- Hyperic has the best performance with MySQL out of MySQL, Oracle, and Postgres in their application
- I suddenly remember hyperic was highly recommended above nagios in MySQL Conference Liveblogging: Monitoring Tools (Wednesday 5:15PM)
- performance bottleneck
- the database
- CPU
- memory
- disk latency
- network latency
- 300 platforms (300 remote agents collecting data)
- 2,100 servers
- 21,000 services (10 services per server), sounds feasible
- 468,000 metrics (20 metrics per service)
- 28,800,000 metric data rows per day
- larger deployments have a lot more of these (sounds crazy)
- measurement_id
- timestamp
- value
- primary key (timestamp, measurement_id)
- agent collects data and sends reports to server with multiple data points
…
beer planet is a blog about technology, programming, computers, and geek life. It is run by Artem Russakovskii - a local San Francisco geek who is currently pursuing his own projects and regularly enjoys hacking Android, PHP, CSS, Javascript, AJAX, Perl, and regular expressions, working on Wordpress plugins and tools, tweaking MySQL queries and server settings, administering Linux machines, blogging, learning new things, and other geeky stuff.