0
MySQL Conference Liveblogging: Performance Guide For MySQL Cluster (Tuesday 10:50AM)
Posted by Artem Russakovskii on April 15th, 2008 in Databases
- Speaker: Mikael Ronstrom, PhD, the creator of the Cluster engine
- Explains the cluster structure
- Aspects of performance
- Response times
- Throughput
- Low variation of response times
- Improving performance
- use low level API (NDB API), expensive, hard
- use new features in MySQL Cluster Carrier Grade Edition 6.3 (currently 6.3.13), more on this later
- proper partitioning of tables, minimize communication
- use of hardware
- NDB API is a C++ record access API
- supports sending parallel record operations within the same transaction or in different transactions
- asynchronous and synchronous
- NDB kernel is programmed entirely asynchronously
- Looking at performance
- Fire synchronous insert transactions – 10x TCP/IP time cost
- Five inserts in one synchronous transaction – 2x TCP/IP time cost
- Five asynchronous insert transactions – 2x TCP/IP time cost
- Case study
- develop prototype using MySQL C API – performance X, response time Y
- develop same functionality using synchronous NDB API – performance 3X, response time ~0.5Y
- develop same functionality using asynchronous NDB API – performance 6X, response time ~0.25Y
- Conclusion on when to use NDB API
- performance is critical, need speed, response time, etc
- queries are not very complex
- Conclusion on when not to use NDB API
- when design time is critical
- when complex queries are executed, the MySQL optimizer may handle them better
- New features of MySQL Cluster Carrier Grade Edition 6.3.13
- polling based communication
- CPU used heavily even at lower throughput
- avoids interrupt and wake-up delays for new messages
- some good results in benchmarks
- decreases performance when CPU is the limiting factor
- 10% performance improvement on 2, 4, and 8 data node clusters
- 20% improvement if using Dolphin Express
- epoll replacing select system calls (Linux)
- improved performance 20% on a 32-node cluster
- send buffer gathering
- real-time scheduler for threads
- lock threads to CPU
- distribution awareness
- 100-200% improvement when application is distribution aware
- avoid read before Update/Delete with PK
- UPDATE t SET a=const1 WHERE pk=x;
- no need to do a read before UPDATE, all data is already known
- ~10% improvement
- old 'truths' revisited
- previous recommendation was to run 1 data node per computer
- this was due to bugs, which are now fixed
- partitioning tricks
- if there is a table that has a lot of index scans (not primary key) on it, partitioning this table to only be in one node group can be a good idea
- partition syntax for this: PARTITION BY KEY (id) (PARTITION p0 NODEGROUP 0);
- new performance features in MySQL Cluster 5.0
- lock memory in main memory – ensure no swapping occurs in NDB kernel
- batching IN (…) with primary keys
- 100x SELECT FROM t WHERE pk=x;
- SELECT * FROM t WHERE pk IN (x1, …, x100)
- IN-statement is around 10x faster
- use of multi-INSERT
- similar 10x speedup
- new features in MySQL Cluster CGE version 6.4 (beta, only available in bitkeeper for now)
- multi-threaded data nodes – currently no benefit using DBT2 but 40% increase in throughput for some NDB API benchmarks
- DBT2 improvements to follow later
- use of hardware, CPU choice
- Pentium D @ 2.8Ghz -> Core 2 Duo @ 2.8Ghz => 75% improvement
- doubling L2 cache doubles thread scalability
- choice of Dolphin Express interconnect increases throughput 10-400%
- scalability of DBT2 threads
- 1-2-4 threads – linear
- 4-8 threads – 40-70%
- 8-16 threads – 10-30%
- decreasing scalability over 16 threads
- current recommendation by Mikael himself: use twice as many SQL nodes as data nodes
- future software performance improvements
- batched key access – 0-400% performance improvement
- improvement scan protocol – ~15% improvement
- incremental backups
- optimized backup code
- parallel I/O on index scans using disk data
- Niagara-II benchmark from 2002
- simple read, simple update, both transactional
- 72-CPU Sunfire 15k, 256GB RAM
- CPUs: ultra sparc-III @ 900Mhz
- 32-node NDB Cluster, 1 data node locked to 1 CPU
- db size 88GB, 900 mil records
- simple reads 1.5mil reads per second
- simple update 340,000 per second
- Everyone is overwhelmed, so no questions are asked
In the meantime, if you found this article useful, feel free to buy me a cup of coffee below.