0						
						
					MySQL Conference Liveblogging: Performance Guide For MySQL Cluster (Tuesday 10:50AM)
Posted by Artem Russakovskii on April 15th, 2008 in Databases 
											
- Speaker: Mikael Ronstrom, PhD, the creator of the Cluster engine
 - Explains the cluster structure
 - Aspects of performance
- Response times
 - Throughput
 - Low variation of response times
 
 - Improving performance
- use low level API (NDB API), expensive, hard
 - use new features in MySQL Cluster Carrier Grade Edition 6.3 (currently 6.3.13), more on this later
 - proper partitioning of tables, minimize communication
 - use of hardware
 
 - NDB API is a C++ record access API
- supports sending parallel record operations within the same transaction or in different transactions
 - asynchronous and synchronous
 - NDB kernel is programmed entirely asynchronously
 
 - Looking at performance
- Fire synchronous insert transactions – 10x TCP/IP time cost
 - Five inserts in one synchronous transaction – 2x TCP/IP time cost
 - Five asynchronous insert transactions – 2x TCP/IP time cost
 
 - Case study
- develop prototype using MySQL C API – performance X, response time Y
 - develop same functionality using synchronous NDB API – performance 3X, response time ~0.5Y
 - develop same functionality using asynchronous NDB API – performance 6X, response time ~0.25Y
 
 - Conclusion on when to use NDB API
- performance is critical, need speed, response time, etc
 - queries are not very complex
 
 - Conclusion on when not to use NDB API
- when design time is critical
 - when complex queries are executed, the MySQL optimizer may handle them better
 
 - New features of MySQL Cluster Carrier Grade Edition 6.3.13
 - polling based communication
 - CPU used heavily even at lower throughput
 - avoids interrupt and wake-up delays for new messages
 - some good results in benchmarks
 - decreases performance when CPU is the limiting factor
 - 10% performance improvement on 2, 4, and 8 data node clusters
 - 20% improvement if using Dolphin Express
 - epoll replacing select system calls (Linux)
 - improved performance 20% on a 32-node cluster
 - send buffer gathering
 - real-time scheduler for threads
 - lock threads to CPU
 - distribution awareness
 - 100-200% improvement when application is distribution aware
 - avoid read before Update/Delete with PK
 - UPDATE t SET a=const1 WHERE pk=x;
 - no need to do a read before UPDATE, all data is already known
 - ~10% improvement
 - old 'truths' revisited
 - previous recommendation was to run 1 data node per computer
 - this was due to bugs, which are now fixed
 - partitioning tricks
 - if there is a table that has a lot of index scans (not primary key) on it, partitioning this table to only be in one node group can be a good idea
 - partition syntax for this: PARTITION BY KEY (id) (PARTITION p0 NODEGROUP 0);
 - new performance features in MySQL Cluster 5.0
 - lock memory in main memory – ensure no swapping occurs in NDB kernel
 - batching IN (…) with primary keys
 - 100x SELECT FROM t WHERE pk=x;
 - SELECT * FROM t WHERE pk IN (x1, …, x100)
 - IN-statement is around 10x faster
 - use of multi-INSERT
 - similar 10x speedup
 - new features in MySQL Cluster CGE version 6.4 (beta, only available in bitkeeper for now)
 - multi-threaded data nodes – currently no benefit using DBT2 but 40% increase in throughput for some NDB API benchmarks
 - DBT2 improvements to follow later
 - use of hardware, CPU choice
 - Pentium D @ 2.8Ghz -> Core 2 Duo @ 2.8Ghz => 75% improvement
 - doubling L2 cache doubles thread scalability
 - choice of Dolphin Express interconnect increases throughput 10-400%
 - scalability of DBT2 threads
 - 1-2-4 threads – linear
 - 4-8 threads – 40-70%
 - 8-16 threads – 10-30%
 - decreasing scalability over 16 threads
 - current recommendation by Mikael himself: use twice as many SQL nodes as data nodes
 - future software performance improvements
 - batched key access – 0-400% performance improvement
 - improvement scan protocol – ~15% improvement
 - incremental backups
 - optimized backup code
 - parallel I/O on index scans using disk data
 - Niagara-II benchmark from 2002
 - simple read, simple update, both transactional
 - 72-CPU Sunfire 15k, 256GB RAM
 - CPUs: ultra sparc-III @ 900Mhz
 - 32-node NDB Cluster, 1 data node locked to 1 CPU
 - db size 88GB, 900 mil records
 - simple reads 1.5mil reads per second
 - simple update 340,000 per second
 - Everyone is overwhelmed, so no questions are asked
 
In the meantime, if you found this article useful, feel free to buy me a cup of coffee below.
