1						
						
					MySQL Conference Liveblogging: Portable Scale-out Benchmarks For MySQL (Wednesday 10:50AM)
Posted by Artem Russakovskii on April 16th, 2008 in Databases 
											
- Robert Hodges from Continuent presents
 - About Continuent
 - leading provider of open source database availability and scaling solutions
 - solutions
 - uni/cluster – multi-master database clustering that replicates data across multiple databases and load balances reads
 - uses "database virtualization"
 - scale-out design motivation
 - protection from db and site failures
 - continuous operation during upgrades
 - how come not everyone has it already?
 - creating identical replicas across different hosts is hard
 - Brewer's conjecture
 - trade-offs
 - DDL support
 - inconsistent reads between replicas
 - deadlocks
 - sequences
 - non-deterministic SQL
 - therefore many scale-out approaches are non-transparent
 - 3 basic scale-out technologies
 - data replication
 - where are updates processed? master/master vs master/slave
 - when are updates replicated? sync vs async
 - group communication – coordinates messages between distributed processes
 - views – who is active, who is crashed, do we have quorum, etc
 - message delivery – ordering and delivery guarantees
 - proxying – virtualizes databases and hides database locations from applications
 - latency, performance?
 - 3 replication algorithms
 - master/slave – accept updates at a single master and replicate changes to one or more slaves
 - multi-master state machine – deliver a stream of updates in the same order simultaneously to a set of databases
 - certification – optimistically execute transactions on one of a number of nodes and then apply to all nodes after confirming serialization. Currently not in MySQL but developed by Continuent (presenter's company)
 - performance testing strategy
 - run appropriate tests
 - mixed load tests to check overall throughput and scaling
 - micro-benchmarks to focus on specific issues
 - use appropriate workloads
 - scale-out use profiles are often read or write intensive
 - cover key issues
 - read latency through proxies
 - read and write scaling
 - slave latency for master/slave configurations
 - group communication and replication bottlenecks
 - aborts and deadlocks
 - generate sufficient load in the right places
 - many transactions/queries
 - large data sets
 - data types
 - Bristlecone
 - http://bristlecone.continuent.org
 - open source
 - svn checkout svn://forge.continuent.org/bristlecone/trunk/bristlecone bristlecone
 - load test
 - batch transaction loading
 - micro-benchmarks
 - Bristlecone Load Testing: Evaluator
 - Java tool to generate mixed load on databases
 - similar to pgbench but works cross-DBMS (how about sysbench?)
 - can easily vary mix of select, insert, update, delete statements
 - default select statement designed to "exercise" the db
 - can choose lightweight queries as well
 - parameters are defined in a simple config file
 - can generate reports
 - shows sample config file (xml) that generates 500 clients, lasts 600 seconds. Looks quite simple but very proprietary. Examples are included in the download.
 - Evaluator Graphical Output
 - shows a graph of requests/s and response time, very standard looking, updates live while the test is running, last 10 minutes are visible.
 - Bristlecone Micro-Benchmarks: Benchmark
 - Java tool to test specific operations while systematically varying parameters
 - benchmarks run "scenarios" – specialized Java classes with interfaces similar to JUnit
 - shows config file, java properties file this time instead of xml, you can vary a few parameters that will spawn multple variations of the test (cross join between all variations)
 - current micro benchmarks
 - basic read latency – low db stress
 - ReadSimpleScenario
 - ReadSimpleLargeScenario
 - read scaling – high db stress
 - ReadScalingAggregatesScenario
 - ReadScalingInvertedKeysScenario
 - write latency and scaling – low/high stress
 - …
 - deadlocks – variable transaction lenghts
 - DeadLockScenario
 - TPC-B scenario will be added shortly
 - shows html output, simple table layout, easy to look at or load into a pivot table in Excel
 - Bristlecone Testing Examples
 - shows a mixed load query throughput test output graph between a standalone server and a 2-node cluster. cluster is approximately twice as productive as the standalone server
 - shows a mixed load query response test output between the same standalone server and a 2-node cluster. The standalone server is visibly choking while the cluster is smooth
 - shows a proxy query throughput against MySQL 5.1.23, MySQL Proxy 0.6.1, Myosotis Connector proxy, and uni/cluster proxy. MySQL 5.1.23 is significantly faster than any proxy. MySQL Proxy is the worst performing one, even though it's written in C and the others are in Java. Robert thinks it's due to Java handling multi-threading better than C
 - shows a read scaling test output for a query that does SELECT COUNT(*) with 200 rows. MySQL 5.1.23 beats uni/cluster proxy until it passes 4 threads, where the proxy beats it.
 - All tests used InnoDB
 - shows a MySQL replication master overhead test results comparing a inserts per second on a single master vs a master with a slave. The master with a slave is about 30% slower. Peter Zaitsev raises an interesting question of the differences between just having the binlog turned on vs having it turned on AND a slave replicating. These differences weren't tested by the presenter and he's unsure on the result
 - shows a replication latency MySQL vs Postgres test results, in which Postgres actually kicks MySQL's ass. A replica with default InnoDB settings performs very badly compared to tweaked settings (about 70% slower)
 
In the meantime, if you found this article useful, feel free to buy me a cup of coffee below.
