1
MySQL Conference Liveblogging: Portable Scale-out Benchmarks For MySQL (Wednesday 10:50AM)
Posted by Artem Russakovskii on April 16th, 2008 in Databases
- Robert Hodges from Continuent presents
- About Continuent
- leading provider of open source database availability and scaling solutions
- solutions
- uni/cluster – multi-master database clustering that replicates data across multiple databases and load balances reads
- uses "database virtualization"
- scale-out design motivation
- protection from db and site failures
- continuous operation during upgrades
- how come not everyone has it already?
- creating identical replicas across different hosts is hard
- Brewer's conjecture
- trade-offs
- DDL support
- inconsistent reads between replicas
- deadlocks
- sequences
- non-deterministic SQL
- therefore many scale-out approaches are non-transparent
- 3 basic scale-out technologies
- data replication
- where are updates processed? master/master vs master/slave
- when are updates replicated? sync vs async
- group communication – coordinates messages between distributed processes
- views – who is active, who is crashed, do we have quorum, etc
- message delivery – ordering and delivery guarantees
- proxying – virtualizes databases and hides database locations from applications
- latency, performance?
- 3 replication algorithms
- master/slave – accept updates at a single master and replicate changes to one or more slaves
- multi-master state machine – deliver a stream of updates in the same order simultaneously to a set of databases
- certification – optimistically execute transactions on one of a number of nodes and then apply to all nodes after confirming serialization. Currently not in MySQL but developed by Continuent (presenter's company)
- performance testing strategy
- run appropriate tests
- mixed load tests to check overall throughput and scaling
- micro-benchmarks to focus on specific issues
- use appropriate workloads
- scale-out use profiles are often read or write intensive
- cover key issues
- read latency through proxies
- read and write scaling
- slave latency for master/slave configurations
- group communication and replication bottlenecks
- aborts and deadlocks
- generate sufficient load in the right places
- many transactions/queries
- large data sets
- data types
- Bristlecone
- http://bristlecone.continuent.org
- open source
- svn checkout svn://forge.continuent.org/bristlecone/trunk/bristlecone bristlecone
- load test
- batch transaction loading
- micro-benchmarks
- Bristlecone Load Testing: Evaluator
- Java tool to generate mixed load on databases
- similar to pgbench but works cross-DBMS (how about sysbench?)
- can easily vary mix of select, insert, update, delete statements
- default select statement designed to "exercise" the db
- can choose lightweight queries as well
- parameters are defined in a simple config file
- can generate reports
- shows sample config file (xml) that generates 500 clients, lasts 600 seconds. Looks quite simple but very proprietary. Examples are included in the download.
- Evaluator Graphical Output
- shows a graph of requests/s and response time, very standard looking, updates live while the test is running, last 10 minutes are visible.
- Bristlecone Micro-Benchmarks: Benchmark
- Java tool to test specific operations while systematically varying parameters
- benchmarks run "scenarios" – specialized Java classes with interfaces similar to JUnit
- shows config file, java properties file this time instead of xml, you can vary a few parameters that will spawn multple variations of the test (cross join between all variations)
- current micro benchmarks
- basic read latency – low db stress
- ReadSimpleScenario
- ReadSimpleLargeScenario
- read scaling – high db stress
- ReadScalingAggregatesScenario
- ReadScalingInvertedKeysScenario
- write latency and scaling – low/high stress
- …
- deadlocks – variable transaction lenghts
- DeadLockScenario
- TPC-B scenario will be added shortly
- shows html output, simple table layout, easy to look at or load into a pivot table in Excel
- Bristlecone Testing Examples
- shows a mixed load query throughput test output graph between a standalone server and a 2-node cluster. cluster is approximately twice as productive as the standalone server
- shows a mixed load query response test output between the same standalone server and a 2-node cluster. The standalone server is visibly choking while the cluster is smooth
- shows a proxy query throughput against MySQL 5.1.23, MySQL Proxy 0.6.1, Myosotis Connector proxy, and uni/cluster proxy. MySQL 5.1.23 is significantly faster than any proxy. MySQL Proxy is the worst performing one, even though it's written in C and the others are in Java. Robert thinks it's due to Java handling multi-threading better than C
- shows a read scaling test output for a query that does SELECT COUNT(*) with 200 rows. MySQL 5.1.23 beats uni/cluster proxy until it passes 4 threads, where the proxy beats it.
- All tests used InnoDB
- shows a MySQL replication master overhead test results comparing a inserts per second on a single master vs a master with a slave. The master with a slave is about 30% slower. Peter Zaitsev raises an interesting question of the differences between just having the binlog turned on vs having it turned on AND a slave replicating. These differences weren't tested by the presenter and he's unsure on the result
- shows a replication latency MySQL vs Postgres test results, in which Postgres actually kicks MySQL's ass. A replica with default InnoDB settings performs very badly compared to tweaked settings (about 70% slower)
In the meantime, if you found this article useful, feel free to buy me a cup of coffee below.