Share

  • Robert Hodges from Continuent presents
  • About Continuent
    • leading provider of open source database availability and scaling solutions
  • solutions
    • uni/cluster – multi-master database clustering that replicates data across multiple databases and load balances reads
    • uses "database virtualization"
  • scale-out design motivation
    • protection from db and site failures
    • continuous operation during upgrades
  • how come not everyone has it already?
  • creating identical replicas across different hosts is hard
    • Brewer's conjecture
  • trade-offs
    • DDL support
    • inconsistent reads between replicas
    • deadlocks
    • sequences
    • non-deterministic SQL
  • therefore many scale-out approaches are non-transparent
  • 3 basic scale-out technologies
    • data replication
      • where are updates processed? master/master vs master/slave
      • when are updates replicated? sync vs async
    • group communication – coordinates messages between distributed processes
      • views – who is active, who is crashed, do we have quorum, etc
      • message delivery – ordering and delivery guarantees
    • proxying – virtualizes databases and hides database locations from applications
      • latency, performance?
  • 3 replication algorithms
    • master/slave – accept updates at a single master and replicate changes to one or more slaves
    • multi-master state machine – deliver a stream of updates in the same order simultaneously to a set of databases
    • certification – optimistically execute transactions on one of a number of nodes and then apply to all nodes after confirming serialization. Currently not in MySQL but developed by Continuent (presenter's company)
  • performance testing strategy
    • run appropriate tests
      • mixed load tests to check overall throughput and scaling
      • micro-benchmarks to focus on specific issues
    • use appropriate workloads
      • scale-out use profiles are often read or write intensive
    • cover key issues
      • read latency through proxies
      • read and write scaling
      • slave latency for master/slave configurations
      • group communication and replication bottlenecks
      • aborts and deadlocks
    • generate sufficient load in the right places
      • many transactions/queries
      • large data sets
      • data types
  • Bristlecone
  • Bristlecone Load Testing: Evaluator
    • Java tool to generate mixed load on databases
    • similar to pgbench but works cross-DBMS (how about sysbench?)
    • can easily vary mix of select, insert, update, delete statements
    • default select statement designed to "exercise" the db
    • can choose lightweight queries as well
    • parameters are defined in a simple config file
    • can generate reports
    • shows sample config file (xml) that generates 500 clients, lasts 600 seconds. Looks quite simple but very proprietary. Examples are included in the download.
    • Evaluator Graphical Output
      • shows a graph of requests/s and response time, very standard looking, updates live while the test is running, last 10 minutes are visible.
  • Bristlecone Micro-Benchmarks: Benchmark
    • Java tool to test specific operations while systematically varying parameters
    • benchmarks run "scenarios" – specialized Java classes with interfaces similar to JUnit
    • shows config file, java properties file this time instead of xml, you can vary a few parameters that will spawn multple variations of the test (cross join between all variations)
    • current micro benchmarks
      • basic read latency – low db stress
        • ReadSimpleScenario
        • ReadSimpleLargeScenario
      • read scaling – high db stress
        • ReadScalingAggregatesScenario
        • ReadScalingInvertedKeysScenario
      • write latency and scaling – low/high stress
      • deadlocks – variable transaction lenghts
        • DeadLockScenario
      • TPC-B scenario will be added shortly
    • shows html output, simple table layout, easy to look at or load into a pivot table in Excel
  • Bristlecone Testing Examples
    • shows a mixed load query throughput test output graph between a standalone server and a 2-node cluster. cluster is approximately twice as productive as the standalone server
    • shows a mixed load query response test output between the same standalone server and a 2-node cluster. The standalone server is visibly choking while the cluster is smooth
    • shows a proxy query throughput against MySQL 5.1.23, MySQL Proxy 0.6.1, Myosotis Connector proxy, and uni/cluster proxy. MySQL 5.1.23 is significantly faster than any proxy. MySQL Proxy is the worst performing one, even though it's written in C and the others are in Java. Robert thinks it's due to Java handling multi-threading better than C
    • shows a read scaling test output for a query that does SELECT COUNT(*) with 200 rows. MySQL 5.1.23 beats uni/cluster proxy until it passes 4 threads, where the proxy beats it.
    • All tests used InnoDB
    • shows a MySQL replication master overhead test results comparing a inserts per second on a single master vs a master with a slave. The master with a slave is about 30% slower. Peter Zaitsev raises an interesting question of the differences between just having the binlog turned on vs having it turned on AND a slave replicating. These differences weren't tested by the presenter and he's unsure on the result
    • shows a replication latency MySQL vs Postgres test results, in which Postgres actually kicks MySQL's ass. A replica with default InnoDB settings performs very badly compared to tweaked settings (about 70% slower)

● ● ●

Artem Russakovskii is a San Francisco programmer, blogger, and future millionaire (that last part is in the works). Follow Artem on Twitter (@ArtemR) or subscribe to the RSS feed.

In the meantime, if you found this article useful, feel free to buy me a cup of coffee below.



Share
  • http://www.php-trivandrum.org/ php trivandrum

    I happend to drop in because I was interested in getting the total no of queries, total query time, and time distribution in percentage with respect to any web page instance, and along with all these, the details of where bottlenecks could happen. All of these along with the advantage of being able to use on shared hosting.. the result was a php class, http://www.php-trivandrum.org/open-php-myprofiler