MySQL Conference Liveblogging: Applied Partitioning And Scaling your (OLTP) Database System (Wednesday 11:55AM)
Wednesday, April 16th, 2008
- Phil Hilderbrand of thePlatform for Media, Inc presents
- classic partitioning
- old school - union in the archive tables
- auto partitioning and partition pruning
- great for data warehousing
- query performance improved
- maintenance is clearly improved
- design issues in applying partitioning to OLTP (On-Line Transaction Processing)
- often id driven access vs date driven access
- 1 big clients could be 80% of the whole database, so there's a difficulty selecting partitioning schemes
- partitioning is only supported starting from MySQL 5.1
- understanding the benefits
- reducing seek and scan set sizes
- improving inserts/updates durations
- making maintenance easier
- shows an EXPLAIN output for SELECTS on non-partitioned and partitioned tables. The results are significantly better for partitions
- OPTIMIZE TABLE on an unpartitioned table takes 1.14s
- ALTER TABLE REBUILD PARTITION p1; on a partitioned table takes 0.03s
- ALTER TABLE REBUILD PARTITION p1, p2, p3, p4, …, p10; takes 0.27s
- design consideration
- table sizes and predicted growth patterns - partition big tables and also partition in advance, if you predict quick growth
- access patterns - select what you want to partition by in a smart way, date, id, etc
- keys and indexes - there are a few restrictions, foreign keys are currently not supported
- availability requirements
- manageability considerations - choosing to partition by hash if there is a TON of data
- reuse / scope considerations - think ahead, think of the usage
- partitioning methods
- range partitioning
- data usually accessed by date
- limited number of primary partitions needed
- ordered intelligent keys
- support sub partitions
- list partitioning
- grouping data in partitions out of order (1,5,7 in partition x)
- limited number of primary partitions needed
- intelligent keys
- supports sub partitions
- hash partitioning
- low maintenance
- works with limited or large number of partitions
- non-intelligent keys (can work in some cases with intelligent keys)
- key partitioning
- non-integer based partitioned keys (MySQL converts to int for you)
- low maintenance
- hash partitioning example
- hash(mod%num_partitions)
- in this example, Phil has stores, employees, and inventory. He decided to partition by store.
- http://dev.mysql.com/doc/refman/5.1/en/partitioning-management-hash-key.html
- 50 stores
- ALTER TABLE my_store PARTITION BY HASH(id) PARTITIONS 50;
- ALTER TABLE my_employee PARTITION BY HASH(store_id) PARTITIONS 50;
- ALTER TABLE my_inventory PARTITION BY HASH(store_id) PARTITIONS 50;
- ALTER obviously takes a long time and blocks (grr)
- adding partitions
- ALTER TABLE my_store ADD PARTITION PARTITIONS 2;
- ALTER TABLE my_employee ADD PARTITION PARTITIONS 2;
- ALTER TABLE my_inventory ADD PARTITION PARTITIONS 2;
- ALTER takes some time again, though less (how come if the partitions are empty?)
- SELECT table_name, partition_name, table_rows FROM information_schema.partitions … shows info on partitions
- remove 4 partitions
- ALTER TABLE my_store COALESCE PARTITION 4;
- ALTER TABLE my_employee COALESCE PARTITION 4;
- ALTER TABLE my_inventory COALESCE PARTITION 4;
MySQL Conference Liveblogging: Portable Scale-out Benchmarks For MySQL (Wednesday 10:50AM)
Wednesday, April 16th, 2008
- Robert Hodges from Continuent presents
- About Continuent
- leading provider of open source database availability and scaling solutions
- solutions
- uni/cluster - multi-master database clustering that replicates data across multiple databases and load balances reads
- uses "database virtualization"
- scale-out design motivation
- protection from db and site failures
- continuous operation during upgrades
- how come not everyone has it already?
- creating identical replicas across different hosts is hard
- Brewer's conjecture
- trade-offs
- DDL support
- inconsistent reads between replicas
- deadlocks
- sequences
- non-deterministic SQL
- therefore many scale-out approaches are non-transparent
- 3 basic scale-out technologies
- data replication
- where are updates processed? master/master vs master/slave
- when are updates replicated? sync vs async
- group communication - coordinates messages between distributed processes
- views - who is active, who is crashed, do we have quorum, etc
- message delivery - ordering and delivery guarantees
- proxying - virtualizes databases and hides database locations from applications
- latency, performance?
- 3 replication algorithms
- master/slave - accept updates at a single master and replicate changes to one or more slaves
- multi-master state machine - deliver a stream of updates in the same order simultaneously to a set of databases
- certification - optimistically execute transactions on one of a number of nodes and then apply to all nodes after confirming serialization. Currently not in MySQL but developed by Continuent (presenter's company)
- performance testing strategy
- run appropriate tests
- mixed load tests to check overall throughput and scaling
- micro-benchmarks to focus on specific issues
- use appropriate workloads
- scale-out use profiles are often read or write intensive
- cover key issues
- read latency through proxies
- read and write scaling
- slave latency for master/slave configurations
- group communication and replication bottlenecks
- aborts and deadlocks
- generate sufficient load in the right places
- many transactions/queries
- large data sets
- data types
- Bristlecone
- http://bristlecone.continuent.org
- open source
- svn checkout svn://forge.continuent.org/bristlecone/trunk/bristlecone bristlecone
- load test
- batch transaction loading
- micro-benchmarks
- Bristlecone Load Testing: Evaluator
- Java tool to generate mixed load on databases
- similar to pgbench but works cross-DBMS (how about sysbench?)
- can easily vary mix of select, insert, update, delete statements
- default select statement designed to "exercise" the db
- can choose lightweight queries as well
- parameters are defined in a simple config file
- can generate reports
- shows sample config file (xml) that generates 500 clients, lasts 600 seconds. Looks quite simple but very proprietary. Examples are included in the download.
- Evaluator Graphical Output
- shows a graph of requests/s and response time, very standard looking, updates live while the test is running, last 10 minutes are visible.
- Bristlecone Micro-Benchmarks: Benchmark
- Java tool to test specific operations while systematically varying parameters
- benchmarks run "scenarios" - specialized Java classes with interfaces similar to JUnit
- shows config file, java properties file this time instead of xml, you can vary a few parameters that will spawn multple variations of the test (cross join between all variations)
- current micro benchmarks
- basic read latency - low db stress
- ReadSimpleScenario
- ReadSimpleLargeScenario
- read scaling - high db stress
- ReadScalingAggregatesScenario
- ReadScalingInvertedKeysScenario
- write latency and scaling - low/high stress
- …
- deadlocks - variable transaction lenghts
- DeadLockScenario
- TPC-B scenario will be added shortly
- shows html output, simple table layout, easy to look at or load into a pivot table in Excel
- Bristlecone Testing Examples
- shows a mixed load query throughput test output graph between a standalone server and a 2-node cluster. cluster is approximately twice as productive as the standalone server
- shows a mixed load query response test output between the same standalone server and a 2-node cluster. The standalone server is visibly choking while the cluster is smooth
- shows a proxy query throughput against MySQL 5.1.23, MySQL Proxy 0.6.1, Myosotis Connector proxy, and uni/cluster proxy. MySQL 5.1.23 is significantly faster than any proxy. MySQL Proxy is the worst performing one, even though it's written in C and the others are in Java. Robert thinks it's due to Java handling multi-threading better than C
- shows a read scaling test output for a query that does SELECT COUNT(*) with 200 rows. MySQL 5.1.23 beats uni/cluster proxy until it passes 4 threads, where the proxy beats it.
- All tests used InnoDB
- shows a MySQL replication master overhead test results comparing a inserts per second on a single master vs a master with a slave. The master with a slave is about 30% slower. Peter Zaitsev raises an interesting question of the differences between just having the binlog turned on vs having it turned on AND a slave replicating. These differences weren't tested by the presenter and he's unsure on the result
- shows a replication latency MySQL vs Postgres test results, in which Postgres actually kicks MySQL's ass. A replica with default InnoDB settings performs very badly compared to tweaked settings (about 70% slower)

(2 rating, 2 votes)
beer planet is Artem Russakovskii's blog. Artem is a software engineer at