MySQL Conference: Presentation At The Kickfire Booth
Tuesday, April 15th, 2008
Updated: April 17th, 2008
I had a chance to visit the Kickfire booth after the keynotes and before the first presentation. They gave me a kicking t-shirt, followed by a presentation on the newly announced Kickfire appliance (now in beta, shipping in Fall 2008). Here are some notes I jotted down:
- von Neumann bottleneck
- SQL chip (SQC), packs the power of 10s of conventional CPUs
- Query parallelization on the chip
- On-chip memory - 64GB. No registers - no von Neumann bottleneck
- Beats the performance of a given 3 server, 32 CPU, 130TB box (1TB of actual data - space is used for distributing IO)
- SQC uses column-store, compression, intelligent indexing
- SQL Chip, PCI connection, plugs into a Linux server
- SQL execution
- Memory management
- Loader acceleration
- KDB (Kickfire storage engine), plugs into MySQL
- Optimizer
- Transactional engine
- Column store & cache
- Kickfire appliance size is 2U or 3U
- Highest performing MySQL related database offering
- Starts at $20k (10x performance of similar priced offerings)
- Point and go, point the appliance at the existing db and it sucks the data in
- Up to 3TB database
- Percona ran a test of some Dell box with MySQL vs Kickfire Appliance and Kickfire is 1000x faster
So my questions are:
- does it support foreign keys? The presenter answered yes.
- how does it handle replication? The presenter said it should be addressed in the future. Still unclear on this one.
Update 1: In the latest TPC-H results, Kickfire placed at #1, outperforming all competition by a long margin. The cost per QphH (Query-per-Hour) is only 70 cents! The nearest competition is $3+.
Update 2: Kickfire got an incredible amount of attention at this conference, I think it's everything they'd hoped for and a lot more. When some independent respectable benchmarkers, like Peter, actually get their hands on a sample appliance and post some real life tests, we will truly be able to judge on the performance, but if PR was an indicator of anything, Kickfire will have an insanely successful future.
MySQL Conference Liveblogging: EXPLAIN Demystified (Tuesday 2:00PM)
Tuesday, April 15th, 2008
- Baron Schwartz presents
- only works for SELECTs
- nobody dares admit if they've never seen EXPLAIN
- MySQL actually executes the query
- at each JOIN, instead of executing the query, it fills the EXPLAIN result set
- everything is a JOIN (even SELECT 1)
- Columns in EXPLAIN
- id: which SELECT the row belongs to
- select_type
- simple
- subquery
- derived
- union
- union result
- table: the table accessed or its alias
- type:
- join
- range
- …
- possible_keys: which indexes looked useful to the optimizer
- key: which index(es) the optimizer chose
- key_len: the number of bytes of the index MySQL will use
- ref: which columns/constants from preceding tables are used for lookups in the index named in the key column
- rows: estimated number of rows to read
- extra
- using index: covering index
- using where: server post-filters rows from storage engine
- using temporary: an implicit temp table (for sorting or grouping rows, DISTINCT). No indication of whether the temp table is in memory or on disk
- using filesort: external sort to order result. No indication of which algorithm MySQL will use
- shows an insane EXPLAIN output with 8 EXPLAIN rows
- maatkit includes a tool called mk-visual-explain, which can construct a formatted tree
- Baron shows a demo and answers questions
- EXPLAIN EXTENDED followed by SHOW WARNINGS will give more output about how a query is executed
MySQL Conference Liveblogging: The Future Of MySQL (Tuesday 11:55AM)
Tuesday, April 15th, 2008
- Robin Schumacher
- gives overview of MySQL products
- MySQL Enterprise
- MySQL 5.1 announced
- table/index partitioning -> great for data warehouses, range, cache, key, list, composite, subpartitioning. Partition pruning. Response time greatly improved with proper partitioning.
- row-based/hybrid replication -> safer and smarter
- disk-based cluster -> supports bigger DBs
- built-in job scheduler -> simplified task management
- problem SQL identification -> easier troubleshooting. Dynamic query tracing is now available, no need to trace things in slow query logs.
- faster full-text search -> 500% increase in some cases
- 5.1.24RC available for the conference
- MySQL 6.0
- Falcon engine - transactional engine
- new backup (version 1.0) -> cross engine, non-blocking, to replace mysqldump
- Falcon
- planned default transactional storage engine. Q4 GA (general availability).
- not InnoDB replacement
- most InnoDB apps are OK on Falcon
- crash recovery
- ACID transactions
- more features
- best on multi-CPU, large RAM servers
- planned to beat InnoDB
- shows latest internal Falcon vs InnoDB benchmarks, all benchmarks have Falcon winning now (dual and quad quadcore CPUs), compared to before
- new backup in 6.0
- all general engines supported (except for Cluster)
- SQL-command driven
- online, non-blocking DML (insert,update,delete) for transactional engines. MyISAM is still blocking (at least for now)
- point-in-time recovery
- better recovery times in benchmarks
- restore is blocking
- plugins for the backup tool
- first one is a non-blocking MyISAM plugin
- compression plugin
- encryption plugin
- new optimizer enhancements in 6.0
- example shows 99.75% improvement, seems like a very edgy edge case
- High Availability
- MySQL 5.1 with disk-based cluster and replication for cluster
- Data Warehousing
- MySQL 5.1 with data partitioning
- data management becomes easy if one needs to delete many rows and they sit on one (smartly created) partition. Then a quick DROP DDL statement takes care of the job in a split second.
- better subquery optimizations (6.0)
- New Nitro engine available in 5.1 for real-time data warehousing
- InfoBright engine for TB-sized data warehousing
- Kickfire
- MySQL 5.1 with data partitioning
- memcached
- MySQL Enterprise is going to start offering support
- MySQL Workbench
- use it
- reverse engineer a schema
- find differences
- sync
- free and paid version (nicer functionality in paid only?)
- 2008 plans are shown
- MySQL 6.0, Falcon GA in Q4
- Maria in Q4
- MySQL 6.x
- foreign keys in all storage engines
- better prepared statements
- better server-side cursors -> faster, less memory
- replication improvements -> checksums
- optimizer enhancements
- more
- MySQL 7.0?
- Alpha, Beta begin mid-2009
- GA expected 2009
- codename "Citadel"
- security oriented
- per-column data encryption
- external authentication methods
-
online alter table -> Online DDL changes (holy crap, bring it NOW!!!)
- Infobright storage engine
- no indexes needed (wow, definitely need to research this)
- Kickfire
- Rob Young takes over
- talks about Enterprise plans, customer reported pains, a lot have to do with replication
- MySQL Load Balancer (Q3-Q4 2008)
- for high traffic, read intensive apps and websites
- application load balancing extension (not replacement)
- MySQL Enterprise Monitor
- needle in a haystack diagnosis
- MySQL Query Analyzer
- will be able to talk to the Enterprise Monitor
- MySQL Connection Manager (2009)
- connection pooler
- connection concentrator
- optimizes throughput of web applications
- multiplexing transactions onto a single connection
- Lunch time
MySQL Conference Liveblogging: Performance Guide For MySQL Cluster (Tuesday 10:50AM)
Tuesday, April 15th, 2008
- Speaker: Mikael Ronstrom, PhD, the creator of the Cluster engine
- Explains the cluster structure
- Aspects of performance
- Response times
- Throughput
- Low variation of response times
- Improving performance
- use low level API (NDB API), expensive, hard
- use new features in MySQL Cluster Carrier Grade Edition 6.3 (currently 6.3.13), more on this later
- proper partitioning of tables, minimize communication
- use of hardware
- NDB API is a C++ record access API
- supports sending parallel record operations within the same transaction or in different transactions
- asynchronous and synchronous
- NDB kernel is programmed entirely asynchronously
- Looking at performance
- Fire synchronous insert transactions - 10x TCP/IP time cost
- Five inserts in one synchronous transaction - 2x TCP/IP time cost
- Five asynchronous insert transactions - 2x TCP/IP time cost
- Case study
- develop prototype using MySQL C API - performance X, response time Y
- develop same functionality using synchronous NDB API - performance 3X, response time ~0.5Y
- develop same functionality using asynchronous NDB API - performance 6X, response time ~0.25Y
- Conclusion on when to use NDB API
- performance is critical, need speed, response time, etc
- queries are not very complex
- Conclusion on when not to use NDB API
- when design time is critical
- when complex queries are executed, the MySQL optimizer may handle them better
- New features of MySQL Cluster Carrier Grade Edition 6.3.13
- polling based communication
- CPU used heavily even at lower throughput
- avoids interrupt and wake-up delays for new messages
- some good results in benchmarks
- decreases performance when CPU is the limiting factor
- 10% performance improvement on 2, 4, and 8 data node clusters
- 20% improvement if using Dolphin Express
- epoll replacing select system calls (Linux)
- improved performance 20% on a 32-node cluster
- send buffer gathering
- real-time scheduler for threads
- lock threads to CPU
- distribution awareness
- 100-200% improvement when application is distribution aware
- avoid read before Update/Delete with PK
- UPDATE t SET a=const1 WHERE pk=x;
- no need to do a read before UPDATE, all data is already known
- ~10% improvement
- old 'truths' revisited
- previous recommendation was to run 1 data node per computer
- this was due to bugs, which are now fixed
- partitioning tricks
- if there is a table that has a lot of index scans (not primary key) on it, partitioning this table to only be in one node group can be a good idea
- partition syntax for this: PARTITION BY KEY (id) (PARTITION p0 NODEGROUP 0);
- new performance features in MySQL Cluster 5.0
- lock memory in main memory - ensure no swapping occurs in NDB kernel
- batching IN (…) with primary keys
- 100x SELECT FROM t WHERE pk=x;
- SELECT * FROM t WHERE pk IN (x1, …, x100)
- IN-statement is around 10x faster
- use of multi-INSERT
- similar 10x speedup
- new features in MySQL Cluster CGE version 6.4 (beta, only available in bitkeeper for now)
- multi-threaded data nodes - currently no benefit using DBT2 but 40% increase in throughput for some NDB API benchmarks
- DBT2 improvements to follow later
- use of hardware, CPU choice
- Pentium D @ 2.8Ghz -> Core 2 Duo @ 2.8Ghz => 75% improvement
- doubling L2 cache doubles thread scalability
- choice of Dolphin Express interconnect increases throughput 10-400%
- scalability of DBT2 threads
- 1-2-4 threads - linear
- 4-8 threads - 40-70%
- 8-16 threads - 10-30%
- decreasing scalability over 16 threads
- current recommendation by Mikael himself: use twice as many SQL nodes as data nodes
- future software performance improvements
- batched key access - 0-400% performance improvement
- improvement scan protocol - ~15% improvement
- incremental backups
- optimized backup code
- parallel I/O on index scans using disk data
- Niagara-II benchmark from 2002
- simple read, simple update, both transactional
- 72-CPU Sunfire 15k, 256GB RAM
- CPUs: ultra sparc-III @ 900Mhz
- 32-node NDB Cluster, 1 data node locked to 1 CPU
- db size 88GB, 900 mil records
- simple reads 1.5mil reads per second
- simple update 340,000 per second
- Everyone is overwhelmed, so no questions are asked
My MySQL Conference Schedule
Sunday, April 13th, 2008
Were there too many "my"'s in that title? Anyway… this week's MySQL conference is promising to be really busy and exciting. I can't wait to finally be there and experience it in all its glory. Thanks to the O'Reilly personal conference planner and scheduler and the advice of my fellow conference goers, I was able to easily (not really) pick out the speeches I am most interested in attending.
Here goes (my pass doesn't include Monday
):
Tuesday
8:30am Tuesday, 04/15/2008
Keynote Ballroom E
Mårten Mickos (MySQL)
In his annual State of MySQL keynote, Marten discusses the current and future role of MySQL in the modern online world. The presentation also covers the acquisition by Sun of MySQL, the role open source is playing for users and customers all over the planet, and what the visions for the future are. Read more.
9:05am Tuesday, 04/15/2008
Open Source: The Heart of the Network Economy
Keynote Ballroom E
Jonathan Schwartz (Sun Microsystems)
Free software and open communities are the lifeblood of network innovation. Sun Microsystems CEO Jonathan Schwartz will highlight the rising open source tide and how Sun's recently announced acquisition of MySQL furthers free software as a platform for the web economy. Read more.
9:40am Tuesday, 04/15/2008
A Head in the Cloud - The Power of Infrastructure as a Service
Keynote Ballroom E
Werner Vogels (Amazon.com)
There are many challenges when building a reliable, flexible architecture that can manage unpredictable behaviors of today's internet business. This presentation will review some of the lessons learned from building one of the world's largest distributed systems; Amazon.com. Read more.
10:50am Tuesday, 04/15/2008
Performance Guide for MySQL Cluster
MySQL Cluster and High Availability, Performance Tuning and Benchmarks Ballroom D
Mikael Ronstrom (MySQL)
Learn about all the tricks required to make MySQL Cluster high performance. This includes using real-time scheduling, batching in all its form, cluster interconnects, and locking threads to CPUs. Read more.
11:55am Tuesday, 04/15/2008
The Future of MySQL: What You Need to Know About What's Coming
Architecture and Technology, General Ballroom B
Robin Schumacher (Sun/MySQL), Rob Young (Sun/MySQL)
What enhancements can you expect in the MySQL Server in the next few years? What new tools, services, and software is MySQL going to deliver this year and next to help you deploy and maintain MySQL applications? This session will let you in on all the plans MySQL has for the server, the Enterprise Monitor, the upcoming Load Balancer and Query Analyzer, management tools, and more. Read more.
2:00pm Tuesday, 04/15/2008
InnoDB: Status, Architecture, and New Features
Architecture and Technology Ballroom F
Heikki Tuuri (Innobase / Oracle Corp.), Ken Jacobs (Oracle / Innobase)
Ken Jacobs and Heikki Tuuri will describe the InnoDB architecture in depth, and discuss the new powerful performance-enhancing capabilities in InnoDB. Read more.
3:05pm Tuesday, 04/15/2008
Investigating Innodb Scalability Limits
Performance Tuning and Benchmarks Ballroom F
Peter Zaitsev (MySQL Performance Blog), Vadim Tkachenko (MySQLPerformanceBlog.com)
You may have heard Innodb has limited scalability with multiple CPUs and some of these were fixed in recent MySQL 5.0 versions. In this presentations we will look into which problems are fixed. Read more.
4:25pm Tuesday, 04/15/2008
Disaster is Inevitable—Are You Prepared?
Security and Database Administration Ballroom B
Farhan Mashraqi (Fotolog)
What’s the worst disaster you expect to happen? What can you do to better prepare for the disaster? Join us in this heart-racing, real-life inspired presentation for answers to these questions and more. Read more.
5:15pm Tuesday, 04/15/2008
Mitigating Replication Latency in a Distributed Application Environment
Architecture and Technology, Business and Case Studies, Replication and Scale-Out Ballroom E
Jeff Freund (Clickability)
Master-Master replication provides high availability and serviceability for the applications. Publishing web sites is a read-intensive operation, and the combination of Master-Slave replication with an application layer that intelligently splits database read and write operations allows for rapid scale out. Hear how Clickability solves issues for both environments. Read more.
Wednesday
8:30am Wednesday, 04/16/2008
Copyright Regime vs. Civil Liberties
Keynote Ballroom E
Rick Falkvinge (Swedish Pirate Party)
Rick Falkvinge, founder of the Swedish Pirate Party, talks about the rise and success of pirates and why pirates are necessary in today's politics. He'll also outline the next steps in the pirates' strategy to change global copyright laws. Read more.
9:15am Wednesday, 04/16/2008
Keynote Ballroom E
John Allspaw (Flickr (Yahoo!)), Jeff Rothschild (Facebook.com), Monty Taylor (MySQL), Domas Mituzas (MySQL), Paul Tuckfield (YouTube)
This lively panel discussion keynote will address the challenges large, modern web properties face in scaling MySQL. Panelists from Facebook, YouTube, and Flickr pair up with MySQL engineers in discussing the current and future problem domain and possible solutions. Read more.
10:00am Wednesday, 04/16/2008
Faster, Greener, Cheaper: Why Every MySQL Database Server Will One Day Have a SQL Chip
Keynote Ballroom E
Raj Cherabuddi (Kickfire)
The history of computing is full of algorithms such as graphics processing that are fine-tuned in general purpose CPUs over decades. Only when they are finally ported to dedicated hardware are tremendous improvements in speed, cost, and power realized. Raj Cherabuddi explains how a new SQL chip will revolutionize today’s database query processing. Read more.
10:50am Wednesday, 04/16/2008
Portable Scale-out Benchmarks for MySQL
Architecture and Technology, Performance Tuning and Benchmarks, Replication and Scale-Out Ballroom D
Robert Hodges (Continuent.com)
This talk presents new open source tools that allow users to set up and run database scale-out benchmarks easily. Hodges illustrates with benchmark results from your favorite MySQL configurations. Read more.
11:55am Wednesday, 04/16/2008
Applied Partitioning and Scaling Your Database System
General Ballroom D
Phil Hildebrand (thePlatform)
Take advantage of MySQL partitioning to allow your database applications to scale in both size and performance. A practical look at applying partitioning to OLTP database systems. Read more.
2:00pm Wednesday, 04/16/2008
Architecture of Maria: A New Storage Engine with a Transactional Design
Architecture and Technology, Performance Tuning and Benchmarks Ballroom E
Michael Widenius (MySQL)
A deep tour into the design of Maria, a new MVCC storage engine for MySQL from the original authors of MySQL that is designed to support transactions and automatic recovery. Read more.
3:05pm Wednesday, 04/16/2008
An Introduction to BLOB Streaming for MySQL Project
Java, Storage Engine Development and Optimization Ballroom A
Paul McCullagh (PrimeBase Technologies GmbH)
This session explains how the BLOB Streaming engine solves the problems involved in storing pictures, films, MP3 files, and other binary and text objects (BLOBs) in the database. Read more.
4:25pm Wednesday, 04/16/2008
Benchmarking and Monitoring: Tools of the Trade (Part I)
Performance Tuning and Benchmarks Ballroom D
Tom Hanlon (MySQL)
Benchmarking and Profiling are extrememly important and a large array of tools exist for the job. Join Tom Hanlon for a tour of the current landscape. Demos of each tool wil be shown. Read more.
5:15pm Wednesday, 04/16/2008
Benchmarking and Monitoring: Tools of the Trade (Part II)
Performance Tuning and Benchmarks, Security and Database Administration Ballroom D
Tom Hanlon (MySQL)
Join us for a presentation of the wonderful world of benchmarks and monitoring tools. Here you will learn what is available, how each tool works, and a demonstration using each tool against a running database from a veteran MySQL expert. Read more.
8:30pm Wednesday, 04/16/2008
Event Ballroom F
Have a drink, mingle with fellow conference participants, and enter our raffle to win great prizes, including a a Sony PS3! Sponsored by Sun Microsystems. Read more.
Thursday
8:30am Thursday, 04/17/2008
Keynote Ballroom E
Dick Hardt (Sxip Identity Corporation)
Much of the data in a database is about people. Identity 2.0 technologies will lower the friction for people to provide and easily move data about themselves online. This fast paced keynote will offer a background on Identity 2.0, discuss current roadblocks and future opportunities, and explore the potential impacts these will have on databases. Read more.
9:15am Thursday, 04/17/2008
A Match Made in Heaven? The Social Graph and the Database
Keynote Ballroom E
Jeff Rothschild (Facebook.com)
Social applications integrate information about many different facets of people’s lives. Join us as Jeff Rothschild from Facebook looks at the power of the social graph, how it can increase the utility and adoption of applications, and its implications on storage architectures. Read more.
10:50am Thursday, 04/17/2008
MySQL Proxy, the Friendly Man in the Middle
Architecture and Technology Ballroom F
Jan Kneschke (MySQL), Jimmy Guerrero (Sun-MySQL)
MySQL Proxy is a tool to route, rewrite, handle, and block queries on the MySQL Protocol level. Load Balancing, Query Replay, Online Query Rewrites, and more with a grain of scripting. Read more.
11:55am Thursday, 04/17/2008
Sphinx: High Performance Full Text Search for MySQL
General Ballroom C
Andrew Aksyonoff (Sphinx Technologies), Peter Zaitsev (MySQL Performance Blog)
Sphinx is an open source full-text search engine designed for indexing databases and integrated especially well with MySQL. We'll talk about its features, capabilities, and real-world applications. Read more.
2:00pm Thursday, 04/17/2008
Top 20 DB Design Tips Every Architect Needs to Know
Architecture and Technology, Data Warehousing and Business Intelligence, Security and Database Administration Ballroom B
Ronald Bradford (Primebase Technologies)
Each database product has strengths and weaknesses. Having chosen MySQL as your database product, leverage the strengths of the product to maximize design and performance. Learn the things to avoid. Read more.
2:50pm Thursday, 04/17/2008
Architecture and Technology, Java, Ruby and MySQL Ballroom G
Farhan Mashraqi (Fotolog)
Lucene is a high performance, scalable, full-text search engine library that allows you to add search to any application. This presentation shows you how you can use Lucene within your environment. Read more.
3:50pm Thursday, 04/17/2008
The Science and Fiction of Petascale Analytics
Keynote Ballroom E
Jacek Becla (Stanford Linear Accelerator Center)
Scientists are trying to understand dark matter, discover distant galaxies, hunt for the Higgs boson, detect asteroids, and take movies of molecules. Their science is fascinating but their analysis requirements may seem like science fiction. Few have experienced the reality of petascale analytics so far, but everybody, including you, will face it tomorrow. Are we ready? Read more.
4:35pm Thursday, 04/17/2008
Event Ballroom E
Take the opportunity to network one last time at this closing event. Say thank you and exchange contact information until next year. Read more.
Phew. I think I've picked out the most interesting topics. I'm excited to see Peter, Farhan, Ron, Paul, Jan, and everyone else. I hope I didn't skip anything interesting…

(+1 rating, 1 votes)
beer planet is Artem Russakovskii's blog. Artem is a software engineer at