Do NOT Use This Perl Module: Passwd::Unix
Updated: April 29th, 2008
Update: The author of the module contacted me the same day and promised to fix it in the next version. Version 0.40 was indeed on cpan as promised, but I haven't tested it yet.
Passwd::Unix will corrupt your /etc/shadow file and rearrange login names and their corresponding password hashes.
The current version of Passwd::Unix corrupted my /etc/shadow upon only
calling the passwd() function. Immediately users started to report not
being able to login.
After examining the situation, I found that Passwd::Unix rearranges all
users in /etc/shadow in some way, but it only does it to the
usernames, and not the password hashes. Thus, you will get corrupted accounts. Moreover,
users are now able to login to one OTHER account, not …
One thing that still springs to mind when I think of the MySQL User Conference last week is Sun's opening keynote. While talking about Sun's market penetration with open source software, Jonathan Schwartz, Sun's CEO, slipped in a short mention of the mobile market saying something along the lines of "Sun is going to be entering the mobile market later on this year". He didn't spend more than 5 seconds talking about it, moving on to the acquisition of MySQL.
Last year, Sun already made an announcement of JavaFX, a Java-based mobile platform but didn't provide any concrete timelines, so I was excited to hear the more on the subject. With Apple iPhone's advent last year and …
MySQL Conference Liveblogging: Optimizing MySQL For High Volume Data Logging Applications (Thursday 2:50PM)
- http://en.oreilly.com/mysql2008/public/schedule/detail/874
- presented by Charles Lee of Hyperic
- Hyperic has the best performance with MySQL out of MySQL, Oracle, and Postgres in their application
- I suddenly remember hyperic was highly recommended above nagios in MySQL Conference Liveblogging: Monitoring Tools (Wednesday 5:15PM)
- performance bottleneck
- the database
- CPU
- memory
- disk latency
- network latency
- 300 platforms (300 remote agents collecting data)
- 2,100 servers
- 21,000 services (10 services per server), sounds feasible
- 468,000 metrics (20 metrics per service)
- 28,800,000 metric data rows per day
- larger deployments have a lot more of these (sounds crazy)
- measurement_id
- timestamp
- value
- primary key (timestamp, measurement_id)
MySQL Conference Liveblogging: MySQL Hidden Treasures (Thursday 11:55PM)
- Damien Seguy of Nexen Services presents
- easiest session of all (phew, that's a relief)
- clever SQL recipes
- tweaking SQL queries
- shows an example where SELECT is ORDERED by a column that is actually an enum.
- an enum is both a string and a number
- sorted by number
- displayed as string
- can be sorted by string if it's cast as string
- compacts storage
- faster to search
- if (var)char is turned into enum, some space can be saved, shows example
MySQL Conference Liveblogging: Monitoring Tools (Wednesday 5:15PM)
Updated: April 18th, 2008
- Tom Hanlon of MySQL presents
- monitoring tool basics
- SHOW FULL PROCESSLIST
- SHOW GLOBAL STATUS
- SHOW GLOBAL VARIABLES
- basic tools
- mysqladmin is provided with the server
- mysqladmin -i 10 extended status: will repeat the same command every 10 seconds. Pipe through grep "and smoke it" (bad pun, hah hah)
- -r: show only changed values
- MySQL Administrator
- mysqladmin is provided with the server
- cacti
- rrdtool based network graphing tool
- uses snmp
- PHP apache and MySQL based solution
- MySQL plugins, download and install
- "poller" gathers data and populates the graphs
- someone offers munin as an alternative
- not snmp based, its own agent is used
- pros
- cacti is fairly easy to configure
- cons
- could be CPU intensive with lots of machines (Perl polling seems to be the
…
MySQL Conference Liveblogging: Benchmarking Tools (Wednesday 4:25PM)
- Tom Hanlon of MySQL presents
- Benchmarking tools
- mysqlslap (with MySQL 5.1)
- sql-bench
- supersmack – Jeremy Zawodny's tool
- Apache Bench (combined with some sample PHP scripts)
- MySQL's benchmark() function
- mybench
- WAST
- JMeter
- sql-bench
- pros
- ubiquitous
- long history of use
- cons
- single thread
- Perl
- not always real-life test cases (create 10k tables?)
- list of tests follows
- pros
- supersmack
- configurable, flexible
- 1000 queries, 50 users
- super-smack -d mysql select-key-smack 50 1000
- can modify queries to be closer to what your own application uses
- pros
- benches concurrent connections
- well documented
- cons
- test language sucks
- Apache Bench
- webserver benchmarking tool
- point to a webserver, utilizes concurrent users
- siege, httperf, httpload are similar
- 404 errors deliver really quickly, so make sure to check for those
- benchmark()
- tests
…
MySQL – Sun – Flickr – Fotolog – Wikipedia – Facebook – YouTube Comparison – MySQL Conference Day 2 Keynote
Updated: April 24th, 2008
Unfortunately I didn't find any available seats to take notes for this but this morning a very interesting keynote took place. Representatives from 7 large companies mentioned in the title gathered on stage and answered various questions by MySQL's Kaj Arno.
These questions included things like "how many MySQL servers do you have", "how many DBAs", etc. It was a lot of fun, hopefully someone (Sheeri) will edit and post the video soon.
Keith has a nice summary of everything that went on together with the numbers here.
Update: Venu has even better notes here….
MySQL Conference Liveblogging: Introduction To The BLOB Streaming Project (Wednesday 3:00PM)
- Paul McCullagh presents
- BLOB
- invented by Jim Starkey
- Basic Large OBject
- Binary Large OBject
- photos, films, mp4 files, pdfs, etc
- mysql client send buffer -> receive buffer on the server (max_allowed_packet)
- streaming a BLOB
- continuous data stream
- stream BLOB data directly in and out of the database
- store BLOBs of any size (>4GB) in the database
- create a scalable back-end that can handle any throughput and storage requirements. Wouldn't need to know in advance how big the database will get
- provide an open system that can be used by all engines
- provide extensions for BLOB streaming to existing MySQL clients
MySQL Conference Liveblogging: MySQL Performance Under A Microscope: The Tobias And Jay Show (Wednesday 2:00PM)
- Jay Pipes, Tobias Asplund
- Finding out the number of rows that would have been returned (MyISAM and InnoDB)
- SQL_CALC_FOUND_ROWS and FOUND_ROWS()
- COUNT(*)
- MEMORY table
- if query cache is on, then it makes no difference
- if it's off
- Memory MyISAM is fastest
- FOUND_ROWS() is slightly slower than count(*)
- SELECT … WHERE a UNION SELECT … WHERE b
vs
SELECT … WHERE a AND b - index_merge wins
- composite index is faster
- of course, multiple indexes are more flexible than composite index
- …
MySQL Conference Liveblogging: Applied Partitioning And Scaling your (OLTP) Database System (Wednesday 11:55AM)
- Phil Hilderbrand of thePlatform for Media, Inc presents
- classic partitioning
- old school – union in the archive tables
- auto partitioning and partition pruning
- great for data warehousing
- query performance improved
- maintenance is clearly improved
- often id driven access vs date driven access
- 1 big clients could be 80% of the whole database, so there's a difficulty selecting partitioning schemes
- reducing seek and scan set sizes
- improving inserts/updates durations
- making maintenance easier
MySQL Conference Liveblogging: Portable Scale-out Benchmarks For MySQL (Wednesday 10:50AM)
- Robert Hodges from Continuent presents
- About Continuent
- leading provider of open source database availability and scaling solutions
- uni/cluster – multi-master database clustering that replicates data across multiple databases and load balances reads
- uses "database virtualization"
- protection from db and site failures
- continuous operation during upgrades
- Brewer's conjecture
- DDL support
- inconsistent reads between replicas
- deadlocks
- sequences
- non-deterministic SQL
- data replication
- where are updates processed? master/master vs master/slave
- when are updates replicated? sync vs async
MySQL Conference Liveblogging: Disaster Is Inevitable – Are You Prepared? (Tuesday 4:25PM)
- Suicide
- having no backups
- depending on slaves for backup
- keeping backups on same SAN
- having a single DBA – Frank didn't like this one at all
- not keeping binlogs
- how much time?
- uncompressed backup ready to mount?
- separate network for recovery?
- first problem: backup was highly compressed (tar.gz)
- uncompressing took hours
- so keep uncompressed backups (at least last N days)
- it should be mountable, rather than transferable
MySQL Conference: Presentation At The Kickfire Booth
Updated: April 17th, 2008
I had a chance to visit the Kickfire booth after the keynotes and before the first presentation. They gave me a kicking t-shirt, followed by a presentation on the newly announced Kickfire appliance (now in beta, shipping in Fall 2008). Here are some notes I jotted down:
- von Neumann bottleneck
- SQL chip (SQC), packs the power of 10s of conventional CPUs
- Query parallelization on the chip
- On-chip memory – 64GB. No registers – no von Neumann bottleneck
- Beats the performance of a given 3 server, 32 CPU, 130TB box (1TB of actual data – space is used for distributing IO)
- SQC uses column-store, compression, intelligent indexing
- SQL Chip, PCI connection, plugs into a Linux server
- SQL execution
- Memory management
- Loader
…
MySQL Conference Liveblogging: EXPLAIN Demystified (Tuesday 2:00PM)
- Baron Schwartz presents
- only works for SELECTs
- nobody dares admit if they've never seen EXPLAIN
- MySQL actually executes the query
- at each JOIN, instead of executing the query, it fills the EXPLAIN result set
- everything is a JOIN (even SELECT 1)
- Columns in EXPLAIN
- id: which SELECT the row belongs to
- select_type
- simple
- subquery
- derived
- union
- union result
- join
- range
- …
MySQL Conference Liveblogging: The Future Of MySQL (Tuesday 11:55AM)
- Robin Schumacher
- gives overview of MySQL products
- MySQL Enterprise
- MySQL 5.1 announced
- table/index partitioning -> great for data warehouses, range, cache, key, list, composite, subpartitioning. Partition pruning. Response time greatly improved with proper partitioning.
- row-based/hybrid replication -> safer and smarter
- disk-based cluster -> supports bigger DBs
- built-in job scheduler -> simplified task management
- problem SQL identification -> easier troubleshooting. Dynamic query tracing is now available, no need to trace things in slow query logs.
- faster full-text search -> 500% increase in some cases
- 5.1.24RC available for the conference
- MySQL 6.0
- Falcon engine – transactional engine
- new backup (version 1.0) -> cross engine, non-blocking, to replace mysqldump
- Falcon
- planned default transactional storage engine. Q4 GA (general availability).
- not InnoDB replacement
- most
…