Beer Planet Upgraded To Wordpress 2.5
Saturday, March 29th, 2008
Updated: April 2nd, 2008
1 | svn sw http://svn.automattic.com/wordpress/tags/2.5 |
and Beer Planet is on 2.5. There were no problems with the upgrade itself or any of the plugins. Great work on the new clean interface and multi-file upload, Wordpress!
Info here: http://codex.wordpress.org/Version_2.5 (short) and http://wordpress.org/development/2008/03/wordpress-25-brecker/ (more detailed)
Parsing JSON In Perl By Example - SouthParkStudios.com South Park Episodes
Thursday, March 27th, 2008
In this tutorial, I'll show you how to parse JSON using Perl. As a fun example, I'll use the new SouthParkStudios.com site released earlier this week, which contains full legal episodes of South Park. I guess the TV companies are finally getting a clue about what users want. I will parse the first season's JSON and pull out information about individual episodes (like title, description, air date, etc) from http://www.southparkstudios.com/includes/utils/proxy_feed.php?html=season_json.jhtml%3fseason=1. Feel free to replace '1' with any valid season number.
Here's a short snippet of the JSON:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 | {
season:{
episode:[
{
title:'Cartman Gets an Anal Probe',
description:'While the boys are waiting for the school bus, Cartman explains the odd nightmare he had the previous night involving alien visitors.',
thumbnail:'http://www.southparkstudios.com/includes/utils/proxy_resizer.php?image=/images/south_park/episode_thumbnails/s01e01_480.jpg&width=55&quality=100',
thumbnail_larger:'http://www.southparkstudios.com/includes/utils/proxy_resizer.php?image=/images/south_park/episode_thumbnails/s01e01_480.jpg&width=63&quality=100',
thumbnail_190:'http://www.southparkstudios.com/includes/utils/proxy_resizer.php?image=/images/south_park/episode_thumbnails/s01e01_480.jpg&width=190&quality=100',
id:'103511',
airdate:'08.13.97',
episodenumber:'101',
available:'true',
when:'08.13.97'
}
,
{
title:'Weight Gain 4000',
description:'When Cartman\'s environmental essay wins a national contest, America\'s sweetheart, Kathie Lee Gifford, comes to South Park to present the award.',
thumbnail:'http://www.southparkstudios.com/includes/utils/proxy_resizer.php?image=/images/south_park/episode_thumbnails/s01e02_480.jpg&width=55&quality=100',
thumbnail_larger:'http://www.southparkstudios.com/includes/utils/proxy_resizer.php?image=/images/south_park/episode_thumbnails/s01e02_480.jpg&width=63&quality=100',
thumbnail_190:'http://www.southparkstudios.com/includes/utils/proxy_resizer.php?image=/images/south_park/episode_thumbnails/s01e02_480.jpg&width=190&quality=100',
id:'103516',
airdate:'08.20.97',
episodenumber:'102',
available:'true',
when:'08.20.97'
}
...
]
}
} |
Before you can parse JSON, you need to have a few libraries. Install them using CPAN, for example:
1 2 3 4 | cpan install JSON install JSON::XS install WWW::Mechanize # my favorite library for browsing |
Now the script.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 | #!/usr/bin/perl -w
# $Rev: 1 $
# $Author: artem $
# $Date: 2008-03-25 14:28:39 -0800 (Tue, 25 Mar 2008) $
use strict;
use WWW::Mechanize;
use JSON -support_by_pp;
fetch_json_page("http://www.southparkstudios.com/includes/utils/proxy_feed.php?html=season_json.jhtml%3fseason=1");
sub fetch_json_page
{
my ($json_url) = @_;
my $browser = WWW::Mechanize->new();
eval{
# download the json page:
print "Getting json $json_url\n";
$browser->get( $json_url );
my $content = $browser->content();
my $json = new JSON;
# these are some nice json options to relax restrictions a bit:
my $json_text = $json->allow_nonref->utf8->relaxed->escape_slash->loose->allow_singlequote->allow_barekey->decode($content);
# iterate over each episode in the JSON structure:
my $episode_num = 1;
foreach my $episode(@{$json_text->{season}->{episode}}){
my %ep_hash = ();
$ep_hash{title} = "Episode $episode_num: $episode->{title}";
$ep_hash{description} = $episode->{description};
$ep_hash{url} = "http://www.southparkstudios.com/episodes/" . $episode->{id};
$ep_hash{publish_date} = $episode->{airdate};
$ep_hash{thumbnail_url} = $episode->{thumbnail_190} || $episode->{thumbnail_larger};
# print episode information:
while (my($k, $v) = each (%ep_hash)){
print "$k => $v\n";
}
print "\n";
$episode_num++;
}
};
# catch crashes:
if($@){
print "[[JSON ERROR]] JSON parser crashed! $@\n";
}
} |
Here's the output:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | Getting json http://www.southparkstudios.com/includes/utils/proxy_feed.php?html=season_json.jhtml%3fseason=1 publish_date => 08.13.97 url => http://www.southparkstudios.com/episodes/103511 thumbnail_url => http://www.southparkstudios.com/includes/utils/proxy_resizer.php?image=/images/south_park/episode_thumbnails/s01e01_480.jpg&width=190&quality=100 title => Episode 1: Cartman Gets an Anal Probe description => While the boys are waiting for the school bus, Cartman explains the odd nightmare he had the previous night involving alien visitors. publish_date => 08.20.97 url => http://www.southparkstudios.com/episodes/103516 thumbnail_url => http://www.southparkstudios.com/includes/utils/proxy_resizer.php?image=/images/south_park/episode_thumbnails/s01e02_480.jpg&width=190&quality=100 title => Episode 2: Weight Gain 4000 description => When Cartman's environmental essay wins a national contest, America's sweetheart, Kathie Lee Gifford, comes to South Park to present the award. ... |
Of particular interest here is the way JSON accepts settings:
1 | my $json_text = $json->allow_nonref->utf8->relaxed->escape_slash->loose->allow_singlequote->allow_barekey->decode($content); |
I found that these settings fix most of the crashes and incompatibilities while parsing various JSON pages.
Is there something I've missed? Do you know a better way to parse JSON in Perl? Unclear about something. Don't hesitate to share in the comments.
Getting The Most Out Of The MySQL Conference
Wednesday, March 26th, 2008
As half of the world population already knows, the MySQL conference is coming in less than 3 weeks. Since this event only happens once a year, lasts only 4 days, and costs more than a Russian mail-order bride, I'd really like to get the most out of it. Considering that the schedule is completely packed, with 8 (!!) events going on in parallel, I imagine things can get a little frantic. Additionally, I've never been to a conference of such size before and I'm not sure what to expect.
So… I'm contemplating:
- printing out the event schedule and drawing a zig-zagging "map" of exactly where I'll be jumping to next, once the previous presentation ends. I'm actually wondering if I'll need to figure out where all the events are located exactly in advance. How big is that place? Did Google invent in-building walking maps yet? Do people normally jump from one presentation to another parallel one or is that unheard of?
- bringing a laptop to take notes. I find it that my brain tends to retain mostly the general ideas for a good period of time. Code details and specifics tend to flush a lot sooner. Keeping notes (and publishing them online) is the best way to retain all this tasty information. Learn it and starting doing it, don't be lazy. For my note taking application, I actually prefer Microsoft (:gasp:) OneNote. It keeps things organized and has a coupe of neat tricks up its sleeve, like built-in OCR, Win-S shortcut for a quick area-defined screenshot, integration with Outlook, audio note-taking. Aha!..
- recording audio at every presentation, is that allowed?
- getting plenty of sleep the night before each conference day, as the amount of information is going to be simply crushing. I guess I'm going to have to postpone my 3am sessions until Friday or so.
- bribing an organ thief to steal Peter Zaitsev's brain and replace it with a statistical computer chip capable of running 17 billion MySQL benchmarks a second. Nobody is going to notice the difference anyway.
Do you have any tips? How do YOU handle conferences? Please share in the comments.
Setting Up A MySQL Cluster
Wednesday, March 26th, 2008
Updated: March 29th, 2008
This article contains my notes and detailed instructions on setting up a MySQL cluster. After reading it, you should have a good understanding of what a MySQL cluster is capable of, how and why it works, and how to set one of these bad boys up. Note that I'm primarily a developer, with an interest in systems administration but I think that every developer should be able to understand and set up a MySQL cluster, at least to make the dev environment more robust.
- In short, a MySQL cluster allows a user to set up a MySQL database shared between a number of machines. Here are some benefits:
- High availability. If one or some of the machines go down, the cluster will stay up, as long as there is at least one copy of all data still present. The more redundant copies of data there are, the more machines you can afford to lose.
- Scalability. Distributed architecture allows for load balancing. If your MySQL database is getting hit with lots of queries, consider setting up a cluster to spread this load in almost linear fashion. A 4 node cluster should be able to handle twice as many queries as a 2 node cluster.
- Online backups.
- Full support for transactions.
- Must-have manual: MySQL Clustering by Alex Davies and Harrison Fisk, MySQL Press.
- First and foremost, I would like to get this out of the way (from MySQL Clustering):
- Response time with MySQL Cluster is quite commonly worse than it is with the traditional setup. Yes, response time is quite commonly worse with clustering than with a normal system. If you consider the architecture of MySQL Cluster, this will begin to make more sense.
- When you do a query with a cluster, it has to first go to the MySQL server, and then it goes to storage nodes and sends the data back the same way. When you do a query on a normal system, all access is done within the MySQL server itself. It is clearly faster to access local resources than to read the same thing across a network. Response time is very much dependant on network latency because of the extra network traffic. Some queries may be faster than others due to the parallel scanning that is possible, but you cannot expect all queries to have a better response time.
- So if the response time is worse, why would you use a cluster? First, response time isn't normally very important. For the vast majority of applications, 10ms versus 15ms isn't considered a big difference.
- Where MySQL Cluster shines is in relation to the other two metrics: throughput and scalability.
- A typical MySQL cluster setup involves 3 components in at least this configuration:
- 1 management (ndb_mgmd) node.
- Management nodes contain the cluster configuration.
- A management node is only needed to connect new storage and query nodes to the cluster and do some arbitration.
- Existing storage and query nodes continue to operate normally if the management node goes down.
- Therefore, it's relatively safe to have only 1 management node running on a very low spec machine (configuring 2 management nodes is possible but is slightly more complex and less dynamic).
- Interfacing with a management node is done via an ndb_mgm utility.
- Management nodes are configured using config.ini.
- My setup here involves 1 management node.
- 2 storage (ndbd) nodes.
- You do not interface directly with those nodes, instead you go through SQL nodes, described next.
- It is possible to have more storage nodes than SQL nodes.
- It is possible to host storage nodes on the same machines as SQL nodes.
- It is possible, although not recommended, to host storage nodes on the same machines as management nodes.
- Storage nodes will split up the data between themselves automatically. For example, if you want to store each row on 2 machines for redundancy (NoOfReplicas=2) and you have 6 storage nodes, your data is going to be split up into 3 distinct non-intersecting chunks, called node groups.
- Given a correctly formulated query, it is possible to make MySQL scan all 3 chunks in parallel, thus returning the result set quicker.
- Node groups are formed implicitly, meaning you cannot assign a storage node to a specific node group. What you can do, however, is manipulate the IDs of the nodes in such a way that the servers you want will get assigned to the node groups you want. The nodes having consecutive IDs get assigned to the same node group until there are NoOfReplicas nodes in a node group, at which point a node group starts.
- Storage nodes are configured using /etc/my.cnf. They are also affected by settings in config.ini on the management node.
- My setup here involves 4 storage nodes.
- 2 query (SQL) nodes.
- SQL nodes are regular mysqld processes that access data in the cluster. You guessed it right - the data sits in storage nodes, and SQL nodes just serve as gateways to them.
- Your application will connect to these SQL node IPs and will have no knowledge of storage nodes.
- It is possible to have more SQL nodes than storage nodes.
- It is possible to host SQL nodes on the same machines as storage nodes.
- It is possible, although not recommended, to host SQL nodes on the same machines as management nodes.
- SQL nodes are configured using /etc/my.cnf. They are also affected by settings in config.ini on the management node.
- My setup here involves 4 SQL nodes.
- 1 management (ndb_mgmd) node.
- Normally a cluster doesn't want to start if not all the storage nodes are connected (from MySQL Clustering).
- Therefore, the cluster waits longer during the restart if the nodes aren't all connected so that the other storage nodes can connect. This period of time is specified in the setting StartPartialTimeout, which defaults to 30 seconds. If at the end of 30 seconds, a cluster is possible (that is, it has one node from each node group) and it can't be in a network partitioned situation (that is, it has all of one node group), the cluster will perform a partial cluster restart, in which it starts up even though storage nodes are missing.
- If the cluster is in a potential network partitioned setup, where it doesn't have all of a single node group, then it will wait even longer, with a setting called StartPartitionedTimeout, which defaults to 60 seconds.
- Adding databases propagates to all SQL nodes (at least with the latest version of MySQL), so when you create a new database, you only need to do it once on any SQL node. However, users dont propagate, so each SQL node will need to have its own users set up. Warning: do NOT try to change the MySQL internal tables (the ones in database mysql) to type ndbcluster as the cluster will break.
- I will think of something else to put here.
Notes
My Setup
This is my sample configuration with sample IPs:
- mysql-5.1.22-rc-linux-i686-icc-glibc23
- 1x management node (OpenSUSE): 10.0.0.1
- 4x storage (ndbd) nodes (OpenSUSE): 10.0.0.2, 10.0.0.3, 10.0.0.4, 10.0.0.5.
- 4x query (SQL) nodes (OpenSUSE): 10.0.0.2, 10.0.0.3, 10.0.0.4, 10.0.0.5.
- NoOfReplicas = 2, meaning there will be 2 copies of all data and therefore 4/2=2 node groups.
- Cluster data will sit in /var/lib/mysql-cluster.
Sample Screenshot
Here is a sample screenshot of another one of my configurations showing a similar setup. This is the output of show on the management node:
Setup Instructions
On the management node (as root):
1 2 3 4 5 6 7 | groupadd mysql useradd -g mysql mysql mkdir -p /root/src/ cd /root/src/ wget http://dev.mysql.com/get/Downloads/MySQL-5.1/mysql-5.1.22-rc-linux-i686-icc-glibc23.tar.gz/from/http://mysql.he.net/ tar xvzf mysql-*.tar.gz rm mysql-*.tar.gz |
- ndb_mgmd is the management server
- ndb_mgm is the management client
1 2 3 4 5 | cp mysql-*/bin/ndb_mg* /usr/bin/ chmod +x /usr/bin/ndb_mg* mkdir /var/lib/mysql-cluster chown mysql:mysql /var/lib/mysql-cluster vi /var/lib/mysql-cluster/config.ini |
Download /var/lib/mysql-cluster/config.ini
1 2 3 | ndb_mgmd -f /var/lib/mysql-cluster/config.ini ndb_mgm show |
On each storage and SQL node (as root):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | groupadd mysql useradd -g mysql mysql cd /usr/local wget http://dev.mysql.com/get/Downloads/MySQL-5.1/mysql-5.1.22-rc-linux-i686-icc-glibc23.tar.gz/from/http://mysql.he.net/ tar xvzf mysql-*.tar.gz rm mysql-*.tar.gz ln -s `echo mysql-*` mysql cd mysql chown -R root . chown -R mysql data chgrp -R mysql . scripts/mysql_install_db --user=mysql cp support-files/mysql.server /etc/init.d/ chmod +x /etc/init.d/mysql.server vi /etc/my.cnf |
Download /etc/my.cnf
1 2 3 4 5 6 7 8 9 | mkdir /var/lib/mysql-cluster chown mysql:mysql /var/lib/mysql-cluster cd /var/lib/mysql-cluster su mysql /usr/local/mysql/bin/ndbd --initial # start the storage node and force it to (re)read the config exit echo "/usr/local/mysql/bin/ndbd" > /etc/init.d/ndbd chmod +x /etc/init.d/ndbd /etc/init.d/mysql.server restart # start the query node |
SUSE:
1 2 3 4 | chkconfig --add mysql.server # this is SUSE's way of starting applications on system boot chkconfig --add ndbd chkconfig --list mysql.server chkconfig --list ndbd |
Ubuntu:
1 2 3 4 5 | sudo apt-get install sysv-rc-conf # this is chkconfig's equivalent in Ubuntu sysv-rc-conf mysql.server on sysv-rc-conf ndbd on sysv-rc-conf --list mysql.server sysv-rc-conf --list ndbd |
That's it! At this point you should go back to the management console that you logged into earlier (ndb_mgm) and issue the 'show' command again. If everything is fine, you should see your data and SQL nodes connected. Now you can login to any SQL node, make some users, and create new ndb tables. If you're experiencing problems, do leave a message in the comments.
In the next mysql cluster article, I will explore various cluster error messages I have encountered as well as config file tweaking. Now go and spend some time outside in the sun - life is too short to waste it at a dark office.
Navicat For MySQL Bugs Filed
Tuesday, March 25th, 2008
Updated: April 22nd, 2008
Update: Looks like both of these have been fixed in 8.0.26.
Navicat For MySQL is a GUI for MySQL developers. I've tried a few tools before but somehow got attached to Navicat due to a few nice features that I'm not going to go into right now. Navicat suffers from a couple of annoying bugs and random crashes. I don't know if I can help fix the random ones but if I can at least file the ones I can reproduce, everyone wins. I have the latest as of today version 8.0.23.
Bug [NAL-15328]: Structure Sync Fails to notice encoding differences
| Last Update: | 13 Mar 2008 12:38 PM |
| Last Replier: | Mayho Ho |
| Status: | Open |
| Department: | Navicat Support Center |
| Created On: | 13 Mar 2008 09:52 AM |
Structure sync doesn't see the difference between my columns that are utf8 and that are latin1. This is a severe bug. I relied on it to compare some production tables and wasn't aware some fields weren't utf8 in one of the tables until I dumped the DDL.
In 2 different databases:
1 2 3 4 5 6 7 8 9 | -- Table "test" DDL CREATE TABLE `test` ( `a` varchar(255) default NULL ) ENGINE=MyISAM DEFAULT CHARSET=latin1; -- Table "test" DDL CREATE TABLE `test` ( `a` varchar(255) character set utf8 default NULL ) ENGINE=MyISAM DEFAULT CHARSET=latin1; |
Now run Structure Sync. It will not report any differences and won't even show utf8ness in the 2nd table.
Bug [VXL-67626]: Navicat crash
| Last Update: | 25 Mar 2008 10:37 AM |
| Last Replier: | Mayho Ho |
| Status: | Open |
| Department: | Navicat Support Center |
| Created On: | 07 Mar 2008 04:50 AM |
If I open the Server Monitor and there's a really long query running (I don't have the exact length), for example a really long SELECT with A OR B OR C, etc, it just crashes. I think there's a buffer overrun somewhere in Navicat's code, so when it does SHOW FULL PROCESSLIST internally, it overruns that buffer.
I would appreciate a fix for this as I lose all my open queries and tables when this happens. I'd be glad to answer any additional questions, but I hope the gist of it is clear.
A fix would involve at least letting a user select between SHOW PROCESSLIST and SHOW FULL PROCESSLIST as well as fixing the crash.
I was able to come up with a test case. I did warn you the query had to be really large :-].
1 2 3 4 5 | CREATE TABLE `media` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`index_status` enum('none') DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1; |
Take an instance of navicat and start a transaction:
> begin;
> select * from media for update;
Now open another Navicat session and issue the attached quite long sql vxl-67626.txt. Because I locked up the table in the previous step, you'll have plenty of time to open up the Server Monitor. The query should remain in the
process list until the first transaction is canceled or completed. Now launch Server Monitor and observe it crash.

(No Ratings Yet)
(12 rating, 8 votes)
beer planet is Artem Russakovskii's blog. Artem is a software engineer at