Apparently it's not straightforward to install SOAP::Lite, even using CPAN.

Check this out.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
cpan[1]> install SOAP::Lite
CPAN: Storable loaded ok (v2.18)
Going to read /root/.cpan/Metadata
  Database was generated on Tue, 29 Apr 2008 18:29:45 GMT
CPAN: YAML loaded ok (v0.66)
Going to read /root/.cpan/build/
............................................................................DONE
Found 149 old builds, restored the state of 109
Warning: Cannot install SOAP::Lite, don't know what it is.
Try the command
 
    i /SOAP::Lite/
 
to find objects with matching identifiers.
CPAN: Time::HiRes loaded ok (v1.9713)

Huh? Okay…

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
cpan[2]> i /SOAP::Lite/    
Module    ResourcePool::Command::SOAP::Lite::Call (MWS/ResourcePool-Resource-SOAP-Lite-1.0101.tar.gz)
Module    ResourcePool::Factory::SOAP::Lite (MWS/ResourcePool-Resource-SOAP-Lite-1.0101.tar.gz)
Module    ResourcePool::Resource::SOAP::Lite (MWS/ResourcePool-Resource-SOAP-Lite-1.0101.tar.gz)
Module    SOAP::Lite::Deserializer::XMLSchema1999 (MKUTTER/SOAP-Lite-0.71.04.tar.gz)
Module    SOAP::Lite::Deserializer::XMLSchema2001 (MKUTTER/SOAP-Lite-0.71.04.tar.gz)
Module    SOAP::Lite::Deserializer::XMLSchemaSOAP1_1 (MKUTTER/SOAP-Lite-0.71.04.tar.gz)
Module    SOAP::Lite::Deserializer::XMLSchemaSOAP1_2 (MKUTTER/SOAP-Lite-0.71.04.tar.gz)
Module    SOAP::Lite::InstanceExporter (SMEISNER/SOAP-Lite-InstanceExporter-0.02.tar.gz)
Module    SOAP::Lite::Packager   (MKUTTER/SOAP-Lite-0.71.04.tar.gz)
Module    SOAP::Lite::Simple     (LLAP/SOAP-Lite-Simple-1.9.tar.gz)
Module    SOAP::Lite::Simple::DotNet (LLAP/SOAP-Lite-Simple-1.4.tar.gz)
Module    SOAP::Lite::Simple::Real (LLAP/SOAP-Lite-Simple-1.4.tar.gz)
Module    SOAP::Lite::Utility    (BRYCE/SOAP-Lite-Utility-0.01.tar.gz)
Module    SOAP::Lite::Utils      (MKUTTER/SOAP-Lite-0.71.04.tar.gz)
14 items found

Wtf? Let's try something else.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
cpan[8]> i /SOAP.*Lite/                           
Distribution    BRYCE/SOAP-Lite-Utility-0.01.tar.gz
Distribution    BYRNE/SOAP/SOAP-Lite-0.60a.tar.gz
Distribution    DYACOB/SOAP-Lite-ActiveWorks-0.10.tar.gz
Distribution    DYACOB/SOAP-Lite-SmartProxy-0.11.tar.gz
Distribution    LLAP/SOAP-Lite-Simple-1.4.tar.gz
Distribution    LLAP/SOAP-Lite-Simple-1.9.tar.gz
Distribution    MKUTTER/SOAP-Lite-0.71.04.tar.gz
Distribution    MWS/ResourcePool-Resource-SOAP-Lite-1.0101.tar.gz
Distribution    SMEISNER/SOAP-Lite-InstanceExporter-0.02.tar.gz
Module    Catalyst::Action::SOAP::DocumentLiteral (DRUOSO/Catalyst-Controller-SOAP-0.8.tar.gz)
Module    Catalyst::Action::SOAP::DocumentLiteralWrapped (DRUOSO/Catalyst-Controller-SOAP-0.8.tar.gz)
Module    Catalyst::Action::SOAP::RPCLiteral (DRUOSO/Catalyst-Controller-SOAP-0.8.tar.gz)
Module    Catalyst::Controller::SOAP::DocumentLiteralWrapped (DRUOSO/Catalyst-Controller-SOAP-0.8.tar.gz)
Module    Net::DRI::Transport::HTTP::SOAPLite (PMEVZEK/Net-DRI-0.85.tar.gz)
Module    ResourcePool::Command::SOAP::Lite::Call (MWS/ResourcePool-Resource-SOAP-Lite-1.0101.tar.gz)
Module    ResourcePool::Factory::SOAP::Lite (MWS/ResourcePool-Resource-SOAP-Lite-1.0101.tar.gz)
Module    ResourcePool::Resource::SOAP::Lite (MWS/ResourcePool-Resource-SOAP-Lite-1.0101.tar.gz)
Module  = SOAP::Lite::Deserializer::XMLSchema1999 (MKUTTER/SOAP-Lite-0.71.04.tar.gz)
Module  = SOAP::Lite::Deserializer::XMLSchema2001 (MKUTTER/SOAP-Lite-0.71.04.tar.gz)
Module  = SOAP::Lite::Deserializer::XMLSchemaSOAP1_1 (MKUTTER/SOAP-Lite-0.71.04.tar.gz)
Module  = SOAP::Lite::Deserializer::XMLSchemaSOAP1_2 (MKUTTER/SOAP-Lite-0.71.04.tar.gz)
Module    SOAP::Lite::InstanceExporter (SMEISNER/SOAP-Lite-InstanceExporter-0.02.tar.gz)
Module  = SOAP::Lite::Packager   (MKUTTER/SOAP-Lite-0.71.04.tar.gz)
Module    SOAP::Lite::Simple     (LLAP/SOAP-Lite-Simple-1.9.tar.gz)
Module    SOAP::Lite::Simple::DotNet (LLAP/SOAP-Lite-Simple-1.4.tar.gz)
Module    SOAP::Lite::Simple::Real (LLAP/SOAP-Lite-Simple-1.4.tar.gz)
Module    SOAP::Lite::Utility    (BRYCE/SOAP-Lite-Utility-0.01.tar.gz)
Module  = SOAP::Lite::Utils      (MKUTTER/SOAP-Lite-0.71.04.tar.gz)
28 items found

Aha! It's hiding under a Distribution. Tricky, tricky.

1
2
cpan
install MKUTTER/SOAP-Lite-0.71.04.tar.gz
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
  CPAN.pm: Going to build M/MK/MKUTTER/SOAP-Lite-0.71.04.tar.gz
 
We are about to install SOAP::Lite and for your convenience will provide
you with list of modules and prerequisites, so you'll be able to choose
only modules you need for your configuration.
 
XMLRPC::Lite, UDDI::Lite, and XML::Parser::Lite are included by default.
Installed transports can be used for both SOAP::Lite and XMLRPC::Lite.
 
Press  to see the detailed list.  
 
Feature                       Prerequisites                Install?
----------------------------- ---------------------------- --------
Core Package                  [*] Scalar::Util             always  
                              [*] Test::More                       
                              [*] URI                              
                              [*] MIME::Base64                     
                              [*] version                          
                              [*] XML::Parser (v2.23)              
Client HTTP support           [*] LWP::UserAgent           always  
Client HTTPS support          [*] Crypt::SSLeay            [ yes ] 
Client SMTP/sendmail support  [ ] MIME::Lite               [ no ]  
Client FTP support            [*] IO::File                 [ yes ] 
                              [*] Net::FTP                         
Standalone HTTP server        [*] HTTP::Daemon             [ yes ] 
Apache/mod_perl server        [ ] Apache                   [ no ]  
FastCGI server                [ ] FCGI                     [ no ]  
POP3 server                   [*] MIME::Parser             [ yes ] 
                              [*] Net::POP3                        
IO server                     [*] IO::File                 [ yes ] 
MQ transport support          [ ] MQSeries                 [ no ]  
JABBER transport support      [ ] Net::Jabber              [ no ]  
MIME messages                 [*] MIME::Parser             [ yes ] 
DIME messages                 [*] IO::Scalar (v2.105)      [ no ]  
                              [ ] DIME::Tools (v0.03)              
                              [ ] Data::UUID (v0.11)               
SSL Support for TCP Transport [ ] IO::Socket::SSL          [ no ]  
Compression support for HTTP  [*] Compress::Zlib           [ yes ] 
MIME interoperability w/ Axis [ ] MIME::Parser (v6.106)    [ no ]  
--- An asterix '[*]' indicates if the module is currently installed.
 
Do you want to proceed with this configuration? [yes] 
Checking if your kit is complete...
Looks good
Writing Makefile for SOAP::Lite
cp lib/SOAP/Packager.pm blib/lib/SOAP/Packager.pm
cp lib/XML/Parser/Lite.pm blib/lib/XML/Parser/Lite.pm
...
Writing /usr/lib/perl5/site_perl/5.10.0/i686-linux/auto/SOAP/Lite/.packlist
Appending installation info to /usr/lib/perl5/5.10.0/i686-linux/perllocal.pod
  MKUTTER/SOAP-Lite-0.71.04.tar.gz
  /usr/bin/make install  -- OK

The latest version of SOAP::Lite is installed, time to pat yourself on the back and write some code to actually use it.

I'm sure most Perl coders have to face this annoying problem at one point or another: how do you consistently get the return value out of a system call, be at executed via backticks or system()? Backticks return the output of the program with no error code in sight, while system() returns the error code but prints the output instead of putting it into a variable.

The best solution I could find to this problem to date was posted at http://www.perlmonks.org/?node_id=19119 and involved opening a piped filehandle. It worked quite well but always felt like a hack (which it was). Having used the new Perl 5.10 for a few months, I was shocked today to find this new variable that I've been dreaming about for years:

1
${^CHILD_ERROR_NATIVE}

This variable gives the native status returned by the last pipe close, backtick command, successful call to wait() or waitpid(), or from the system() operator. See perlrun for details. (Contributed by Gisle Aas.)

http://search.cpan.org/dist/perl-5.10.0/pod/perl5100delta.pod#New_internal_variables

I've just tested it and it works as described. Finally!.. what else can I say?

Do NOT Use This Perl Module: Passwd::Unix

Tuesday, April 22nd, 2008

Updated: April 29th, 2008

Update: The author of the module contacted me the same day and promised to fix it in the next version. Version 0.40 was indeed on cpan as promised, but I haven't tested it yet.

Passwd::Unix will corrupt your /etc/shadow file and rearrange login names and their corresponding password hashes.

The current version of Passwd::Unix corrupted my /etc/shadow upon only
calling the passwd() function. Immediately users started to report not
being able to login.

After examining the situation, I found that Passwd::Unix rearranges all
users in /etc/shadow in some way, but it only does it to the
usernames, and not the password hashes. Thus, you will get corrupted accounts. Moreover,
users are now able to login to one OTHER account, not their own,
depending on how the usernames got shuffled.

Thankfully, I had a recent backup but I definitely don’t want anyone
else to suffer.

I’m using perl 5.10, SUSE 10.3. If it’s incompatible with SUSE, it needs
to say so and exit.

I've filed the bug here: http://rt.cpan.org/Public/Bug/Display.html?id=35323.

You have been warned.

In this tutorial, I'll show you how to parse JSON using Perl. As a fun example, I'll use the new SouthParkStudios.com site released earlier this week, which contains full legal episodes of South Park. I guess the TV companies are finally getting a clue about what users want. I will parse the first season's JSON and pull out information about individual episodes (like title, description, air date, etc) from http://www.southparkstudios.com/includes/utils/proxy_feed.php?html=season_json.jhtml%3fseason=1. Feel free to replace '1' with any valid season number.

Here's a short snippet of the JSON:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
{
season:{
 
episode:[
 
{	
title:'Cartman Gets an Anal Probe',
description:'While the boys are waiting for the school bus, Cartman explains the odd nightmare he had the previous night involving alien visitors.',
thumbnail:'http://www.southparkstudios.com/includes/utils/proxy_resizer.php?image=/images/south_park/episode_thumbnails/s01e01_480.jpg&width=55&quality=100',
thumbnail_larger:'http://www.southparkstudios.com/includes/utils/proxy_resizer.php?image=/images/south_park/episode_thumbnails/s01e01_480.jpg&width=63&quality=100',
thumbnail_190:'http://www.southparkstudios.com/includes/utils/proxy_resizer.php?image=/images/south_park/episode_thumbnails/s01e01_480.jpg&width=190&quality=100',	
id:'103511',
airdate:'08.13.97',
episodenumber:'101',
available:'true',
when:'08.13.97'
}
 
,
 
{	
title:'Weight Gain 4000',
description:'When Cartman\'s environmental essay wins a national contest, America\'s sweetheart, Kathie Lee Gifford, comes to South Park to present the award.',
thumbnail:'http://www.southparkstudios.com/includes/utils/proxy_resizer.php?image=/images/south_park/episode_thumbnails/s01e02_480.jpg&width=55&quality=100',
thumbnail_larger:'http://www.southparkstudios.com/includes/utils/proxy_resizer.php?image=/images/south_park/episode_thumbnails/s01e02_480.jpg&width=63&quality=100',
thumbnail_190:'http://www.southparkstudios.com/includes/utils/proxy_resizer.php?image=/images/south_park/episode_thumbnails/s01e02_480.jpg&width=190&quality=100',	
id:'103516',
airdate:'08.20.97',
episodenumber:'102',
available:'true',
when:'08.20.97'
}
 
...
 
]
}
}

Before you can parse JSON, you need to have a few libraries. Install them using CPAN, for example:

1
2
3
4
cpan
install JSON
install JSON::XS
install WWW::Mechanize # my favorite library for browsing

Now the script.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
#!/usr/bin/perl -w
# $Rev: 1 $
# $Author: artem $
# $Date: 2008-03-25 14:28:39 -0800 (Tue, 25 Mar 2008) $
 
use strict;
use WWW::Mechanize;
use JSON -support_by_pp;
 
fetch_json_page("http://www.southparkstudios.com/includes/utils/proxy_feed.php?html=season_json.jhtml%3fseason=1");
 
sub fetch_json_page
{
  my ($json_url) = @_;
  my $browser = WWW::Mechanize->new();
  eval{
    # download the json page:
    print "Getting json $json_url\n";
    $browser->get( $json_url );
    my $content = $browser->content();
    my $json = new JSON;
 
    # these are some nice json options to relax restrictions a bit:
    my $json_text = $json->allow_nonref->utf8->relaxed->escape_slash->loose->allow_singlequote->allow_barekey->decode($content);
 
    # iterate over each episode in the JSON structure:
    my $episode_num = 1;
    foreach my $episode(@{$json_text->{season}->{episode}}){
      my %ep_hash = ();
      $ep_hash{title} = "Episode $episode_num: $episode->{title}";
      $ep_hash{description} = $episode->{description};
      $ep_hash{url} = "http://www.southparkstudios.com/episodes/" . $episode->{id};
      $ep_hash{publish_date} = $episode->{airdate};
      $ep_hash{thumbnail_url} = $episode->{thumbnail_190} || $episode->{thumbnail_larger};
 
      # print episode information:
      while (my($k, $v) = each (%ep_hash)){
        print "$k => $v\n";
      }
      print "\n";
 
      $episode_num++;
    }
  };
  # catch crashes:
  if($@){
    print "[[JSON ERROR]] JSON parser crashed! $@\n";
  }
}

Here's the output:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
Getting json http://www.southparkstudios.com/includes/utils/proxy_feed.php?html=season_json.jhtml%3fseason=1
publish_date => 08.13.97
url => http://www.southparkstudios.com/episodes/103511
thumbnail_url => http://www.southparkstudios.com/includes/utils/proxy_resizer.php?image=/images/south_park/episode_thumbnails/s01e01_480.jpg&width=190&quality=100
title => Episode 1: Cartman Gets an Anal Probe
description => While the boys are waiting for the school bus, Cartman explains the odd nightmare he had the previous night involving alien visitors.
 
publish_date => 08.20.97
url => http://www.southparkstudios.com/episodes/103516
thumbnail_url => http://www.southparkstudios.com/includes/utils/proxy_resizer.php?image=/images/south_park/episode_thumbnails/s01e02_480.jpg&width=190&quality=100
title => Episode 2: Weight Gain 4000
description => When Cartman's environmental essay wins a national contest, America's sweetheart, Kathie Lee Gifford, comes to South Park to present the award.
 
...

Of particular interest here is the way JSON accepts settings:

1
my $json_text = $json->allow_nonref->utf8->relaxed->escape_slash->loose->allow_singlequote->allow_barekey->decode($content);

I found that these settings fix most of the crashes and incompatibilities while parsing various JSON pages.

Is there something I've missed? Do you know a better way to parse JSON in Perl? Unclear about something. Don't hesitate to share in the comments.

Updated: May 1st, 2008

Sometimes in my line of work, I need to figure out if a url or filename point to a media file by checking for the file extension. If it's a url, however, it may be followed by various parameters. Not to overcomplicate things, I came up with the following Perl code:

1
2
3
4
5
6
7
8
9
10
#!/usr/bin/perl -w
use strict;
my $name = "some_file.flv"; # or http://example.com/file.mp4?foo=bar
my $is_media_type = ($name =~ /\.(wmv|avi|flv|mov|mkv|mp..?|swf|ra.?|rm|as.|m4[av]|smi.?)\b/i);
if($is_media_type){
  print "media extension found\n";
}
else{
  print "not a media file\n";
}

This gets the job done without triggering any false positives (at least for the files/urls I've been dealing with so far). Am I missing any obvious types? Do you have a better way to accomplish the same thing? If so, please share in the comments.