How To Fix Incomplete WordPress (WXR) Exports

Posted by Artem Russakovskii on April 13th, 2012 in PHP, Programming, Tips, Wordpress

Having spent way more time on this problem than I really should have, I'm going to make sure everyone can actually find a solution instead of useless WordPress support threads.

The Problem

I wanted to export all the data from WordPress using its native export mechanism (located at http://YOURBLOG/wp-admin/export.php), but since the blog I was working on was pretty large (6k posts, 120k comments), I kept getting XML files that ended prematurely and for which xmllint spit out this error:

Premature end of data in tag channel

Upon closer inspection, I saw the XML file ended with a random, yet always fully closed, </item> tag, but was missing the closing </channel> and </rss> tags, as well as a whole bunch of data.

My immediate theory was that PHP was running out of memory. It's also interesting that the number of <item> elements was always divisible by 20, but after looking at the code in export.php, I saw the loop grabbed 20 posts at a time. This revelation made it obvious that the code crashed while processing one of the batches, which only made the out-of-memory theory stronger.

The Solution

After raising the memory in the export.php file itself and verifying the fix, I came up with the following solution that removes the need to modify core WordPress files. Just add this somewhere in functions.php:

* Dynamically increase allowed memory limit for export. 
function my_export_wp() { 
  ini_set('memory_limit', '1024M'); 
add_action('export_wp', 'my_export_wp');

The code above will dynamically set the memory limit high enough for all but unimaginably large jobs to complete. Feel free to adjust this limit if you have, for instance, millions of comments and lots of RAM on the web server.

And there you have it – full exports. Much better, isn't it?

P.S. Some shared hosting providers have PHP set to ignore the directive above, in which case this solution, or no other solution but upgrading your hosting, will do anything.

P.P.S. There is a bug (#15203) in WordPress <=3.3.1, which doesn't properly escape posts that contain CDATA and breaks XML. It is slated for a fix in WordPress 3.4, but if you haven't upgraded yet (it's not even out at the time of this writing), you can still apply the fix manually, like so. My dumps wouldn't validate before the fix, but I've confirmed that they now fully validate after.

● ● ●
Artem Russakovskii is a San Francisco programmer and blogger. Follow Artem on Twitter (@ArtemR) or subscribe to the RSS feed.

In the meantime, if you found this article useful, feel free to buy me a cup of coffee below.

  • redwall_hp

    I usually just use mysqldump. 🙂

    I've run into the memory/execution time limits when helping someone import them via phpMyAdmin, though. For that I ended up splitting the dump into multiple files and feeding them in one at a time. http://www.webmaster-source.com/2011/09/26/how-to-import-a-very-large-sql-dump-with-phpmyadmin/

    • Ah, but I wanted to get WXR working – I needed it for validating XML for importing to DISQUS. Of course, if you are doing full DB dumps, that's a whole other problem.

      • redwall_hp

        Ah, Disqus. That makes sense.

  • Jai

    Here are more ways to fix your WordPress export file for Disqus

  • maria ozawa

    thank you! Very good

  • Personally I haven't yet tried the iPhone 4S so I can't comment on how good that device is, however from looking at the phones specifications and user reviews for the extra money you would pay for the 4S I think personally I would prefer to save some money and stick with the iPhone 4.

  • iPhone 3 and 3GS users will be used to their sleek looking device with the curved plastic back, and I must admit I loved this design, so when the iPhone 4 came along with its rectangular and flat design I was rather put off and thought initially that it was borderline ugly.

  • On the iPhone news grapevine, it's not just iPhone hacks that are streaming in thick and fast, there are also a series of full-blown web apps which I have to say is pretty impressive – well done guys! Probably the best one at the moment is the iPhone hack "iChat for iPhone" which provides you with IM capability on AIM. More than that, the iPhone software source is available for this hack and you can host it on your own machine. If you are really serious about keeping up-to-date with the latest in iPhone unlocking and iPhone hacks then check out the iPhone Dev Wiki that seems to me to be the closest to getting an iPhone unlocking hack.

  • I have seen one review website saying that one of their employees has dropped this phone out of a low level

    window and nothing happened to the phone, now sorry I don't believe this at all. Maybe they wrapped the device

    in bubble wrap first and then tried it, but otherwise this device will crack. Having said that if you drop

    this phone from your pocket to the ground then you shouldn't receive any problems, but just to make sure I

    would recommend getting a cheap case off eBay or somewhere like that.

  • The iPhone 4 only has a 1GHZ single core processor whereas the newer model phones such as the Samsung Galaxy 2 have 2 cores. This however shouldn't put you off as the iPhone will handle all the tasks you want it to. I have only ever found the phone to go slow when you are running multiple apps simultaneously.

  • fishsquad

    I added that to the end of my functions.php and WP immediately stops working and just returns a blank page in the browser.

  • fishsquad

    putting the following in my wp-config.php worked for me instead: ini_set("max_execution_time", 60);

  • riquez

    I was able to improve export results by using the above code (ram set to 2048) AND also using fishsquad's ini_set() suggestion below.
    The result was I could export in batches of 6 months.

    Without these changes the file would truncate once it got over 3.6mb. With the changes my file could reach 7.2mb.