Exporting your App Engine Data with a Sitemap of Sitemaps

ESP32 Development Board WiFi+Bluetooth Ultra-Low Power Consumption Dual Cores ESP-32 ESP-32S Board
Integrated antenna and RF balun, power amplifier, low-noise amplifiers, filters, and power management module. The entire solution takes up the least amount of printed circuit board area. This board is used with 2.4 GHz dual-mode Wi-Fi and Bluetooth chips by TSMC 40nm low power technology, power and RF properties best, which is safe, reliable, and scalable to a variety of applications.
Currently there are about 200’000 archived items available in the Feederator database. In the hope to attract more traffic I wanted to make this data available to search engines. But how to make this data available to search engines without putting too much burden on my App Engine Quota. Certainly one big Sitemap wouldn’t work, because the call from a search engine would reach the 30 seconds limit of a Servlet call long before all the links to the different items could be created.
Ok, that means breaking it down into smaller chunks. But how to distribute the data more or less equally? That was an easy one: since Feederator adds more or less the same quantity of new feed items to the archive I would have to separate the chunks by day. Every item already has a publishedDate, which marks the date of entry to the datastore. But that would be another problem: how to add all the different sitemaps to search engines like Google or Bing?
Turns out there’s a solution for exactly that: a sitemap of sitemaps. The super sitemap file has to look as described here. So this file would contain one entry per day and could be generated relatively easy by iterating over all days in the last three months (the duration items are stored in the archive). I wouldn’t even have to know how many items are available per day in this stage. Just assume there are some, that’s enough.
This looks like this:

<?xml version=”1.0″ encoding=”UTF-8″?>
<sitemapindex xmlns=”http://www.sitemaps.org/schemas/sitemap/0.9″>


In my case, the same servlet that creates the sitemapindex file also creates different per-date index files. To avoid huge quantities of XML overhead you can export the per-date sitemaps also in line by line format. The sitemap file for 2010-12-19 looks like this:







With one link per line. To write a servlet like that is pretty easy. 

Posted by squix78

Leave a Reply