Automagically ping blog search engines

I wanted to automatically ping Technorati, Icerocket, and Google Blog Search, that means with no intervention the blog search engines should be pinged. I was alright with a delay of 15 minutes.

So I went about exploiting the XML-RPC services provided by the blog search engines. I came up with this python script. I set up a cron job to invoke the script every 15 minutes. See bellow for the source.
[sourcecode language=’py’]#!/usr/bin/python

import xmlrpclib
import urllib2
import os

from hashlib import md5

feed_url = ‘[Yorur feed url]’
blog_url = ‘[Your blog url]’
blog_name = ‘[Your blog name]’
hash_file_path = os.path.expanduser(“~/.blogger/”)

def main():
req = urllib2.Request(feed_url)
response = urllib2.urlopen(req)
feed = response.read()
hash_file_name = hash_file_path + md5(blog_url).hexdigest()

if os.path.exists(hash_file_name):
hash_file = open(hash_file_name, “r+”)
last_digest = hash_file.read(os.path.getsize(hash_file_name))
else:
hash_file = open(hash_file_name, “w”)
last_digest = ”

curr_digest = md5(feed).hexdigest()

if curr_digest != last_digest:
ping = Ping(blog_name, blog_url)
responses = ping.ping_all([‘icerocket’,’technorati’,’google’])
hash_file.write(curr_digest)

hash_file.close()

class Ping:
def __init__(self, blog_name, blog_url):
self.blog_name = blog_name
self.blog_url = blog_url

def ping_all(self, down_stream_services):
responses = []

for down_stream_service in down_stream_services:
method = eval(‘self._’ + down_stream_service)
responses.append(method.__call__())

return responses

def _icerocket(self):
server = xmlrpclib.ServerProxy(‘http://rpc.icerocket.com:10080’)
response = server.ping(self.blog_name, self.blog_url)
# print “Icerocket response : ” + str(response)
return response

def _technorati(self):
server = xmlrpclib.ServerProxy(‘http://rpc.technorati.com/rpc/ping’)
response = server.weblogUpdates.ping(self.blog_name, self.blog_url)
# print “Technorati response : ” + str(response)
return response

def _google(self):
server = xmlrpclib.ServerProxy(‘http://blogsearch.google.com/ping/RPC2’)
response = server.weblogUpdates.ping(self.blog_name, self.blog_url)
# print “Google blog search response : ” + str(response)
return response

main()[/sourcecode]
When ever the script is invoked it will get the post feed content, and create a md5 hash of it and then compare the hash against the last known hash, if they differ ping the given list of service.

This is very convenient if you have someplace to run the cron job. Even your own machine is sufficient if you can keep your machine on for at least 15 minutes after the blog post is made.

To run the script you need to python 2.4 to later and the python package hashlib. Hope you will find this useful.

If you enjoyed this post, make sure you subscribe to my RSS feed!

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.