I wanted to automatically ping Technorati, Icerocket, and Google Blog Search, that means with no intervention the blog search engines should be pinged. I was alright with a delay of 15 minutes.
So I went about exploiting the XML-RPC services provided by the blog search engines. I came up with this python script. I set up a cron job to invoke the script every 15 minutes. See bellow for the source.
[sourcecode language=’py’]#!/usr/bin/python
import xmlrpclib
import urllib2
import os
from hashlib import md5
feed_url = ‘[Yorur feed url]’
blog_url = ‘[Your blog url]’
blog_name = ‘[Your blog name]’
hash_file_path = os.path.expanduser(“~/.blogger/”)
def main():
req = urllib2.Request(feed_url)
response = urllib2.urlopen(req)
feed = response.read()
hash_file_name = hash_file_path + md5(blog_url).hexdigest()
if os.path.exists(hash_file_name):
hash_file = open(hash_file_name, “r+”)
last_digest = hash_file.read(os.path.getsize(hash_file_name))
else:
hash_file = open(hash_file_name, “w”)
last_digest = ”
curr_digest = md5(feed).hexdigest()
if curr_digest != last_digest:
ping = Ping(blog_name, blog_url)
responses = ping.ping_all([‘icerocket’,’technorati’,’google’])
hash_file.write(curr_digest)
hash_file.close()
class Ping:
def __init__(self, blog_name, blog_url):
self.blog_name = blog_name
self.blog_url = blog_url
def ping_all(self, down_stream_services):
responses = []
for down_stream_service in down_stream_services:
method = eval(‘self._’ + down_stream_service)
responses.append(method.__call__())
return responses
def _icerocket(self):
server = xmlrpclib.ServerProxy(‘http://rpc.icerocket.com:10080’)
response = server.ping(self.blog_name, self.blog_url)
# print “Icerocket response : ” + str(response)
return response
def _technorati(self):
server = xmlrpclib.ServerProxy(‘http://rpc.technorati.com/rpc/ping’)
response = server.weblogUpdates.ping(self.blog_name, self.blog_url)
# print “Technorati response : ” + str(response)
return response
def _google(self):
server = xmlrpclib.ServerProxy(‘http://blogsearch.google.com/ping/RPC2’)
response = server.weblogUpdates.ping(self.blog_name, self.blog_url)
# print “Google blog search response : ” + str(response)
return response
main()[/sourcecode]
When ever the script is invoked it will get the post feed content, and create a md5 hash of it and then compare the hash against the last known hash, if they differ ping the given list of service.
This is very convenient if you have someplace to run the cron job. Even your own machine is sufficient if you can keep your machine on for at least 15 minutes after the blog post is made.
To run the script you need to python 2.4 to later and the python package hashlib. Hope you will find this useful.