{"id":36,"date":"2008-01-23T20:50:00","date_gmt":"2008-01-23T20:50:00","guid":{"rendered":"http:\/\/www.mohanjith.net\/wordpress\/?p=36"},"modified":"2008-10-29T21:07:50","modified_gmt":"2008-10-29T21:07:50","slug":"automagically-ping-blog-search-engines","status":"publish","type":"post","link":"https:\/\/mohanjith.net\/blog\/2008\/01\/automagically-ping-blog-search-engines.html","title":{"rendered":"Automagically ping blog search engines"},"content":{"rendered":"<p>I wanted to automatically ping <a href=\"http:\/\/www.technorati.com\/\">Technorati<\/a>, <a href=\"http:\/\/www.icerocket.com\/\">Icerocket<\/a>, and <a href=\"http:\/\/blogsearch.google.com\/\">Google Blog Search<\/a>, that means with no intervention the blog search engines should be pinged. I was alright with a delay of 15 minutes.<\/p>\n<p>So I went about exploiting the XML-RPC services provided by the blog search engines. I came up with <a href=\"http:\/\/www.mohanjith.net\/downloads\/scripts\/python\/xmlrpc-blog-ping.py\">this<\/a> python script. I set up a cron job to invoke the script every 15 minutes. See bellow for the source.<br \/>\n[sourcecode language=&#8217;py&#8217;]#!\/usr\/bin\/python<\/p>\n<p>import xmlrpclib<br \/>\nimport urllib2<br \/>\nimport os<\/p>\n<p>from hashlib import md5<\/p>\n<p>feed_url = &#8216;[Yorur feed url]&#8217;<br \/>\nblog_url = &#8216;[Your blog url]&#8217;<br \/>\nblog_name = &#8216;[Your blog name]&#8217;<br \/>\nhash_file_path = os.path.expanduser(&#8220;~\/.blogger\/&#8221;)<\/p>\n<p>def main():<br \/>\nreq = urllib2.Request(feed_url)<br \/>\nresponse = urllib2.urlopen(req)<br \/>\nfeed = response.read()<br \/>\nhash_file_name = hash_file_path + md5(blog_url).hexdigest()<\/p>\n<p>if os.path.exists(hash_file_name):<br \/>\nhash_file = open(hash_file_name, &#8220;r+&#8221;)<br \/>\nlast_digest = hash_file.read(os.path.getsize(hash_file_name))<br \/>\nelse:<br \/>\nhash_file = open(hash_file_name, &#8220;w&#8221;)<br \/>\nlast_digest = &#8221;<\/p>\n<p>curr_digest = md5(feed).hexdigest()<\/p>\n<p>if curr_digest != last_digest:<br \/>\nping = Ping(blog_name, blog_url)<br \/>\nresponses = ping.ping_all([&#8216;icerocket&#8217;,&#8217;technorati&#8217;,&#8217;google&#8217;])<br \/>\nhash_file.write(curr_digest)<\/p>\n<p>hash_file.close()<\/p>\n<p>class Ping:<br \/>\ndef __init__(self, blog_name, blog_url):<br \/>\nself.blog_name = blog_name<br \/>\nself.blog_url = blog_url<\/p>\n<p>def ping_all(self, down_stream_services):<br \/>\nresponses = []<\/p>\n<p>for down_stream_service in down_stream_services:<br \/>\nmethod = eval(&#8216;self._&#8217; + down_stream_service)<br \/>\nresponses.append(method.__call__())<\/p>\n<p>return responses<\/p>\n<p>def _icerocket(self):<br \/>\nserver = xmlrpclib.ServerProxy(&#8216;http:\/\/rpc.icerocket.com:10080&#8217;)<br \/>\nresponse = server.ping(self.blog_name, self.blog_url)<br \/>\n# print &#8220;Icerocket response : &#8221; + str(response)<br \/>\nreturn response<\/p>\n<p>def _technorati(self):<br \/>\nserver = xmlrpclib.ServerProxy(&#8216;http:\/\/rpc.technorati.com\/rpc\/ping&#8217;)<br \/>\nresponse = server.weblogUpdates.ping(self.blog_name, self.blog_url)<br \/>\n# print &#8220;Technorati response : &#8221; + str(response)<br \/>\nreturn response<\/p>\n<p>def _google(self):<br \/>\nserver = xmlrpclib.ServerProxy(&#8216;http:\/\/blogsearch.google.com\/ping\/RPC2&#8217;)<br \/>\nresponse = server.weblogUpdates.ping(self.blog_name, self.blog_url)<br \/>\n# print &#8220;Google blog search response : &#8221; + str(response)<br \/>\nreturn response<\/p>\n<p>main()[\/sourcecode]<br \/>\nWhen ever the script is invoked it will get the post feed content, and create a md5 hash of it and then compare the hash against the last known hash, if they differ ping the given list of service.<\/p>\n<p>This is very convenient if you have someplace to run the cron job. Even your own machine is sufficient if you can keep your machine on for at least 15 minutes after the blog post is made.<\/p>\n<p>To run the script you need to python 2.4 to later and the python package hashlib. Hope you will find this useful.<\/p>\n<div id=\"fb-like\" style=\"\"><iframe src=\"http:\/\/www.facebook.com\/plugins\/like.php?href=https:\/\/mohanjith.net\/blog\/2008\/01\/automagically-ping-blog-search-engines.html&amp;layout=standard&amp;show_faces=true&amp;width=300&amp;action=like&amp;font=&amp;colorscheme=light&amp;locale=en_US\" scrolling=\"no\" frameborder=\"0\" allowTransparency=\"true\" style=\"border:none; overflow:hidden; width:300px; height:30px\"><\/iframe><\/div>","protected":false},"excerpt":{"rendered":"<p>I wanted to automatically ping Technorati, Icerocket, and Google Blog Search, that means with no intervention the blog search engines should be pinged. I was alright with a delay of 15 minutes. So I went about exploiting the XML-RPC services provided by the blog search engines. I came up with this python script. I set &#8230; <a title=\"Automagically ping blog search engines\" class=\"read-more\" href=\"https:\/\/mohanjith.net\/blog\/2008\/01\/automagically-ping-blog-search-engines.html\" aria-label=\"More on Automagically ping blog search engines\">Read more<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"enabled":false},"version":2}},"categories":[89,87,88],"tags":[401,399,400],"class_list":["post-36","post","type-post","status-publish","format-standard","hentry","category-blog","category-ping","category-python","tag-blog","tag-ping","tag-python"],"aioseo_notices":[],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p5lUHm-A","jetpack_likes_enabled":false,"_links":{"self":[{"href":"https:\/\/mohanjith.net\/blog\/wp-json\/wp\/v2\/posts\/36","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mohanjith.net\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mohanjith.net\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mohanjith.net\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mohanjith.net\/blog\/wp-json\/wp\/v2\/comments?post=36"}],"version-history":[{"count":3,"href":"https:\/\/mohanjith.net\/blog\/wp-json\/wp\/v2\/posts\/36\/revisions"}],"predecessor-version":[{"id":137,"href":"https:\/\/mohanjith.net\/blog\/wp-json\/wp\/v2\/posts\/36\/revisions\/137"}],"wp:attachment":[{"href":"https:\/\/mohanjith.net\/blog\/wp-json\/wp\/v2\/media?parent=36"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mohanjith.net\/blog\/wp-json\/wp\/v2\/categories?post=36"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mohanjith.net\/blog\/wp-json\/wp\/v2\/tags?post=36"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}