A Simple Web Bot with Requests and BeautifulSoup

Posted on Tuesday, February 21, 2012

Today I helped a colleague debugging a web bot written in Java. Since I did’t really work with Java since a few years, I thought it would be easier for me to reproduce (and solve) the problem with Requests and BeautifulSoup. (I’ve actually been looking for an opportunity to try Requests out for a while, since I’ve heard so much good about it.)

And what can I say—I was blown away by how easy it was to implement a simple web bot that filled out a form and grabbed a huge table of data for me.

I cannot show you exactly that web bot, but here’s how you could search my website for “Python” and get the headings of the resulting posts:

from BeautifulSoup import BeautifulSoup
import requests

url = 'http://stefan.sofa-rockers.org/search/?q=%(q)s'
payload = {
    'q': 'Python',
}
r = requests.get(url % payload)

soup = BeautifulSoup(r.text)
titles = [h2.text for h2 in soup.findAll('h2', attrs={'class': 'post_title'})]

for t in titles:
    print(t)