A Simple Web Bot with Requests and BeautifulSoup
Posted on
Today I helped a colleague debugging a web bot written in Java. Since I did’t really work with Java since a few years, I thought it would be easier for me to reproduce (and solve) the problem with Requests and BeautifulSoup. (I’ve actually been looking for an opportunity to try Requests out for a while, since I’ve heard so much good about it.)
And what can I say—I was blown away by how easy it was to implement a simple web bot that filled out a form and grabbed a huge table of data for me.
I cannot show you exactly that web bot, but here’s how you could search my website for “Python” and get the headings of the resulting posts:
from BeautifulSoup import BeautifulSoup
import requests
url = 'http://stefan.sofa-rockers.org/search/?q=%(q)s'
payload = {
'q': 'Python',
}
r = requests.get(url % payload)
soup = BeautifulSoup(r.text)
titles = [h2.text for h2 in soup.findAll('h2', attrs={'class': 'post_title'})]
for t in titles:
print(t)