[ACCEPTED]-Using SoupStrainer to parse selectively-scrape
Accepted answer
Oh boy am i silly, i was searching for tags 3 with atribute id = products, but it should 2 have been product_list
heres the finaly code 1 if anyone comes searching.
from BeautifulSoup import BeautifulSoup, SoupStrainer
import urllib
import re
start = time.clock()
url = "http://someplace.com"
html = urllib.urlopen(url).read()
product = SoupStrainer('div',{'id': 'products_list'})
soup = BeautifulSoup(html,parseOnlyThese=product)
for a in soup.findAll('a',{'title':re.compile('.+') }):
print a.string
Try searching first for the product list 1 div
and then for the a
tags with title:
product = soup.find('div',{'id': 'products'})
for a in product.findAll('a',{'title': re.compile('.+') }):
print a.string
Source:
stackoverflow.com
More Related questions
Cookie Warning
We use cookies to improve the performance of the site. By staying on our site, you agree to the terms of use of cookies.