Parse HTML using Python and Beautiful Soup -


<div class="profile-row clearfix"><div class="profile-row-header">member since</div><div class="profile-information">january 2010</div></div> <div class="profile-row clearfix"><div class="profile-row-header">aiga chapter</div><div class="profile-information">alaska</div></div> <div class="profile-row clearfix"><div class="profile-row-header">title</div><div class="profile-information">owner</div></div> <div class="profile-row clearfix"><div class="profile-row-header">company</div><div class="profile-information">mad dog graphx</div></div> 

i'm using beautiful soup point in html code. want search through code, , pull data january 2010, alaska, owner, , mad dog graph. data has same class have different variables "member since", "aiga chapter," etc. before hand. how can search member since , january 2010. , same other 3 fields?

>>> beautifulsoup import beautifulsoup >>> soup = beautifulsoup('''<div class="profile-row clearfix"><div class="profile-row-header">member since</div><div class="profile-information">january 2010</div></div> ... <div class="profile-row clearfix"><div class="profile-row-header">aiga chapter</div><div class="profile-information">alaska</div></div> ... <div class="profile-row clearfix"><div class="profile-row-header">title</div><div class="profile-information">owner</div></div> ... <div class="profile-row clearfix"><div class="profile-row-header">company</div><div class="profile-information">mad dog graphx</div></div> ... ''') >>> row in soup.findall('div', {'class':'profile-row clearfix'}): ...  field, value = row.findall(text = true) ...  print field, value ...  member since january 2010 aiga chapter alaska title owner company mad dog graphx 

you can of course want field , value, create dict them or store them in database.

if there other divs or other text nodes within "profile-row clearfix" div, you'll need field = row.find('div', {'class':'profile-row-header'}).findall(text=true), etc.


Comments

Popular posts from this blog

c++ - Is it possible to compile a VST on linux? -

java - Output of Eclipse is rubbish -

jquery - Confused with JSON data and normal data in Django ajax request -