python - Help parsing between <pre> tags using BeautifulSoup -


i attempint parse out information website using beautifulsoup , python. html looks following. wanting parsed data like:

id definition
lysine.biosynthesis - burkholderia psuedomallei 17
... rest of data in similar place (within "pre" tags , outside "a" tags.

how can this?

<pre>id                   definition     ---------------------------------------------------------------------------------------------------- <a href="/kegg-bin/show_pathway?bpm00300">bpm00300</a>             lysine biosynthesis - burkholderia pseudomallei 17  <a href="/kegg-bin/show_pathway?bpm00330">bpm00330</a>             arginine , proline metabolism - burkholderia pse  <a href="/kegg-bin/show_pathway?bpm01100">bpm01100</a>             metabolic pathways - burkholderia pseudomallei 171  <a href="/kegg-bin/show_pathway?bpm01110">bpm01110</a>             biosynthesis of secondary metabolites - burkholder  </pre> 

i have tried by:

y=soup.find('pre') #returns data between <pre> tags. specific kegg     in y:         z =a.string 

this gave me:

 id                   definition ---------------------------------------------------------------------------------------------------- 

thanks help!

beautifulsoup() , search methods return hierarchical parse-tree object, not string. iterating through findchildren() on node found want (and skips header line):

for in soup.find('pre').findchildren():     z = a.string 

Comments

Popular posts from this blog

c# - SharpSVN - How to get the previous revision? -

c++ - Is it possible to compile a VST on linux? -

url - Querystring manipulation of email Address in PHP -