python - Help parsing between <pre> tags using BeautifulSoup -

i attempint parse out information website using beautifulsoup , python. html looks following. wanting parsed data like:

id definition
lysine.biosynthesis - burkholderia psuedomallei 17
... rest of data in similar place (within "pre" tags , outside "a" tags.

how can this?

<pre>id                   definition     ---------------------------------------------------------------------------------------------------- <a href="/kegg-bin/show_pathway?bpm00300">bpm00300</a>             lysine biosynthesis - burkholderia pseudomallei 17  <a href="/kegg-bin/show_pathway?bpm00330">bpm00330</a>             arginine , proline metabolism - burkholderia pse  <a href="/kegg-bin/show_pathway?bpm01100">bpm01100</a>             metabolic pathways - burkholderia pseudomallei 171  <a href="/kegg-bin/show_pathway?bpm01110">bpm01110</a>             biosynthesis of secondary metabolites - burkholder  </pre>

i have tried by:

y=soup.find('pre') #returns data between <pre> tags. specific kegg     in y:         z =a.string

this gave me:

 id                   definition ----------------------------------------------------------------------------------------------------

thanks help!

beautifulsoup() , search methods return hierarchical parse-tree object, not string. iterating through findchildren() on node found want (and skips header line):

for in soup.find('pre').findchildren():     z = a.string

Search This Blog

Barbera

python - Help parsing between <pre> tags using BeautifulSoup -

Comments

Post a Comment

Popular posts from this blog

c++ - Is it possible to compile a VST on linux? -

c# - SharpSVN - How to get the previous revision? -

php cli reading files and how to fix it? -