Problem with Python CSV putting each letter in new field -
i'm trying put list of urls csv file i'm scraping webpage using urllib2 , beautifulsoup. have tried writing links csv file unicode , converted utf-8. in both cases, each letter inserted new field.
here's code (i've tried @ least these 2 ways):
f = open('filename','wb') w = csv.writer(f,delimiter=',') link in links: w.writerow(link['href'])
and:
f = open('filename','wb') w = csv.writer(f,delimiter=',') link in links: w.writerow(link['href'].encode('utf-8'))
links
list looks this:
[<a href="#flyout1" accesskey="2" class="quicklinks" tabindex="1" title="skip content">quick links: skip main page content</a>, <a href="#search" class="quicklinks" tabindex="1" title="skip search">skip search</a>, <a href="#news" class="quicklinks" tabindex="1" title="skip section table of contents">skip section content menu</a>, <a href="#footer" class="quicklinks" tabindex="1" title="skip site options">skip common links</a>, <a href="http://www.hhs.gov"><img src="/ucm/groups/fdagov-public/@system/documents/system/img_fdagov_hhs_gov.png" alt="www.hhs.gov link" style="width:112px; height:18px;" border="0" /></a>]
not links have 'href'
key check in code not shown here. in both cases, correct strings written csv file, each letter in new field.
any thoughts?
from docs: "a row must sequence of strings or numbers ..." passing single string, not sequence of strings, treats each letter item. put string in list.
so change w.writerow(link['href'])
w.writerow([link['href']])
.
note: csv file single column looks flat text file. maybe don't need csv.
Comments
Post a Comment