Formating http get request output in python -
i trying read data our internal web-page using following code:
import requests requests_toolbelt.utils import dump resp = requests.get('xxxxxxxxxxxxxxxx') data = dump.dump_all(resp) print(data.decode('utf-8'))
and output getting in following format:
<tr> <td bgcolor="#ffffff"><font size=2><a href=javascript:openwin(179)>kevin</a></font></td> <td bgcolor="#ffffff"><font size=2>45.50/week</font></td> </tr> <tr> <td bgcolor="#ffffff"><font size=2><a href=javascript:openwin(33)>eliza</a></font></td> <td bgcolor="#ffffff"><font size=2>220=00/week</font></td> </tr> <tr> <td bgcolor="#ffffff"><font size=2><a href=javascript:openwin(97)>sam</a></font></td> <td bgcolor="#ffffff"><font size=2>181=00</font></td> </tr>
however data interested in above output name , values, e.g.:
kevin 45.50/week eliza 220=00/week sam 181=00
is there module/way can format output in required format , put in file(preferably excel)
try beautifulsoup:
from bs4 import beautifulsoup soup content = """<tr> <td bgcolor="#ffffff"><font size=2><a href=javascript:openwin(179)>kevin</a></font></td> <td bgcolor="#ffffff"><font size=2>45.50/week</font></td> </tr> <tr> <td bgcolor="#ffffff"><font size=2><a href=javascript:openwin(33)>eliza</a></font></td> <td bgcolor="#ffffff"><font size=2>220=00/week</font></td> </tr> <tr> <td bgcolor="#ffffff"><font size=2><a href=javascript:openwin(97)>sam</a></font></td> <td bgcolor="#ffffff"><font size=2>181=00</font></td> </tr>""" html = soup(content, 'lxml') trs = html.find_all('tr') row in trs: tds = row.find_all('td') data in tds: print data.text.strip(), print '\n'
the output:
kevin 45.50/week eliza 220=00/week sam 181=00
first find <tr>
tags find_all('tr')
, <td>
tags inside find_all('td')
, output text content of td
data.text
Comments
Post a Comment