python - Scraping using Beautiful Soup leads to error only in a particular section (NullType object encountered) -
i'm trying list of injuries of particular team (liverpool in case) following website
http://www.physioroom.com/news/english_premier_league/epl_injury_table.php
it works fine teams(swansea), exits following errors (liverpool, everyon)
typeerror: can't convert 'nonetype' object str implicitly
here code using.
from bs4 import beautifulsoup import urllib.request url = "http://www.physioroom.com/news/english_premier_league/epl_injury_table.php" html = urllib.request.urlopen(url).read() soup = beautifulsoup(html, "html.parser") #lp = soup.find(alt="liverpool away shirt").parent.parent.parent.next_sibling.next_sibling lp = soup.find(alt="swansea city away shirt").parent.parent.parent.next_sibling.next_sibling player_info = "" player_list = [] while true: if(lp.has_attr('id')): break else: tdlist = lp.find_all('td')# player_info = tdlist[0].string+"\t"+tdlist[1].string+"\t"+tdlist[3].string #print(tdlist[0].find('a').string.strip() + "\t" + tdlist[1].string.strip() + "\t" + tdlist[3].string.strip()) print(tdlist[0].string + "\t" + tdlist[1].string + "\t" + tdlist[3].string) lp=lp.findnext('tr')
please let me know how can fix this.
from bs4 import beautifulsoup import requests url = "http://www.physioroom.com/news/english_premier_league/epl_injury_table.php" r = requests.get(url) soup = beautifulsoup(r.text, "lxml") table = soup.find('table', id='epl-table') tr in table('tr', id=none): print(tr.get_text('\t', strip=true))
out:
player condition latest news expected return available? d meyler knock no return date slight doubt s maloney ear infection no return date slight doubt m henriksen shoulder separation april 1, 2017 major doubt mcgregor fitness no return date major doubt w keane acl knee injury no return date m odubajo patella fracture may 1, 2017 g luer knee injury february 1, 2017
if want text part of document or tag, can use get_text() method. returns text in document or beneath tag, single unicode string:
you can specify string used join bits of text together
you can tell beautiful soup strip whitespace beginning , end of each bit of text
Comments
Post a Comment