python - Scraping using Beautiful Soup leads to error only in a particular section (NullType object encountered) -


i'm trying list of injuries of particular team (liverpool in case) following website

http://www.physioroom.com/news/english_premier_league/epl_injury_table.php

it works fine teams(swansea), exits following errors (liverpool, everyon)

typeerror: can't convert 'nonetype' object str implicitly 

here code using.

from bs4 import beautifulsoup import urllib.request   url = "http://www.physioroom.com/news/english_premier_league/epl_injury_table.php" html = urllib.request.urlopen(url).read() soup = beautifulsoup(html, "html.parser") #lp = soup.find(alt="liverpool away shirt").parent.parent.parent.next_sibling.next_sibling lp = soup.find(alt="swansea city away shirt").parent.parent.parent.next_sibling.next_sibling player_info = "" player_list = []  while true:     if(lp.has_attr('id')):             break     else:             tdlist = lp.find_all('td')#     player_info = tdlist[0].string+"\t"+tdlist[1].string+"\t"+tdlist[3].string             #print(tdlist[0].find('a').string.strip() + "\t" + tdlist[1].string.strip() + "\t" + tdlist[3].string.strip())             print(tdlist[0].string + "\t" + tdlist[1].string + "\t" + tdlist[3].string)             lp=lp.findnext('tr') 

please let me know how can fix this.

from bs4 import beautifulsoup import requests   url = "http://www.physioroom.com/news/english_premier_league/epl_injury_table.php" r = requests.get(url) soup = beautifulsoup(r.text, "lxml") table = soup.find('table', id='epl-table') tr in table('tr', id=none):     print(tr.get_text('\t', strip=true)) 

out:

player  condition   latest news expected return available? d meyler    knock   no return date  slight doubt s maloney   ear infection   no return date  slight doubt m henriksen shoulder separation april 1, 2017   major doubt mcgregor  fitness no return date  major doubt w keane acl knee injury no return date m odubajo   patella fracture    may 1, 2017 g luer  knee injury february 1, 2017 

get_text()

if want text part of document or tag, can use get_text() method. returns text in document or beneath tag, single unicode string:

you can specify string used join bits of text together

you can tell beautiful soup strip whitespace beginning , end of each bit of text


Comments

Popular posts from this blog

python - How to insert QWidgets in the middle of a Layout? -

python - serve multiple gunicorn django instances under nginx ubuntu -

module - Prestashop displayPaymentReturn hook url -