get_heading_text

pyrcs.parser.get_heading_text(heading_tag, elem_tag_name='em')[source]

Gets the text from a given HTML heading tag.

Parameters:
  • heading_tag (bs4.element.Tag) – The HTML tag of a heading element.

  • elem_tag_name (str) – The tag name of an inner element within the heading; defaults to 'em'.

Returns:

Cleaned text of the heading tag.

Return type:

str

Examples:

>>> from pyrcs.parser import get_heading_text
>>> from pyrcs.line_data import Electrification
>>> elec = Electrification()
>>> url = elec.catalogue[elec.KEY_TO_INDEPENDENT_LINES]
>>> source = requests.get(url=url, headers=fake_requests_headers())
>>> soup = bs4.BeautifulSoup(markup=source.content, features='html.parser')
>>> h3 = soup.find('h3')
>>> h3_text = get_heading_text(heading_tag=h3, elem_tag_name='em')
>>> h3_text
'Beamish Tramway'