parser

Parses web-page contents.

Preprocess contents

parse_tr(trs, ths[, sep, as_dataframe])

Parses a list of HTML <tr> elements and extracts data from a table.

parse_table(source[, parser, as_dataframe])

Parses HTML <tr> elements to create a table from the given source.

parse_date(str_date[, as_date_type])

Parses a string representation of a date into a formatted date.

Extract information

get_site_map([update, ...])

Gets the site map.

get_last_updated_date(url[, parsed, ...])

Gets the last update date of a specified web page.

get_financial_year(date)

Gets the financial year of a given date.

get_catalogue(url[, update, json_it, ...])

Gets the catalogue of items from the main page of a data cluster.

get_category_menu(name[, update, ...])

Gets a menu of the available classes from the specified URL.

get_page_catalogue(url[, head_tag_name, ...])

Gets the catalogue of features from the main page of a data cluster.

get_heading_text(heading_tag[, elem_tag_name])

Gets the text from a given HTML heading tag.

get_hypertext(hypertext_tag[, ...])

Gets hyperlinked text from a specified HTML tag.

get_introduction(url[, delimiter, update, ...])

Gets the introduction section of a specified web page.