Quick Start

This brief tutorial provides a step-by-step guide to using PyRCS, highlighting its key functionalities. It demonstrates how to retrieve three key categories of codes used in the UK railway system, which are commonly applied in both practical and research contexts:

Through practical examples, this tutorial will guide you in understanding how PyRCS works and how to use it effectively.

Location Identifiers

< Back to Top | Next >

The location identifiers, including CRS, NLC, TIPLOC and STANOX codes, are classified as line data on the Railway Codes website. To retrieve these codes using PyRCS, we can use the LocationIdentifiers class, contained in the line_data subpackage.

First, let’s import the class and create an instance:

>>> from pyrcs.line_data import LocationIdentifiers
>>> # Alternatively, from pyrcs import LocationIdentifiers
>>> lid = LocationIdentifiers()
>>> lid.NAME
'CRS, NLC, TIPLOC and STANOX codes'
>>> lid.URL
'http://www.railwaycodes.org.uk/crs/crs0.shtm'

Alternatively, we can create the instance using the LineData class:

>>> from pyrcs.collector import LineData
>>> # Alternatively, from pyrcs import LineData
>>> ld = LineData()
>>> lid_ = ld.LocationIdentifiers
>>> lid.NAME == lid_.NAME
True

Note

  • The instance ld encompasses all classes within the line data category.

  • lid_ is equivalent to lid.

Location identifiers by initial letter

We can retrieve codes (in pandas.DataFrame format) for all locations starting with a specific letter using the LocationIdentifiers.collect_loc_id() method. This input value for the parameter is case-insensitive. For example, to get the codes for locations whose names begin with the letter 'A' (or 'a'):

>>> loc_a_codes = lid.collect_loc_id(initial='a', verbose=True)
To collect data of CRS, NLC, TIPLOC and STANOX codes beginning with "A"
? [No]|Yes: yes
Collecting the data ... Done.
>>> type(loc_a_codes)
dict
>>> list(loc_a_codes.keys())
['A', 'Notes', 'Last updated date']

As shown above, loc_a_codes is a dictionary (i.e. in dict format) with the following keys:

  • 'A'

  • 'Notes'

  • 'Last updated date'

The corresponding values are:

  • loc_a_codes['A'] - CRS, NLC, TIPLOC and STANOX codes for the locations whose names begin with 'A', referring to the table on the Locations beginning A web page.

  • loc_a_codes['Notes'] - Any additional information provided on the web page (if available).

  • loc_a_codes['Last updated date'] - The date when the Locations beginning A web page was last updated.

A snapshot of the data contained in loc_a_codes is demonstrated below:

>>> loc_a_codes_dat = loc_a_codes['A']
>>> type(loc_a_codes_dat)
pandas.core.frame.DataFrame
>>> loc_a_codes_dat.head()
                                 Location CRS  ... STANME_Note STANOX_Note
0                    1999 Reorganisations      ...
1                                      A1      ...
2                          A463 Traded In      ...
3     A483 Road Scheme Supervisors Closed      ...
4                                  Aachen      ...
...                                   ...  ..  ...         ...         ...
3319                       Ayr Wagon Team      ...
3320                       Ayr Wagon Team      ...
3321                       Ayr Wagon Team      ...
3322                          Ayr Welders      ...
3323                    Aztec Travel S378      ...
[3324 rows x 12 columns]
>>> print(f"Notes: {loc_a_codes['Notes']}")
>>> print(f"Last updated date: {loc_a_codes['Last updated date']}")
Notes: None
Last updated date: 2025-02-19

>>> ## Try more examples! Uncomment the lines below and run:
>>> # loc_a_codes = lid.fetch_loc_id('a')  # Fetch location codes starting with 'A'
>>> # loc_codes = lid.fetch_loc_id()  # Fetch all location codes

All available location identifiers

Beyond retrieving location codes for a specific letter, we can use the LocationIdentifiers.fetch_codes() method to obtain codes for all locations with names starting from 'A' to 'Z':

>>> loc_codes = lid.fetch_codes()
>>> type(loc_codes)
dict
>>> list(loc_codes.keys())
['Location ID', 'Other systems', 'Notes', 'Last updated date']

The loc_codes object is a dictionary with the following keys:

  • 'Location ID'

  • 'Other systems'

  • 'Notes'

  • 'Latest update date'

The corresponding values are:

  • loc_codes['Location ID'] - CRS, NLC, TIPLOC, and STANOX codes for all locations listed across the relevant web pages.

  • loc_codes['Other systems'] - Codes related to the other systems.

  • loc_codes['Notes'] - Any notes and information (if available).

  • loc_codes['Latest update date'] - The latest 'Last updated date' across all initial-specific data.

Here is a snapshot of the data contained in loc_codes:

>>> lid.KEY
'Location ID'
>>> loc_codes_dat = loc_codes[lid.KEY]  # loc_codes['Location ID']
>>> type(loc_codes_dat)
pandas.core.frame.DataFrame
>>> loc_codes_dat
                                  Location CRS  ... STANME_Note STANOX_Note
0                     1999 Reorganisations      ...
1                                       A1      ...
2                           A463 Traded In      ...
3      A483 Road Scheme Supervisors Closed      ...
4                                   Aachen      ...
...                                    ...  ..  ...         ...         ...
59877                              ZZTYALS      ...
59878                              ZZTYKKH      ...
59879                              ZZTYLIN      ...
59880                              ZZTYSGY      ...
59881                              ZZWMNST      ...
[59882 rows x 12 columns]
>>> loc_codes_dat[['Location', 'Location_Note']]
                                  Location    Location_Note
0                     1999 Reorganisations
1                                       A1
2                           A463 Traded In
3      A483 Road Scheme Supervisors Closed
4                                   Aachen
...                                    ...              ...
59877                              ZZTYALS       see Alston
59878                              ZZTYKKH    see Kirkhaugh
59879                              ZZTYLIN      see Lintley
59880                              ZZTYSGY   see Slaggyford
59881                              ZZWMNST  see Westminster
[59882 rows x 2 columns]

To access codes from other systems, such as Crossrail or the Tyne & Wear Metro:

>>> lid.KEY_TO_OTHER_SYSTEMS
'Other systems'
>>> os_codes_dat = loc_codes[lid.KEY_TO_OTHER_SYSTEMS]
>>> type(os_codes_dat)
collections.defaultdict
>>> list(os_codes_dat.keys())
['Córas Iompair Éireann (Republic of Ireland)',
 'Crossrail',
 'Croydon Tramlink',
 'Docklands Light Railway',
 'Manchester Metrolink',
 'Translink (Northern Ireland)',
 'Tyne & Wear Metro']

For example, to view the data for Crossrail:

>>> crossrail_codes_dat = os_codes_dat['Crossrail']
>>> type(crossrail_codes_dat)
pandas.core.frame.DataFrame
>>> crossrail_codes_dat.head()
                                      Location  ... New operating code
0                                   Abbey Wood  ...                ABW
1  Abbey Wood Bolthole Berth/Crossrail Sidings  ...
2                           Abbey Wood Sidings  ...
3                                  Bond Street  ...                BDS
4                                 Canary Wharf  ...                CWX
[5 rows x 5 columns]

>>> ## Try more examples! Uncomment the lines below and run:
>>> ## Get a dictionary for STANOX codes and location names
>>> # stanox_dict = lid.make_xref_dict('STANOX')
>>> ## ... and for STANOX, TIPLOC and location names starting with 'A'
>>> # stanox_tiploc_dict_a = lid.make_xref_dict(['STANOX', 'TIPLOC'], initials='a')

ELRs and mileages

< Previous | Back to Top | Next >

Engineer’s Line References (ELRs) are also commonly encountered in various data sets within the UK’s railway system. To retrieve the codes for ELRs along with their associated mileage files, we can use the ELRMileages class:

>>> from pyrcs.line_data import ELRMileages
>>> # Alternatively, from pyrcs import ELRMileages
>>> em = ELRMileages()
>>> em.NAME
"Engineer's Line References (ELRs)"
>>> em.URL
'http://www.railwaycodes.org.uk/elrs/elr0.shtm'

Engineer’s Line References (ELRs)

Similar to location identifiers, the ELR codes on the Railway Codes website are arranged alphabetically based on their initial letters. We can use the ELRMileages.collect_elr() method to obtain ELRs starting with a specific letter. For example, to get the data for ELRs beginning with the letter 'A':

>>> elrs_a_codes = em.collect_elr(initial='a', verbose=True)
To collect data of Engineer's Line References (ELRs) beginning with "A"
? [No]|Yes: yes
Collecting the data ... Done.
>>> type(elrs_a_codes)
dict
>>> list(elrs_a_codes.keys())
['A', 'Last updated date']

The elrs_a_codes object is a dictionary with the following keys:

  • 'A'

  • 'Last updated date'

The corresponding values are:

  • elrs_a_codes['A'] - Data for ELRs that begin with 'A', referring to the table presented on the ELRs beginning with A web page.

  • elrs_a_codes['Last updated date'] - The date when the ELRs beginning with A web page was last updated.

Here is a snapshot of the data contained in elrs_a_codes:

>>> elrs_a_codes_dat = elrs_a_codes['A']
>>> type(elrs_a_codes_dat)
pandas.core.frame.DataFrame
>>> elrs_a_codes_dat
      ELR  ...         Notes
0     AAL  ...      Now NAJ3
1     AAM  ...  Formerly AML
2     AAV  ...
3     ABB  ...       Now AHB
4     ABB  ...
..    ...  ...           ...
186  AYR4  ...
187  AYR5  ...
188  AYR6  ...
189   AYS  ...
190   AYT  ...
[191 rows x 5 columns]
>>> print(f"Last updated date: {elrs_a_codes['Last updated date']}")
Last updated date: 2024-10-20

To retrieve data for all ELRs (from 'A' to 'Z'), we can use the ELRMileages.fetch_elr() method:

>>> elrs_codes = em.fetch_elr()
>>> type(elrs_codes)
dict
>>> list(elrs_codes.keys())
['ELRs and mileages', 'Last updated date']

Similarly, elrs_codes is a dictionary with the following keys:

  • 'ELRs and mileages'

  • 'Latest update date'

The corresponding values are:

  • elrs_codes['ELRs and mileages'] - Codes for all available ELRs (with the initial letters ranging from 'A' to 'Z').

  • elrs_codes['Latest update date'] - The most recent update date among all the ELR data.

Here is a snapshot of the data contained in elrs_codes:

>>> elrs_codes_dat = elrs_codes[em.KEY]
>>> type(elrs_codes_dat)
pandas.core.frame.DataFrame
>>> elrs_codes_dat
       ELR  ...         Notes
0      AAL  ...      Now NAJ3
1      AAM  ...  Formerly AML
2      AAV  ...
3      ABB  ...       Now AHB
4      ABB  ...
...    ...  ...           ...
4542  ZGW1  ...
4543  ZGW2  ...
4544   ZZY  ...
4545   ZZZ  ...
4546  ZZZ9  ...
[4547 rows x 5 columns]

>>> ## Try more examples! Uncomment the lines below and run:
>>> # elrs_a_codes = em.fetch_elr(initial='a')  # Fetch ELRs starting with 'A'
>>> # elrs_b_codes = em.fetch_elr(initial='B')  # Fetch ELRs starting with 'B'

Mileage file of a given ELR

In addition to the codes of ELRs, each ELR is associated with a mileage file that specifies the major mileages along the line. To retrieve this data, we can use the ELRMileages.fetch_mileage_file() method.

For example, to get the mileage file for ‘AAM’:

>>> amm_mileage_file = em.fetch_mileage_file(elr='AAM')
>>> type(amm_mileage_file)
dict
>>> list(amm_mileage_file.keys())
['ELR', 'Line', 'Sub-Line', 'Mileage', 'Notes']

The amm_mileage_file object is also a dictionary and has the following keys:

  • 'ELR'

  • 'Line'

  • 'Sub-Line'

  • 'Mileage'

  • 'Notes'

The corresponding values are:

  • amm_mileage_file['ELR'] - The given ELR (in this example, 'AAM').

  • amm_mileage_file['Line'] - The name of the line associated with the ELR.

  • amm_mileage_file['Sub-Line'] - The sub-line name (if applicable).

  • amm_mileage_file['Mileage'] - The major mileages along the line.

  • amm_mileage_file['Notes'] - Additional notes or information (if available).

Here is a snapshot of the data contained in amm_mileage_file:

>>> amm_mileage_file['Line']
'Ashchurch and Malvern Line'
>>> amm_mileage_file['Mileage']
   Mileage Mileage_Note  ... Link_2_ELR Link_2_Mile_Chain
0   0.0000               ...
1   0.0154               ...
2   0.0396               ...
3   1.1012               ...
4   1.1408               ...
5   5.0330               ...
6   7.0374               ...
7  11.1298               ...
8  13.0638               ...
[9 rows x 11 columns]

>>> ## Try more examples! Uncomment the lines below and run:
>>> # xre_mileage_file = em.fetch_mileage_file('XRE')  # Fetch mileage file for 'XRE'
>>> # your_mileage_file = em.fetch_mileage_file(elr='?')  # ... and for a given ELR '?'

Railway station data

< Previous | Back to Top | Next >

The railway station data includes information such as the station name, ELR, mileage, status, owner, operator, coordinates and grid reference. This data is available in the other assets section of the Railway Codes website and can be retrieved using the Stations class contained in the other_assets subpackage.

To get the data, let’s import the Stations class and create an instance:

>>> from pyrcs.other_assets import Stations  # from pyrcs import Stations
>>> stn = Stations()
>>> stn.NAME
'Railway station data'
>>> stn.URL
'http://www.railwaycodes.org.uk/stations/station0.shtm'

Alternatively, we can also create the instance by using the OtherAssets class:.

>>> from pyrcs.collector import OtherAssets  # from pyrcs import OtherAssets
>>> oa = OtherAssets()
>>> stn_ = oa.Stations
>>> stn.NAME == stn_.NAME
True

Note

  • The instance stn encompasses all classes within the other assets category.

  • stn_ is equivalent to stn.

Railway stations by initial letter

We can obtain railway station data based on the first letter (e.g. 'A' or 'Z') of the station’s name using the Stations.collect_locations() method. For example, to get data for stations starting with 'A':

>>> stn_loc_a_codes = stn.collect_locations(initial='a', verbose=True)
To collect data of mileages, operators and grid coordinates beginning with "A"
? [No]|Yes: yes
Collecting the data ... Done.
>>> type(stn_loc_a_codes)
dict
>>> list(stn_loc_a_codes.keys())
['A', 'Last updated date']

The dictionary stn_loc_a_codes includes the following keys:

  • 'A'

  • 'Last updated date'

The corresponding values are:

  • stn_loc_a_codes['A'] - Data for railway stations whose names begin with 'A', including mileages, operators and grid coordinates, referring to the table on the Stations beginning with A web page.

  • stn_loc_a_codes['Last updated date'] - The date when the Stations beginning with A web page was last updated.

Here is a snapshot of the data contained in stn_loc_a:

>>> stn_loc_a_codes_dat = stn_loc_a_codes['A']
>>> type(stn_loc_a_codes_dat)
pandas.core.frame.DataFrame
>>> stn_loc_a_codes_dat
                                Station  ...                Former Operator
0    Abbey Wood Abbey Wood / ABBEY WOOD  ...  London & South Eastern Rai...
1    Abbey Wood Abbey Wood / ABBEY WOOD  ...  London & South Eastern Rai...
2                                  Aber  ...  Keolis Amey Operations/Gwe...
3                             Abercynon  ...  Keolis Amey Operations/Gwe...
4                             Abercynon  ...  Keolis Amey Operations/Gwe...
..                                  ...  ...                            ...
137              Aylesbury Vale Parkway  ...
138                           Aylesford  ...  London & South Eastern Rai...
139                            Aylesham  ...  London & South Eastern Rai...
140                                 Ayr  ...  Abellio ScotRail from 1 Ap...
141                                 Ayr  ...  Abellio ScotRail from 1 Ap...
[142 rows x 14 columns]
>>> stn_loc_a_codes_dat.columns.to_list()
['Station',
 'Station Note',
 'ELR',
 'Mileage',
 'Status',
 'Degrees Longitude',
 'Degrees Latitude',
 'Grid Reference',
 'CRS',
 'CRS Note',
 'Owner',
 'Former Owner',
 'Operator',
 'Former Operator']
>>> stn_loc_a_codes_dat[['Station', 'ELR', 'Mileage']]
                                Station   ELR   Mileage
0    Abbey Wood Abbey Wood / ABBEY WOOD   NKL  11m 43ch
1    Abbey Wood Abbey Wood / ABBEY WOOD   XRS  24.458km
2                                  Aber   CAR   8m 69ch
3                             Abercynon   CAM  16m 28ch
4                             Abercynon   ABD  16m 28ch
..                                  ...   ...       ...
137              Aylesbury Vale Parkway  MCJ2  40m 38ch
138                           Aylesford  PWS2  38m 74ch
139                            Aylesham   FDM  68m 66ch
140                                 Ayr  AYR6  40m 49ch
141                                 Ayr  STR1  40m 49ch
[142 rows x 3 columns]
>>> print(f"Last updated date: {stn_loc_a_codes['Last updated date']}")
Last updated date: 2025-02-12

All available railway stations

To retrieve data for all railway stations available in the other assets category, we can use the Stations.fetch_locations() method:

>>> stn_loc_codes = stn.fetch_locations()
>>> type(stn_loc_codes)
dict
>>> list(stn_loc_codes.keys())
['Mileages, operators and grid coordinates', 'Last updated date']

The dictionary stn_loc_codes includes the following keys:

  • 'Mileages, operators and grid coordinates'

  • 'Latest update date'

The corresponding values are:

  • stn_loc_codes['Mileages, operators and grid coordinates'] - Data for all railway stations, with the initial letters ranging from 'A' to 'Z'.

  • stn_loc_codes['Latest update date'] - The most recent update date among all the station data.

Here is a snapshot of the data contained in stn_loc_codes:

>>> stn.KEY_TO_STN
'Mileages, operators and grid coordinates'
>>> stn_loc_codes_dat = stn_loc_codes[stn.KEY_TO_STN]
>>> type(stn_loc_codes_dat)
pandas.core.frame.DataFrame
>>> stn_loc_codes_dat
                                 Station  ...               Former Operator
0     Abbey Wood Abbey Wood / ABBEY WOOD  ...  London & South Eastern Ra...
1     Abbey Wood Abbey Wood / ABBEY WOOD  ...  London & South Eastern Ra...
2                                   Aber  ...  Keolis Amey Operations/Gw...
3                              Abercynon  ...  Keolis Amey Operations/Gw...
4                              Abercynon  ...  Keolis Amey Operations/Gw...
...                                  ...  ...                           ...
2900                                York  ...  East Coast Main Line Comp...
2901                                York  ...  East Coast Main Line Comp...
2902                              Yorton  ...  Keolis Amey Operations/Gw...
2903                       Ystrad Mynach  ...  Keolis Amey Operations/Gw...
2904                      Ystrad Rhondda  ...  Keolis Amey Operations/Gw...
[2905 rows x 14 columns]
>>> sel_cols = ['Station', 'ELR', 'Mileage', 'Degrees Longitude', 'Degrees Latitude']
>>> stn_loc_codes_dat[sel_cols].tail()
             Station   ELR    Mileage  Degrees Longitude  Degrees Latitude
2900            York  ECM5    0m 00ch            -1.0920           53.9584
2901            York  ECM4  188m 40ch            -1.0920           53.9584
2902          Yorton   SYC   25m 14ch            -2.7360           52.8083
2903   Ystrad Mynach   CAR   13m 60ch            -3.2414           51.6414
2904  Ystrad Rhondda   THT   20m 05ch            -3.4666           51.6436
>>> print(f"Last updated date: {stn_loc_codes['Last updated date']}")
Last updated date: 2025-02-26

>>> ## Try more examples! Uncomment the lines below and run:
>>> # stn_loc_a_codes = em.fetch_locations('a')  # railway stations starting with 'A'
>>> # your_stn_loc_codes = em.fetch_locations('?')  # ... and a given letter '?'

< Previous | Back to Top


Any issues regarding the use of pyrcs are welcome and can be logged/reported onto the Issue Tracker.

For more details and examples, check Subpackages and Modules.