Transport Data Science with R
- Start date: TBC
- End date: TBC
- Duration: TBC
This course teaches two skill-sets that are fundamental in modern transport research: programming and data analytics. Combining these enables powerful transport planning and analysis workflows for tackling a wide range of problems, including:
- How to find, download and import a range of transport datasets?
- How to develop automated and reproducible transport planning workflows?
- How can increasingly available datasets on air quality, traffic and active travel be used to inform policy?
- How to visualise results in an attractive and potentially on-line and interactive manner?
This course will provide tools, code, data and, above all, face-to-face teaching to answer these questions and more, with the statistical programming language R. The data science approach opens a world of possibilities for generating insight from your transport datasets. The course is suitable for researchers in the public sector, academia and industry.
By the end of the course you will be able to:
- Find, download and import a variety of transport datasets, including from OpenStreetMap and government data portals
- Work with, analyse and model transport data with spatial, temporal and demographic attributes
- Work with air polution data in R and compare with transport behaviours
- Generate and analyse route networks for transport planning with reference to:
- Origin-destination (OD) data
- Geographic desire lines
- Route allocation using different routing services
- Route network generation and analysis
- Registration and refreshments (09:00 – 09:20)
- Getting set-up in the cluster (09:20 – 09:30)
- Finding, downloading, importing transport data (09:30 – 11:00)
- An overview of data portals
- Origin-destination data
- OpenStreetMap data
- Other data sources
11:00 – 11:10 Coffee break
- Working with spatio-temporal data (11:10 – 12:30)
- Introduction to STATS19
- Temporal analysis
- Spatial analysis
- Analysis and modelling
LUNCH: 12:30 – 13:30
- Traffic data and pollution analysis with R (13:30 – 15:30, delivered by Dr James Tate)
- An introduction to the openair package
- Traffic count data
- Meteorological data
- Air pollution data: daily, weekly and seasonal variability
- Visualising air pollution data and next steps
15:30 – 15:45 Refreshments
- From desire lines to route networks (15:45 – 16:45)
- Handling OD data
- Creating ‘desire lines’ from OD and zone data
- Route allocation and route network creation
- Route network analysis (comparing with other datasets)
- Discussion and applying the methods to your data (16:00 onwards)
Who should attend?
Prior experience with transport datasets is a prerequisite for the course. Attendees are expected to:
- Be comfortable with the use of R, using it for everyday data analysis tasks (you will find DataCamp’s free Introduction to R easy)
- Have experience with transport datasets and understand their structure (you will be familiar with the contents of the Transport chapter in Geocomputation with R)
Participants are expected to brush-up on their knowledge before the course, for example by completing the exercises linked-to in the bullet points above.
Computers with RStudio installed will be available for course attendees. However, for maximum benefit, we recommend participants bring their own laptops, with a recent version of R installed (3.5.0 or later). Steps to set-up a suitable R/RStudio environment are described in sections 2.3 and 2.5 of the book Efficient R Programming. The following packages should be installed prior to attending the course:
Robin Lovelace is a researcher at the Leeds Institute for Transport Studies (ITS) and the Leeds Institute for Data Analytics (LIDA). Robin has many years of experience of using R for academic research and has taught numerous R courses at all levels. He has developed popular R resources including the popular books Efficient R Programming (Gillespie and Lovelace 2016), Spatial Microsimulation with R (Lovelace and Dumont 2016), and Geocomputation with R (Lovelace et al. 2019).
These skills have been applied on a number of projects with real-world applications, including the Propensity to Cycle Tool, a nationally scalable interactive online mapping application, and the stplanr package.
James Tate is a vehicle emissions and air quality expert focussing on the impacts of road transport on the environment. He has developed and deployed new approaches to survey and model the emission performance of the UK/ EU road transport fleet. James has been using R as the primary tool in his data analysis workflow for a decade and has developed popular modules teaching R to Master’s students in ITS.
Early bird prices (valid until 1st March)
Academic, public sector and charitable sector: £300
Price (valid 1st March – 3rd April)
Academic, public sector and charitable sector: £350
The course will be held in the Leeds Institute for Data Analytics, computer cluster 11.06. It is open to students, academic staff and external delegates. Please note the fee includes learning materials, lunch and refreshments during the course.
The course is also available as bespoke or in-company training.
Institute for Transport Studies
Leeds LS2 9JT