Skip to content

[2] Activity Location Assignment

Hussein Mahfouz edited this page Nov 12, 2024 · 1 revision

image

After matching each individual in the synthetic population to an activity chain, we need to assign each of their activities to a geographic location. The resulting travel patterns must be consistent with the person's assigned activity chain:

  • trip purpose should be the same
  • travel time should be as close as possible to the reported time

In general, activities could be classified as primary (work, school) and secondary (shopping, leisure etc). Primary activity locations tend to be more fixed, while secondary activity locations are more flexible and are chosen based on constraints imposed by primary location choice.

0. Input Data

Before assigning activities to locations, we prepare the datasets that are necessary for our methods. These include a dataset of facility locations, and approximation of zone to zone travel time by mode.

0.1. Travel time matrices

Each individual trip has a travel time from the NTS. For our purposes, we need to define, given a trip origin zone, which destination zones can be reached within the reported time.

Option 1: Using a routing engine

We use a multimodal routing engine to calculate travel times for all zone combinations. These travel times are calculated for {cars, public transport, cycling, and walking}.

Next Steps

Option 2: Estimating travel times

Travel time matrices are computationally expensive, and require a lot of preprocessing (of road network, gtfs etc. In case we do not have a travel time matrix from a routing engine, we use estimates based on distance and fixed mode speeds. Seehere

Intrazonal travel times

Multimodal routing engines report intrazonal travel times as 0. We calculate intrazonal times based on the area of each zone (see here)

0.2 Commuting matrices

Commuting matrices are used to ensure that our workzone assignment matches the distribution from actual data (e.g. census commuting flows). They include total number of people travelling between each OD pair for work (In the UK, census commuting matrices are available at the MSOA and the OA level). the use of these matrices is described in detail below

0.3 Activity locations

In any study area, there will be facilities that match different activity types. Some facilities can be mixed-use (e.g. a mall classifies as a work and shopping location). We use the osmox library for scraping zones from OSM and giving them custom labels based on their osm labels

1. Assigning primary activities to zones

We assume (for now), that all primary activities are home-based (i.e. they originate from home). This assumption is valid for the majority of work and school trips, however a small number of trips are not home-based. The heatmap below shows that the % of home-based trips is 84% for work and 95% for education. For more heatmaps of how trip origin and destinations are broken down see the heatmaps in this notebook

image

1.1. Primary (Education)

Step 0: Assuming education type

The NTS does not give us what type of education facility a person is going to. It only give us an aggregated trip purpose = education. We also have the person's age, so we match people to education type based on their age.

age_group_mapping = {`
    `1: "education_kg",   # "0-4"`
    `2: "education_school", # "5-10"`
    `3: "education_school", # "11-16"`
    `4: "education_university", # "17-20"`
    `5: "education_university", # "21-29"`
    `6: "education_university", # "30-39"`
    `7: "education_university", # "40-49"`
    `8: "education_university", # "50-59"`
    `9: "education_university" # "60+"`
`}

Next steps

  • Add other options for people 17 and over (e.g. add college option and distribute people between college and university)

Step 1: Getting feasible zones

The first step is to get the feasible destinations for each activity. For each person, they will have an estimate trip time. We use this trip time to get all reachable destinations from the home location. The logic can be found in the src code

Step 2: Choosing a zone

After identifying a set of feasible locations, we choose one zone probabilisitcally. This is done based on the total floor area of relevant education facilities in the feasible zones. The logic can be found in the src code

1.2 Primary (Work)

The logic for assigning work locations has the exact same Step 1 as with Education . There are however, some main differences.

  • The census gives us commuting matrices that describe the number of people who commute between each zone pair (at MSOA and OA level).
  • Destination choice is a function of the categories of the spatial distribution of employment industries. Someone is unlikely to commute to the City of London if they work in agricuture, even if the City is one of their feasible zones

Constraining to commuting matrices

A few papers mention this problem. A couple of them are:

We try two different approaches for constraining our work location choices to commuting matrices

  • iterative probabilistic: code
  • optimization problem: code

The optimisation problem can be specified as follows:

image image

Integrating employment categories (TODO)

We have explored using Standard Industry Classification codes. People in our synthetic population are assigned SIC codes based on matching with the Time Use Survey. We are revising this matching process

Next steps

2. Assigning Secondary Activities to Zones

Secondary activities are normally assigned after getting "anchor" primary activity locations. References

We use the space-time prism approach implemented in PAM to assign secondary activities to feasible zones. The relevant code is here

Next steps

3. Selecting a location (facility) for all Primary & Secondary activities

After assigning all activities to zones, we select a specific facility inside each zone for each activity. The facility should match the activity purpose.

For example, for education:

  • We sample from education facilities in the chosen zone probabilistically. This can be probabilisitc sampling based on floor area. See here
  • Our candidate facilities are the education facilities that match our criteria (logic

the script to do this is here

3. Validation

3.1. Self-consistency checks

similar to the previous step on activity scheduling, we carry out self-consistency checks to ensure the model distribution matches that of the NTS. We do this for

  • trip distance distribution by activity purpose
  • OD distribution of assigned commuting flows vs actual commuting flows (using RMSE). This checks how well our optimisation problem has distributed the flows.

3.2. Validating against other external datasets (TODO)

  • The Connected Places Catapult mobile-phone dataset