-
Notifications
You must be signed in to change notification settings - Fork 1
/
README.Rmd
842 lines (713 loc) · 48.3 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
---
title: "Exploring jittering and routing options for converting origin-destination data into route networks: towards accurate estimates of movement at the street level"
bibliography: foss4g2022.bib
author: Robin Lovelace, Rosa Félix, Dustin Carlino
output: github_document
# output:
# bookdown::pdf_document2:
# template: ISPRStemplate.tex
# keep_tex: true
editor_options:
markdown:
wrap: sentence
csl: ispr-from-harvard.csl
---
```{r, eval=FALSE, echo=FALSE}
unzip("ISPRSguidelines_authors_fullpaper_latex_2021_09_09.zip")
tinytex::pdflatex("ISPRSguidelines_authors_fullpaper.tex")
rmarkdown::render("README.Rmd")
file.rename("README.pdf", "foss4g-paper-jittering.pdf")
browseURL("foss4g-paper-jittering.pdf")
piggyback::pb_upload("foss4g-paper-jittering.pdf")
system("gh release upload v1 foss4g-paper-jittering.pdf --clobber")
system("gh release download 1")
rbbt::bbt_update_bib(path_rmd = "README.Rmd", path_bib = "foss4g2022.bib")
file.edit("README.tex")
```
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include=FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
echo = FALSE,
message = FALSE,
cache = TRUE,
warning = FALSE,
fig.align = "center"
# eval = FALSE
)
```
```{r, include=FALSE}
# devtools::install_github("itsleeds/od")
library(sf)
library(tmap)
library(tidyverse)
library(stplanr)
library(cyclestreets)
# rbbt::bbt_update_bib(path_rmd = "README.Rmd", path_bib = "foss4g2022.bib")
```
Note: this has been submitted to the academic track of FOSS4G. See https://osf.io/4yxj7/ for the preprint.
# Introduction
Origin-destination (OD) datasets provide information on aggregate travel patterns between zones and geographic entities, and can be obtained from a wide range of sources making them one of the most commonly used geographic inputs in applied transport planning [@alexander_validation_2015].
OD datasets are often 'implicitly geographic', containing identification codes of the geographic objects from which trips start and end.
Exact coordinates of origins and destinations are provided in this way for good reasons: historically computational resources constrained analysis options, meaning that data reduction (by converting thousands of travel survey responses into a more compact aggregate OD dataset) was important; and privacy considerations prevent the disclosure of exact trip start and end points [@boyce_forecasting_2015].
A common approach to converting OD datasets to geographic entities, for example represented using the simple features standard [@ogcopengeospatialconsortiuminc_opengis_2011] and saved in file formats such as GeoPackage and GeoJSON, is to represent each OD record as a straight line between zone centroids.
This approach to representing OD datasets on the map has been since at least the 1950s [@boyce_forecasting_2015] and --- despite the development of various methods to add value to OD datasets by sampling start and end points and 'connectors' withing each zone [@lovelace_jittering_2022b], discussed below --- centroid-based geographic representations of OD data are still dominant [@rae_spatial_2009; @tennekes_design_2021].
Before explaining the methods, it is worth defining terms:
- **Origins**: locations of trip departure, typically stored as ID codes linking to zones
- **Destinations**: trip destinations, also stored as ID codes linking to zones
- **Attributes**: the number of trips made between each 'OD pair' and additional attributes such as route distance between each OD pair
- **Jittering**: The combined process of 'splitting' OD pairs representing many trips into multiple 'sub OD' pairs (disaggregation) and assigning origins and destinations to multiple unique points within each zone
Beyond simply visualising aggregate travel patterns, centroid-based geographic desire lines are also used as the basis of many transport modelling processes.
The following steps can be used to convert OD datasets into route networks, in a process that can generate nationally scalable results [@morgan_travel_2020]:
- OD data converted into centroid-based geographic desire lines
- Calculation of routes for each desire line, with start and end points at zone centroids
- Aggregation of routes into route networks, with values on each segment representing the total amount of travel ('flow') on that part of the network, using functions such as `overline()` in the open source R package `stplanr` [@lovelace_stplanr_2018]
This approach is tried and tested:
the OD $\rightarrow$ desire line $\rightarrow$ route $\rightarrow$ route network processing pipeline forms the basis of the route network results in the Propensity to Cycle Tool, an open source and publicly available map-based web application for informing strategic cycle network investment, 'visioning' and prioritisation [@lovelace_propensity_2017; @goodman_scenarios_2019].
However, the approach has some key limitations:
- Flows are concentrated on transport network segments leading to zone centroids, creating distortions in the results and preventing the simulation of the diffuse networks that are particularly important for walking and cycling
- The results are highly dependent on the size and shape of geographic zones used to define OD data
- The approach is inflexible, providing few options to people who want to use valuable OD datasets in different ways
To overcome these limitations, methods of 'jittering' OD data have been developed [@lovelace_jittering_2022b].
While the results from analysis of route networks generated from jittered OD data in that paper were promising, the input datasets were small and technique was not evaluated with reference to ground truth data.
This raised the question "Are the jittered results measurably better when compared with counter datasets on the network?" [@lovelace_jittering_2022b].
This question was partially addressed during a presentation and subsequent proceedings published as part of the GISRUK conference [@lovelace_assessing_2022].
However, the input dataset used for that conference paper was small and overly focussed on Edinburgh.
Furthermore, only a single routing option was used, raising the question:
what is the relative importance of geographic OD data pre-processing (jittering) and routing options when preparing route networks to support strategic sustainable transport plans?
We set out to address this question in this paper.
## Software and reproducibility
In this paper present results generated using the `odjitter` Rust crate.
We developed an interface to R in the `odjitter` R package (not on CRAN at the time of writing) that can form the basis of a implementations in other languages that interface with the highly efficient Rust implementation.
The results presented in this paper are fully reproducible.
See the paper's GitHub repository at https://github.com/Robinlovelace/foss4g22/ for implementation details and to reproduce the results.
# Approach
## Jittering
Jittering represents a comparatively simple --- compared with 'connector' based methods [@jafari_investigation_2015] --- approach is to OD data preprocessing.
For each OD pair, the jittering approach consists of the following steps for each OD pair (provided it has required inputs of a disaggregation threshold, a single number greater than one, and sub-points from which origin and destination points are located):
1. Checks if the number of trips (for a given 'disaggregation key', e.g. 'walking') is greater than the disaggregation threshold.
2. If so, the OD pair is disaggregated. This means being divided into as many pieces ('sub-OD pairs') as is needed, with trip counts divided by the number of sub-OD pairs, for the total to be below the disaggregation threshold.
3. For each sub-OD pair (or each original OD pair if no disaggregation took place) origin and destination locations are randomly sampled from sub-points which optionally have weights representing relative probability of trips starting and ending there.
This approach has been implemented efficiently in the Rust crate `odjitter`, the source code of which can be found at <https://github.com/dabreegster/odjitter>.
## Case study
Lisbon, Portugal, is a city with about half million residents. By 2018, when a mobility survey was carried on, and only about 0.5% of trips were made by bicycle. However, the investments in cycling infrastructure, reaching 150 km of cycling network in 2021, and the implementation of a dock-based bike-sharing system had a major impact on cycling levels [@felix_build_2020a].
Cyclists’ counts are performed yearly from 2017 to 2021 at more than 65 locations in Lisbon during morning and afternoon peak hours (8-10 am and 5-7 pm). In 2021, these were carried out in October.
The 67 locations, shown in Figure \ref{lisbonmap}, were chosen considering to the existent and planned
cycling infrastructure, and places where there was no cycling infrastructure, but had already some presence
of cyclists.
```{r lisbon1, include=FALSE, cache=TRUE, fig.cap="Illustration of jittered (left) compared with unjittered (right) origin-destination data.", out.width="100%"}
od_all = readRDS(url("https://github.com/U-Shift/biclar/releases/download/0.0.1/TRIPSmode_freguesias.Rds"))
zones = readRDS(url("https://github.com/U-Shift/biclar/releases/download/0.0.1/FREGUESIASgeo.Rds"))
osm_data_region = readRDS(url("https://github.com/U-Shift/biclar/releases/download/0.0.1/osm_data_region.Rds"))
lisbon_limit = st_read("Lisboa_limite.gpkg") %>% st_transform(4326)
```
```{r include=FALSE, cache=TRUE}
## For Lisbon only
lisbon_zones = zones %>% filter(Concelho == "Lisboa")
od_lisbon = od_all %>%
filter(DICOFREor11 %in% lisbon_zones$Dicofre & DICOFREde11 %in% lisbon_zones$Dicofre)
od_lisbon_with_bikes = od_lisbon %>% filter(Bike > 0)
od_lisbon_sf = od::od_to_sf(od_lisbon_with_bikes, lisbon_zones) #desire lines
set.seed(42)
od_lisbon_jittered = odjitter::jitter( #jitter
od = od_lisbon_with_bikes,
zones = lisbon_zones,
subpoints = osm_data_region,
disaggregation_key = "Total",
disaggregation_threshold = round(max(od_lisbon_with_bikes$Total) + 1) ##30? 50? 100?
)
od_lisbon_jittered_500 = odjitter::jitter( #jitter
od = od_lisbon_with_bikes,
zones = lisbon_zones,
subpoints = osm_data_region,
disaggregation_key = "Total",
disaggregation_threshold = 500 ##30? 50? 100?
)
od_lisbon_jittered_200 = odjitter::jitter( #jitter
od = od_lisbon_with_bikes,
zones = lisbon_zones,
subpoints = osm_data_region,
disaggregation_key = "Total",
disaggregation_threshold = 200 ##30? 50? 100?
)
# nrow(od_lisbon_jittered) # 9042 (17784 with 100 disagreg_thr)
```
```{r}
counters = readxl::read_excel("DadosAbertos_IST_CML_ContagensCiclistas_20172021.xlsx", sheet = "Out2021")
counters_sf = counters %>%
filter(TurnoNor %in% c("M1", "M2", "T1", "T2")) %>%
group_by(Ponto) %>%
summarise(SumCiclistas = sum(SumCiclistas, na.rm = TRUE), lon = mean(lon), lat = mean(lat)) %>%
select(-Ponto) %>%
sf::st_as_sf(coords = c("lon", "lat"), crs = 4326)
```
```{r lisbonmap, fig.cap="\\label{lisbonmap}Cycling infrastructure in Lisbon as October 2021 and location of cyclists' counters.", fig.ncol=2, out.width="100%"}
bikelanes = readRDS("Ciclovias2021out.Rds")
bikelanes = bikelanes %>% filter(TIPOLOGIA == "Ciclovia segregada") #do not show the sharrow ones with traffic
plot(lisbon_limit$geom, border="grey60")
plot(bikelanes, lwd = 0.6, col = "darkgreen", add = TRUE)
plot(counters_sf, col = "red", add = TRUE)
```
<!-- \begin{figure*} -->
<!-- {\centering \includegraphics[width=0.6\linewidth]{README_files/figure-latex/lisbonmap-1} -->
<!-- } @spatstat-->
<!-- \caption{\label{lisbonmap}Cycling infrastructure in Lisbon as October 2021 and location of cyclists' counters.}\label{fig:lisbonmap} -->
<!-- \end{figure*} -->
## Methods
We use data from a mobility survey [@IMOB] at district level (Lisbon has 24 districts), including `r round(sum(od_lisbon_with_bikes$Bike))` daily bicycle trips, represented by 122 desire lines.
Cycling count data includes `r as.integer(sum(counters_sf$SumCiclistas))` passings in the total of the 67 locations (one trip may pass at more than one location).
Routes were computed using [_CycleStreets_](https://cyclestreets.net), which relies on 2022 road network from OpenStreetMap, using the [`r5` engine](https://ipeagit.github.io/r5r/) [@pereira_r5r_2021], and using _Google Maps_ service, for routing comparison.
Routes were calculated using reproducible code available in the GitHub repo associated with this paper thanks to the `stplanr`, `r5r` and `cyclestreets` R packages that provide interfaces to these routing engines.
Regarding the routing options, CycleStreets provides 3 options of cycling routes: "fastest", "balanced" and "quietest", while r5r uses the Level of Traffic Stress (LTS), ranging from 1 --- less bicycle friendly, to 4 --- more bicycle friendly [@mekuria2012low]. Google Maps does not provide such profile options for bicycle routing.
In this research we compared CycleStreets' "quietest" and "fastest" modes, and LTS 2 and 4 [@mekuria2012low; @desjardins_correlates_2022].
This was an iterative process, an not all options were tested due to the computational requirements. We started by generating routes with CycleStreets for the 3 routing profiles and for unjittered, jittered with no disagregation, and jittered with disagregation level of 500 trips. Then we compared the results with routes generated by r5r, for 2 levels of traffic stress (2 and 4), and with routes generated by Google. Other jittering disagregation level of 200 trips was also compared with the previous results, for routes generated with CycleStreets ("quietest" profile) and for routes generated with r5r (LTS 2).
Results were then assessed. Count data was compared with the resulting route networks (with information on bike trips at each segment level, from the mobility survey data) by taking the value of the nearest segment, and using a R^2^ correlation fit.
# Results
We generated route networks based on a range of different jittering parameters and routing options.
The results presented in this section not only report estimates of model-counter fit but also provide indication of the type of networks generated, though route network maps.
Figures \ref{poltlisbon1}, \ref{poltlisbon2} and \ref{poltlisbon3} show the difference between desire lines with centroids approach and the jittering approach, for bike trips in Lisbon.
```{r jitteredoverview1, echo=FALSE, fig.cap="\\label{poltlisbon1}Trips represented with desire lines from centroids of 24 areas. The red circles represent the counters locations.", fig.ncol=2, out.width="100%"}
plot(lisbon_limit$geom, border="grey60")
plot(od_lisbon_sf$geometry, lwd = 0.2, add = TRUE)
plot(counters_sf, col = "red", add = TRUE)
```
```{r jitteredoverview2, echo=FALSE, fig.cap="\\label{poltlisbon2}Trips represented with jittered desire lines, with no disagregation.", fig.ncol=2, out.width="100%"}
plot(lisbon_limit$geom, border="grey60")
plot(od_lisbon_jittered$geometry, lwd = 0.2, add = TRUE)
plot(counters_sf, col = "red", add = TRUE)
```
```{r jitteredoverview3, echo=FALSE, fig.cap="\\label{poltlisbon3}Trips represented with jittered desire lines, with disagregation of 500 trips.", fig.ncol=2, out.width="100%"}
plot(lisbon_limit$geom, border="grey60")
plot(od_lisbon_jittered_500$geometry, lwd = 0.1, add = TRUE)
plot(counters_sf, col = "red", add = TRUE)
```
```{r, eval=FALSE, echo=FALSE}
# Routing unjittered:
routes_unjittered_quietest = route(l = od_lisbon_sf , route_fun = journey, plan = "quietest")
write_rds(routes_unjittered_quietest, "routes_unjittered_quietest.Rds")
routes_unjittered_balanced = route(l = od_lisbon_sf , route_fun = journey, plan = "balanced")
write_rds(routes_unjittered_balanced, "routes_unjittered_balanced.Rds")
routes_unjittered_fastest = route(l = od_lisbon_sf , route_fun = journey, plan = "fastest")
write_rds(routes_unjittered_fastest, "routes_unjittered_fastest.Rds")
# Routing unjittered:
routes_jittered_quietest = route(l = od_lisbon_jittered , route_fun = journey, plan = "quietest")
write_rds(routes_jittered_quietest, "routes_jittered_quietest.Rds")
routes_jittered_balanced = route(l = od_lisbon_jittered , route_fun = journey, plan = "balanced")
write_rds(routes_jittered_balanced, "routes_jittered_balanced.Rds")
routes_jittered_fastest = route(l = od_lisbon_jittered , route_fun = journey, plan = "fastest")
write_rds(routes_jittered_fastest, "routes_jittered_fastest.Rds")
# Routing jittered 500:
routes_jittered_500_quietest = route(l = od_lisbon_jittered_500 , route_fun = journey, plan = "quietest")
write_rds(routes_jittered_500_quietest, "routes_jittered_500_quietest.Rds")
routes_jittered_500_balanced = route(l = od_lisbon_jittered_500 , route_fun = journey, plan = "balanced")
write_rds(routes_jittered_500_balanced, "routes_jittered_500_balanced.Rds")
routes_jittered_500_fastest = route(l = od_lisbon_jittered_500 , route_fun = journey, plan = "fastest")
write_rds(routes_jittered_500_fastest, "routes_jittered_500_fastest.Rds")
routes_jittered_500_google = route(l = od_lisbon_jittered_500 , route_fun = stplanr::route_google, mode = "bicycling")
write_rds(routes_jittered_500_google, "routes_jittered_500_google.Rds")
#Routind jittered 200:
routes_jittered_200_quietest = route(l = od_lisbon_jittered_200 , route_fun = journey, plan = "quietest")
write_rds(routes_jittered_200_quietest, "routes_jittered_200_quietest.Rds")
```
```{r eval=FALSE, include=FALSE}
#routes with r5r
options(java.parameters = '-Xmx8G') #memory max 8GB
options(java.home="C:/Program Files/Java/jdk-11.0.11/")
library(r5r)
library(stplanr)
r5r_lts = setup_r5(data_path = "r5r_paper/", overwrite = TRUE) #includes osm from june 2022
#jittered routes with r5r, selection of LTS 2 and 4
od_lisbon_jittered_500_points = line2df(od_lisbon_jittered_500)
od_lisbon_jittered_500_OR = od_lisbon_jittered_500_points[,c(1,2,3)]
names(od_lisbon_jittered_500_OR) = c("id", "lon", "lat")
od_lisbon_jittered_500_DE = od_lisbon_jittered_500_points[,c(1,4,5)]
names(od_lisbon_jittered_500_DE) = c("id", "lon", "lat")
od_lisbon_jittered_500_r5r = od_lisbon_jittered_500
od_lisbon_jittered_500_r5r$id = 1:nrow(od_lisbon_jittered_500_r5r)
routes_jittered_500_lts1 = detailed_itineraries(
r5r_lts,
origins = od_lisbon_jittered_500_OR,
destinations = od_lisbon_jittered_500_DE,
mode = "BICYCLE",
# mode_egress = "WALK",
# departure_datetime = Sys.time(),
# time_window = 1L,
# suboptimal_minutes = 0L,
fare_structure = NULL,
max_fare = Inf,
max_walk_time = Inf,
max_bike_time = Inf,
max_trip_duration = 180L, #in minutes
# walk_speed = 3.6,
bike_speed = 12,
# max_rides = 3,
max_lts = 1, #1 - quietest, 4 - hardcore
shortest_path = TRUE, #FALSE?
all_to_all = FALSE,
n_threads = Inf,
verbose = FALSE,
progress = TRUE,
drop_geometry = FALSE,
output_dir = NULL
)
routes_jittered_500_lts1 = routes_jittered_500_lts1 %>% mutate(id = as.integer(from_id)) %>%
select(id, total_duration, total_distance, route) %>%
left_join(od_lisbon_jittered_500_r5r %>% st_drop_geometry(), by="id")
routes_jittered_500_lts1 = sf::st_as_sf(
as.data.frame(sf::st_drop_geometry(routes_jittered_500_lts1)),
geometry = routes_jittered_500_lts1$geometry
)
write_rds(routes_jittered_500_lts1, "routes_jittered_500_lts1.Rds")
routes_jittered_500_lts2 = detailed_itineraries(
r5r_lts,
origins = od_lisbon_jittered_500_OR,
destinations = od_lisbon_jittered_500_DE,
mode = "BICYCLE",
fare_structure = NULL,
max_fare = Inf,
max_walk_time = Inf,
max_bike_time = Inf,
max_trip_duration = 180L, #in minutes
bike_speed = 12,
max_lts = 2, #1 - quietest, 4 - hardcore
shortest_path = TRUE, #FALSE?
all_to_all = FALSE,
n_threads = Inf,
verbose = FALSE,
progress = TRUE,
drop_geometry = FALSE,
output_dir = NULL
)
routes_jittered_500_lts2 = routes_jittered_500_lts2 %>% mutate(id = as.integer(from_id)) %>%
select(id, total_duration, total_distance, route) %>%
left_join(od_lisbon_jittered_500_r5r %>% st_drop_geometry(), by="id")
routes_jittered_500_lts2 = sf::st_as_sf(
as.data.frame(sf::st_drop_geometry(routes_jittered_500_lts2)),
geometry = routes_jittered_500_lts2$geometry
)
write_rds(routes_jittered_500_lts2, "routes_jittered_500_lts2.Rds")
routes_jittered_500_lts3 = detailed_itineraries(
r5r_lts,
origins = od_lisbon_jittered_500_OR,
destinations = od_lisbon_jittered_500_DE,
mode = "BICYCLE",
fare_structure = NULL,
max_fare = Inf,
max_walk_time = Inf,
max_bike_time = Inf,
max_trip_duration = 180L, #in minutes
bike_speed = 12,
max_lts = 3, #1 - quietest, 4 - hardcore
shortest_path = TRUE, #FALSE?
all_to_all = FALSE,
n_threads = Inf,
verbose = FALSE,
progress = TRUE,
drop_geometry = FALSE,
output_dir = NULL
)
routes_jittered_500_lts3 = routes_jittered_500_lts3 %>% mutate(id = as.integer(from_id)) %>%
select(id, total_duration, total_distance, route) %>%
left_join(od_lisbon_jittered_500_r5r %>% st_drop_geometry(), by="id")
routes_jittered_500_lts3 = sf::st_as_sf(
as.data.frame(sf::st_drop_geometry(routes_jittered_500_lts3)),
geometry = routes_jittered_500_lts3$geometry
)
write_rds(routes_jittered_500_lts3, "routes_jittered_500_lts3.Rds")
routes_jittered_500_lts4 = detailed_itineraries(
r5r_lts,
origins = od_lisbon_jittered_500_OR,
destinations = od_lisbon_jittered_500_DE,
mode = "BICYCLE",
fare_structure = NULL,
max_fare = Inf,
max_walk_time = Inf,
max_bike_time = Inf,
max_trip_duration = 180L, #in minutes
bike_speed = 12,
max_lts = 4, #1 - quietest, 4 - hardcore
shortest_path = TRUE, #FALSE?
all_to_all = FALSE,
n_threads = Inf,
verbose = FALSE,
progress = TRUE,
drop_geometry = FALSE,
output_dir = NULL
)
routes_jittered_500_lts4 = routes_jittered_500_lts4 %>% mutate(id = as.integer(from_id)) %>%
select(id, total_duration, total_distance, route) %>%
left_join(od_lisbon_jittered_500_r5r %>% st_drop_geometry(), by="id")
routes_jittered_500_lts4 = sf::st_as_sf(
as.data.frame(sf::st_drop_geometry(routes_jittered_500_lts4)),
geometry = routes_jittered_500_lts4$geometry
)
write_rds(routes_jittered_500_lts4, "routes_jittered_500_lts4.Rds")
# unjittered routes with r5r, selection of LTS 2 and 4
od_lisbon_unjittered_points = line2df(od_lisbon_with_bikes)
od_lisbon_unjittered_OR = od_lisbon_unjittered_points[,c(1,2,3)]
names(od_lisbon_unjittered_OR) = c("id", "lon", "lat")
od_lisbon_unjittered_DE = od_lisbon_unjittered_points[,c(1,4,5)]
names(od_lisbon_unjittered_DE) = c("id", "lon", "lat")
od_lisbon_unjittered_r5r = od_lisbon_with_bikes
od_lisbon_unjittered_r5r$id = 1:nrow(od_lisbon_unjittered_r5r)
routes_unjittered_lts2 = detailed_itineraries(
r5r_lts,
origins = od_lisbon_unjittered_OR,
destinations = od_lisbon_unjittered_DE,
mode = "BICYCLE",
fare_structure = NULL,
max_fare = Inf,
max_walk_time = Inf,
max_bike_time = Inf,
max_trip_duration = 180L, #in minutes
bike_speed = 12,
max_lts = 2, #1 - quietest, 4 - hardcore
shortest_path = TRUE, #FALSE?
all_to_all = FALSE,
n_threads = Inf,
verbose = FALSE,
progress = TRUE,
drop_geometry = FALSE,
output_dir = NULL
)
routes_unjittered_lts2 = routes_unjittered_lts2 %>% mutate(id = as.integer(from_id)) %>%
select(id, total_duration, total_distance, route) %>%
left_join(od_lisbon_unjittered_r5r %>% st_drop_geometry(), by="id")
routes_unjittered_lts2 = sf::st_as_sf(
as.data.frame(sf::st_drop_geometry(routes_unjittered_lts2)),
geometry = routes_unjittered_lts2$geometry
)
write_rds(routes_unjittered_lts2, "routes_unjittered_lts2.Rds")
routes_unjittered_lts4 = detailed_itineraries(
r5r_lts,
origins = od_lisbon_unjittered_OR,
destinations = od_lisbon_unjittered_DE,
mode = "BICYCLE",
fare_structure = NULL,
max_fare = Inf,
max_walk_time = Inf,
max_bike_time = Inf,
max_trip_duration = 180L, #in minutes
bike_speed = 12,
max_lts = 4, #1 - quietest, 4 - hardcore
shortest_path = TRUE, #FALSE?
all_to_all = FALSE,
n_threads = Inf,
verbose = FALSE,
progress = TRUE,
drop_geometry = FALSE,
output_dir = NULL
)
routes_unjittered_lts4 = routes_unjittered_lts4 %>% mutate(id = as.integer(from_id)) %>%
select(id, total_duration, total_distance, route) %>%
left_join(od_lisbon_unjittered_r5r %>% st_drop_geometry(), by="id")
routes_unjittered_lts4 = sf::st_as_sf(
as.data.frame(sf::st_drop_geometry(routes_unjittered_lts4)),
geometry = routes_unjittered_lts4$geometry
)
write_rds(routes_unjittered_lts4, "routes_unjittered_lts4.Rds")
#jittered routes with r5r for 200, selection of LTS 2
od_lisbon_jittered_200_points = line2df(od_lisbon_jittered_200)
od_lisbon_jittered_200_OR = od_lisbon_jittered_200_points[,c(1,2,3)]
names(od_lisbon_jittered_200_OR) = c("id", "lon", "lat")
od_lisbon_jittered_200_DE = od_lisbon_jittered_200_points[,c(1,4,5)]
names(od_lisbon_jittered_200_DE) = c("id", "lon", "lat")
od_lisbon_jittered_200_r5r = od_lisbon_jittered_200
od_lisbon_jittered_200_r5r$id = 1:nrow(od_lisbon_jittered_200_r5r)
routes_jittered_200_lts2 = detailed_itineraries(
r5r_lts,
origins = od_lisbon_jittered_200_OR,
destinations = od_lisbon_jittered_200_DE,
mode = "BICYCLE",
fare_structure = NULL,
max_fare = Inf,
max_walk_time = Inf,
max_bike_time = Inf,
max_trip_duration = 180L, #in minutes
bike_speed = 12,
max_lts = 2, #1 - quietest, 4 - hardcore
shortest_path = TRUE, #FALSE?
all_to_all = FALSE,
n_threads = Inf,
verbose = FALSE,
progress = TRUE,
drop_geometry = FALSE,
output_dir = NULL
)
routes_jittered_200_lts2 = routes_jittered_200_lts2 %>% mutate(id = as.integer(from_id)) %>%
select(id, total_duration, total_distance, route) %>%
left_join(od_lisbon_jittered_200_r5r %>% st_drop_geometry(), by="id")
routes_jittered_200_lts2 = sf::st_as_sf(
as.data.frame(sf::st_drop_geometry(routes_jittered_200_lts2)),
geometry = routes_jittered_200_lts2$geometry
)
write_rds(routes_jittered_200_lts2, "routes_jittered_200_lts2.Rds")
```
```{r}
routes_unjittered_quietest = readRDS("routes_unjittered_quietest.Rds")
routes_unjittered_balanced = readRDS("routes_unjittered_balanced.Rds")
routes_unjittered_fastest = readRDS("routes_unjittered_fastest.Rds")
routes_jittered_quietest = readRDS("routes_jittered_quietest.Rds")
routes_jittered_balanced = readRDS("routes_jittered_balanced.Rds")
routes_jittered_fastest = readRDS("routes_jittered_fastest.Rds")
routes_jittered_500_quietest = readRDS("routes_jittered_500_quietest.Rds")
routes_jittered_500_balanced = readRDS("routes_jittered_500_balanced.Rds")
routes_jittered_500_fastest = readRDS("routes_jittered_500_fastest.Rds")
routes_jittered_500_lts2 = readRDS("routes_jittered_500_lts2.Rds")
routes_jittered_500_lts4 = readRDS("routes_jittered_500_lts4.Rds")
routes_unjittered_lts2 = readRDS("routes_unjittered_lts2.Rds")
routes_unjittered_lts4 = readRDS("routes_unjittered_lts4.Rds")
routes_jittered_500_google = readRDS("routes_jittered_500_google.Rds")
routes_jittered_200_quietest = readRDS("routes_jittered_200_quietest.Rds")
routes_jittered_200_lts2 = readRDS("routes_jittered_200_lts2.Rds")
rnet_unjittered_quietest = overline(routes_unjittered_quietest, attrib = "Bike")
rnet_unjittered_balanced = overline(routes_unjittered_balanced, attrib = "Bike")
rnet_unjittered_fastest = overline(routes_unjittered_fastest, attrib = "Bike")
rnet_jittered_quietest = overline(routes_jittered_quietest, attrib = "Bike")
rnet_jittered_balanced = overline(routes_jittered_balanced, attrib = "Bike")
rnet_jittered_fastest = overline(routes_jittered_fastest, attrib = "Bike")
rnet_jittered_500_quietest = overline(routes_jittered_500_quietest, attrib = "Bike")
rnet_jittered_500_balanced = overline(routes_jittered_500_balanced, attrib = "Bike")
rnet_jittered_500_fastest = overline(routes_jittered_500_fastest, attrib = "Bike")
rnet_jittered_200_quietest = overline(routes_jittered_200_quietest, attrib = "Bike")
rnet_jittered_500_lts2 = overline(routes_jittered_500_lts2, attrib = "Bike")
rnet_jittered_500_lts4 = overline(routes_jittered_500_lts4, attrib = "Bike")
rnet_unjittered_lts2 = overline(routes_unjittered_lts2, attrib = "Bike")
rnet_unjittered_lts4 = overline(routes_unjittered_lts4, attrib = "Bike")
rnet_jittered_200_lts2 = overline(routes_jittered_200_lts2, attrib = "Bike")
rnet_jittered_500_google = overline(routes_jittered_500_google, attrib = "Bike")
```
```{r, echo=FALSE}
# rnet_quiet = readRDS(url("https://github.com/U-Shift/biclar/releases/download/0.0.1/rnet_enmac_region_quietest_top_20000.Rds"))
counters_sf_joined = st_join(counters_sf,
rnet_unjittered_quietest %>% rename(Bikes_unjittered_quietest = Bike),
join = sf::st_nearest_feature)
counters_sf_joined = st_join(counters_sf_joined,
rnet_unjittered_balanced %>% rename(Bikes_unjittered_balanced = Bike),
join = sf::st_nearest_feature)
counters_sf_joined = st_join(counters_sf_joined,
rnet_unjittered_fastest %>% rename(Bikes_unjittered_fastest = Bike),
join = sf::st_nearest_feature)
counters_sf_joined = st_join(counters_sf_joined,
rnet_jittered_quietest %>% rename(Bikes_jittered_quietest = Bike),
join = sf::st_nearest_feature)
counters_sf_joined = st_join(counters_sf_joined,
rnet_jittered_balanced %>% rename(Bikes_jittered_balanced = Bike),
join = sf::st_nearest_feature)
counters_sf_joined = st_join(counters_sf_joined,
rnet_jittered_fastest %>% rename(Bikes_jittered_fastest = Bike),
join = sf::st_nearest_feature)
counters_sf_joined = st_join(counters_sf_joined,
rnet_jittered_500_quietest %>% rename(Bikes_jittered_500_quietest = Bike),
join = sf::st_nearest_feature)
counters_sf_joined = st_join(counters_sf_joined,
rnet_jittered_500_balanced %>% rename(Bikes_jittered_500_balanced = Bike),
join = sf::st_nearest_feature)
counters_sf_joined = st_join(counters_sf_joined,
rnet_jittered_500_fastest %>% rename(Bikes_jittered_500_fastest = Bike),
join = sf::st_nearest_feature)
counters_sf_joined = st_join(counters_sf_joined,
rnet_jittered_500_lts2 %>% rename(Bikes_jittered_500_lts2 = Bike),
join = sf::st_nearest_feature)
counters_sf_joined = st_join(counters_sf_joined,
rnet_jittered_500_lts4 %>% rename(Bikes_jittered_500_lts4 = Bike),
join = sf::st_nearest_feature)
counters_sf_joined = st_join(counters_sf_joined,
rnet_unjittered_lts2 %>% rename(Bikes_unjittered_lts2 = Bike),
join = sf::st_nearest_feature)
counters_sf_joined = st_join(counters_sf_joined,
rnet_unjittered_lts4 %>% rename(Bikes_unjittered_lts4 = Bike),
join = sf::st_nearest_feature)
counters_sf_joined = st_join(counters_sf_joined,
rnet_jittered_500_google %>% rename(Bikes_jittered_500_google = Bike),
join = sf::st_nearest_feature)
counters_sf_joined = st_join(counters_sf_joined,
rnet_jittered_200_quietest %>% rename(Bikes_jittered_200_quietest = Bike),
join = sf::st_nearest_feature)
counters_sf_joined = st_join(counters_sf_joined,
rnet_jittered_200_lts2 %>% rename(Bikes_jittered_200_lts2 = Bike),
join = sf::st_nearest_feature)
# # head(counters_sf_joined)
# corrplot::corrplot(counters_sf_joined %>% sf::st_drop_geometry())
# counters_sf_joined %>%
# sf::st_drop_geometry() %>%
# plot()
```
Figures \ref{map1}, \ref{map2}, \ref{map3} and \ref{map4} show examples of route networks from unjittered OD pairs, and jittered OD pairs with disagregation level of 500 trips, for differen routing providers, and the counters location.
```{r map1, echo=FALSE, message=FALSE, warning=FALSE, fig.ncol=2, out.width="100%", fig.cap="\\label{map1}Route network from unjittered desire lines, with routes from CycleStreets, in quietest routing option."}
library(tmap)
library(biclar)
#map with route network and couters location. rnet lwd = Bikes
tm_shape(rnet_unjittered_quietest) +
tmap::tm_lines(
id = NULL,
lwd = "Bike",
scale = 15,
col = "Bike",
palette = cols4all::c4a(palette = "mako") #choose a darker one!
) +
tm_shape(counters_sf) + tm_bubbles(
size = "SumCiclistas",
alpha = 0,
border.col = "red",
col = NA,
border.lwd = 1.5
)
```
```{r map2, echo=FALSE, message=FALSE, warning=FALSE, fig.ncol=2, out.width="100%", fig.cap="\\label{map2}Route network from jittered desire lines with disagregation of 500 trips, with routes from CycleStreets, in quietest routing option."}
tm_shape(rnet_jittered_500_quietest) +
tmap::tm_lines(
id = NULL,
lwd = "Bike",
scale = 15,
col = "Bike",
palette = cols4all::c4a(palette = "mako")
)+
tm_shape(counters_sf) + tm_bubbles(
size = "SumCiclistas",
alpha = 0,
border.col = "red",
col = NA,
border.lwd = 1.5
)
```
```{r map3, echo=FALSE, message=FALSE, warning=FALSE, fig.ncol=2, out.width="100%", fig.cap="\\label{map3}Route network from jittered desire lines with disagregation of 500 trips, with routes from r5r, level of traffic stress 2 (quiet) routing option."}
tm_shape(rnet_jittered_500_lts2) +
tmap::tm_lines(
id = NULL,
lwd = "Bike",
scale = 15,
col = "Bike",
palette = cols4all::c4a(palette = "mako")
)+
tm_shape(counters_sf) + tm_bubbles(
size = "SumCiclistas",
alpha = 0,
border.col = "red",
col = NA,
border.lwd = 1.5
)
```
```{r map4, echo=FALSE, message=FALSE, warning=FALSE, fig.ncol=2, out.width="100%", fig.cap="\\label{map4}Route network from jittered desire lines with disagregation of 500 trips, with routes from Google."}
tm_shape(rnet_jittered_500_google) +
tmap::tm_lines(
id = NULL,
lwd = "Bike",
scale = 15,
col = "Bike",
palette = cols4all::c4a(palette = "mako")
)+
tm_shape(counters_sf) + tm_bubbles(
size = "SumCiclistas",
alpha = 0,
border.col = "red",
col = NA,
border.lwd = 1.5
)
```
```{r map5, echo=FALSE, message=FALSE, warning=FALSE, fig.ncol=2, out.width="100%", fig.cap="\\label{map5}Route network from jittered desire lines with disagregation of 200 trips, with routes from r5r, level of traffic stress 2 (quiet) routing option."}
tm_shape(rnet_jittered_200_lts2) +
tmap::tm_lines(
id = NULL,
lwd = "Bike",
scale = 15,
col = "Bike",
palette = cols4all::c4a(palette = "mako")
)+
tm_shape(counters_sf) + tm_bubbles(
size = "SumCiclistas",
alpha = 0,
border.col = "red",
col = NA,
border.lwd = 1.5
)
```
When comparing the route network with unjittered desire lines (Figure \ref{map1}) with the jittered ones (Figures \ref{map2}, \ref{map3} and \ref{map4}), we may find that the route networks from jittered desire lines are more diffuse, and not concentrated in a few routes. For cycling and walking, this bring more realistic routes for this transport modes. Nevertheless, we are aware that routing options "quiet", and LTS 2 (quieter than LTS4), have a higher weight in using the existing cycling network infrastructure, and then the resulting route network can be similar to the cycling network silhouette (see Figure \ref{lisbonmap}). In fact, cyclists tend to opt for a cycling infrastructure when it is available, even if it compromises the directness of their trips [@Broach2012].
It is also noticed that "Fastest" and LTS4 routing option does not have a good fit with the counting data, when compared with the "Quietest" and LTS2.
Regarding the different disagregation levels, a route network build from a jittering disagregation of 200 trips is shown in Figure \ref{map5}, with a more diffuse network.
Although useful for visualizing the complex and spatially diffuse reality of travel patterns, we found that the most valuable use of jittering is as a pre-processing stage before routing and route network generation.
Route networks generated from jittered desire lines are more diffuse, and potentially more realistic, than centroid-based desire lines.
We also found that the approach, implemented in Rust and with bindings to R and Python (in progress), is fast.
Benchmarks show that the approach can 'jitter' desire lines representing millions of trips in a major city in less than a minute on consumer hardware.
We also found that the results of jittering depend on the geographic input datasets representing start points and trip attractors, and the use of weights.
Table \ref{tableresults} shows the counter data vs modeled route network fit, with different routing and jittering parameters. We can observe that jittered OD pairs provide a better fit result, with disagregation.
```{r}
results = tibble::tribble(
~`Jittering`, ~`Routing`, ~`Nrow`, ~`R-Squared`,
"Unjittered", "quietest", nrow(od_lisbon_sf), cor(counters_sf_joined$SumCiclistas, counters_sf_joined$Bikes_unjittered_quietest),
"Unjittered", "balanced", nrow(od_lisbon_sf), cor(counters_sf_joined$SumCiclistas, counters_sf_joined$Bikes_unjittered_balanced),
"Unjittered", "fastest", nrow(od_lisbon_sf), cor(counters_sf_joined$SumCiclistas, counters_sf_joined$Bikes_unjittered_fastest),
"Unjittered", "LTS2", nrow(od_lisbon_sf), cor(counters_sf_joined$SumCiclistas, counters_sf_joined$Bikes_unjittered_lts2),
"Unjittered", "LTS4", nrow(od_lisbon_sf), cor(counters_sf_joined$SumCiclistas, counters_sf_joined$Bikes_unjittered_lts4),
"Jittered, no disaggregation", "quietest", nrow(od_lisbon_jittered), cor(counters_sf_joined$SumCiclistas, counters_sf_joined$Bikes_jittered_quietest),
"Jittered, no disaggregation", "balanced", nrow(od_lisbon_jittered), cor(counters_sf_joined$SumCiclistas, counters_sf_joined$Bikes_jittered_balanced),
"Jittered, no disaggregation", "fastest", nrow(od_lisbon_jittered), cor(counters_sf_joined$SumCiclistas, counters_sf_joined$Bikes_jittered_fastest),
"Jittered, 500 disaggregation", "quietest", nrow(od_lisbon_jittered_500), cor(counters_sf_joined$SumCiclistas, counters_sf_joined$Bikes_jittered_500_quietest),
"Jittered, 500 disaggregation", "balanced", nrow(od_lisbon_jittered_500), cor(counters_sf_joined$SumCiclistas, counters_sf_joined$Bikes_jittered_500_balanced),
"Jittered, 500 disaggregation", "fastest", nrow(od_lisbon_jittered_500), cor(counters_sf_joined$SumCiclistas, counters_sf_joined$Bikes_jittered_500_fastest),
"Jittered, 500 disaggregation", "LTS2", nrow(od_lisbon_jittered_500), cor(counters_sf_joined$SumCiclistas, counters_sf_joined$Bikes_jittered_500_lts2),
"Jittered, 500 disaggregation", "LTS4", nrow(od_lisbon_jittered_500), cor(counters_sf_joined$SumCiclistas, counters_sf_joined$Bikes_jittered_500_lts4),
"Jittered, 500 disaggregation", "Google", nrow(od_lisbon_jittered_500), cor(counters_sf_joined$SumCiclistas, counters_sf_joined$Bikes_jittered_500_google),
"Jittered, 200 disaggregation", "quietest", nrow(od_lisbon_jittered_200), cor(counters_sf_joined$SumCiclistas, counters_sf_joined$Bikes_jittered_200_quietest),
"Jittered, 200 disaggregation", "LTS2", nrow(od_lisbon_jittered_200), cor(counters_sf_joined$SumCiclistas, counters_sf_joined$Bikes_jittered_200_lts2),
)
knitr::kable(results, digits = 2, booktabs = TRUE, caption = "\\label{tableresults}Results showing counter/model fit for route networks generated from different routing and jittering parameters",
linesep = c("", "", "","", "\\addlinespace","","", "\\addlinespace","", "", "","","", "\\addlinespace"))
```
A higher jittered disagregation level (200 trips) does not bring a better fit against a lower disagregation level of 500 trips. This might be explained but the routing profile used in the routing engines, and the location of the cycling counters --- most of them at the existing cycling infrastructure.
Although a more diffuse route network is expected in active transportation modes, the available data and computed routes are usually closer to where cycling infrastructure exists. Other data should be used to validate this hypothesis, such as a more diffuse cyclists' counters location, or/and the actual cyclist's routes --- for example, bike sharing trips routes, despite their access is not usually guaranteed for research purposed.
The results from our analysis suggest that investment in cycle infrastructure is particularly important in a few key locations where cycling potential is high yet provision is poor.
These locations are highlighted in Figure \@ref(fig:segments), which was generated using information from three key sources:
- Estimates of cycling potential, generated using the jittering $\rightarrow$ routing $\rightarrow$ route network methods presented in this paper.
- Estimates of quietness of links on the network, computed with the open source cyclestreets R package [@desjardins_correlates_2022].
- Local knowledge, which was used to visually inspect the resulting networks and identify key 'severance' points in the network [@mindell_chapter_2020].
```{r segments, fig.cap="Segments on the transport network of Lisbon where investment in new cycling infrastructure should be prioritised according to the route networks generated using methods presented in this paper, alongside local knowledge.", out.width="100%"}
knitr::include_graphics("figures/priority-segments.jpeg")
```
<!-- \begin{figure} -->
<!-- \includegraphics[width=\textwidth]{figures/priority-segments} -->
<!-- \caption{Segments on the transport network of Lisbon where investment in new cycling infrastructure should be prioritised according to the route networks generated using methods presented in this paper, alongside local knowledge.} -->
<!-- \label{fig:segments} -->
<!-- \end{figure} -->
Figure \@ref(fig:segments) highlights the policy relevant nature of this research.
A key finding is that, combined with local knowledge and detailed data on existing transport infrastructure, which can be used to generate metrics such as Level of Traffic Stress (LTS) [@wang_does_2016] and Cycling Level of Service (CLoS) [@deegan_cycling_2015], route networks generated from jittered, disaggregated, and appropriated routed OD data can help prioritise investment where it is most needed.
Results were presented to stakeholders working in the local area who said that these new results would support their investment plans.
The overall result was the finding that OD jittering methods first developed by @lovelace_jittering_2022b are not enough on their own to generate accurate route networks.
Jittering leads to more spatially diffuse route networks than networks generated from the common approach of routing from and to zone centroids.
However, the results presented in this section show that careful consideration of routing options is needed in addition to evidence-based selection of jittering parameters.
# Conclusion
Building on previous work [@lovelace_jittering_2022b], we have explored the relative importance of jittering and routing options for generating accurate route network level estimates of movement, down to the street level.
In corroboration with previous research, we found that jittering leads to more spatially diverse geographic representations of travel between zones and estimates of flow down to the link level [@lovelace_assessing_2022].
A new finding was that jittering alone cannot be guaranteed to generate accurate route network levels results: appropriate routing options should be tested and identified.
The results were generated only for a single city and we did not explore the full parameter space (alternative subpoint weighting parameters in the jittering process are discussed below).
For these reasons, we cannot draw specific and universally applicable conclusions about the optimal settings for accurate route network generation in other cities: t should be remembered that route networks and cycling preferences vary from city to city [@buehler_bikeway_2016].
However, although our findings were based on a single case study, Lisbon, Portugal, the findings have implications for future work using OD data to support evidence-based investment in sustainable transport infrastructure [e.g. @vybornova_automated_2022a].
The main conclusion is that both careful translation of OD data to geographic start and end locations and disaggregation and careful selection of routing options are needed *in combination* to ensure that route networks derived from OD data are diffuse and accurate.
Accurate route network representations of transport systems are needed to support investment in a variety of transport interventions [@morgan_travel_2020].
We have focused in this study on cycleway network because a complete cycle network represents one of the most cost-effective ways to reduce car dependence and associated environmental, economic, social and health costs [@waldykowski_sustainable_2022].
Cycleway *networks*, rather than simply isolated routes or other geographically sparse interventions, are vital for successful active travel investment [@buehler_bikeway_2016].
Our results are therefore highly policy relevant, adding value to established methods of adding value to OD data to support sustainable transport planning [@lovelace_propensity_2017; @larsen_build_2013; @mohammed_origindestination_2022].
The research presented in this paper is not without limitations.
We did not explore the full range of jittering and routing options available due to time and computational resource constraints.
Specifically, varying the type and weights of origin and destination subpoints, as advocated in @lovelace_jittering_2022b, could lead to improved fit.
This would require filtering the subpoints used to include only certain types of nodes on the road network (all vertices on the road network were used as the basis for both origin subpoints and destination subpoints in this study, see [documentation](https://github.com/dabreegster/odjitter) in the `odjitter` Rust crate for details).
Future work could explore the use of including only residential roads, or increasing the weight associated with residential roads, in the origin subpoints, for example.
Likewise, destination subpoints and associated weights could be altered to prioritise key trip attractors such as schools and commercial centres.
Another limitation is the simplistic measure of accuracy used in this study.
Accuracy was inferred from goodness-of-fit between aggregated flow values at 67 counter locations and modeled flow on nearest segment on the network.
Future work could use alternative measures of fit such as root-mean-square error (RMSE) and more sophisticated ways of comparing observed counter values to modeled networked values, e.g. using inverse distance weighted measures associated with links in close proximity to each counter, with empirically derived bandwidths.
More broadly, the quality of the underlying route network data is imperfect.
Efforts to improve the underlying OpenStreetMap data will continue to overcome this limitation, not just in Lisbon but worldwide [@barrington-leigh_world_2017].
This will improve the results over time because all routing engines used in this study, except for Google's routing service, use OSM data.
Furthermore, alternative data sources and methods could be used to generate more accurate road networks [e.g. @leninisha_water_2015].
Future work should seek to test a wider range of jittering parameters in multiple case study areas with larger ground truth datasets.
Other fit measures, such as GEH or SQV statistics, may also be used to compare count data with simulated traffic volumes.
Despite these limitations, and the need for future academic work, the results are already useful.
Imperfect data-driven evidence is better than no systematic evidence, especially when practitioners are aware of the mechanisms underlying route network level estimates of travel behavior such as those presented in this paper.
A benefit of the approach is that it based on open source software and reproducible code, allowing others to build on the methods [@lovelace_open_2020].
Indeed, a next step building on directly on the research presented in this paper is to use the results to support strategic cycle network planning in Lisbon and the wider area.
In parallel to efforts to improve route network representations of transport systems we therefore advocate for the use of the approach presented in this paper, and related methods [e.g. @cooper_predictive_2018; @vybornova_automated_2022a], to be implemented in support of more evidence-based investment in sustainable transport infrastructure at city, regional and national scales worldwide.
\section*{ACKNOWLEDGEMENTS}\label{ACKNOWLEDGEMENTS}
We thank Lisbon Municipal Government and Transport Infrastructure Ireland for funding this research.
# References