Skip to content

Commit 5568808

Browse files
authored
Add TomTom comparison (#6)
* Adding TomTom comparison, refactoring code to work with multiple providers
1 parent 1c78780 commit 5568808

File tree

8 files changed

+267
-83
lines changed

8 files changed

+267
-83
lines changed

README.md

Lines changed: 20 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,13 @@
11
# TravelTime/Google comparison tool
22

3-
This tool compares the travel times obtained from [TravelTime Routes API](https://docs.traveltime.com/api/reference/routes)
4-
and [Google Maps Directions API](https://developers.google.com/maps/documentation/directions/get-directions).
3+
This tool compares the travel times obtained from [TravelTime Routes API](https://docs.traveltime.com/api/reference/routes),
4+
[Google Maps Directions API](https://developers.google.com/maps/documentation/directions/get-directions),
5+
and [TomTom Routing API](https://developer.tomtom.com/routing-api/documentation/tomtom-maps/routing-service).
56
Source code is available on [GitHub](https://github.com/traveltime-dev/traveltime-google-comparison).
67

78
## Features
89

9-
- Get travel times from TravelTime API and Google Maps API in parallel, for provided origin/destination pairs and a set
10+
- Get travel times from TravelTime API, Google Maps API and TomTom API in parallel, for provided origin/destination pairs and a set
1011
of departure times.
1112
- Departure times are calculated based on user provided start time, end time and interval.
1213
- Analyze the differences between the results and print out the average error percentage.
@@ -40,6 +41,12 @@ For Google Maps API:
4041
export GOOGLE_API_KEY=[Your Google Maps API Key]
4142
```
4243

44+
For TomTom API:
45+
46+
```bash
47+
export TOMTOM_API_KEY=[Your TomTom API Key]
48+
```
49+
4350
For TravelTime API:
4451
```bash
4552
export TRAVELTIME_APP_ID=[Your TravelTime App ID]
@@ -76,7 +83,9 @@ Required arguments:
7683

7784

7885
Optional arguments:
79-
- `--google-max-rpm [int]`: Set max number of parallel requests sent to Google API per minute. Default is 60.
86+
- `--google-max-rpm [int]`: Set max number of parallel requests sent to Google API per minute. Default is 60.
87+
It is enforced on per-second basis, to avoid bursts.
88+
- `--tomtom-max-rpm [int]`: Set max number of parallel requests sent to TomTom API per minute. Default is 60.
8089
It is enforced on per-second basis, to avoid bursts.
8190
- `--traveltime-max-rpm [int]`: Set max number of parallel requests sent to TravelTime API per minute. Default is 60.
8291
It is enforced on per-second basis, to avoid bursts.
@@ -106,13 +115,13 @@ The output file will contain the `origin` and `destination` columns from input f
106115

107116
### Sample output
108117
```csv
109-
origin,destination,departure_time,google_travel_time,tt_travel_time,error_percentage
110-
"52.1849867903527, 0.1809343829904072","52.202817030086266, 0.10935651695330152",2024-05-28 06:00:00+0100,718.0,1050.0,46
111-
"52.1849867903527, 0.1809343829904072","52.202817030086266, 0.10935651695330152",2024-05-28 09:00:00+0100,1427.0,1262.0,11
112-
"52.1849867903527, 0.1809343829904072","52.202817030086266, 0.10935651695330152",2024-05-28 12:00:00+0100,1064.0,1165.0,9
113-
"52.1849867903527, 0.1809343829904072","52.202817030086266, 0.10935651695330152",2024-05-28 15:00:00+0100,1240.0,1287.0,3
114-
"52.1849867903527, 0.1809343829904072","52.202817030086266, 0.10935651695330152",2024-05-28 18:00:00+0100,1312.0,1223.0,6
115-
"52.18553917820687, 0.12702050752253252","52.22715259892737, 0.14811674226050345",2024-05-28 06:00:00+0100,749.0,903.0,20
118+
origin,destination,departure_time,google_travel_time,tomtom_travel_time,tt_travel_time,error_percentage_google,error_percentage_tomtom
119+
"50.077012199999984, -5.2234787","50.184134100000726, -5.593753699999999",2024-09-20 07:00:00+0100,2276.0,2388.0,2071.0,9,13
120+
"50.077012199999984, -5.2234787","50.184134100000726, -5.593753699999999",2024-09-20 10:00:00+0100,2702.0,2578.0,2015.0,25,21
121+
"50.077012199999984, -5.2234787","50.184134100000726, -5.593753699999999",2024-09-20 13:00:00+0100,2622.0,2585.0,2015.0,23,22
122+
"50.077012199999984, -5.2234787","50.184134100000726, -5.593753699999999",2024-09-20 16:00:00+0100,2607.0,2596.0,2130.0,18,17
123+
"50.077012199999984, -5.2234787","50.184134100000726, -5.593753699999999",2024-09-20 19:00:00+0100,2398.0,2431.0,1960.0,18,19
124+
"50.09814150000003, -5.2586104000000065","50.2165765000003, -5.4758540000000036",2024-09-20 07:00:00+0100,2175.0,2357.0,1861.0,14,21
116125
```
117126

118127
## License
Lines changed: 79 additions & 39 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,22 @@
11
import logging
22
from dataclasses import dataclass
3+
from typing import List
34

45
from pandas import DataFrame
56

6-
from traveltime_google_comparison.collect import Fields, GOOGLE_API, TRAVELTIME_API
7+
from traveltime_google_comparison.collect import (
8+
Fields,
9+
TRAVELTIME_API,
10+
get_capitalized_provider_name,
11+
)
712

8-
ABSOLUTE_ERROR = "absolute_error"
9-
RELATIVE_ERROR = "error_percentage"
13+
14+
def absolute_error(api_provider: str) -> str:
15+
return f"absolute_error_{api_provider}"
16+
17+
18+
def relative_error(api_provider: str) -> str:
19+
return f"error_percentage_{api_provider}"
1020

1121

1222
@dataclass
@@ -15,54 +25,84 @@ class QuantileErrorResult:
1525
relative_error: int
1626

1727

18-
def run_analysis(results: DataFrame, output_file: str, quantile: float):
19-
results_with_differences = calculate_differences(results)
20-
logging.info(
21-
f"Mean relative error: {results_with_differences[RELATIVE_ERROR].mean():.2f}%"
22-
)
23-
quantile_errors = calculate_quantiles(results_with_differences, quantile)
24-
logging.info(
25-
f"{int(quantile * 100)}% of TravelTime results differ from Google API "
26-
f"by less than {int(quantile_errors.relative_error)}%"
27-
)
28+
def log_results(
29+
results_with_differences: DataFrame, quantile: float, api_providers: List[str]
30+
):
31+
for provider in api_providers:
32+
capitalized_provider = get_capitalized_provider_name(provider)
33+
logging.info(
34+
f"Mean relative error compared to {capitalized_provider} "
35+
f"API: {results_with_differences[relative_error(provider)].mean():.2f}%"
36+
)
37+
quantile_errors = calculate_quantiles(
38+
results_with_differences, quantile, provider
39+
)
40+
logging.info(
41+
f"{int(quantile * 100)}% of TravelTime results differ from {capitalized_provider} API "
42+
f"by less than {int(quantile_errors.relative_error)}%"
43+
)
44+
45+
46+
def format_results_for_csv(
47+
results_with_differences: DataFrame, api_providers: List[str]
48+
) -> DataFrame:
49+
formatted_results = results_with_differences.copy()
50+
51+
for provider in api_providers:
52+
formatted_results = formatted_results.drop(columns=[absolute_error(provider)])
53+
relative_error_col = relative_error(provider)
54+
formatted_results[relative_error_col] = formatted_results[
55+
relative_error_col
56+
].astype(int)
57+
58+
return formatted_results
59+
60+
61+
def run_analysis(
62+
results: DataFrame, output_file: str, quantile: float, api_providers: List[str]
63+
):
64+
results_with_differences = calculate_differences(results, api_providers)
65+
log_results(results_with_differences, quantile, api_providers)
2866

2967
logging.info(f"Detailed results can be found in {output_file} file")
3068

31-
results_with_differences = results_with_differences.drop(columns=[ABSOLUTE_ERROR])
32-
results_with_differences[RELATIVE_ERROR] = results_with_differences[
33-
RELATIVE_ERROR
34-
].astype(int)
69+
formatted_results = format_results_for_csv(results_with_differences, api_providers)
3570

36-
results_with_differences.to_csv(output_file, index=False)
71+
formatted_results.to_csv(output_file, index=False)
3772

3873

39-
def calculate_differences(results: DataFrame) -> DataFrame:
40-
results_with_differences = results.assign(
41-
**{
42-
ABSOLUTE_ERROR: abs(
43-
results[Fields.TRAVEL_TIME[GOOGLE_API]]
44-
- results[Fields.TRAVEL_TIME[TRAVELTIME_API]]
45-
)
46-
}
47-
)
74+
def calculate_differences(results: DataFrame, api_providers: List[str]) -> DataFrame:
75+
results_with_differences = results.copy()
76+
77+
for provider in api_providers:
78+
absolute_error_col = absolute_error(provider)
79+
relative_error_col = relative_error(provider)
80+
81+
results_with_differences[absolute_error_col] = abs(
82+
results[Fields.TRAVEL_TIME[provider]]
83+
- results[Fields.TRAVEL_TIME[TRAVELTIME_API]]
84+
)
85+
86+
results_with_differences[relative_error_col] = (
87+
results_with_differences[absolute_error_col]
88+
/ results_with_differences[Fields.TRAVEL_TIME[provider]]
89+
* 100
90+
)
4891

49-
results_with_differences[RELATIVE_ERROR] = (
50-
results_with_differences[ABSOLUTE_ERROR]
51-
/ results_with_differences[Fields.TRAVEL_TIME[GOOGLE_API]]
52-
* 100
53-
)
5492
return results_with_differences
5593

5694

5795
def calculate_quantiles(
58-
results_with_differences: DataFrame, quantile: float
96+
results_with_differences: DataFrame,
97+
quantile: float,
98+
api_provider: str,
5999
) -> QuantileErrorResult:
60-
quantile_absolute_error = results_with_differences[ABSOLUTE_ERROR].quantile(
61-
quantile, "higher"
62-
)
63-
quantile_relative_error = results_with_differences[RELATIVE_ERROR].quantile(
64-
quantile, "higher"
65-
)
100+
quantile_absolute_error = results_with_differences[
101+
absolute_error(api_provider)
102+
].quantile(quantile, "higher")
103+
quantile_relative_error = results_with_differences[
104+
relative_error(api_provider)
105+
].quantile(quantile, "higher")
66106
return QuantileErrorResult(
67107
int(quantile_absolute_error), int(quantile_relative_error)
68108
)

src/traveltime_google_comparison/collect.py

Lines changed: 25 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -14,15 +14,31 @@
1414
from traveltime_google_comparison.requests.base_handler import BaseRequestHandler
1515

1616
GOOGLE_API = "google"
17+
TOMTOM_API = "tomtom"
1718
TRAVELTIME_API = "traveltime"
1819

1920

21+
def get_capitalized_provider_name(provider: str) -> str:
22+
if provider == "google":
23+
return "Google"
24+
elif provider == "tomtom":
25+
return "TomTom"
26+
elif provider == "traveltime":
27+
return "TravelTime"
28+
else:
29+
raise ValueError(f"Unsupported API provider: {provider}")
30+
31+
2032
@dataclass
2133
class Fields:
2234
ORIGIN = "origin"
2335
DESTINATION = "destination"
2436
DEPARTURE_TIME = "departure_time"
25-
TRAVEL_TIME = {GOOGLE_API: "google_travel_time", TRAVELTIME_API: "tt_travel_time"}
37+
TRAVEL_TIME = {
38+
GOOGLE_API: "google_travel_time",
39+
TOMTOM_API: "tomtom_travel_time",
40+
TRAVELTIME_API: "tt_travel_time",
41+
}
2642

2743

2844
logger = logging.getLogger(__name__)
@@ -100,7 +116,7 @@ def generate_tasks(
100116

101117

102118
async def collect_travel_times(
103-
args, data, request_handlers: Dict[str, BaseRequestHandler]
119+
args, data, request_handlers: Dict[str, BaseRequestHandler], providers: List[str]
104120
) -> DataFrame:
105121
timezone = pytz.timezone(args.time_zone_id)
106122
localized_start_datetime = localize_datetime(args.date, args.start_time, timezone)
@@ -111,7 +127,12 @@ async def collect_travel_times(
111127

112128
tasks = generate_tasks(data, time_instants, request_handlers, mode=Mode.DRIVING)
113129

114-
logger.info(f"Sending {len(tasks)} requests to Google and TravelTime APIs")
130+
capitalized_providers_str = ", ".join(
131+
[get_capitalized_provider_name(provider) for provider in providers]
132+
)
133+
logger.info(
134+
f"Sending {len(tasks)} requests to {capitalized_providers_str} and TravelTime APIs"
135+
)
115136

116137
results = await asyncio.gather(*tasks)
117138

@@ -121,6 +142,7 @@ async def collect_travel_times(
121142
).agg(
122143
{
123144
Fields.TRAVEL_TIME[GOOGLE_API]: "first",
145+
Fields.TRAVEL_TIME[TOMTOM_API]: "first",
124146
Fields.TRAVEL_TIME[TRAVELTIME_API]: "first",
125147
}
126148
)

src/traveltime_google_comparison/config.py

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,9 +9,11 @@
99
)
1010

1111
DEFAULT_GOOGLE_RPM = 60
12+
DEFAULT_TOMTOM_RPM = 60
1213
DEFAULT_TRAVELTIME_RPM = 60
1314

1415
GOOGLE_API_KEY_VAR_NAME = "GOOGLE_API_KEY"
16+
TOMTOM_API_KEY_VAR_NAME = "TOMTOM_API_KEY"
1517
TRAVELTIME_APP_ID_VAR_NAME = "TRAVELTIME_APP_ID"
1618
TRAVELTIME_API_KEY_VAR_NAME = "TRAVELTIME_API_KEY"
1719

@@ -48,13 +50,21 @@ def parse_args():
4850
default=DEFAULT_GOOGLE_RPM,
4951
help="Maximum number of requests sent to Google API per minute",
5052
)
53+
parser.add_argument(
54+
"--tomtom-max-rpm",
55+
required=False,
56+
type=int,
57+
default=DEFAULT_TOMTOM_RPM,
58+
help="Maximum number of requests sent to TomTom API per minute",
59+
)
5160
parser.add_argument(
5261
"--traveltime-max-rpm",
5362
required=False,
5463
type=int,
5564
default=DEFAULT_TRAVELTIME_RPM,
5665
help="Maximum number of requests sent to TravelTime API per minute",
5766
)
67+
5868
parser.add_argument(
5969
"--skip-data-gathering",
6070
action=argparse.BooleanOptionalAction,
@@ -74,6 +84,14 @@ def retrieve_google_api_key():
7484
return google_api_key
7585

7686

87+
def retrieve_tomtom_api_key():
88+
tomtom_api_key = os.environ.get(TOMTOM_API_KEY_VAR_NAME)
89+
90+
if not tomtom_api_key:
91+
raise ValueError(f"{TOMTOM_API_KEY_VAR_NAME} not set in environment variables.")
92+
return tomtom_api_key
93+
94+
7795
def retrieve_traveltime_credentials() -> TravelTimeCredentials:
7896
app_id = os.environ.get(TRAVELTIME_APP_ID_VAR_NAME)
7997
api_key = os.environ.get(TRAVELTIME_API_KEY_VAR_NAME)

src/traveltime_google_comparison/main.py

Lines changed: 12 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,12 @@
66
from traveltime_google_comparison import collect
77
from traveltime_google_comparison import config
88
from traveltime_google_comparison.analysis import run_analysis
9-
from traveltime_google_comparison.collect import Fields, GOOGLE_API, TRAVELTIME_API
9+
from traveltime_google_comparison.collect import (
10+
Fields,
11+
GOOGLE_API,
12+
TRAVELTIME_API,
13+
TOMTOM_API,
14+
)
1015
from traveltime_google_comparison.requests import factory
1116

1217
logging.basicConfig(
@@ -19,6 +24,7 @@
1924

2025

2126
async def run():
27+
providers = [GOOGLE_API, TOMTOM_API]
2228
args = config.parse_args()
2329
csv = pd.read_csv(
2430
args.input, usecols=[Fields.ORIGIN, Fields.DESTINATION]
@@ -29,7 +35,7 @@ async def run():
2935
return
3036

3137
request_handlers = factory.initialize_request_handlers(
32-
args.google_max_rpm, args.traveltime_max_rpm
38+
args.google_max_rpm, args.tomtom_max_rpm, args.traveltime_max_rpm
3339
)
3440
if args.skip_data_gathering:
3541
travel_times_df = pd.read_csv(
@@ -39,15 +45,17 @@ async def run():
3945
Fields.DESTINATION,
4046
Fields.DEPARTURE_TIME,
4147
Fields.TRAVEL_TIME[GOOGLE_API],
48+
Fields.TRAVEL_TIME[TOMTOM_API],
4249
Fields.TRAVEL_TIME[TRAVELTIME_API],
4350
],
4451
)
4552
else:
4653
travel_times_df = await collect.collect_travel_times(
47-
args, csv, request_handlers
54+
args, csv, request_handlers, providers
4855
)
4956
filtered_travel_times_df = travel_times_df.loc[
5057
travel_times_df[Fields.TRAVEL_TIME[GOOGLE_API]].notna()
58+
& travel_times_df[Fields.TRAVEL_TIME[TOMTOM_API]].notna()
5159
& travel_times_df[Fields.TRAVEL_TIME[TRAVELTIME_API]].notna(),
5260
:,
5361
]
@@ -62,7 +70,7 @@ async def run():
6270
logger.info(
6371
f"Skipped {skipped_rows} rows ({100 * skipped_rows / all_rows:.2f}%)"
6472
)
65-
run_analysis(filtered_travel_times_df, args.output, 0.90)
73+
run_analysis(filtered_travel_times_df, args.output, 0.90, providers)
6674

6775

6876
def main():

0 commit comments

Comments
 (0)