-
-
Notifications
You must be signed in to change notification settings - Fork 692
Description
Summary
Performance and memory usage issues in the Python 3.14 release compared to the 3.13 one:
- Scanning speed is down about 40 %
- Memory usage for a scan using 21 processes is increased from 3.1 GB to 25 GB.
Description
We have discovered performance and memory usage issues in the Python 3.14 release of v32.5.0 that prevents us from upgrading to that version. Most of these issues are not present in the Python 3.13 release or in v32.4.1.
The way we run Scancode, in a containerized environment, seems to exacerbate the problem. A scan that took 30 seconds in 32.4.1 now takes 4 minutes. Passive memory usage is about 50% higher, and memory usage during maximum load is around 250% higher. Some of these issues may be attributed to our environment, but we're seeing similar problems when running Scancode in a non-containerized fashion, using the CLI.
Running Scancode directly on my laptop gives the following results when scanning the async 3.2.6 npm package:
ScanCode v32.5.0_py3.14
$ ./scancode -l --json output.json ./package
Setup plugins...
Collect file inventory...
Scan files for: licenses with 21 process(es)...
[####################] 274
Scanning done.
Summary: licenses with 21 process(es)
Errors count: 0
Scan Speed: 14.03 files/sec.
Initial counts: 140 resource(s): 137 file(s) and 3 directorie(s)
Final counts: 140 resource(s): 137 file(s) and 3 directorie(s)
Timings:
scan_start: 2026-01-28T131349.526492
scan_end: 2026-01-28T131400.641108
setup_scan:licenses: 1.33s
setup: 1.33s
scan: 9.76s
total: 11.12s
Removing temporary files...done.
Memory usage during the scan peaks at 25 GB (!) - it seems to scale linearly with the number of processes.
ScanCode v32.5.0_py3.13
$ ./scancode -l --json output.json ./package
Setup plugins...
Collect file inventory...
Scan files for: licenses with 21 process(es)...
[####################] 274
Scanning done.
Summary: licenses with 21 process(es)
Errors count: 0
Scan Speed: 23.23 files/sec.
Initial counts: 140 resource(s): 137 file(s) and 3 directorie(s)
Final counts: 140 resource(s): 137 file(s) and 3 directorie(s)
Timings:
scan_start: 2026-01-28T131911.340798
scan_end: 2026-01-28T131918.676450
setup_scan:licenses: 1.42s
setup: 1.42s
scan: 5.90s
total: 7.37s
Removing temporary files...done.
Memory usage peaks at 3.1 GB. While it it less when only running with one process, the difference is nowhere near the one in v32.5.0_py3.14.
ScanCode v32.4.1_py3.13
$ ./scancode -l --json output.json ./package
Setup plugins...
Collect file inventory...
Scan files for: licenses with 21 process(es)...
[####################] 274
Scanning done.
Summary: licenses with 21 process(es)
Errors count: 0
Scan Speed: 23.04 files/sec.
Initial counts: 140 resource(s): 137 file(s) and 3 directorie(s)
Final counts: 140 resource(s): 137 file(s) and 3 directorie(s)
Timings:
scan_start: 2026-01-28T135229.947719
scan_end: 2026-01-28T135237.361817
setup_scan:licenses: 1.45s
setup: 1.45s
scan: 5.95s
total: 7.45s
Removing temporary files...done.
Memory usage peaks at 3.1 GB, same as with v32.5.0_py3.13.
Copyright scans are a bit slower as well, but they don't seem to have the same memory issues.
How To Reproduce
curl https://registry.npmjs.org/async/-/async-3.2.6.tgz | tar -xz
scancode -l --json output.json ./package
System configuration
- Ubuntu 24.04
- Scancode downloaded from the release page (https://github.com/aboutcode-org/scancode-toolkit/releases/download/v32.5.0/scancode-toolkit-v32.5.0_py3.14-linux.tar.gz)