Skip to content

Performance degradation and memory usage in 32.5.0_py3.14 #4709

@arban

Description

@arban

Summary

Performance and memory usage issues in the Python 3.14 release compared to the 3.13 one:

  • Scanning speed is down about 40 %
  • Memory usage for a scan using 21 processes is increased from 3.1 GB to 25 GB.

Description

We have discovered performance and memory usage issues in the Python 3.14 release of v32.5.0 that prevents us from upgrading to that version. Most of these issues are not present in the Python 3.13 release or in v32.4.1.

The way we run Scancode, in a containerized environment, seems to exacerbate the problem. A scan that took 30 seconds in 32.4.1 now takes 4 minutes. Passive memory usage is about 50% higher, and memory usage during maximum load is around 250% higher. Some of these issues may be attributed to our environment, but we're seeing similar problems when running Scancode in a non-containerized fashion, using the CLI.

Running Scancode directly on my laptop gives the following results when scanning the async 3.2.6 npm package:

ScanCode v32.5.0_py3.14

$ ./scancode -l --json output.json ./package
Setup plugins...
Collect file inventory...
Scan files for: licenses with 21 process(es)...
[####################] 274                             
Scanning done.
Summary:        licenses with 21 process(es)
Errors count:   0
Scan Speed:     14.03 files/sec. 
Initial counts: 140 resource(s): 137 file(s) and 3 directorie(s) 
Final counts:   140 resource(s): 137 file(s) and 3 directorie(s) 
Timings:
  scan_start: 2026-01-28T131349.526492
  scan_end:   2026-01-28T131400.641108
  setup_scan:licenses: 1.33s
  setup: 1.33s
  scan: 9.76s
  total: 11.12s
Removing temporary files...done.

Memory usage during the scan peaks at 25 GB (!) - it seems to scale linearly with the number of processes.

ScanCode v32.5.0_py3.13

$ ./scancode -l --json output.json ./package
Setup plugins...
Collect file inventory...
Scan files for: licenses with 21 process(es)...
[####################] 274                             
Scanning done.
Summary:        licenses with 21 process(es)
Errors count:   0
Scan Speed:     23.23 files/sec. 
Initial counts: 140 resource(s): 137 file(s) and 3 directorie(s) 
Final counts:   140 resource(s): 137 file(s) and 3 directorie(s) 
Timings:
  scan_start: 2026-01-28T131911.340798
  scan_end:   2026-01-28T131918.676450
  setup_scan:licenses: 1.42s
  setup: 1.42s
  scan: 5.90s
  total: 7.37s
Removing temporary files...done.

Memory usage peaks at 3.1 GB. While it it less when only running with one process, the difference is nowhere near the one in v32.5.0_py3.14.

ScanCode v32.4.1_py3.13

$ ./scancode -l --json output.json ./package
Setup plugins...
Collect file inventory...
Scan files for: licenses with 21 process(es)...
[####################] 274                             
Scanning done.
Summary:        licenses with 21 process(es)
Errors count:   0
Scan Speed:     23.04 files/sec. 
Initial counts: 140 resource(s): 137 file(s) and 3 directorie(s) 
Final counts:   140 resource(s): 137 file(s) and 3 directorie(s) 
Timings:
  scan_start: 2026-01-28T135229.947719
  scan_end:   2026-01-28T135237.361817
  setup_scan:licenses: 1.45s
  setup: 1.45s
  scan: 5.95s
  total: 7.45s
Removing temporary files...done.

Memory usage peaks at 3.1 GB, same as with v32.5.0_py3.13.

Copyright scans are a bit slower as well, but they don't seem to have the same memory issues.

How To Reproduce

curl https://registry.npmjs.org/async/-/async-3.2.6.tgz | tar -xz
scancode -l --json output.json ./package

System configuration

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions