Add comprehensive performance benchmark suite for Orso version comparison #56

Copilot · 2025-10-14T09:30:42Z

Overview

This PR adds a comprehensive performance benchmark suite to enable easy comparison of performance across different versions of Orso. The suite focuses on the core functionalities of Orso DataFrame:

Arrow conversion (to and from PyArrow)
Appending values to DataFrames
Iterating over DataFrame rows
Displaying DataFrames

Everything else is glue and sugar - these are the fundamental operations that determine Orso's performance characteristics.

What's Included

📊 Benchmark Suite (`tests/test_benchmark_suite.py`)

A comprehensive test suite with 17 individual performance tests organized into 4 categories:

Arrow Conversion (5 tests): from_arrow (small/large), to_arrow (standard/decimal/wide tables)
Append Operations (3 tests): dict append, append with schema, complete append+Arrow workflow
Iteration (4 tests): materialized, lazy/generator, with column access, fetchall
Display (5 tests): small/medium/wide tables, markdown output, lazy DataFrame display

🔧 Version Comparison Tools

JSON output format for machine-readable results and historical tracking
Built-in comparison tool with visual indicators (✓ for improvements, ✗ for regressions)
Shell script (compare_versions.sh) for automated version comparison
Programmatic API for custom benchmarking workflows

📚 Documentation

BENCHMARK_SUITE.md - Comprehensive documentation with usage instructions, benchmark descriptions, and performance targets
BENCHMARK_QUICKREF.md - Quick reference guide for common tasks
BENCHMARK_IMPLEMENTATION.md - Implementation summary and overview
examples/benchmark_usage.py - Working examples showing 5 different usage patterns
Updated README.md with benchmark section

Usage

Basic benchmark run:

python tests/test_benchmark_suite.py

Compare two versions:

# Create baseline
python tests/test_benchmark_suite.py -o baseline.json

# Switch version (upgrade, git checkout, etc.)
pip install orso==<new_version>

# Compare
python tests/test_benchmark_suite.py -o current.json -c baseline.json

Automated comparison:

./compare_versions.sh 0.0.225 0.0.227

Programmatic usage:

from tests.test_benchmark_suite import PerformanceBenchmark

benchmark = PerformanceBenchmark()
results = benchmark.run_all_benchmarks()
benchmark.save_results("my_results.json")

Sample Output

================================================================================
Performance Comparison
Baseline: 0.0.225
Current:  0.0.227
================================================================================

ARROW_CONVERSION:
--------------------------------------------------------------------------------
  ✓ to_arrow_standard:
      Baseline: 0.0172s
      Current:  0.0165s
      Speedup:  1.04x
  ✓ to_arrow_standard (rows/sec):
      Baseline: 5,816,031
      Current:  6,060,606
      Change:   +4.2%

Performance Targets

The suite validates these minimum performance thresholds:

Arrow conversion: > 500k rows/sec (standard types)
Append operations: > 100k rows/sec
Iteration: > 500k rows/sec (materialized)
Display: < 1 second (100 rows with types)

Integration

Compatible with existing test framework (run_tests())
Individual test functions with performance assertions
Can be run as standalone script or imported as module
.gitignore updated to exclude benchmark output files

Files Changed

6 new files (1,387 lines): benchmark suite, documentation, examples, utilities
2 modified files: README.md (benchmark section), .gitignore (output patterns)
Total: 1,410 lines added

This benchmark suite enables systematic performance tracking and regression detection across Orso versions, making it easy to validate that performance improvements don't introduce regressions and that new features maintain acceptable performance characteristics.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

esm.ubuntu.com
- Triggering command: /usr/lib/apt/methods/https (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Configure Actions setup steps to set up my environment, which run before the firewall is enabled
Add the appropriate URLs or hosts to the custom allowlist in this repository's Copilot coding agent settings (admins only)

Original prompt

write me a test suite to allow me to compare the performance of two differenet versions of orso, the key functionality are the conversion to and from arrow, the appending of values, the iterating of a dataframe and the display of a dataframe. these are the core functions, everything else is glue and sugar

✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Co-authored-by: joocer <[email protected]>

Initial plan

38d049c

Copilot AI assigned Copilot and joocer Oct 14, 2025

Copilot started work on behalf of joocer October 14, 2025 09:31 View session

Copilot AI and others added 3 commits October 14, 2025 09:45

Add comprehensive performance benchmark suite for version comparison

93c2eaf

Co-authored-by: joocer <[email protected]>

Add documentation, examples, and utilities for benchmark suite

38922c2

Co-authored-by: joocer <[email protected]>

Add implementation summary for benchmark suite

e31a2a7

Co-authored-by: joocer <[email protected]>

Copilot AI changed the title ~~[WIP] Add test suite for comparing Orso versions performance~~ Add comprehensive performance benchmark suite for Orso version comparison Oct 14, 2025

Copilot AI requested a review from joocer October 14, 2025 09:52

Copilot finished work on behalf of joocer October 14, 2025 09:52

perf benchmark

8267f30

joocer approved these changes Oct 14, 2025

View reviewed changes

joocer marked this pull request as ready for review October 14, 2025 14:08

joocer merged commit af32a46 into main Oct 14, 2025
6 of 10 checks passed

joocer deleted the copilot/add-performance-test-suite branch October 14, 2025 15:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add comprehensive performance benchmark suite for Orso version comparison #56

Add comprehensive performance benchmark suite for Orso version comparison #56

Uh oh!

Copilot AI commented Oct 14, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add comprehensive performance benchmark suite for Orso version comparison #56

Add comprehensive performance benchmark suite for Orso version comparison #56

Uh oh!

Conversation

Copilot AI commented Oct 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

What's Included

📊 Benchmark Suite (tests/test_benchmark_suite.py)

🔧 Version Comparison Tools

📚 Documentation

Usage

Basic benchmark run:

Compare two versions:

Automated comparison:

Programmatic usage:

Sample Output

Performance Targets

Integration

Files Changed

I tried to connect to the following addresses, but was blocked by firewall rules:

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented Oct 14, 2025 •

edited

Loading

📊 Benchmark Suite (`tests/test_benchmark_suite.py`)