Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Oct 14, 2025

Overview

This PR adds a comprehensive performance benchmark suite to enable easy comparison of performance across different versions of Orso. The suite focuses on the core functionalities of Orso DataFrame:

  1. Arrow conversion (to and from PyArrow)
  2. Appending values to DataFrames
  3. Iterating over DataFrame rows
  4. Displaying DataFrames

Everything else is glue and sugar - these are the fundamental operations that determine Orso's performance characteristics.

What's Included

📊 Benchmark Suite (tests/test_benchmark_suite.py)

A comprehensive test suite with 17 individual performance tests organized into 4 categories:

  • Arrow Conversion (5 tests): from_arrow (small/large), to_arrow (standard/decimal/wide tables)
  • Append Operations (3 tests): dict append, append with schema, complete append+Arrow workflow
  • Iteration (4 tests): materialized, lazy/generator, with column access, fetchall
  • Display (5 tests): small/medium/wide tables, markdown output, lazy DataFrame display

🔧 Version Comparison Tools

  • JSON output format for machine-readable results and historical tracking
  • Built-in comparison tool with visual indicators (✓ for improvements, ✗ for regressions)
  • Shell script (compare_versions.sh) for automated version comparison
  • Programmatic API for custom benchmarking workflows

📚 Documentation

  • BENCHMARK_SUITE.md - Comprehensive documentation with usage instructions, benchmark descriptions, and performance targets
  • BENCHMARK_QUICKREF.md - Quick reference guide for common tasks
  • BENCHMARK_IMPLEMENTATION.md - Implementation summary and overview
  • examples/benchmark_usage.py - Working examples showing 5 different usage patterns
  • Updated README.md with benchmark section

Usage

Basic benchmark run:

python tests/test_benchmark_suite.py

Compare two versions:

# Create baseline
python tests/test_benchmark_suite.py -o baseline.json

# Switch version (upgrade, git checkout, etc.)
pip install orso==<new_version>

# Compare
python tests/test_benchmark_suite.py -o current.json -c baseline.json

Automated comparison:

./compare_versions.sh 0.0.225 0.0.227

Programmatic usage:

from tests.test_benchmark_suite import PerformanceBenchmark

benchmark = PerformanceBenchmark()
results = benchmark.run_all_benchmarks()
benchmark.save_results("my_results.json")

Sample Output

================================================================================
Performance Comparison
Baseline: 0.0.225
Current:  0.0.227
================================================================================

ARROW_CONVERSION:
--------------------------------------------------------------------------------
  ✓ to_arrow_standard:
      Baseline: 0.0172s
      Current:  0.0165s
      Speedup:  1.04x
  ✓ to_arrow_standard (rows/sec):
      Baseline: 5,816,031
      Current:  6,060,606
      Change:   +4.2%

Performance Targets

The suite validates these minimum performance thresholds:

  • Arrow conversion: > 500k rows/sec (standard types)
  • Append operations: > 100k rows/sec
  • Iteration: > 500k rows/sec (materialized)
  • Display: < 1 second (100 rows with types)

Integration

  • Compatible with existing test framework (run_tests())
  • Individual test functions with performance assertions
  • Can be run as standalone script or imported as module
  • .gitignore updated to exclude benchmark output files

Files Changed

  • 6 new files (1,387 lines): benchmark suite, documentation, examples, utilities
  • 2 modified files: README.md (benchmark section), .gitignore (output patterns)
  • Total: 1,410 lines added

This benchmark suite enables systematic performance tracking and regression detection across Orso versions, making it easy to validate that performance improvements don't introduce regressions and that new features maintain acceptable performance characteristics.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • esm.ubuntu.com
    • Triggering command: /usr/lib/apt/methods/https (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Original prompt

write me a test suite to allow me to compare the performance of two differenet versions of orso, the key functionality are the conversion to and from arrow, the appending of values, the iterating of a dataframe and the display of a dataframe. these are the core functions, everything else is glue and sugar


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Copilot AI changed the title [WIP] Add test suite for comparing Orso versions performance Add comprehensive performance benchmark suite for Orso version comparison Oct 14, 2025
Copilot AI requested a review from joocer October 14, 2025 09:52
@joocer joocer marked this pull request as ready for review October 14, 2025 14:08
@joocer joocer merged commit af32a46 into main Oct 14, 2025
6 of 10 checks passed
@joocer joocer deleted the copilot/add-performance-test-suite branch October 14, 2025 15:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants