Skip to content

⚡️ Speed up function sort_objects_by_score by 19%#36

Open
codeflash-ai[bot] wants to merge 1 commit intomainfrom
codeflash/optimize-sort_objects_by_score-mkos98rz
Open

⚡️ Speed up function sort_objects_by_score by 19%#36
codeflash-ai[bot] wants to merge 1 commit intomainfrom
codeflash/optimize-sort_objects_by_score-mkos98rz

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Jan 22, 2026

📄 19% (0.19x) speedup for sort_objects_by_score in unstructured_inference/models/table_postprocess.py

⏱️ Runtime : 760 microseconds 640 microseconds (best of 208 runs)

📝 Explanation and details

The optimization replaces a lambda function lambda k: k["score"] with operator.itemgetter("score") as the key function for sorting.

What changed:

  • Added from operator import itemgetter import
  • Changed key=lambda k: k["score"] to key=itemgetter("score")

Why it's faster:
In Python, lambda functions create a new function object at runtime and invoke Python's function call machinery for each comparison during sorting. itemgetter is implemented in C and avoids this overhead—it's a specialized callable designed specifically for attribute/item access. This eliminates per-item lambda invocation costs, resulting in faster execution especially as the list size grows.

The line profiler shows the optimization reduces per-hit time from ~65.2μs to ~18.0μs (3.6x faster per call), with an overall 18% speedup across all test workloads.

Performance characteristics:

  • Small lists (2-10 objects): 4-15% slower due to fixed import overhead and simpler cases where lambda's flexibility isn't a bottleneck
  • Medium lists (~100 objects): The optimization starts showing benefits
  • Large lists (500-1000 objects): 15-57% faster - the optimization shines here, with particularly strong gains for lists with many duplicate scores (57% speedup for 1000 identical scores) where comparison count is high

Impact on existing workloads:
Looking at function_references, this function is called in critical hot paths:

  • nms() - Non-maxima suppression for object detection, processes potentially large object lists
  • slot_into_containers() - Called inside nested loops over packages/containers, may sort repeatedly
  • nms_by_containment(), nms_supercells(), header_supercell_tree() - All sort lists during table structure analysis

These are performance-sensitive post-processing operations in table detection pipelines where objects (cells, rows, columns) can number in the hundreds. The optimization provides meaningful speedups when processing tables with many detected elements, with negligible impact on smaller tables.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 41 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Click to see Generated Regression Tests
import random  # used to generate deterministic large-scale test data

import pytest  # used for our unit tests
from unstructured_inference.models.table_postprocess import \
    sort_objects_by_score

def test_basic_sort_descending_integers():
    # Basic functionality: ensure descending order (default reverse=True) for integer scores
    objs = [
        {"score": 1, "name": "low"},
        {"score": 3, "name": "high"},
        {"score": 2, "name": "mid"},
    ]
    # call the function
    codeflash_output = sort_objects_by_score(objs); sorted_objs = codeflash_output # 2.44μs -> 2.78μs (12.3% slower)

def test_basic_sort_ascending_with_reverse_false():
    # The reverse parameter should control sort direction; test reverse=False yields ascending
    objs = [
        {"score": 5},
        {"score": -1},
        {"score": 0},
    ]
    codeflash_output = sort_objects_by_score(objs, reverse=False); sorted_objs = codeflash_output # 2.91μs -> 3.06μs (4.65% slower)

def test_preserves_original_and_returns_new_list():
    # sorted() returns a new list and does not mutate the input list
    original = [{"score": 1, "id": "a"}, {"score": 0, "id": "b"}]
    copy_before = list(original)  # shallow copy for comparison
    codeflash_output = sort_objects_by_score(original); result = codeflash_output # 2.23μs -> 2.44μs (8.34% slower)

def test_tuple_input_returns_list():
    # The function should accept any iterable sorted() accepts; a tuple should yield a list result
    tpl = ({"score": 2}, {"score": 1})
    codeflash_output = sort_objects_by_score(tpl); result = codeflash_output # 2.24μs -> 2.50μs (10.3% slower)

def test_stability_with_equal_scores():
    # sorted in Python is stable; items with equal keys should preserve relative order
    objs = [
        {"score": 10, "tag": "first_10"},
        {"score": 5, "tag": "first_5"},
        {"score": 10, "tag": "second_10"},
        {"score": 5, "tag": "second_5"},
    ]
    codeflash_output = sort_objects_by_score(objs); result = codeflash_output # 2.54μs -> 2.75μs (7.70% slower)
    # both 10s should remain in their original relative order: first_10 then second_10
    tags_in_10s = [o["tag"] for o in result if o["score"] == 10]
    # both 5s should remain in their original relative order as well
    tags_in_5s = [o["tag"] for o in result if o["score"] == 5]

def test_missing_score_key_raises_keyerror():
    # Edge case: an object without 'score' should cause the key-extraction lambda to raise KeyError
    objs = [{"score": 1}, {"not_score": 2}]
    with pytest.raises(KeyError):
        # calling the function should raise KeyError during sorting
        sort_objects_by_score(objs) # 2.94μs -> 2.94μs (0.102% faster)

def test_non_mapping_objects_raises_typeerror():
    # Edge case: if elements are not subscriptable dict-like objects, access k["score"] raises TypeError
    objs = [{"score": 1}, 5, {"score": 2}]
    with pytest.raises(TypeError):
        sort_objects_by_score(objs) # 3.73μs -> 3.80μs (1.76% slower)

def test_mixed_number_types_sorting():
    # Ensure mixing ints and floats behaves as expected (numeric comparison)
    objs = [
        {"score": 1.5},
        {"score": 2},
        {"score": 1},
    ]
    codeflash_output = sort_objects_by_score(objs); result = codeflash_output # 2.86μs -> 3.03μs (5.51% slower)

def test_string_scores_sort_lexicographically():
    # If scores are strings, the function uses them as-is; ordering will be lexicographic
    objs = [{"score": "2", "id": "a"}, {"score": "10", "id": "b"}]
    # default reverse=True: lexicographic ascending would be ["10","2"], reversed gives ["2","10"]
    codeflash_output = sort_objects_by_score(objs); result = codeflash_output # 2.33μs -> 2.48μs (6.06% slower)
    # verify that explicit reverse=False yields lexicographic ascending
    codeflash_output = sort_objects_by_score(objs, reverse=False); result_asc = codeflash_output # 1.63μs -> 1.41μs (15.1% faster)

def test_empty_input_returns_empty_list():
    # Edge case: empty iterable should return an empty list (sorted([]) -> [])
    codeflash_output = sort_objects_by_score([]) # 1.71μs -> 2.02μs (15.3% slower)

def test_large_scale_sort_matches_builtin_sorted_for_stability_and_correctness():
    # Large scale test with deterministic random data (but kept under 1000 elements)
    random.seed(42)  # deterministic seed for reproducibility
    n = 500  # within the stated limit (<1000)
    objs = [{"score": random.randint(0, 1000), "index": i} for i in range(n)]
    # Call the function under test
    codeflash_output = sort_objects_by_score(objs); result = codeflash_output # 94.6μs -> 81.7μs (15.8% faster)
    # For correctness, compare to Python's own sorted using the same key and reverse parameters
    expected = sorted(objs, key=lambda k: k["score"], reverse=True)
    # Verify strictly non-increasing property across adjacent pairs
    scores = [o["score"] for o in result]
    for i in range(len(scores) - 1):
        pass
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
import pytest
from unstructured_inference.models.table_postprocess import \
    sort_objects_by_score

def test_basic_sort_descending_by_default():
    """Test that objects are sorted in descending order by default (reverse=True)."""
    objects = [
        {"score": 1, "name": "low"},
        {"score": 3, "name": "high"},
        {"score": 2, "name": "medium"}
    ]
    codeflash_output = sort_objects_by_score(objects); result = codeflash_output # 2.39μs -> 2.69μs (11.1% slower)

def test_basic_sort_ascending():
    """Test that objects can be sorted in ascending order (reverse=False)."""
    objects = [
        {"score": 3, "name": "high"},
        {"score": 1, "name": "low"},
        {"score": 2, "name": "medium"}
    ]
    codeflash_output = sort_objects_by_score(objects, reverse=False); result = codeflash_output # 2.66μs -> 2.88μs (7.77% slower)

def test_single_object():
    """Test sorting with a single object."""
    objects = [{"score": 5, "name": "only"}]
    codeflash_output = sort_objects_by_score(objects); result = codeflash_output # 1.96μs -> 2.15μs (9.23% slower)

def test_two_objects():
    """Test sorting with two objects."""
    objects = [
        {"score": 2, "name": "low"},
        {"score": 5, "name": "high"}
    ]
    codeflash_output = sort_objects_by_score(objects); result = codeflash_output # 2.30μs -> 2.54μs (9.71% slower)

def test_objects_with_additional_fields():
    """Test that objects with extra fields are sorted correctly."""
    objects = [
        {"score": 1, "name": "obj1", "description": "first", "category": "A"},
        {"score": 3, "name": "obj3", "description": "third", "category": "C"},
        {"score": 2, "name": "obj2", "description": "second", "category": "B"}
    ]
    codeflash_output = sort_objects_by_score(objects); result = codeflash_output # 2.36μs -> 2.65μs (10.9% slower)

def test_identical_scores():
    """Test sorting objects with identical scores."""
    objects = [
        {"score": 5, "name": "first"},
        {"score": 5, "name": "second"},
        {"score": 5, "name": "third"}
    ]
    codeflash_output = sort_objects_by_score(objects); result = codeflash_output # 2.21μs -> 2.52μs (12.2% slower)

def test_negative_scores():
    """Test sorting objects with negative scores."""
    objects = [
        {"score": -1, "name": "negative"},
        {"score": 0, "name": "zero"},
        {"score": 1, "name": "positive"}
    ]
    codeflash_output = sort_objects_by_score(objects); result = codeflash_output # 2.38μs -> 2.60μs (8.17% slower)

def test_float_scores():
    """Test sorting objects with floating-point scores."""
    objects = [
        {"score": 1.5, "name": "mid"},
        {"score": 3.7, "name": "high"},
        {"score": 0.2, "name": "low"}
    ]
    codeflash_output = sort_objects_by_score(objects); result = codeflash_output # 2.29μs -> 2.51μs (8.85% slower)

def test_original_list_not_modified():
    """Test that the original list is not modified (sorted returns new list)."""
    objects = [
        {"score": 1, "name": "low"},
        {"score": 3, "name": "high"},
        {"score": 2, "name": "medium"}
    ]
    original_order = [obj["score"] for obj in objects]
    codeflash_output = sort_objects_by_score(objects); result = codeflash_output # 2.36μs -> 2.53μs (6.76% slower)

def test_empty_list():
    """Test sorting an empty list."""
    objects = []
    codeflash_output = sort_objects_by_score(objects); result = codeflash_output # 1.73μs -> 2.02μs (14.3% slower)

def test_very_large_scores():
    """Test sorting objects with very large score values."""
    objects = [
        {"score": 1e10, "name": "huge"},
        {"score": 1e9, "name": "large"},
        {"score": 1e11, "name": "massive"}
    ]
    codeflash_output = sort_objects_by_score(objects); result = codeflash_output # 2.36μs -> 2.66μs (11.3% slower)

def test_very_small_scores():
    """Test sorting objects with very small (negative) score values."""
    objects = [
        {"score": -1e10, "name": "tiny"},
        {"score": -1e9, "name": "small"},
        {"score": -1e11, "name": "minuscule"}
    ]
    codeflash_output = sort_objects_by_score(objects); result = codeflash_output # 2.37μs -> 2.46μs (3.82% slower)

def test_zero_score():
    """Test sorting with zero as a score value."""
    objects = [
        {"score": 1, "name": "positive"},
        {"score": 0, "name": "zero"},
        {"score": -1, "name": "negative"}
    ]
    codeflash_output = sort_objects_by_score(objects); result = codeflash_output # 2.36μs -> 2.58μs (8.82% slower)

def test_mixed_int_and_float_scores():
    """Test sorting with mixed integer and float score values."""
    objects = [
        {"score": 1, "name": "int_one"},
        {"score": 2.5, "name": "float_two_point_five"},
        {"score": 2, "name": "int_two"},
        {"score": 1.5, "name": "float_one_point_five"}
    ]
    codeflash_output = sort_objects_by_score(objects); result = codeflash_output # 3.03μs -> 3.26μs (7.05% slower)

def test_objects_with_extra_nested_fields():
    """Test sorting objects with complex nested field structures."""
    objects = [
        {"score": 2, "name": "obj2", "metadata": {"nested": {"value": "deep"}}},
        {"score": 3, "name": "obj3", "metadata": {"nested": {"value": "deeper"}}},
        {"score": 1, "name": "obj1", "metadata": {"nested": {"value": "deepest"}}}
    ]
    codeflash_output = sort_objects_by_score(objects); result = codeflash_output # 2.37μs -> 2.64μs (10.2% slower)

def test_objects_with_string_values_in_other_fields():
    """Test that objects with string values in non-score fields are handled correctly."""
    objects = [
        {"score": 1, "name": "apple", "category": "fruit"},
        {"score": 3, "name": "zebra", "category": "animal"},
        {"score": 2, "name": "carrot", "category": "vegetable"}
    ]
    codeflash_output = sort_objects_by_score(objects); result = codeflash_output # 2.38μs -> 2.61μs (8.56% slower)

def test_descending_explicitly_true():
    """Test that reverse=True explicitly specified works correctly."""
    objects = [
        {"score": 1, "name": "one"},
        {"score": 3, "name": "three"},
        {"score": 2, "name": "two"}
    ]
    codeflash_output = sort_objects_by_score(objects, reverse=True); result = codeflash_output # 2.69μs -> 3.04μs (11.6% slower)

def test_many_objects_with_same_high_score():
    """Test sorting when many objects share the highest score."""
    objects = [
        {"score": 10, "name": "high_1"},
        {"score": 10, "name": "high_2"},
        {"score": 10, "name": "high_3"},
        {"score": 5, "name": "medium"},
        {"score": 1, "name": "low"}
    ]
    codeflash_output = sort_objects_by_score(objects); result = codeflash_output # 2.51μs -> 2.64μs (4.93% slower)

def test_infinity_score():
    """Test sorting with positive infinity as a score."""
    objects = [
        {"score": float('inf'), "name": "infinite"},
        {"score": 1000, "name": "large"},
        {"score": 1, "name": "small"}
    ]
    codeflash_output = sort_objects_by_score(objects); result = codeflash_output # 2.48μs -> 2.81μs (11.9% slower)

def test_negative_infinity_score():
    """Test sorting with negative infinity as a score."""
    objects = [
        {"score": 1, "name": "positive"},
        {"score": -1000, "name": "very_negative"},
        {"score": float('-inf'), "name": "neg_infinity"}
    ]
    codeflash_output = sort_objects_by_score(objects); result = codeflash_output # 2.52μs -> 2.77μs (8.85% slower)

def test_objects_with_unicode_names():
    """Test sorting objects with unicode characters in non-score fields."""
    objects = [
        {"score": 3, "name": "日本語"},
        {"score": 1, "name": "English"},
        {"score": 2, "name": "Español"}
    ]
    codeflash_output = sort_objects_by_score(objects); result = codeflash_output # 2.33μs -> 2.57μs (9.42% slower)

def test_preserve_object_identity():
    """Test that the same object instances are returned (not copies)."""
    obj1 = {"score": 1, "name": "one"}
    obj2 = {"score": 3, "name": "three"}
    obj3 = {"score": 2, "name": "two"}
    objects = [obj1, obj2, obj3]
    codeflash_output = sort_objects_by_score(objects); result = codeflash_output # 2.34μs -> 2.52μs (7.12% slower)

def test_500_objects():
    """Test sorting 500 objects with varying scores."""
    # Create 500 objects with scores from 1 to 500
    objects = [{"score": i, "id": i} for i in range(1, 501)]
    # Shuffle to make it a more realistic scenario
    import random
    random.shuffle(objects)
    codeflash_output = sort_objects_by_score(objects); result = codeflash_output # 95.2μs -> 82.0μs (16.2% faster)

def test_500_objects_ascending():
    """Test sorting 500 objects in ascending order."""
    # Create 500 objects with scores from 1 to 500
    objects = [{"score": i, "id": i} for i in range(1, 501)]
    codeflash_output = sort_objects_by_score(objects, reverse=False); result = codeflash_output # 37.2μs -> 24.6μs (51.2% faster)

def test_1000_objects_all_identical_scores():
    """Test sorting 1000 objects where all have the same score."""
    # Create 1000 objects with identical scores
    objects = [{"score": 42, "id": i} for i in range(1000)]
    codeflash_output = sort_objects_by_score(objects); result = codeflash_output # 75.2μs -> 47.8μs (57.4% faster)

def test_500_objects_with_duplicate_scores():
    """Test sorting 500 objects with many duplicate scores."""
    # Create objects with repeated score patterns
    objects = []
    for score in range(100):
        for duplicate in range(5):
            objects.append({"score": score, "id": len(objects)})
    # Shuffle to randomize order
    import random
    random.shuffle(objects)
    codeflash_output = sort_objects_by_score(objects); result = codeflash_output # 92.2μs -> 78.8μs (17.0% faster)
    # Verify descending order (may have duplicates at same level)
    for i in range(len(result) - 1):
        pass

def test_performance_with_large_float_scores():
    """Test sorting 500 objects with large floating-point scores."""
    import random

    # Create 500 objects with random large float scores
    objects = [
        {"score": random.uniform(0, 1e10), "id": i}
        for i in range(500)
    ]
    codeflash_output = sort_objects_by_score(objects); result = codeflash_output # 90.7μs -> 78.2μs (16.0% faster)
    # Verify descending order
    for i in range(len(result) - 1):
        pass

def test_stability_with_large_dataset():
    """Test that sorting is stable (equal elements maintain relative order) with large dataset."""
    # Create 500 objects with repeated scores
    objects = []
    for score in [10, 20, 30]:
        for i in range(167):  # ~500 total
            objects.append({"score": score, "original_id": len(objects)})
    codeflash_output = sort_objects_by_score(objects); result = codeflash_output # 40.2μs -> 27.2μs (47.6% faster)
    # Verify descending score order
    scores = [obj["score"] for obj in result]
    for i in range(len(scores) - 1):
        pass

def test_800_objects_with_varied_fields():
    """Test sorting 800 objects with various additional fields (realistic scenario)."""
    objects = []
    import random
    for i in range(800):
        objects.append({
            "score": random.uniform(0, 100),
            "id": i,
            "name": f"object_{i}",
            "category": random.choice(["A", "B", "C"]),
            "timestamp": 1000000 + i
        })
    codeflash_output = sort_objects_by_score(objects); result = codeflash_output # 154μs -> 133μs (16.0% faster)
    # Verify descending order by score
    for i in range(len(result) - 1):
        pass
    # Verify all fields are preserved
    for obj in result:
        pass
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-sort_objects_by_score-mkos98rz and push.

Codeflash Static Badge

The optimization replaces a lambda function `lambda k: k["score"]` with `operator.itemgetter("score")` as the key function for sorting.

**What changed:**
- Added `from operator import itemgetter` import
- Changed `key=lambda k: k["score"]` to `key=itemgetter("score")`

**Why it's faster:**
In Python, `lambda` functions create a new function object at runtime and invoke Python's function call machinery for each comparison during sorting. `itemgetter` is implemented in C and avoids this overhead—it's a specialized callable designed specifically for attribute/item access. This eliminates per-item lambda invocation costs, resulting in faster execution especially as the list size grows.

The line profiler shows the optimization reduces per-hit time from ~65.2μs to ~18.0μs (3.6x faster per call), with an overall 18% speedup across all test workloads.

**Performance characteristics:**
- Small lists (2-10 objects): 4-15% slower due to fixed import overhead and simpler cases where lambda's flexibility isn't a bottleneck
- Medium lists (~100 objects): The optimization starts showing benefits
- Large lists (500-1000 objects): **15-57% faster** - the optimization shines here, with particularly strong gains for lists with many duplicate scores (57% speedup for 1000 identical scores) where comparison count is high

**Impact on existing workloads:**
Looking at `function_references`, this function is called in critical hot paths:
- `nms()` - Non-maxima suppression for object detection, processes potentially large object lists
- `slot_into_containers()` - Called inside nested loops over packages/containers, may sort repeatedly
- `nms_by_containment()`, `nms_supercells()`, `header_supercell_tree()` - All sort lists during table structure analysis

These are performance-sensitive post-processing operations in table detection pipelines where objects (cells, rows, columns) can number in the hundreds. The optimization provides meaningful speedups when processing tables with many detected elements, with negligible impact on smaller tables.
@codeflash-ai codeflash-ai bot requested a review from aseembits93 January 22, 2026 01:38
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Jan 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants