Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

⚡️ Speed up method AstraDBVectorStoreComponent.reset_database_list by 49% in PR #6048 (bugfix-dev-astradb) #6086

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

codeflash-ai[bot]
Copy link
Contributor

@codeflash-ai codeflash-ai bot commented Feb 3, 2025

⚡️ This pull request contains optimizations for PR #6048

If you approve this dependent PR, these changes will be merged into the original PR branch bugfix-dev-astradb.

This PR will be automatically closed if the original PR is merged.


📄 49% (0.49x) speedup for AstraDBVectorStoreComponent.reset_database_list in src/backend/base/langflow/components/vectorstores/astradb.py

⏱️ Runtime : 974 microseconds 655 microseconds (best of 113 runs)

📝 Explanation and details

To optimize the provided Python code for better performance, we can adopt the following strategies.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 22 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage undefined
🌀 Generated Regression Tests Details
import pytest  # used for our unit tests
# function to test
from langflow.base.vectorstores.model import LCVectorStoreComponent
from langflow.components.vectorstores.astradb import \
    AstraDBVectorStoreComponent

# unit tests

# Mock class to simulate get_database_list method
class MockAstraDBVectorStoreComponent(AstraDBVectorStoreComponent):
    def __init__(self, database_list):
        self._database_list = database_list

    def get_database_list(self):
        return self._database_list


def test_basic_single_database_option():
    # Test with a single database option
    component = MockAstraDBVectorStoreComponent({
        "db1": {"collections": ["col1"], "api_endpoint": "http://example.com"}
    })
    build_config = {"api_endpoint": {"options": [], "options_metadata": [], "value": ""}}
    codeflash_output = component.reset_database_list(build_config)


def test_basic_multiple_database_options():
    # Test with multiple database options
    component = MockAstraDBVectorStoreComponent({
        "db1": {"collections": ["col1"], "api_endpoint": "http://example.com"},
        "db2": {"collections": ["col2"], "api_endpoint": "http://example2.com"}
    })
    build_config = {"api_endpoint": {"options": [], "options_metadata": [], "value": ""}}
    codeflash_output = component.reset_database_list(build_config)


def test_edge_empty_database_list():
    # Test with an empty database list
    component = MockAstraDBVectorStoreComponent({})
    build_config = {"api_endpoint": {"options": [], "options_metadata": [], "value": ""}}
    codeflash_output = component.reset_database_list(build_config)


def test_edge_missing_fields():
    # Test with database list missing fields
    component = MockAstraDBVectorStoreComponent({
        "db1": {"collections": ["col1"]}  # Missing "api_endpoint"
    })
    with pytest.raises(ValueError):
        build_config = {"api_endpoint": {"options": [], "options_metadata": [], "value": ""}}
        component.reset_database_list(build_config)


def test_edge_extra_fields():
    # Test with database list with extra fields
    component = MockAstraDBVectorStoreComponent({
        "db1": {"collections": ["col1"], "api_endpoint": "http://example.com", "extra_field": "extra_value"}
    })
    build_config = {"api_endpoint": {"options": [], "options_metadata": [], "value": ""}}
    codeflash_output = component.reset_database_list(build_config)


def test_error_handling_exception_during_initialization():
    # Test with an exception during database initialization
    class ExceptionMockAstraDBVectorStoreComponent(AstraDBVectorStoreComponent):
        def get_database_list(self):
            raise Exception("Test Exception")

    component = ExceptionMockAstraDBVectorStoreComponent()
    with pytest.raises(ValueError) as excinfo:
        build_config = {"api_endpoint": {"options": [], "options_metadata": [], "value": ""}}
        component.reset_database_list(build_config)


def test_large_scale_many_databases():
    # Test with a large number of databases
    database_list = {f"db{i}": {"collections": [f"col{i}"], "api_endpoint": f"http://example{i}.com"} for i in range(1000)}
    component = MockAstraDBVectorStoreComponent(database_list)
    build_config = {"api_endpoint": {"options": [], "options_metadata": [], "value": ""}}
    codeflash_output = component.reset_database_list(build_config)


def test_special_characters_in_database_names():
    # Test with special characters in database names
    component = MockAstraDBVectorStoreComponent({
        "db!@#": {"collections": ["col1"], "api_endpoint": "http://example.com"}
    })
    build_config = {"api_endpoint": {"options": [], "options_metadata": [], "value": ""}}
    codeflash_output = component.reset_database_list(build_config)


def test_nested_structures_in_database_list():
    # Test with nested structures in database list
    component = MockAstraDBVectorStoreComponent({
        "db1": {"collections": [{"name": "col1"}], "api_endpoint": "http://example.com"}
    })
    build_config = {"api_endpoint": {"options": [], "options_metadata": [], "value": ""}}
    codeflash_output = component.reset_database_list(build_config)




import pytest  # used for our unit tests
# function to test
from langflow.base.vectorstores.model import LCVectorStoreComponent
from langflow.components.vectorstores.astradb import \
    AstraDBVectorStoreComponent


# unit tests
class MockAstraDBVectorStoreComponent(AstraDBVectorStoreComponent):
    def __init__(self, mock_data):
        self.mock_data = mock_data

    def get_database_list(self):
        return self.mock_data

def test_basic_functionality():
    # Standard Input
    mock_data = {
        "db1": {"collections": ["col1"], "api_endpoint": "endpoint1"},
        "db2": {"collections": ["col2"], "api_endpoint": "endpoint2"}
    }
    component = MockAstraDBVectorStoreComponent(mock_data)
    build_config = {"api_endpoint": {"options": [], "options_metadata": [], "value": ""}}
    expected_config = {
        "api_endpoint": {
            "options": ["db1", "db2"],
            "options_metadata": [
                {"collections": ["col1"], "api_endpoint": "endpoint1"},
                {"collections": ["col2"], "api_endpoint": "endpoint2"}
            ],
            "value": ""
        }
    }
    codeflash_output = component.reset_database_list(build_config)

def test_empty_database_list():
    # Empty Database List
    mock_data = {}
    component = MockAstraDBVectorStoreComponent(mock_data)
    build_config = {"api_endpoint": {"options": [], "options_metadata": [], "value": ""}}
    expected_config = {"api_endpoint": {"options": [], "options_metadata": [], "value": ""}}
    codeflash_output = component.reset_database_list(build_config)

def test_no_api_endpoint_key():
    # No `api_endpoint` Key in `build_config`
    mock_data = {}
    component = MockAstraDBVectorStoreComponent(mock_data)
    build_config = {}
    with pytest.raises(KeyError):
        component.reset_database_list(build_config)

def test_no_options_key():
    # No `options` Key in `api_endpoint`
    mock_data = {
        "db1": {"collections": ["col1"], "api_endpoint": "endpoint1"}
    }
    component = MockAstraDBVectorStoreComponent(mock_data)
    build_config = {"api_endpoint": {"options_metadata": [], "value": ""}}
    expected_config = {
        "api_endpoint": {
            "options": ["db1"],
            "options_metadata": [{"collections": ["col1"], "api_endpoint": "endpoint1"}],
            "value": ""
        }
    }
    codeflash_output = component.reset_database_list(build_config)

def test_no_options_metadata_key():
    # No `options_metadata` Key in `api_endpoint`
    mock_data = {
        "db1": {"collections": ["col1"], "api_endpoint": "endpoint1"}
    }
    component = MockAstraDBVectorStoreComponent(mock_data)
    build_config = {"api_endpoint": {"options": [], "value": ""}}
    expected_config = {
        "api_endpoint": {
            "options": ["db1"],
            "options_metadata": [{"collections": ["col1"], "api_endpoint": "endpoint1"}],
            "value": ""
        }
    }
    codeflash_output = component.reset_database_list(build_config)

def test_exception_in_initialize_database_options():
    # Exception in `_initialize_database_options`
    def mock_get_database_list():
        raise Exception("Mock exception")
    component = MockAstraDBVectorStoreComponent({})
    component.get_database_list = mock_get_database_list
    build_config = {"api_endpoint": {"options": [], "options_metadata": [], "value": ""}}
    with pytest.raises(ValueError, match="Error fetching database options: Mock exception"):
        component.reset_database_list(build_config)

def test_large_number_of_databases():
    # Large Number of Databases
    mock_data = {f"db{i}": {"collections": [f"col{i}"], "api_endpoint": f"endpoint{i}"} for i in range(1000)}
    component = MockAstraDBVectorStoreComponent(mock_data)
    build_config = {"api_endpoint": {"options": [], "options_metadata": [], "value": ""}}
    expected_config = {
        "api_endpoint": {
            "options": [f"db{i}" for i in range(1000)],
            "options_metadata": [
                {"collections": [f"col{i}"], "api_endpoint": f"endpoint{i}"} for i in range(1000)
            ],
            "value": ""
        }
    }
    codeflash_output = component.reset_database_list(build_config)

def test_special_characters_in_database_names():
    # Special Characters in Database Names
    mock_data = {
        "db@1!": {"collections": ["col1"], "api_endpoint": "endpoint1"}
    }
    component = MockAstraDBVectorStoreComponent(mock_data)
    build_config = {"api_endpoint": {"options": [], "options_metadata": [], "value": ""}}
    expected_config = {
        "api_endpoint": {
            "options": ["db@1!"],
            "options_metadata": [{"collections": ["col1"], "api_endpoint": "endpoint1"}],
            "value": ""
        }
    }
    codeflash_output = component.reset_database_list(build_config)

def test_unicode_characters_in_database_names():
    # Unicode Characters
    mock_data = {
        "数据库": {"collections": ["集合"], "api_endpoint": "端点"}
    }
    component = MockAstraDBVectorStoreComponent(mock_data)
    build_config = {"api_endpoint": {"options": [], "options_metadata": [], "value": ""}}
    expected_config = {
        "api_endpoint": {
            "options": ["数据库"],
            "options_metadata": [{"collections": ["集合"], "api_endpoint": "端点"}],
            "value": ""
        }
    }
    codeflash_output = component.reset_database_list(build_config)

def test_nested_collections():
    # Nested Collections
    mock_data = {
        "db1": {"collections": [["col1", "col2"]], "api_endpoint": "endpoint1"}
    }
    component = MockAstraDBVectorStoreComponent(mock_data)
    build_config = {"api_endpoint": {"options": [], "options_metadata": [], "value": ""}}
    expected_config = {
        "api_endpoint": {
            "options": ["db1"],
            "options_metadata": [{"collections": [["col1", "col2"]], "api_endpoint": "endpoint1"}],
            "value": ""
        }
    }
    codeflash_output = component.reset_database_list(build_config)

def test_mixed_data_types_in_collections():
    # Mixed Data Types in `collections`
    mock_data = {
        "db1": {"collections": ["col1", 2, ["col3"]], "api_endpoint": "endpoint1"}
    }
    component = MockAstraDBVectorStoreComponent(mock_data)
    build_config = {"api_endpoint": {"options": [], "options_metadata": [], "value": ""}}
    expected_config = {
        "api_endpoint": {
            "options": ["db1"],
            "options_metadata": [{"collections": ["col1", 2, ["col3"]], "api_endpoint": "endpoint1"}],
            "value": ""
        }
    }
    codeflash_output = component.reset_database_list(build_config)




def test_verify_side_effects():
    # Verify Side Effects
    mock_data = {
        "db1": {"collections": ["col1"], "api_endpoint": "endpoint1"}
    }
    component = MockAstraDBVectorStoreComponent(mock_data)
    build_config = {"api_endpoint": {"options": [], "options_metadata": [], "value": ""}}
    component.reset_database_list(build_config)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

Codeflash

erichare and others added 8 commits January 31, 2025 11:31
… improve code structure and readability

📝 (duck_duck_go_search_run.py): update DuckDuckGoSearchComponent with new display name, description, and documentation URL
📝 (duck_duck_go_search_run.py): update DuckDuckGoSearchComponent inputs with additional information and tool mode
📝 (duck_duck_go_search_run.py): update DuckDuckGoSearchComponent outputs with new output methods and display names
📝 (duck_duck_go_search_run.py): update DuckDuckGoSearchComponent methods to improve clarity and functionality
…by 49% in PR #6048 (`bugfix-dev-astradb`)

To optimize the provided Python code for better performance, we can adopt the following strategies.
@codeflash-ai codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Feb 3, 2025
@dosubot dosubot bot added the size:S This PR changes 10-29 lines, ignoring generated files. label Feb 3, 2025
@dosubot dosubot bot added the python Pull requests that update Python code label Feb 3, 2025
Base automatically changed from bugfix-dev-astradb to main February 3, 2025 15:53
@dosubot dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. and removed size:S This PR changes 10-29 lines, ignoring generated files. labels Feb 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
⚡️ codeflash Optimization PR opened by Codeflash AI python Pull requests that update Python code size:M This PR changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants