Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Primary key and sequential key cannot be the same #2096

Open
npatki opened this issue Jun 27, 2024 · 0 comments · May be fixed by #2106
Open

Primary key and sequential key cannot be the same #2096

npatki opened this issue Jun 27, 2024 · 0 comments · May be fixed by #2106
Assignees
Labels
bug Something isn't working data:sequential Related to timeseries datasets

Comments

@npatki
Copy link
Contributor

npatki commented Jun 27, 2024

Environment Details

  • SDV version: 1.14.0 (latest)

Error Description

For sequential data, it should not be possible for the primary key column (or alternate key column) to be the same as any sequential key column. Yet, the metadata object is accepting such a situation as valid.

from sdv.metadata import SingleTableMetadata

metadata = SingleTableMetadata.load_from_dict({
    'columns': {
        'A': { 'sdtype': 'id' },
        'B': { 'sdtype': 'datetime', 'datetime_format': '%Y-%m-%d' },
        'C': { 'sdtype': 'numerical' },
        'D': { 'sdtype': 'categorical' }
    },
    'primary_key': 'A',
    'sequence_key': 'A'
})

metadata.validate()

Expected Behavior

The code above should throw an error because the primary key cannot be the same as sequence key. (Same error should also be thrown if an alternate key is the same as a sequence key.)

The same error should also be thrown when adding these keys programmatically. Eg.

from sdv.metadata import SingleTableMetadata

metadata = SingleTableMetadata.load_from_dict({
    'columns': {
        'A': { 'sdtype': 'id' },
        'B': { 'sdtype': 'datetime', 'datetime_format': '%Y-%m-%d' },
        'C': { 'sdtype': 'numerical' },
        'D': { 'sdtype': 'categorical' }
    },
})

metadata.set_sequence_key('A')
metadata.set_primary_key('A') # this should throw an error
@npatki npatki added bug Something isn't working data:sequential Related to timeseries datasets labels Jun 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working data:sequential Related to timeseries datasets
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants