Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validation failing when some columns are not named and others are #247

Open
josepajay opened this issue Apr 19, 2021 · 0 comments
Open

Validation failing when some columns are not named and others are #247

josepajay opened this issue Apr 19, 2021 · 0 comments

Comments

@josepajay
Copy link

Expected Behaviour

What should happen?

Columns which are used to define primaryKey, foreignKeys & rowTitles in schema should validate the correct column.

Current Behaviour

If we define a column with no name before a column with a name and we use the named column as the primaryKey then validation fails as the unique constraint is applied to the wrong column. The same could happen if we use the column as part of a foreignKey or rowTitle definition.

Steps to Reproduce

csvlint -s reproduce_bug.csv-metadata.json

csv: reproduce_bug.csv

metadata: (Pasting here since the github does not support this file)

{
  "@context": [
    "http://www.w3.org/ns/csvw",
    {
      "@language": "en"
    }
  ],
  "tables": [{
    "url": "reproduce_bug.csv",
    "tableSchema": {
    "columns": [{
            "titles": "A",
            "datatype": "string"
        }, {
            "name": "b",
            "titles": "B",
            "datatype": "string"
        }],
        "primaryKey": ["b" ]
    }
  }]
}

Result

csvlint -s ~/Desktop/reproduce_bug.csv-metadata.json
Resolving dependencies...
..!!!
/Users/ajayjoseph/Desktop/reproduce_bug.csv is INVALID
1. duplicate_key. Row: 3,1. I am not a primary key
2. duplicate_key. Row: 4,1. I am not a primary key
3. duplicate_key. Row: 5,1. I am not a primary key

I think the code here is the problem. Columns has all the columns in it, but the column_names only gets populated only if the name is present. So if you have a column without a name before a column with name, these will get out of alignment. These 2 variables are arrays. So this means that you can't look up a column's index in one and expect it to be in the same place in the other array.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant