Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split test data into smaller sets #6

Open
Stephen-Gates opened this issue Sep 8, 2016 · 1 comment
Open

Split test data into smaller sets #6

Stephen-Gates opened this issue Sep 8, 2016 · 1 comment

Comments

@Stephen-Gates
Copy link
Owner

Stephen-Gates commented Sep 8, 2016

GoodTables only returns a limited number of errors so the test data will need to be split into smaller sets.

Consider naming standard for test types:

  • type and format
  • constraints
    • required
    • unique
    • minimum, maximum
    • minLength, maxLength
    • pattern
    • emun
      -missing values
  • primary and foreign keys, duplicates
    -structural errors
    • Undeclared header: if you do not specify in a machine readable way whether or not your CSV has a header row
    • Ragged rows: if every row in the file doesn't have the same number of columns
    • Blank rows: if there are any blank rows
    • Stray/Unclosed quote: if there are any unclosed quotes in the file
    • Whitespace: if there is any whitespace between commas and double quotes around fields
    • Empty column name: if all the columns don't have a name
    • Duplicate column name: if all the column names aren't unique
@Stephen-Gates
Copy link
Owner Author

Splitting data into smaller sets may not be needed for automated testing as processing limits can be set in the script.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant