Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Streamlining the SGID Data Introduction Process #74

Open
jacobdadams opened this issue Oct 28, 2020 · 6 comments
Open

Streamlining the SGID Data Introduction Process #74

jacobdadams opened this issue Oct 28, 2020 · 6 comments

Comments

@jacobdadams
Copy link
Member

jacobdadams commented Oct 28, 2020

We've had a couple different conversations about creating a better process for guiding people who want to add data to the SGID, so I'm putting down our notes in this issue. In the future we can branch out to a separate repo if necessary and close this issue.

Goal:
A streamlined way to screen requests for data to be added to the SGID (adding data to the Open SGID, sharing an AGOL item through the SGID Open Data site, or adding a link or other type of entry to the SGID Index) with as much automation as possible.

Benefits:

  • More clear direction for people who want to share data
  • A chance to ensure datasets meet our metadata, quality, and applicability standards
  • Automatically capturing data for the SGID Index for Open SGID/Open Data datasets, ensuring every new dataset is in the Index and reducing fragmentation (helps keep the Index the primary list of SGID datasets).
    • This wouldn't solve the problem of keeping it updated, however.

Components:

  • Some sort of form or automated way to request data to be added to the SGID. The form would include required fields for the necessary tidbits for the Index.
    • If we use github issues, could we use github actions to push data from the issue text into the db discussed below?
    • Some form of notification when new requests come in - Slack, email, carrier pigeon, etc.
  • Revamping the the stewardship tab into a proper DB as we've discussed in the past. This gives us a solid end point for the data coming from the request.
    • Possibly extending this db to hold the metadata as well, so that this one db becomes the source of truth for the Open SGID, AGOL, and the SGID Index?
  • A staging group in AGOL for items that people want to share via Open Data.
    • This should be public so anyone can start the process without first asking for access.
    • A policy regarding how long people have to make any required metadata updates before we remove them from the group
    • A policy for removing items from the Open Data groups that haven't gone through the screen process.
    • A nagger bot to check for new or stale items in the staging group
  • A link checker for the SGID Index db to at least tell us if links to external datasets are totally dead

Process Outline:

  1. Someone requests an addition via the form/issue
    • SGID Index links: link submitted via the form
    • Open SGID layers: email a copy/download link separately
    • Open Data items: share item with the staging group
  2. We review the request against our policies
    • SGID Index links: state agency?
    • Open SGID layers: SGID qualifications and metadata
    • Open Data items: metadata and Open Data presentation tips
  3. Open a Porter issue to add the data and perform the necessary steps
    • SGID Index links: pull the data straight from the issue into the Index db
    • Open SGID layers: sweeper, add data to internal, push out from internal to AGOL and Open SGID; pull data from issue into Index db
    • Open Data items: have user share with appropriate SGID AGOL group, remove from staging group; pull data from issue into Index db.
@jacobdadams
Copy link
Member Author

The topic of data sharing agreements came up during the last Dev/Data meeting. We could/should add that as a required item at some point the in process to make sure the provider understands both our and their roles and responsibilities.

Agreement Content:
SGID Index links/Open Data Items

  • Steward's responsibilities:
    • Create and update metadata according to our standards (and maybe this is a soft "should" instead of "must")
    • Not remove layer or alter schema without a heads up (2 weeks? 4 weeks?) whenever possible
    • Inform AGRC as soon as possible if layers are removed/schema changed or if the link for the SGID Index changes
    • Maintain up-to-date contact info (name/email/phone)
    • Use standard AGRC disclaimer/license wherever possible
  • AGRC Responsibilities
    • Display current info in SGID index
    • Honor/display any custom license info

Open SGID Layers

  • Steward's Responsibilities
    • Create and update metadata according to our standards (and maybe this is a soft "should" instead of "must")
    • Maintain up-to-date contact info (name/email/phone)
    • Use standard AGRC disclaimer/license wherever possible
    • Provide data updates in a timely manner
  • AGRC Responsibilities
    • Display current info in SGID index
    • Honor/display any custom license info
    • Not remove layer or alter schema without a heads up (2 weeks? 4 weeks?) whenever possible
    • Use porter process for deprecation
    • Maintain backups of database (for restoring the db in case of emergency only; not to be treated as an agency's backup for their data)
    • Abide by SGID Database policy (gis.utah.gov/policy/sgid)

Am I missing anything here? Is this something we want to do?

@steveoh
Copy link
Member

steveoh commented Nov 3, 2020

steward responsibilities might be updated to reference sweeper checks passing and coded value domains aren't an issue.

@gregbunce
Copy link
Member

a sweeper check for coded value domains is a great idea.

@steveoh
Copy link
Member

steveoh commented Nov 17, 2020

What are the action items for this? Has everyone generally agreed that it is a good idea and we want to move forward with it?

@jacobdadams
Copy link
Member Author

I created a "Data Submission Process" project board in the repo.

Do you think it's worth creating a custom issue tag and turning the board's cards into issues we can track individually, or would that just clutter up the issue page even more than it is already?

I think once we get that hammered out we can close this issue and move any discussion to the board/issues.

@steveoh
Copy link
Member

steveoh commented Nov 20, 2020

I'm good with a custom label and the project board looks good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants