Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Ageing_Facility to the individual_fact table in the DW and fill it for legacy data. #71

Open
BHHorness-NOAA opened this issue May 12, 2020 · 2 comments
Labels
enhancement New feature or request

Comments

@BHHorness-NOAA
Copy link
Collaborator

There can be biases between labs that read specimens for age determinations. This can be an important parameter for some assessed fish species, but has not been included to date in the data warehouse although available in all trawl survey legacy databases. A request was made by Chantel Wetzel to include this parameter.

  • Add a laboratory_dim dimension table to standardize laboratory names/unique ids and fill with applicable laboratory names (dev, staging, prod).
  • Add a foreign key field to the individual_fact table for the laboratory_dim pk (dev, staging, prod).
  • Modify currently available Pentaho transformations to add the ageing lab id in staging db.
  • Run migration transformations to add ageing lab ids in the staging db

For years 2016/2017 from fram_central postgres schema
For years 2018 (no ages yet from 2019) from survey_central postgres schema
For years 2001 to 2015 from Oracle master schema

  • Push new data in staging database to prod
@BHHorness-NOAA
Copy link
Collaborator Author

Transferred ageing lab id into staging database. Note that for some years prior to 2016 the meta data for age reads was added but the age result itself was not stored in the results table, probably to avoid duplication with the individual table. Curiously, in the later years (starting in 2010? and through 2015) the duplication of age data was introduced. Pentaho transformation was edited and rerun in the staging database. One step remains before pushing to prod: Age data for which the ageing facility is currently unknown needs to be set to -1 = Missing/Unknown. It is possible that if someone has the time and inclination, that Patrick McDonald can dig out this information to actively fill these gaps. Note also that it was discovered through the course of this effort that a considerable number of 2016 and 2017 age reads haven't been pushed to the DW (they do exist in the fram_central database). After this initial push for the ageing facility id, the transformation should be updated to push all applicable age data (the age, ageing date, and facility) to the DW (2016/2017 ages).

@SaOgaz-NOAA
Copy link
Contributor

Hi @BHHorness-NOAA, what's the status on this one?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants