Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Layer not included in generated report #9

Open
SPlanzer opened this issue Sep 30, 2018 · 7 comments
Open

Layer not included in generated report #9

SPlanzer opened this issue Sep 30, 2018 · 7 comments
Assignees
Labels

Comments

@SPlanzer
Copy link

Below from @SsiZhang (added to track issue and my work)

I have a question about the data publish date in the NZGOAL script.

Did you use the first revision date in the history RSS feed or some other date? If the script uses the first revision date, this might be one improvement that can be added into the current process. It looks like that there are few recently published layers not in the ‘publish’ list.

e.g. Bay of Plenty 0.3m Rural Aerial Photos Index Tiles (2016-2017) - https://data.linz.govt.nz/layer/95441-bay-of-plenty-03m-rural-aerial-photos-index-tiles-2016-2017/

The date range I used is from 1/Jul/18 to 30/Sep/18. The BoP imagery index layer was uploaded on 17/May/18 and it’s only been published in the last three month. So the index layer is not in the last NZGOAL report (2017_2018 Q4, 01/Apr/18 – 30/Jun/18) and also not in the new one (2018_2019 Q1, 01/Jul/18 - 30/Sep/18).

Please let me know if you need more information.

@SPlanzer SPlanzer self-assigned this Sep 30, 2018
@SPlanzer SPlanzer added the Epic label Sep 30, 2018
@SPlanzer
Copy link
Author

Hi SiSi.

the RSS feed has two dates available

    <published>2016-04-21T03:43:55+00:00</published>
    
    <updated>2016-05-13T01:51:41+00:00</updated>

I am using the published date to detect any new datasets.

I will investigate this particular dataset later today .

Thanks for the info

@SsiZhang
Copy link

Thank you Simon.
FYI, I expand the date range from 01/01/2017 to 30/09/2018 and found the following layers that have the same issue:
lds_id: Date Pub: Data Set Name:
95453: 2018-05-17T02:45:18+00:00 Wellington 0.3m Rural Aerial Photos Index Tiles (2016-2017)
88089: 2017-09-11T00:46:17+00:00 West Coast 0.3m Rural Aerial Photos Index Tiles (2016-2017)
95450: 2018-05-17T02:43:28+00:00 Thames Coromandel 0.05m Urban Aerial Photos Index Tiles (2015)
95496: 2018-06-08T09:17:49+00:00 Wellington 0.3m Rural Aerial Photos (2016-2017)
53561: 2017-01-20T00:36:42+00:00 Protected Areas to Parcel Association
95551: 2018-06-29T20:28:02+00:00 Thames Coromandel 0.05m Urban Aerial Photos (2015)
53596: 2017-04-23T22:12:43+00:00 NZ Roads: Address Range Road
95552: 2018-06-28T14:26:56+00:00 West Coast 0.3m Rural Aerial Photos (2016-2017)

@SPlanzer
Copy link
Author

SPlanzer commented Oct 1, 2018

Hi @SsiZhang Concerning: Layers 95441 -Bay of Plenty 0.3m Rural Aerial Photos Index Tiles (2016-2017)

When I run
python nzgoal_audit.py -F '01/04/18' -T '01/06/18' -f '~/Downloads/nzgoal.tsv' > nzgoal_results_log.txt
against the google sheet I get the below row in the publish group (in the report).
95441: 2018-05-17T02:37:58+00:00 Bay of Plenty 0.3m Rural Aerial Photos Index Tiles (2016-2017)

This means the id is in the sheet, the forms outcome was publish without restriction and the layer has been published to the LDS. According to the RSS feed and LDS this was published on the 17 May 2018 and last updated on the 17 May 2018

This is as I would expect. Probably best if we talk about this one in person tomorrow

See full results below:


    -------------------------------------------------
     _     ____  ____       _   _   _ ____ ___ _____ 
    | |   |  _ \/ ___|     / \ | | | |  _ \_ _|_   _|
    | |   | | | \___ \    / _ \| | | | | | | |  | |  
    | |___| |_| |___) |  / ___ \ |_| | |_| | |  | |  
    |_____|____/|____/  /_/   \_\___/|____/___| |_|  
    
    This utility performs the NZGOAL LDS audit by
    comparing the LDS RSS feed with the NZGOAL
    google form questionnaire as exported in tsv.
    For more information. Please see the geodetic wiki 

    e.g.
        python nzgoal_audit.py --help
        python nzgoal_audit.py -F '30/06/15' 
                               -T '1/07/16' 
                               -f './NZ Goal Data.tsv'
    -------------------------------------------------
                                                  
     

Assessing RSS Data:
| | | | | | | | | | | | | | | | | | | | | 

>>>RESULTS:
The script has found all LDS ids of those public datasets 
published between the provided dates.The results are categorised based
on the outcomes of the forms questionnaire, except those that did not find 
a matching id in the tsv/ spreadsheet. These are out-putted here under 
the section "NO CORRESPONDING LDS ID IN FORMS SPREAD SHEET (.TSV)"

----------------------------------------------------------------------------------------------------
NO CORRESPONDING LDS ID IN FORMS SPREAD SHEET (.TSV):

lds_id:	Date Pub:		Data Set Name:
95450:	2018-05-17T02:43:28+00:00	Thames Coromandel 0.05m Urban Aerial Photos Index Tiles (2015)
95453:	2018-05-17T02:45:18+00:00	Wellington 0.3m Rural Aerial Photos Index Tiles (2016-2017)
----------------------------------------------------------------------------------------------------
PUBLISH WITH RESTRICTIONS:

lds_id:	Date Pub:		Data Set Name:
----------------------------------------------------------------------------------------------------
DO NOT PUBLISH:

lds_id:	Date Pub:		Data Set Name:
----------------------------------------------------------------------------------------------------
PUBLISH:

lds_id:	Date Pub:		Data Set Name:
95451:	2018-05-17T02:44:01+00:00	Upper Hutt 0.10m Urban Aerial Photos Index Tiles (2017)
95449:	2018-05-17T02:41:55+00:00	New Plymouth 0.10m Urban Aerial Photos Index Tiles (2017)
95447:	2018-05-17T02:40:43+00:00	Napier 0.1m Urban Aerial Photos Index Tiles (2017-2018)
95446:	2018-05-17T02:40:00+00:00	Napier 0.05m Urban Aerial Photos Index Tiles (2017-2018)
95445:	2018-05-17T02:39:46+00:00	Kapiti Coast 0.10m Urban Aerial Photos Index Tiles (2017)
95443:	2018-05-17T02:38:51+00:00	Hutt City 0.10m Urban Aerial Photos Index Tiles (2017)
95442:	2018-05-17T02:38:30+00:00	Hastings 0.1m Urban Aerial Photos Index Tiles (2017-2018)
95441:	2018-05-17T02:37:58+00:00	Bay of Plenty 0.3m Rural Aerial Photos Index Tiles (2016-2017)
95440:	2018-05-17T02:37:33+00:00	Central Hawkes Bay 0.1m Urban Aerial Photos Index Tiles (2017-2018)
95439:	2018-05-17T02:37:32+00:00	Auckland 0.075m Urban Aerial Photos Index Tiles (2017)
95452:	2018-05-17T02:44:09+00:00	Wellington 0.10m Urban Aerial Photos Index Tiles (2017)

@SPlanzer
Copy link
Author

SPlanzer commented Oct 1, 2018

with respect to -F '01/01/17' -T '30/09/18'

I get all these in the "NO CORRESPONDING LDS ID IN FORMS SPREAD SHEET (.TSV):"
section. This means they have been published to the LDS but the data manager did not fill out the NZGOAL spread sheet

This therefore, from the script point of view appears fine to me. Just not from an audit point of view


----------------------------------------------------------------------------------------------------
NO CORRESPONDING LDS ID IN FORMS SPREAD SHEET (.TSV):

lds_id:	Date Pub:		Data Set Name:
95453:	2018-05-17T02:45:18+00:00	Wellington 0.3m Rural Aerial Photos Index Tiles (2016-2017)
88089:	2017-09-11T00:46:17+00:00	West Coast 0.3m Rural Aerial Photos Index Tiles (2016-2017)
95450:	2018-05-17T02:43:28+00:00	Thames Coromandel 0.05m Urban Aerial Photos Index Tiles (2015)
95496:	2018-06-08T09:17:49+00:00	Wellington 0.3m Rural Aerial Photos (2016-2017)
53561:	2017-01-20T00:36:42+00:00	Protected Areas to Parcel Association
95551:	2018-06-29T20:28:02+00:00	Thames Coromandel 0.05m Urban Aerial Photos (2015)
53596:	2017-04-23T22:12:43+00:00	NZ Roads: Address Range Road
95552:	2018-06-28T14:26:56+00:00	West Coast 0.3m Rural Aerial Photos (2016-2017)

@SPlanzer
Copy link
Author

SPlanzer commented Oct 1, 2018

@SsiZhang Lets make sometime to go over this tomorrow

@SsiZhang
Copy link

SsiZhang commented Oct 1, 2018

Thank you Simon. Yes, if we can have a chat tomorrow will be great. I agree it's more for the audit view.

I feel the issue is the 'publish date'. For example, the imagery index data can be uploaded into the LDS few months before the data being actually released to the public. In the BoP index example, the publish date is May in the report, but it actually hasn't been released until recently. Th e RSS feed won't show this layer until it has been made available to the public. So this layer wasn't in the previous report.

I might make the issue sounds more confusing...I'll explain this in person tomorrow.

@SPlanzer SPlanzer assigned SsiZhang and unassigned SPlanzer Oct 1, 2018
@SPlanzer
Copy link
Author

SPlanzer commented Oct 1, 2018

Below is my understanding of the use case resulting in the issue and some options.

The Issue

  • Datesets are being loaded to the LDS (Lets say on the 05/02/18) but not published
  • A first quarter audit is ran (lets calendar year 01/01/18/-01/04/18 ) and our record above is not picked up as it is not published and therefore not in the RSS feed.
  • The Data Manager then publishes the dataset (lets say 01/05/18).
    • It does not get the publish date of 01/05/18 but when it was loaded (05/02/18)
  • A audit is then ran for the second quarter (01/04/18-01/07/18)
    • The record is not picked up as the publish date is for last quarter

@SsiZhang is the above correct?

Options

  • Documentation is updated and all audits are ran for a period greater than require. For example if audits are ran quarterly, always input the dates to audit for a full year.
  • The script uses the RSS feed so that users running it are not not required to install extra python libraries. It could be investigated if the Koordinates API could be used to get more useful date information to resolve the above scenario.
  • The Koordinates platform reports the date published on the RSS feed for the date published entry rather than the date uploaded .

@SsiZhang I have assigned this to you. Please let me know if you need any further help

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

2 participants