Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicate Flowpath attirbutes in v2.2 #280

Open
taddyb opened this issue Nov 7, 2024 · 2 comments
Open

Duplicate Flowpath attirbutes in v2.2 #280

taddyb opened this issue Nov 7, 2024 · 2 comments

Comments

@taddyb
Copy link

taddyb commented Nov 7, 2024

Problem

  • When subsetting the v2.2 hydrofabric I noticed when masking conus_nextgen.gpkg by ID there are duplicate IDs in the flowpath-attributes layer
  • An example is wb-2436960

Steps to reproduce

flowpath_attributes = gpd.read_file("/Users/taddbindas/Downloads/conus_nextgen.gpkg", layer="flowpath-attributes")
mask = flowpath_attributes["id"] == "wb-2436960"
tmp = flowpath_attributes[mask]
tmp.iloc[0]

Here are the outputs, where each column is one of the duplicate rows. The only difference of any of the attributes is the gage

link                 wb-2436960    |    link                 wb-2436960
to                  nex-2436956    |    to                  nex-2436956
Length_m            9249.986199    |    Length_m            9249.986199
Y                      0.511353    |    Y                      0.511353
n                          0.06    |    n                          0.06
nCC                        0.12    |    nCC                        0.12     
BtmWdth                6.861904    |    BtmWdth                6.861904
TopWdth                 7.839788   |    TopWdth                7.839788
TopWdthCC             23.519365    |    TopWdthCC             23.519365
ChSlp                  0.956172    |    ChSlp                  0.956172
alt                  295.547058    |    alt                  295.547058
So                      0.00993    |    So                      0.00993
MusX                        0.2    |    MusX                        0.2
MusK                     3600.0    |    MusK                     3600.0
gage                   08158920    |    gage                   08158927
gage_nex_id         nex-2436956    |    gage_nex_id         nex-2436956
WaterbodyID                None    |    WaterbodyID                None
waterbody_nex_id           None    |    waterbody_nex_id           None
id                   wb-2436960    |    id                   wb-2436960
toid                nex-2436956    |    toid                nex-2436956
vpuid                        12    |    vpuid                        12
Name: 554848, dtype: object             Name: 554849, dtype: object

Temporary Solution:

  • Just removing duplicates since the gage field isn't important for my use case (T-route)
@mikejohnson51
Copy link
Collaborator

Morning! Thanks for this, so, assuming those gages are on the same aggregated flowpath then this is what I would expect to see. The "proper" way to work with both the flowpath-attributes and network layers (given their many-many propoerties) is to only work with distinctly unique rows (subset to the columns you are interested in) . So in this case, for t-route I would define the columns you need, only extract those, run distinct() (or the python equivalent on the resulting data.frame, and proceed.

Does that make sense?

@taddyb
Copy link
Author

taddyb commented Nov 7, 2024

Yup! Thanks Mike for the insight. Subsetting the flowpath columns cleared up the issue without needing to drop duplicates as the gage column was no longer needed.

I'm good to have this issue closed considering my case is working, and it looks like the above duplicated gage
is an expected output of v2.2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants