-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error: Some checklists in EBD are missing from sampling event data. #46
Comments
You can't use different versions of the EBD and sampling event data. You have a Mar-2020 EBD and an Aug-2020 sampling event data. I understand that since you have sensitive data you probably can't get an Aug-2020 version. It is possible to combine these, but you'll have to do it manually. I'd start by using auk to subset the sampling event data:
Then read in the EBD directly, no need to subset it first since it's a small file, and subset both the EBD and SED to have the same set of checklists.
I don't have time to actually test any of this, so you may need to try it out and adjust the code, but this should get you started. |
Thanks,
i took away unique=FALSE otherwise i had no column called checklist_id but when i write the command to intersect the file i have no absence and the zf has only checklist were the species was recorded. thanks again, |
Hmmm, as I think about this more, I don't think you can correctly zero fill the data without the matching sampling event data. I think you'll need to request the most recent version of the Great Green Macaw data so it will match the sampling event data. |
I wanted to follow up on this issue as I'm having a similar problem with auk_zerofill giving the error: "Some checklists in EBD are missing from sampling event data." In my case I have ensured that the versions of the EBD and sampling event data match (both are Jan-2021). However, I am using a custom downloaded EBD dataset (all observations in Canada) and the full sampling event data. Based on a previous issue (now closed -- see here) I'm wondering if a mismatch between a custom dataset is the underlying issue? Unfortunately it seems the only way to check this would be to download the complete EBD and at 90GB I'll admit to be being a bit reticent. I read in both of the successfully filtered EBD and sampling event files (via read_ebd and ebd_sampling, respectively) and they definitely reveal a different number of records (2864 vs. 2052 for my particular filters -- a bounding box in Alberta). So that is probably the issue. But when I try out the suggestion from @mstrimas to manually subset I end up with 454 common checklist_id observations. This is my first project looking at the eBird data, so maybe I'm missing something here, but it seems there is something strange and maybe zero-filled data REQUIRES the full datasets? |
Hi @gking-aug just wondering if you ever found a solution for your problem? I am having an almost identical issue to you, and am having troubleshooting the issue myself. Dd you end up needing to download the full EBD dataset? Or did you find a way to match up the custom download ebd & sampling event files for zerofilling? Thanks! |
Hi @BrittanyHBrown. This is a really good question -- the project was a directed reading and I haven't touched it in a while. Let me quickly investigate what I ended up doing and I will follow-up and post here. |
Building off the initial question in this thread, I am also new to auk and getting the same error. In my case, I am trying to use auk_zerofill for multiple datasets independently. My code is working for all except one dataset, even though from what I can tell it's exactly the same. I have ensured that all the months that the data covers is consistent and that all species are reported. Here is my code: `#My code works for 2019 (in addition to 5 other years of data) US2019checksub <- subset(US2019check, all_species_reported == TRUE) zfUS19 <- auk_zerofill(US2019obssub, US2019checksub, collapse = TRUE) #When I replicate this for 2020 data, I get the error that some checklists from the EBD are missing sampling event data US2020checksub <- subset(US2020check, all_species_reported == TRUE) zfUS20 <- auk_zerofill(US2020obssub, US2020checksub, collapse = TRUE)` If anyone has any ideas of what might be going on, I'd really appreciate some feedback! I tried re-downloading the 2020 dataset a couple times now in case there was something wrong with the download, but get the same error. |
First, you should be using
Should be changed to
If you're still having problems after making that change, please post the error and we can try to troubleshoot it. Thanks! |
Thanks for the catch on the read_ebd @mstrimas. I updated that portion of my code and am still getting the same error.
I am stumped because the same code is working on other datasets. Thanks! |
This is a rare bug that I've describe here #79 (comment) In your case, right before you call
|
Hello,
I'm new to auk, and working with data for Great Green Macaws to estimate presence/absence in different seasons. I've filtered my ebd and sampling event data to Costa Rica and then attempted to zero fill these. Since i'm working with a sensitive species i'm using a customized EBD. However, I am getting an error that there are some checklists in the EBD that are missing in the sampling data. i tryied to filter for the last edited date to exclude checklist that were added after Mar2020. Here is my code:
Wondering if anyone has any insight into why this may be the case, and how I could solve this considering that i can't download a more recent custumized EBD file.
thanks,
Sofia
The text was updated successfully, but these errors were encountered: