Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Uploading Dataset Error #894

Open
1 task done
shinbehavior opened this issue Oct 22, 2024 · 8 comments
Open
1 task done

Uploading Dataset Error #894

shinbehavior opened this issue Oct 22, 2024 · 8 comments
Labels
bug Something isn't working

Comments

@shinbehavior
Copy link

shinbehavior commented Oct 22, 2024

Search before asking

  • I have searched the HUB issues and found no similar bug report.

HUB Component

No response

Bug

I have the .zip of my dataset, which is 58 GB (on Hub it became 54). It has a similar structure to the example but without a test set.
Full upload completes in 1.5-2 hours and after I get this for 14 hours:

image

I was 100% sure that I had a stable ethernet connection. What error here can be? Is it because of the lack of a test set?
Because Loading is endless.

Environment

  • HUB v0.1.60
  • Ubuntu 24.04, Thinkpad x1 gen 11
  • Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/130.0.0.0 Safari/537.36
  • Server Timestamp 1729580820

Minimal Reproducible Example

  1. 58gb size of dataset
  2. Make structure and data.yaml without Test set
  3. Compress
  4. Upload

Additional

image

@shinbehavior shinbehavior added the bug Something isn't working label Oct 22, 2024
@UltralyticsAssistant
Copy link
Member

👋 Hello @shinbehavior, thank you for raising an issue about the Ultralytics HUB 🚀! It seems like you're having some trouble uploading your dataset. Please check out our HUB Docs to make sure everything is set up correctly:

Since this is a 🐛 Bug Report, could you please provide a minimum reproducible example (MRE) to help us better understand the issue? This would include, if possible, a smaller dataset that still replicates the problem. You can learn more about creating an MRE here.

We appreciate your patience, and an Ultralytics engineer will join the conversation to assist you further soon. Thank you! 🌟

@pderrenger
Copy link
Member

@shinbehavior hello!

Thank you for reaching out and providing detailed information about the issue you're experiencing. 😊

It sounds like you've done a great job preparing your dataset, but let's see if we can pinpoint the issue. Here are a few suggestions:

  1. Dataset Structure: While the absence of a test set shouldn't inherently cause an upload issue, it's crucial to ensure that your dataset's structure aligns perfectly with the expected format. Double-check that your data.yaml file is correctly placed in the root directory and that all paths are accurate.

  2. Dataset Size: A 58 GB dataset is quite large, and while Ultralytics HUB should handle it, the upload and processing times can be significant. Ensure that your internet connection remains stable throughout the upload process. You might also want to try uploading during off-peak hours to avoid potential network congestion.

  3. Version Check: Make sure you're using the latest version of Ultralytics HUB. Sometimes, updates include fixes for issues that might be causing your problem.

  4. Validation: Before uploading, validate your dataset locally to catch any potential errors early. You can use the ultralytics.hub.check_dataset function to ensure everything is in order.

  5. Logs and Errors: Check if there are any error messages or logs that might provide more insight into what's going wrong. Sometimes, the issue might be logged in the console or a log file.

If the problem persists after these checks, please let us know, and we can explore further options. Your patience is appreciated, and we're here to help you get your dataset uploaded successfully!

@shinbehavior
Copy link
Author

shinbehavior commented Nov 11, 2024

  1. Dataset structure is correct:
train: images/train
val: images/val

names:
    0: person

And /labels/train and val.
With the same dataset, trained locally successfully.

  1. I tried to upload three times with 100% stable internet connect.

  2. I use web, so indeed it was the last version. And you can check date when I open this issue and Hub version which I provide.

  3. I will try it, but you can see, I have correct yaml, and just folders with images and labels, which are for sure in yolo format for detections.

  4. I attached the log screenshot which i got in the web console.

@sergiuwaxmann
Copy link
Member

@shinbehavior https://docs.ultralytics.com/hub/datasets#upload-dataset
Can you please confirm your folder structure is correct?
Also, try adding path and test to your .yaml file.

@shinbehavior
Copy link
Author

@shinbehavior https://docs.ultralytics.com/hub/datasets#upload-dataset Can you please confirm your folder structure is correct? Also, try adding path and test to your .yaml file.

Yes, folder structure is absolutely the same, like I said earlier. About test set: " While the absence of a test set shouldn't inherently cause an upload issue" - That what @pderrenger said before

@sergiuwaxmann
Copy link
Member

@shinbehavior I understand but I suggested trying that way.

@shinbehavior
Copy link
Author

@sergiuwaxmann Okay, will try it later

@yogendrasinghx
Copy link
Member

@shinbehavior To ensure your dataset is formatted correctly, please watch this YouTube video: Upload Datasets to Ultralytics HUB , which provides a step-by-step guide for creating a valid dataset for Ultralytics HUB. Once you've checked your dataset, try uploading it again following the instructions in the video and in the documentation here.

If the issue persists, please let us know. Thank you for your patience!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants