Using junk DNA analysis + machine learning for preconception disability risk prediction: Feasibility and starting points? #353

Gayathrif · 2025-07-23T17:24:03Z

Gayathrif
Jul 23, 2025

Hello everyone,

I am a final-year CS student interested in bioinformatics. I am exploring an idea for predicting disability/disease risks in offspring before conception by analyzing parental genome patterns and junk DNA (non-coding regions) using AI/ML.

What I want to know:

1️⃣ Is it currently feasible to use non-coding/junk DNA mutation patterns, combined with machine learning, to predict the risk of diseases or disabilities pre-conception?
2️⃣ What datasets, resources, or pipelines would you recommend to start learning and experimenting in this direction practically?
3️⃣ Are there ethical or privacy considerations I should deeply understand before pursuing this line of research?

I am looking for practical advice to start small, recommended tools, papers, or courses, and insights on the current limitations in using junk DNA for disease prediction.

This is a dream research direction for me, and I want to build my skills responsibly while exploring this impactful idea.

Thank you for your time and any guidance you can share!

Warm regards,
Gayathri

ialbert · 2025-07-23T20:59:29Z

ialbert
Jul 23, 2025
Maintainer

The questions you enumerate under point 3 are the main reason that there is no publicly available large-scale data of this kind. Some projects collect tens of thousands of genomes, but all that data is under restricted access. The first step would be to join a group that is allowed to access that data.

3 replies

Gayathrif Jul 24, 2025
Author

Thank you so much for clarifying this! I now understand why access is a core challenge. Could you recommend what kinds of groups (academic labs or projects) typically work in this area, or how a student like me could approach joining or collaborating with such groups to learn while contributing.

ialbert Jul 24, 2025
Maintainer

I have not worked on this type of data and I am not knowledgeable about the requirements for joining groups that have access to this data. I would start investigating projects such as

All of Us Research Program

https://www.researchallofus.org/

UK Biobank,

https://www.ukbiobank.ac.uk/

and many similar projects that exist in the world.

Gayathrif Jul 24, 2025
Author

Thank you so much for your kind and clear guidance!

I appreciate you taking the time to help me understand the practical challenges around data access in this area and for suggesting concrete starting points like the All of Us Research Program and the UK Biobank. I will begin exploring these resources to learn about their data collection and access procedures while building the necessary skills to contribute responsibly in this field.

Your encouragement and direction mean a lot to me as I prepare for future research, and I’m grateful for your support.

Warm regards,
Gayathri

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Using junk DNA analysis + machine learning for preconception disability risk prediction: Feasibility and starting points? #353

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 3 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Using junk DNA analysis + machine learning for preconception disability risk prediction: Feasibility and starting points? #353

Uh oh!

Gayathrif Jul 23, 2025

Replies: 1 comment · 3 replies

Uh oh!

ialbert Jul 23, 2025 Maintainer

Uh oh!

Gayathrif Jul 24, 2025 Author

Uh oh!

ialbert Jul 24, 2025 Maintainer

Uh oh!

Gayathrif Jul 24, 2025 Author

Gayathrif
Jul 23, 2025

Replies: 1 comment 3 replies

ialbert
Jul 23, 2025
Maintainer

Gayathrif Jul 24, 2025
Author

ialbert Jul 24, 2025
Maintainer

Gayathrif Jul 24, 2025
Author