Skip to content

Explored the application of an LSTM-based RNN to analyze protein sequences and evaluate its ability to capture long-range dependencies. Generated new protein sequences and created 3-gram language models based on the trained network.

Notifications You must be signed in to change notification settings

seunshix/recurrent_neural_networks

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Project: Applied Recurrent Neural Network using LSTM on protein sequence to find out long dependencies the network can capture, generate new protein sequences and generate 3-gram language models.

Description:

Explored the application of an LSTM-based RNN to analyze protein sequences and evaluate its ability to capture long-range dependencies. Generated new protein sequences and created 3-gram language models based on the trained network.

Techniques:

Preprocessing of protein sequence data, design and training of LSTM-based RNN, hyperparameter tuning, validation testing, generation of new protein sequences, and analysis of 3-gram language models.

Results:

Demonstrated the effectiveness of the LSTM-based RNN in capturing long dependencies in protein sequences, and its ability to generate high-quality new sequences. Identified patterns and motifs in the protein sequences based on the 3-gram language models, providing valuable insights into the potential applications of LSTM-based RNNs in bioinformatics.

This project can have several potential applications in the field of bioinformatics. Some of these applications could include:

  1. Protein engineering: The ability to generate new high-quality protein sequences using an LSTM-based RNN could be useful in designing and optimizing proteins for specific purposes, such as drug discovery or industrial applications.

  2. Protein classification: The patterns and motifs identified in the 3-gram language models could be used to classify proteins into different functional categories or predict their biological properties.

  3. Disease diagnosis and treatment: By analyzing protein sequences using an LSTM-based RNN, researchers could potentially identify disease-causing mutations or predict the effectiveness of certain treatments.

  4. Functional genomics: The ability to capture long dependencies in protein sequences could be useful in understanding the function and evolution of proteins, and could potentially lead to the discovery of new biological mechanisms.

Overall, this project has the potential to contribute to the development of new tools and techniques for analyzing protein sequences and understanding their biological functions, which could have a broad range of applications in the field of biotechnology and medicine.

About

Explored the application of an LSTM-based RNN to analyze protein sequences and evaluate its ability to capture long-range dependencies. Generated new protein sequences and created 3-gram language models based on the trained network.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages