You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for making this tool, I'm finding it very useful for my current project.
I have a profile hmm database obtained from CONJScan that I want to use to scan through a fasta file containing multiple sequences.
I am running into some issues that I can't seem to figure out a work around.
To preface my issue, let me explain what I am trying to do:
Using the CONJScan database and python, I am iterating over the profile hmms in a for-loop. Each loop, I am using a profile hmm to scan through a fasta file containing multiple sequences. Then at the end of each loop, I output a graphic via dna_features_viewer with a unique name containing a visualization of my alignments.
There are two problems I am encountering:
Occasionally, I will receive an error saying that zip argument #1 must be iterable, this is in reference to for ax, hit in zip(axes, hits):... where argument #1 in zip(axes, hits) is not iterable. I am not sure why this is because aside from the ad-hoc loop I created to go through each profile hmm in my database, everything was done mimicking the example provided on the readthedocs.io page.
At the end of the process, I will have multiple hits from different hmm profiles on the same fasta sequence. However, I would like to visualize them together, rather then separately. I am unsure if I am using the tool incorrectly or if this is unsupported currently.
Copied below is my code, excuse me for the messiness, I am still testing things out.
import pyhmmer
import os
from dna_features_viewer import GraphicFeature, GraphicRecord
import matplotlib.pyplot as plt
directory = 'profiles'
#iterate over profiles in folder
#this is to iterate over a folder containing many profile Hmm (CONJScan database)
for hmmprofile in os.listdir(directory):
f = os.path.join(directory, hmmprofile)
if os.path.isfile(f):
try:
with pyhmmer.plan7.HMMFile(f) as hmm_file:
hmm = next(hmm_file)
with pyhmmer.easel.SequenceFile("test.fasta", digital=True) as seq_file: #test.fasta contains many sequences in amino acid format
sequences = list(seq_file)
pipeline = pyhmmer.plan7.Pipeline(hmm.alphabet)
hits = pipeline.search_hmm(hmm, sequences)
ali = hits[0].domains[0].alignment
hmm_name = (ali.hmm_name.decode()) #storing the name of the hmm profile in the event that a search succeeds
# create an index so we can retrieve a Sequence from its name
seq_index = { seq.name:seq for seq in sequences }
fig, axes = plt.subplots(nrows=len(hits), figsize=(30, 30), sharex=True)
try:
for ax, hit in zip(axes, hits):
# add one feature per domain
features = [
GraphicFeature(start=d.alignment.target_from-1, end=d.alignment.target_to, color='#00FF00', label=hmm_name) #using the hmm_name to create labels for the graphic feature
for d in hit.domains
]
length = len(seq_index[hit.name])
desc = seq_index[hit.name].description.decode()
# render the feature records
record = GraphicRecord(sequence_length=length, features=features)
record.plot(ax=ax)
ax.set_title(desc)
try:
ax.figure.tight_layout()
ax.figure.savefig(desc + hmm_name + ".png") #using both the descriptor + hmm_name to create a unique result and saving the graphic as a png
except Exception as e:
# print(e)
continue
except Exception as e:
# print(e)
continue
except Exception as e:
# print(e)
continue
Any advise you can provide would help immensely.
Thank you.
The text was updated successfully, but these errors were encountered:
This error you're getting, zip argument #1 must support iteration, is quite transparent: it means that the first argument to zip is not iterable; the first argument being axes. I cannot test immediately but I suppose axes may be None in the event where hits is empty; in modern versions of matplotlib passing a zero nrows to subplots raises an error but it could be you're using a version that just returns None there.
Hey Martin,
Thanks for making this tool, I'm finding it very useful for my current project.
I have a profile hmm database obtained from CONJScan that I want to use to scan through a fasta file containing multiple sequences.
I am running into some issues that I can't seem to figure out a work around.
To preface my issue, let me explain what I am trying to do:
Using the CONJScan database and python, I am iterating over the profile hmms in a for-loop. Each loop, I am using a profile hmm to scan through a fasta file containing multiple sequences. Then at the end of each loop, I output a graphic via dna_features_viewer with a unique name containing a visualization of my alignments.
There are two problems I am encountering:
Occasionally, I will receive an error saying that
zip argument #1 must be iterable
, this is in reference tofor ax, hit in zip(axes, hits):...
whereargument #1 in zip(axes, hits)
is not iterable. I am not sure why this is because aside from the ad-hoc loop I created to go through each profile hmm in my database, everything was done mimicking the example provided on the readthedocs.io page.At the end of the process, I will have multiple hits from different hmm profiles on the same fasta sequence. However, I would like to visualize them together, rather then separately. I am unsure if I am using the tool incorrectly or if this is unsupported currently.
Copied below is my code, excuse me for the messiness, I am still testing things out.
Any advise you can provide would help immensely.
Thank you.
The text was updated successfully, but these errors were encountered: