Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dotplots.svg and ks_anchors.tsv files are empty #78

Open
muthu1722 opened this issue Sep 16, 2022 · 9 comments
Open

Dotplots.svg and ks_anchors.tsv files are empty #78

muthu1722 opened this issue Sep 16, 2022 · 9 comments

Comments

@muthu1722
Copy link

Hi, I have used supplementary material to run WGD. I am able to rum till wgd syn without any issue but in the wgd_syn results only dotplot as well as ks_ancjors.tsv files are empty. I have also tried with one sample data just to confirm whether the data am using creating any trouble but again with sample data I have got the same issue but other than these two files my histogram, tsv.mcl files are looking fine for both sample data and my own data. I am attaching my sample data as well as output for your refernce. It would be really helpful if you could help me to resolve it .
Thank you in advance.
Muthulakshmi
sample.tar.gz

@heche-psb
Copy link
Contributor

heche-psb commented Sep 16, 2022

Hi Muthulakshmi,

I downloaded the enclosed file "sample.tar.gz" from you and reran the collinearity analysis successfully. Here is my step:

First, I reformat the gene ID of the CDS file "sample.fasta" to make it match the GFF file "ath.gff" in a python session:

>>>import pandas as pd
>>>df=pd.read_csv('sample.fasta',header=None)
>>>df2=df[0].astype(str).str.split(' | ',expand=True)
>>>df2[0].to_csv('sample.fa.cds',sep="\t",header=False, index=False)

Now I got the reformatted CDS file "sample.fa.cds" for next step;

To infer paralogous gene family, I used command:

$wgd dmd sample.fa.cds
2022-09-16 13:28:04: ERROR	Translation error (First codon 'TCA' is not a start codon) in seq AT1G03325.1
2022-09-16 13:28:04: ERROR	Translation error (First codon 'AGG' is not a start codon) in seq AT1G04105.1
2022-09-16 13:28:05: ERROR	Translation error (First codon 'AAT' is not a start codon) in seq AT1G13805.1
2022-09-16 13:28:05: ERROR	Translation error (Sequence length 2351 is not a multiple of three) in seq AT1G17000.1
[...]
2022-09-16 13:28:13: ERROR	Translation error (First codon 'ACG' is not a start codon) in seq ATMG01275.1
2022-09-16 13:28:13: ERROR	Translation error (Sequence length 922 is not a multiple of three) in seq ATMG01320.1
2022-09-16 13:28:14: INFO	One CDS file: will compute paranome

##Note that there were quite some genes that couldn't be correctly translated.

And I got two files in the "wgd_dmd" directory, "sample.fa.cds.mcl" and "sample.fa.cds_sample.fa.cds.tsv";

Next I inferred the collinearity based on the freshly-gained gene family information (assumed that I was still in the same directory which contained the original data):

$wgd syn -f mRNA -a ID ath.gff wgd_dmd/sample.fa.cds.mcl
2022-09-16 13:34:28: INFO	i-adhore stdout: This is i-ADHoRe v3.0.
Copyright (c) 2002-2010, Flanders Interuniversity Institute for Biotechnology, VIB.
Algorithm designed by Klaas Vandepoele, Cedric Simillion, Jan Fostier, Dieter De Witte,
Koen Janssens, Sebastian Proost, Yvan Saeys and Yves Van de Peer.

Process 1/1 is alive on localhost.
2022-09-16 13:34:28: INFO	i-adhore stderr: Error opening the settings file: -version
2022-09-16 13:34:28: INFO	Made output directory ./wgd_syn
2022-09-16 13:34:28: INFO	Parsing GFF file
2022-09-16 13:34:30: INFO	Writing gene lists
2022-09-16 13:34:30: INFO	Writing families file
2022-09-16 13:34:30: INFO	Writing configuration file
2022-09-16 13:34:30: INFO	Running I-ADHoRe 3.0
2022-09-16 13:35:03: WARNING	WARNING: Maximum allowed number of gaps in the alignment not specified.  Setting to cluster_gap.
WARNING: Tandem gap size not correct in settings file. Using default (gap_size / 2)

2022-09-16 13:35:03: INFO	
This is i-ADHoRe v3.0.
Copyright (c) 2002-2010, Flanders Interuniversity Institute for Biotechnology, VIB.
Algorithm designed by Klaas Vandepoele, Cedric Simillion, Jan Fostier, Dieter De Witte,
Koen Janssens, Sebastian Proost, Yvan Saeys and Yves Van de Peer.

Process 1/1 is alive on localhost.


************* i-ADHoRe parameters *************
	Number of genelists = 7
	Blast table = ./wgd_syn/families.tsv
	Output path = ./wgd_syn/i-adhore-out/
	Gap size = 30
	Cluster gap size = 35
	Cloud gap size = 0
	Cloud cluster gap size = 0
	Max gaps in alignment = 35
	Tandem gap = 15
	Flush output = 1000
	Q-value = 0.75
	Anchorpoints = 3
	Probability cutoff = 0.01
	Cloud filtering method = Binomial
	Level 2 only = false
	Use family = true
	Write statistics = false
	Alignment method = GreedyGraphbased4
	Multiple hypothesis correction = FDR
	Number of threads = 1
	Compare aligners = false
	Collinear searches only
	Visualize GHM.png = false
	Visualize Alignment = true
	Verbose output = true
************ END i-AdDHoRe parameters *********

Creating dataset...			done. (time: 0.022624s)
Mapping gene families...		done. (time: 0.031497s)
Remapping tandem duplicates...	done. (time: 0.0159261s)
Writing genelists file...		done. (time: 0.113681s)
Collinear Search
Level 2 multiplicon detection...	done. (time: 1.72764s)
Profile detection...
433 multiplicons to evaluate - evaluating level 2 multiplicon... 25 new multiplicons found.
[...]
2 multiplicons to evaluate - evaluating level 2 multiplicon... 0 new multiplicons found.
1 multiplicons to evaluate - evaluating level 2 multiplicon... level-2 multiplicon is redundant
Flushing output files...Visualize AlignedProfiles
badprofile (all boxes will be black! => segmentlength differs among the segs of alignment)
badprofile (all boxes will be black! => segmentlength differs among the segs of alignment)
done.
Time for Higher Level Detection: 30.8817s.


All Done!  Bye...



2022-09-16 13:35:03: INFO	Drawing co-linearity dotplot
2022-09-16 13:35:11: INFO	Done

The resulting file "sample.fa.cds.mcl.dotplot.svg" in directory "wgd_syn" is the inferred intraspecific collinear dotplot, as shown below.

sample fa cds mcl dotplot

Given the successful run of "wgd syn" without Ks data, I suppose it will also be no problem for another run with Ks data. It's weird that your result file "sample.fasta.blast.tsv.mcl.dotplot" was even empty. Do you have the log file of that failed run, we can diagnose further in detail based on that log.

@muthu1722
Copy link
Author

sample.tar.gz
Hi, Thank you very much for your reply.
I'm really sorry it took me so long to reply because of our system issues. I have just followed your instructions and tried with the same sample file but again it ended up with empty dot plot and Ks_anchors.tsv. I have attached my output as well as log file of wgd_syn run for your reference. Could you please help me to sort out.
Thanks in advance

@lizhao007
Copy link

Hi, I have the same problem while running wgd syn .The .tsv file and .mcl file seem to be right, but the Dotplots.svg and ks_anchors.tsv files are empty. There are not error report in log file. I get the same problem while using sample.fa, ath.fa and my own data. This is the result of sample.fa after running wgd syn , could you please help me to sort out.
Thanks.

sample mcl dotplot
sample mcl ks_anchors

@lizhao007
Copy link

Meanwhile, I run the code above as
wgd syn -f mRNA -a ID ath.gff wgd_dmd/sample.fa.cds.mcl
but got the same empty Dotplots.svg.

@heche-psb
Copy link
Contributor

Could you please provide the version of wgd you were using? And the version of other python packages. Thanks.

@lizhao007
Copy link

Thanks for your quick answers, the version of wgd is v1.1. I think python is v3.7 because i create a new enviroment and download python3.7 to run wgd by Conda, but it is likely python2.7 is also in this enviroment

图片

@lizhao007
Copy link

(wgd) [zhaoli@mn02 example]$ conda list
packages in environment at /public/home/zhaoli/software/anaconda3/envs/wgd:
Name Version Build Channel
_libgcc_mutex 0.1 conda_forge conda-forge
_openmp_mutex 4.5 2_gnu conda-forge
biopython 1.79 pypi_0 pypi
blast 2.13.0 hf3cf87c_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/bioconda
bokeh 1.4.0 pypi_0 pypi
bzip2 1.0.8 h7f98852_4 conda-forge
c-ares 1.18.1 h7f98852_0 conda-forge
ca-certificates 2022.9.24 ha878542_0 conda-forge
certifi 2022.9.24 pyhd8ed1ab_0 conda-forge
click 8.1.3 pypi_0 pypi
cmake 3.24.2 h5432695_0 conda-forge
coloredlogs 15.0.1 pypi_0 pypi
curl 7.85.0 h7bff187_0 conda-forge
cycler 0.11.0 pypi_0 pypi
entrez-direct 16.2 he881be0_1 https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/bioconda
ete3 3.1.2 pypi_0 pypi
expat 2.4.9 h27087fc_0 conda-forge
fastcluster 1.1.25 pypi_0 pypi
fasttree 2.1.11 hec16e2b_1 https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/bioconda
fonttools 4.37.4 pypi_0 pypi
gettext 0.21.1 h27087fc_0 conda-forge
humanfriendly 10.0 pypi_0 pypi
importlib-metadata 5.0.0 pypi_0 pypi
jinja2 3.1.2 pypi_0 pypi
joblib 0.11 pypi_0 pypi
keyutils 1.6.1 h166bdaf_0 conda-forge
kiwisolver 1.4.4 pypi_0 pypi
krb5 1.19.3 h3790be6_0 conda-forge
ld_impl_linux-64 2.39 hc81fddc_0 conda-forge
libcurl 7.85.0 h7bff187_0 conda-forge
libedit 3.1.20191231 he28a2e2_2 conda-forge
libev 4.33 h516909a_1 conda-forge
libffi 3.3 h58526e2_2 conda-forge
libgcc-ng 12.2.0 h65d4601_18 conda-forge
libgomp 12.2.0 h65d4601_18 conda-forge
libidn2 2.3.3 h166bdaf_0 conda-forge
libnghttp2 1.47.0 hdcd2b5c_1 conda-forge
libnsl 2.0.0 h7f98852_0 conda-forge
libpng 1.6.38 h753d276_0 conda-forge
libsqlite 3.39.4 h753d276_0 conda-forge
libssh2 1.10.0 haa6b8db_3 conda-forge
libstdcxx-ng 12.2.0 h46fd767_18 conda-forge
libunistring 0.9.10 h7f98852_0 conda-forge
libuv 1.44.2 h166bdaf_0 conda-forge
libzlib 1.2.13 h166bdaf_4 conda-forge
mafft 7.508 hec16e2b_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/bioconda
markupsafe 2.1.1 pypi_0 pypi
matplotlib 3.5.3 pypi_0 pypi
mcl 14.137 pl5321hec16e2b_8 https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/bioconda
mpi 1.0 mpich conda-forge
muscle 5.1 h9f5acd7_1 https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/bioconda
ncurses 6.3 h27087fc_1 conda-forge
numpy 1.21.6 pypi_0 pypi
openssl 1.1.1q h166bdaf_0 conda-forge
packaging 21.3 pypi_0 pypi
paml 4.9 hec16e2b_7 https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/bioconda
pandas 1.2.0 pypi_0 pypi
pcre 8.45 h9c3ff4c_0 conda-forge
perl 5.32.1 2_h7f98852_perl5 conda-forge
perl-archive-tar 2.40 pl5321hdfd78af_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/bioconda
perl-carp 1.38 pl5321hdfd78af_4 https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/bioconda
perl-common-sense 3.75 pl5321hdfd78af_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/bioconda
perl-compress-raw-bzip2 2.201 pl5321h87f3376_1 https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/bioconda
perl-compress-raw-zlib 2.105 pl5321h87f3376_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/bioconda
perl-encode 3.19 pl5321hec16e2b_1 https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/bioconda
perl-exporter 5.72 pl5321hdfd78af_2 https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/bioconda
perl-exporter-tiny 1.002002 pl5321hdfd78af_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/bioconda
perl-extutils-makemaker 7.64 pl5321hd8ed1ab_0 conda-forge
perl-io-compress 2.201 pl5321h87f3376_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/bioconda
perl-io-zlib 1.11 pl5321hdfd78af_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/bioconda
perl-json 4.10 pl5321hdfd78af_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/bioconda
perl-json-xs 2.34 pl5321h9f5acd7_5 https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/bioconda
perl-list-moreutils 0.430 pl5321hdfd78af_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/bioconda
perl-list-moreutils-xs 0.430 pl5321hec16e2b_1 https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/bioconda
perl-parent 0.236 pl5321hdfd78af_2 https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/bioconda
perl-pathtools 3.75 pl5321hec16e2b_3 https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/bioconda
perl-scalar-list-utils 1.62 pl5321hec16e2b_1 https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/bioconda
perl-types-serialiser 1.01 pl5321hdfd78af_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/bioconda
pillow 9.2.0 pypi_0 pypi
pip 22.3 pyhd8ed1ab_0 conda-forge
plumbum 1.8.0 pypi_0 pypi
prank v.170427 h9f5acd7_5 https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/bioconda
progressbar2 4.1.1 pypi_0 pypi
pyparsing 3.0.9 pypi_0 pypi
python 3.7.13 haa1d7c7_1 defaults
python-dateutil 2.8.2 pypi_0 pypi
python-utils 3.3.3 pypi_0 pypi
pytz 2022.5 pypi_0 pypi
pyyaml 6.0 pypi_0 pypi
readline 8.1.2 h0f457ee_0 conda-forge
rhash 1.4.3 h166bdaf_0 conda-forge
scikit-learn 1.0.2 pypi_0 pypi
scipy 1.7.3 pypi_0 pypi
seaborn 0.12.1 pypi_0 pypi
setuptools 65.5.0 pyhd8ed1ab_0 conda-forge
six 1.16.0 pypi_0 pypi
sklearn 0.0 pypi_0 pypi
sqlite 3.39.4 h4ff8645_0 conda-forge
threadpoolctl 3.1.0 pypi_0 pypi
tk 8.6.12 h27826a3_0 conda-forge
tornado 6.2 pypi_0 pypi
typing-extensions 4.4.0 pypi_0 pypi
wgd 1.2 pypi_0 pypi
wget 1.20.3 ha56f1ee_1 conda-forge
wheel 0.37.1 pyhd8ed1ab_0 conda-forge
xz 5.2.6 h166bdaf_0 conda-forge
zipp 3.9.0 pypi_0 pypi
zlib 1.2.13 h166bdaf_4 conda-forge
zstd 1.5.2 h6239696_4 conda-forge

@heche-psb
Copy link
Contributor

Hi, there is no multiplicon found in your run. I guess there might be a gene id extraction issue. Could you make sure that you give correct feature and attribute as in gff3 and family files?

sample.tar.gz Hi, Thank you very much for your reply. I'm really sorry it took me so long to reply because of our system issues. I have just followed your instructions and tried with the same sample file but again it ended up with empty dot plot and Ks_anchors.tsv. I have attached my output as well as log file of wgd_syn run for your reference. Could you please help me to sort out. Thanks in advance

@heche-psb
Copy link
Contributor

I think you might need to check whether there is any multiplicons found in the first place. The log provided by @muthu1722 shows no multiplicons found by i-adhore.

Meanwhile, I run the code above as wgd syn -f mRNA -a ID ath.gff wgd_dmd/sample.fa.cds.mcl but got the same empty Dotplots.svg.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants