Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

result figures are empty #46

Open
shiyi-pan opened this issue Nov 5, 2020 · 6 comments
Open

result figures are empty #46

shiyi-pan opened this issue Nov 5, 2020 · 6 comments

Comments

@shiyi-pan
Copy link

shiyi-pan commented Nov 5, 2020

hi, I run the WGD like that,format.SoyC09.CDS.fasta and ormat.SoyC09.gff are my input file :

wgd mcl -n 8 --cds --mcl -s format.SoyC09.CDS.fasta -o SoyC09.CDS.out
wgd ksd --n_threads 8 --pairwise SoyC09.CDS.out/format.SoyC09.CDS.fasta.blast.tsv.mcl format.SoyC09.CDS.fasta
wgd syn  format.SoyC09.gff format.SoyC09.CDS.fasta
wgd kde wgd_ksd/format.SoyC09.CDS.fasta.ks.tsv
wgd mix wgd_ksd/format.SoyC09.CDS.fasta.ks.tsv

I don't find errors in log but the output figures like ks.svg and dotplot.svg are empty.coud't you help me fix this problem ? thank you very much . here is my log and I delete some rereat INFOs because it's too big:

2020-10-31 22:45:08: INFO	makeblastdb stdout: makeblastdb: 2.2.26+
Package: blast 2.2.26, build Feb  9 2012 16:01:46
2020-10-31 22:45:08: INFO	makeblastdb stderr: 
2020-10-31 22:45:08: INFO	blastp stdout: blastp: 2.2.26+
Package: blast 2.2.26, build Feb  9 2012 16:01:46
2020-10-31 22:45:08: INFO	blastp stderr: 
2020-10-31 22:45:09: INFO	mcl stdout: mcl 14-137
Copyright (c) 1999-2014, Stijn van Dongen. mcl comes with NO WARRANTY
to the extent permitted by law. You may redistribute copies of mcl under
the terms of the GNU General Public License.
2020-10-31 22:45:09: INFO	mcl stderr: 
2020-10-31 22:45:09: INFO	Output directory: /ds3512/home/panyp/NN1138-2/04.WGD_data/NN_data/SoyC09.CDS.out does not exist, will make it.
2020-10-31 22:45:09: INFO	CDS sequences provided, will first translate.
N/A% (0 of 55927) |                      | Elapsed Time: 0:00:00 ETA:  --:--:--
  0% (94 of 55927) |                     | Elapsed Time: 0:00:00 ETA:   0:00:59
 94% (52991 of 55927) |################# | Elapsed Time: 0:00:31 ETA:   0:00:01
 94% (53096 of 55927) |################# | Elapsed Time: 0:00:31 ETA:   0:00:01
 95% (53334 of 55927) |################# | Elapsed Time: 0:00:31 ETA:   0:00:01
 95% (53589 of 55927) |################# | Elapsed Time: 0:00:31 ETA:   0:00:01
 96% (53754 of 55927) |################# | Elapsed Time: 0:00:31 ETA:   0:00:01
 96% (53945 of 55927) |################# | Elapsed Time: 0:00:31 ETA:   0:00:00
 96% (54206 of 55927) |################# | Elapsed Time: 0:00:31 ETA:   0:00:00
 97% (54402 of 55927) |################# | Elapsed Time: 0:00:32 ETA:   0:00:00
 97% (54606 of 55927) |################# | Elapsed Time: 0:00:32 ETA:   0:00:00
 97% (54805 of 55927) |################# | Elapsed Time: 0:00:32 ETA:   0:00:00
 98% (55019 of 55927) |################# | Elapsed Time: 0:00:32 ETA:   0:00:00
 98% (55213 of 55927) |################# | Elapsed Time: 0:00:32 ETA:   0:00:00
 99% (55420 of 55927) |################# | Elapsed Time: 0:00:32 ETA:   0:00:00
 99% (55623 of 55927) |################# | Elapsed Time: 0:00:32 ETA:   0:00:00
100% (55927 of 55927) |##################| Elapsed Time: 0:00:32 Time:  0:00:32
2020-10-31 22:45:48: WARNING	There were 1 warnings during translation
2020-10-31 22:45:48: INFO	Writing blastdb sequences to db.fasta.
2020-10-31 22:45:48: INFO	Writing query sequences to query.fasta.
2020-10-31 22:45:49: INFO	Performing all-vs.-all Blastp (this might take a while)
2020-10-31 22:45:49: INFO	Making Blastdb


Building a new DB, current time: 10/31/2020 22:45:49
New DB name:   /ds3512/home/panyp/NN1138-2/04.WGD_data/NN_data/SoyC09.CDS.out/38fdb5b02beba6.db.fasta
New DB title:  /ds3512/home/panyp/NN1138-2/04.WGD_data/NN_data/SoyC09.CDS.out/38fdb5b02beba6.db.fasta
Sequence type: Protein
Keep Linkouts: T
Keep MBits: T
Maximum file size: 1073741824B
Adding sequences from FASTA; added 55926 sequences in 2.31642 seconds.
2020-10-31 22:45:52: INFO	Running Blastp
2020-10-31 22:45:52: INFO	blastp -db /ds3512/home/panyp/NN1138-2/04.WGD_data/NN_data/SoyC09.CDS.out/38fdb5b02beba6.db.fasta -query /ds3512/home/panyp/NN1138-2/04.WGD_data/NN_data/SoyC09.CDS.out/38fdb5b09ed912.query.fasta -evalue 1e-10 -outfmt 6 -num_threads 8 -out /ds3512/home/panyp/NN1138-2/04.WGD_data/NN_data/SoyC09.CDS.out/format.SoyC09.CDS.fasta.blast.tsv
2020-11-01 03:00:51: INFO	All versus all Blastp done
2020-11-01 03:00:51: INFO	Blast done
2020-11-01 03:00:52: INFO	Performing MCL clustering (inflation factor = 2.0)
2020-11-01 03:01:05: INFO	Started MCL clustering (mcl)
2020-11-01 03:01:50: INFO	Done
2020-11-01 03:02:01: INFO	codeml stdout: AAML in paml version 4.9j, February 2020
2020-11-01 03:02:01: INFO	codeml stderr: Error: file name empty..
2020-11-01 03:02:01: INFO	codeml found
2020-11-01 03:02:01: INFO	mafft stdout: 
2020-11-01 03:02:01: INFO	mafft stderr: v7.158b (2014/06/27)
2020-11-01 03:02:02: INFO	FastTree stdout: 
2020-11-01 03:02:02: INFO	FastTree stderr: Unknown or incorrect use of option --version
  FastTree protein_alignment > tree
  FastTree < protein_alignment > tree
  FastTree -out tree protein_alignment
  FastTree -nt nucleotide_alignment > tree
  FastTree -nt -gtr < nucleotide_alignment > tree
  FastTree < nucleotide_alignment > tree
FastTree accepts alignments in fasta or phylip interleaved formats

Common options (must be before the alignment file):
  -quiet to suppress reporting information
  -nopr to suppress progress indicator
  -log logfile -- save intermediate trees, settings, and model details
  -fastest -- speed up the neighbor joining phase & reduce memory usage
        (recommended for >50,000 sequences)
  -n <number> to analyze multiple alignments (phylip format only)
        (use for global bootstrap, with seqboot and CompareToBootstrap.pl)
  -nosupport to not compute support values
  -intree newick_file to set the starting tree(s)
  -intree1 newick_file to use this starting tree for all the alignments
        (for faster global bootstrap on huge alignments)
  -pseudo to use pseudocounts (recommended for highly gapped sequences)
  -gtr -- generalized time-reversible model (nucleotide alignments only)
  -lg -- Le-Gascuel 2008 model (amino acid alignments only)
  -wag -- Whelan-And-Goldman 2001 model (amino acid alignments only)
  -quote -- allow spaces and other restricted characters (but not ' ) in
           sequence names and quote names in the output tree (fasta input only;
           FastTree will not be able to read these trees back in)
  -noml to turn off maximum-likelihood
  -nome to turn off minimum-evolution NNIs and SPRs
        (recommended if running additional ML NNIs with -intree)
  -nome -mllen with -intree to optimize branch lengths for a fixed topology
  -cat # to specify the number of rate categories of sites (default 20)
      or -nocat to use constant rates
  -gamma -- after optimizing the tree under the CAT approximation,
      rescale the lengths to optimize the Gamma20 likelihood
  -constraints constraintAlignment to constrain the topology search
       constraintAlignment should have 1s or 0s to indicates splits
  -expert -- see more options
For more information, see http://www.microbesonline.org/fasttree/
2020-11-01 03:02:02: WARNING	Output directory exists, will possibly overwrite
2020-11-01 03:02:02: INFO	Translating CDS file
N/A% (0 of 55927) |                      | Elapsed Time: 0:00:00 ETA:  --:--:--
  0% (166 of 55927) |                    | Elapsed Time: 0:00:00 ETA:   0:00:33
  0% (351 of 55927) |                    | Elapsed Time: 0:00:00 ETA:   0:00:31
  1% (580 of 55927) |                    | Elapsed Time: 0:00:00 ETA:   0:00:28
  1% (708 of 55927) |                    | Elapsed Time: 0:00:00 ETA:   0:00:28
  1% (970 of 55927) |                    | Elapsed Time: 0:00:00 ETA:   0:00:25
  2% (1235 of 55927) |                   | Elapsed Time: 0:00:00 ETA:   0:00:24
  2% (1416 of 55927) |                   | Elapsed Time: 0:00:00 ETA:   0:00:24
  2% (1668 of 55927) |                   | Elapsed Time: 0:00:00 ETA:   0:00:23
  3% (1853 of 55927) |                   | Elapsed Time: 0:00:00 ETA:   0:00:24
  3% (2014 of 55927) |                   | Elapsed Time: 0:00:00 ETA:   0:00:24
  3% (2124 of 55927) |                   | Elapsed Time: 0:00:00 ETA:   0:00:24
  4% (2322 of 55927) |                   | Elapsed Time: 0:00:01 ETA:   0:00:25
  4% (2485 of 55927) |                   | Elapsed Time: 0:00:01 ETA:   0:00:25
  4% (2648 of 55927) |                   | Elapsed Time: 0:00:01 ETA:   0:00:25
  5% (2832 of 55927) |                   | Elapsed Time: 0:00:01 ETA:   0:00:25
  5% (3023 of 55927) |#                  | Elapsed Time: 0:00:01 ETA:   0:00:25
  5% (3207 of 55927) |#                  | Elapsed Time: 0:00:01 ETA:   0:00:26
  6% (3410 of 55927) |#                  | Elapsed Time: 0:00:01 ETA:   0:00:25
  6% (3540 of 55927) |#                  | Elapsed Time: 0:00:01 ETA:   0:00:25
 95% (53553 of 55927) |################# | Elapsed Time: 0:00:27 ETA:   0:00:01
 96% (53725 of 55927) |################# | Elapsed Time: 0:00:27 ETA:   0:00:01
 96% (53926 of 55927) |################# | Elapsed Time: 0:00:27 ETA:   0:00:01
 96% (54176 of 55927) |################# | Elapsed Time: 0:00:27 ETA:   0:00:00
 97% (54356 of 55927) |################# | Elapsed Time: 0:00:27 ETA:   0:00:00
 97% (54512 of 55927) |################# | Elapsed Time: 0:00:28 ETA:   0:00:00
 97% (54673 of 55927) |################# | Elapsed Time: 0:00:28 ETA:   0:00:00
 98% (54853 of 55927) |################# | Elapsed Time: 0:00:28 ETA:   0:00:00
 98% (55032 of 55927) |################# | Elapsed Time: 0:00:28 ETA:   0:00:00
 98% (55204 of 55927) |################# | Elapsed Time: 0:00:28 ETA:   0:00:00
 99% (55395 of 55927) |################# | Elapsed Time: 0:00:28 ETA:   0:00:00
 99% (55561 of 55927) |################# | Elapsed Time: 0:00:28 ETA:   0:00:00
 99% (55814 of 55927) |################# | Elapsed Time: 0:00:28 ETA:   0:00:00
100% (55927 of 55927) |##################| Elapsed Time: 0:00:28 Time:  0:00:28
2020-11-01 03:02:31: WARNING	There were 1 warnings during translation
2020-11-01 03:02:31: INFO	Started whole paranome Ks analysis
2020-11-01 03:02:31: WARNING	Filtered out the 1 largest gene families because n*(n-1)/2 > `max_pairwise`
2020-11-01 03:02:31: WARNING	If you want to analyse these large families anyhow, please raise the `max_pairwise` parameter. 
2020-11-01 03:02:31: INFO	Started analysis in parallel (n_threads = 8)
2020-11-01 03:02:32: INFO	Performing analysis on gene family GF_000002
2020-11-01 03:02:33: INFO	Performing analysis on gene family GF_000003
2020-11-01 03:02:33: INFO	Performing analysis on gene family GF_000004
2020-11-01 03:02:34: INFO	Performing analysis on gene family GF_000005
2020-11-01 03:02:34: INFO	Performing analysis on gene family GF_000006
2020-11-01 03:02:34: INFO	Performing analysis on gene family GF_000007
2020-11-01 03:02:35: INFO	Performing analysis on gene family GF_000008
2020-11-01 03:02:35: INFO	Performing analysis on gene family GF_000009
2020-11-01 03:45:53: INFO	Performing analysis on gene family GF_000010
2020-11-01 03:49:08: INFO	Performing analysis on gene family GF_000011
2020-11-01 03:58:49: INFO	Performing analysis on gene family GF_000012
2020-11-01 04:01:43: INFO	Performing analysis on gene family GF_000013
2020-11-01 04:09:27: INFO	Performing analysis on gene family GF_000014
2020-11-01 04:12:15: INFO	Performing analysis on gene family GF_000015
2020-11-01 06:35:45: INFO	Performing analysis on gene family GF_000306
2020-11-01 06:36:03: INFO	Performing analysis on gene family GF_000307
2020-11-01 06:36:09: INFO	Performing analysis on gene family GF_000308
2020-11-01 06:36:12: INFO	Performing analysis on gene family GF_000309
2020-11-01 06:36:13: INFO	Performing analysis on gene family GF_000310
2020-11-01 06:36:16: INFO	Performing analysis on gene family GF_000311
2020-11-01 06:36:20: INFO	Performing analysis on gene family GF_000312
2020-11-01 06:36:33: INFO	Performing analysis on gene family GF_000313
2020-11-01 06:36:37: INFO	Performing analysis on gene family GF_000314
2020-11-01 06:36:52: INFO	Performing analysis on gene family GF_000315
2020-11-01 08:03:23: INFO	Performing analysis on gene family GF_011430
2020-11-01 08:03:24: INFO	Performing analysis on gene family GF_011431
2020-11-01 08:03:24: INFO	Performing analysis on gene family GF_011432
2020-11-01 08:03:24: INFO	Performing analysis on gene family GF_011433
2020-11-01 08:03:24: INFO	Performing analysis on gene family GF_011434
2020-11-01 08:03:24: INFO	Performing analysis on gene family GF_011435
2020-11-01 08:03:24: INFO	Performing analysis on gene family GF_011436
2020-11-01 08:03:25: INFO	Performing analysis on gene family GF_011437
2020-11-01 08:03:25: INFO	Performing analysis on gene family GF_011438
2020-11-01 08:03:25: INFO	Performing analysis on gene family GF_011439
2020-11-01 08:03:25: INFO	Performing analysis on gene family GF_011440
2020-11-01 08:03:25: INFO	Performing analysis on gene family GF_011441
2020-11-01 08:03:25: INFO	Performing analysis on gene family GF_011442
2020-11-01 08:03:25: INFO	Performing analysis on gene family GF_011443
2020-11-01 08:03:26: INFO	Performing analysis on gene family GF_011444
2020-11-01 08:03:26: INFO	Performing analysis on gene family GF_011445
2020-11-01 08:03:26: INFO	Performing analysis on gene family GF_011446
2020-11-01 08:03:26: INFO	Performing analysis on gene family GF_011447
2020-11-01 08:03:26: INFO	Performing analysis on gene family GF_011448
2020-11-01 08:03:26: INFO	Performing analysis on gene family GF_011449
2020-11-01 08:03:26: INFO	Performing analysis on gene family GF_011450
2020-11-01 08:03:27: INFO	Performing analysis on gene family GF_011451
2020-11-01 08:03:27: INFO	Performing analysis on gene family GF_011452
2020-11-01 08:03:28: INFO	Analysis done
2020-11-01 08:03:28: INFO	Making results data frame
2020-11-01 08:13:15: INFO	Removing tmp directory
2020-11-01 08:13:34: INFO	Computing weights, outlier cut-off at Ks > 5
2020-11-01 08:13:34: INFO	Note: NumExpr detected 16 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 8.
2020-11-01 08:13:34: INFO	NumExpr defaulting to 8 threads.
2020-11-01 08:13:39: INFO	Generating plots
2020-11-01 08:13:39: INFO	Will plot **node-weighted** histograms
2020-11-01 08:13:41: INFO	Done
2020-11-01 08:13:45: INFO	i-adhore stdout: This is i-ADHoRe v3.0.
Copyright (c) 2002-2010, Flanders Interuniversity Institute for Biotechnology, VIB.
Algorithm designed by Klaas Vandepoele, Cedric Simillion, Jan Fostier, Dieter De Witte,
Koen Janssens, Sebastian Proost, Yvan Saeys and Yves Van de Peer.

Process 1/1 is alive on compute-0-1.local.
2020-11-01 08:13:45: INFO	i-adhore stderr: Error opening the settings file: -version
2020-11-01 08:13:45: WARNING	Output directory already exists, will possibly overwrite
2020-11-01 08:13:45: INFO	Parsing GFF file
2020-11-01 08:13:48: INFO	Writing gene lists
2020-11-01 08:13:49: INFO	Writing families file
2020-11-01 08:13:51: INFO	Writing configuration file
2020-11-01 08:13:51: INFO	Running I-ADHoRe 3.0
2020-11-01 08:13:55: WARNING	WARNING: Maximum allowed number of gaps in the alignment not specified.  Setting to cluster_gap.
WARNING: Tandem gap size not correct in settings file. Using default (gap_size / 2)

2020-11-01 08:13:55: INFO	
This is i-ADHoRe v3.0.
Copyright (c) 2002-2010, Flanders Interuniversity Institute for Biotechnology, VIB.
Algorithm designed by Klaas Vandepoele, Cedric Simillion, Jan Fostier, Dieter De Witte,
Koen Janssens, Sebastian Proost, Yvan Saeys and Yves Van de Peer.

Process 1/1 is alive on compute-0-1.local.


************* i-ADHoRe parameters *************
	Number of genelists = 54
	Blast table = ./wgd_syn/families.tsv
	Output path = ./wgd_syn/i-adhore-out/
	Gap size = 30
	Cluster gap size = 35
	Cloud gap size = 0
	Cloud cluster gap size = 0
	Max gaps in alignment = 35
	Tandem gap = 15
	Flush output = 1000
	Q-value = 0.75
	Anchorpoints = 3
	Probability cutoff = 0.01
	Cloud filtering method = Binomial
	Level 2 only = false
	Use family = true
	Write statistics = false
	Alignment method = GreedyGraphbased4
	Multiple hypothesis correction = FDR
	Number of threads = 1
	Compare aligners = false
	Collinear searches only
	Visualize GHM.png = false
	Visualize Alignment = true
	Verbose output = true
************ END i-AdDHoRe parameters *********

Creating dataset...			done. (time: 0.0401988s)
Mapping gene families...		done. (time: 0.424732s)
Remapping tandem duplicates...	done. (time: 0.050889s)
Writing genelists file...		done. (time: 0.0950758s)
Collinear Search
Level 2 multiplicon detection...	done. (time: 3.18828s)
Profile detection...
Flushing output files...Visualize AlignedProfiles
done.
Time for Higher Level Detection: 0.00403285s.


All Done!  Bye...



2020-11-01 08:13:55: INFO	Drawing co-linearity dotplot
2020-11-01 08:13:55: INFO	Done
/ds3512/home/panyp/ruanjian/python3/lib/python3.6/site-packages/wgd-1.2-py3.6.egg/wgd/viz.py:223: UserWarning: Attempting to set identical left == right == 0 results in singular transformations; automatically expanding.
/ds3512/home/panyp/ruanjian/python3/lib/python3.6/site-packages/wgd-1.2-py3.6.egg/wgd/viz.py:224: UserWarning: Attempting to set identical bottom == top == 0 results in singular transformations; automatically expanding.
2020-11-01 08:14:02: INFO	Preparing data frame
2020-11-01 08:14:03: INFO	 .. max_iter = 1000
2020-11-01 08:14:03: INFO	 .. n_init   = 1
2020-11-01 08:14:03: INFO	Method is GMM, interpret best model with caution!
2020-11-01 08:14:03: INFO	Fitting GMM with 1 components
2020-11-01 08:14:04: INFO	Component mean, variance, weight: 
2020-11-01 08:14:04: INFO	.. 0.283, 1.347, 1.000
2020-11-01 08:14:04: INFO	Fitting GMM with 2 components
2020-11-01 08:14:04: INFO	Component mean, variance, weight: 
2020-11-01 08:14:04: INFO	.. 0.915, 0.486, 0.390
2020-11-01 08:14:04: INFO	.. 0.134, 0.456, 0.610
2020-11-01 08:14:04: INFO	Fitting GMM with 3 components
2020-11-01 08:14:04: INFO	Component mean, variance, weight: 
2020-11-01 08:14:04: INFO	.. 0.656, 0.142, 0.231
2020-11-01 08:14:04: INFO	.. 0.136, 0.458, 0.631
2020-11-01 08:14:04: INFO	.. 1.948, 0.079, 0.138
2020-11-01 08:14:04: INFO	Fitting GMM with 4 components
2020-11-01 08:14:04: INFO	Component mean, variance, weight: 
2020-11-01 08:14:04: INFO	.. 0.130, 0.165, 0.467
2020-11-01 08:14:04: INFO	.. 1.874, 0.098, 0.147
2020-11-01 08:14:04: INFO	.. 0.065, 0.947, 0.079
2020-11-01 08:14:04: INFO	.. 0.550, 0.206, 0.306
2020-11-01 08:14:04: INFO	
2020-11-01 08:14:04: INFO	AIC assessment:
2020-11-01 08:14:04: INFO	min(AIC) = 97487.41 for model 4
2020-11-01 08:14:04: INFO	Relative probabilities compared to model 4:
2020-11-01 08:14:04: INFO	   /                          \
2020-11-01 08:14:04: INFO	   |      (min(AIC) - AICi)/2 |
2020-11-01 08:14:04: INFO	   | p = e                    |
2020-11-01 08:14:04: INFO	   \                          /
2020-11-01 08:14:04: INFO	.. model   1: p = 0.0000
2020-11-01 08:14:04: INFO	.. model   2: p = 0.0000
2020-11-01 08:14:04: INFO	.. model   3: p = 0.0000
2020-11-01 08:14:04: INFO	.. model   4: p = 1.0000
2020-11-01 08:14:04: INFO	
2020-11-01 08:14:04: INFO	
2020-11-01 08:14:04: INFO	Delta BIC assessment: 
2020-11-01 08:14:04: INFO	min(BIC) = 97580.00 for model 4
2020-11-01 08:14:04: INFO	.. model   1: delta(BIC) =  7250.57 (    >10: Very Strong)
2020-11-01 08:14:04: INFO	.. model   2: delta(BIC) =  4139.71 (    >10: Very Strong)
2020-11-01 08:14:04: INFO	.. model   3: delta(BIC) =  2174.50 (    >10: Very Strong)
2020-11-01 08:14:04: INFO	.. model   4: delta(BIC) =     0.00 (0 to  2:   Very weak)
2020-11-01 08:14:04: INFO	
2020-11-01 08:14:04: INFO	Plotting AIC & BIC
2020-11-01 08:14:04: INFO	Plotting mixtures
2020-11-01 08:14:07: INFO	Writing component-wise probabilities to file
@arzwa
Copy link
Owner

arzwa commented Nov 5, 2020

Hi, that is strange, so the .tsv files for the Ks distribution and anchor pair Ks distribution are non-empty, but the figures are? Do you get a plot when using

wgd viz -ks wgd_ksd/format.SoyC09.CDS.fasta.ks.tsv

?

@shiyi-pan
Copy link
Author

shiyi-pan commented Nov 6, 2020

thank you for your reply.
the format.SoyC09.CDS.fasta.ks.tsv is 40Mb and part of file is as follows:

        AlignmentCoverage       AlignmentIdentity       AlignmentLength Distance        Family  Ka      Ks      Node    Omega   PairwiseAlignmentLength Paralog1        Paralog2        WeightOutliersExcluded  WeightOutliersIncluded
SoyC09_02G004800__SoyC09_10G003700      0.95349 0.94146 1290.0  0.08526 GF_006783       0.0419  0.1203  2.0     0.3483  1230.0  SoyC09_02G004800        SoyC09_10G003700        1.0     1.0
SoyC09_02G294600__SoyC09_08G319200      0.93605 0.81884 1032.0  0.28999 GF_002388       0.1261  0.5694  8.0     0.2214  966.0   SoyC09_02G294600        SoyC09_08G319200        0.16667 0.16667
SoyC09_02G294600__SoyC09_14G011500      0.97965 0.96835 1032.0  0.03387 GF_002388       0.0148  0.0906  6.0     0.1638  1011.0  SoyC09_02G294600        SoyC09_14G011500        1.0     1.0
SoyC09_02G294600__SoyC09_16G191200      0.84012 0.90542 1032.0  0.17804 GF_002388       0.0781  0.1935  7.0     0.4036  867.0   SoyC09_02G294600        SoyC09_16G191200        0.5     0.5
SoyC09_02G294600__SoyC09_18G072700      0.93314 0.82139 1032.0  0.28693 GF_002388       0.1259  0.5511  8.0     0.2285  963.0   SoyC09_02G294600        SoyC09_18G072700        0.16667 0.16667
SoyC09_08G319200__SoyC09_14G011500      0.94186 0.82099 1032.0  0.2691  GF_002388       0.1225  0.5687  8.0     0.2154  972.0   SoyC09_08G319200        SoyC09_14G011500        0.16667 0.16667
SoyC09_08G319200__SoyC09_16G191200      0.82267 0.8033  1032.0  0.34811 GF_002388       0.1412  0.6317  8.0     0.2235  849.0   SoyC09_08G319200        SoyC09_16G191200        0.16667 0.16667
SoyC09_08G319200__SoyC09_18G072700      0.95349 0.96138 1032.0  0.04714 GF_002388       0.0251  0.0868  5.0     0.2894  984.0   SoyC09_08G319200        SoyC09_18G072700        1.0     1.0
SoyC09_14G011500__SoyC09_16G191200      0.84302 0.91494 1032.0  0.15715 GF_002388       0.0685  0.1825  7.0     0.3755  870.0   SoyC09_14G011500        SoyC09_16G191200        0.5     0.5
SoyC09_14G011500__SoyC09_18G072700      0.93895 0.8225  1032.0  0.26604 GF_002388       0.1223  0.5586  8.0     0.2189  969.0   SoyC09_14G011500        SoyC09_18G072700        0.16667 0.16667
SoyC09_16G191200__SoyC09_18G072700      0.82267 0.80565 1032.0  0.34505 GF_002388       0.139   0.633   8.0     0.2195  849.0   SoyC09_16G191200        SoyC09_18G072700        0.16667 0.16667
SoyC09_06G212400__SoyC09_07G113300      0.33538 0.75535 1950.0  0.37721 GF_001512       0.1922  0.9434  12.0    0.2037  654.0   SoyC09_06G212400        SoyC09_07G113300        0.08333 0.08333
SoyC09_06G212400__SoyC09_07G155900      0.16    0.74038 1950.0  0.36079 GF_001512       0.2306  1.196   12.0    0.1928  312.0   SoyC09_06G212400        SoyC09_07G155900        0.08333 0.08333
SoyC09_06G212400__SoyC09_12G136800      0.26308 0.7115  1950.0  0.49301 GF_001512       0.2737  0.9403  12.0    0.291   513.0   SoyC09_06G212400        SoyC09_12G136800        0.08333 0.08333
SoyC09_06G212400__SoyC09_12G150100      0.70154 0.94737 1950.0  0.05783 GF_001512       0.0271  0.1412  10.0    0.1917  1368.0  SoyC09_06G212400        SoyC09_12G150100        1.0     1.0
SoyC09_06G212400__SoyC09_12G223100      0.70154 0.7902  1950.0  0.27494 GF_001512       0.1303  0.8582  12.0    0.1518  1368.0  SoyC09_06G212400        SoyC09_12G223100        0.08333 0.08333
SoyC09_06G212400__SoyC09_13G245700      0.70154 0.80263 1950.0  0.26643 GF_001512       0.1307  0.7062  11.0    0.185   1368.0  SoyC09_06G212400        SoyC09_13G245700        0.5     0.5
SoyC09_07G113300__SoyC09_07G155900      0.15846 0.94822 1950.0  0.09738 GF_001512       0.0446  0.079   7.0     0.5652  309.0   SoyC09_07G113300        SoyC09_07G155900        1.0     1.0
SoyC09_07G113300__SoyC09_12G136800      0.24462 0.87421 1950.0  0.23508 GF_001512       0.1435  0.1424  8.0     1.0076  477.0   SoyC09_07G113300        SoyC09_12G136800        0.5     0.5

the the figure got from your command four histogram with X-axis as Ks,logKs,logKa and logW . The figure is just grey bar and don't have any others like lines,I don't know is normal or not.

@arzwa
Copy link
Owner

arzwa commented Nov 6, 2020

Strange, for me it works in a fresh environment with the latest version of wgd installed from scratch, also for the fragment you pasted above. It must have something to do with your installation. I recommend using virtualenv to install wgd in a separate environment. In brief what I have done to test this:

$ virtualenv venv -p python3
$ source venv/bin/activate
$ git clone https://github.com/arzwa/wgd.git
$ pip install ./wgd
$ wgd viz -ks shiyi-pan.tsv

If you encounter issues in the last step, maybe try instead

$ python3 ./wgd/wgd_cli.py viz -ks shiyi-pan.tsv

(where shiyi-pan.tsv is the fragment you pasted above [note that the file should be tab separated not whitespace separated]). This will create a fresh environment, install the latest wgd version and run wgd viz. If I do this, I get a file wgd_hist.svg with the histogram (note that for the fragment above this is only a couple of bars, but for the full file it should be a nice histogram).

@shiyi-pan
Copy link
Author

thank you for your help. I'm sorry to reply lately. I will try your suggestion. In my log about FastTree, is it normal?

2020-11-01 03:02:02: INFO FastTree stdout:
2020-11-01 03:02:02: INFO FastTree stderr: Unknown or incorrect use of option --version
FastTree protein_alignment > tree
FastTree < protein_alignment > tree
FastTree -out tree protein_alignment
FastTree -nt nucleotide_alignment > tree
FastTree -nt -gtr < nucleotide_alignment > tree
FastTree < nucleotide_alignment > tree
FastTree accepts alignments in fasta or phylip interleaved formats

Common options (must be before the alignment file):
-quiet to suppress reporting information
-nopr to suppress progress indicator
-log logfile -- save intermediate trees, settings, and model details
-fastest -- speed up the neighbor joining phase & reduce memory usage
(recommended for >50,000 sequences)
-n to analyze multiple alignments (phylip format only)
(use for global bootstrap, with seqboot and CompareToBootstrap.pl)
-nosupport to not compute support values
-intree newick_file to set the starting tree(s)
-intree1 newick_file to use this starting tree for all the alignments
(for faster global bootstrap on huge alignments)
-pseudo to use pseudocounts (recommended for highly gapped sequences)
-gtr -- generalized time-reversible model (nucleotide alignments only)
-lg -- Le-Gascuel 2008 model (amino acid alignments only)
-wag -- Whelan-And-Goldman 2001 model (amino acid alignments only)
-quote -- allow spaces and other restricted characters (but not ' ) in
sequence names and quote names in the output tree (fasta input only;
FastTree will not be able to read these trees back in)
-noml to turn off maximum-likelihood
-nome to turn off minimum-evolution NNIs and SPRs
(recommended if running additional ML NNIs with -intree)
-nome -mllen with -intree to optimize branch lengths for a fixed topology
-cat # to specify the number of rate categories of sites (default 20)
or -nocat to use constant rates
-gamma -- after optimizing the tree under the CAT approximation,
rescale the lengths to optimize the Gamma20 likelihood
-constraints constraintAlignment to constrain the topology search
constraintAlignment should have 1s or 0s to indicates splits
-expert -- see more options
For more information, see http://www.microbesonline.org/fasttree/

@arzwa
Copy link
Owner

arzwa commented Nov 11, 2020

Yes, this is just the result of a check performed by wgd of whether it can run the FastTree executable, maybe I should hide that because it is confusing indeed.

@shiyi-pan
Copy link
Author

Hi, here is my data , could you take a look at and help me to find out what the problem is ? thank you very much.

gff.zip

cds.zip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants