10.24433/CO.0903909.V1
Ying Wang
Department of Automation, Xiamen University, Xiamen, Fujian 361005,China
Kun Wang
Department of Automation, Xiamen University, Xiamen, Fujian 361005,China
Yang Young Lu
Department of Automation, Xiamen University, Xiamen, Fujian 361005,China
Fengzhu Sun
Department of Automation, Xiamen University, Xiamen, Fujian 361005,China
Improving contig binning of metagenomic data using d2S oligonucleotide frequency dissimilarity
Code Ocean
2019
Capsule
Capsule
Bioinformatics
metagenomics
contigs-binning
Ying Wang
2019-05-08
en-US
d3b64a54-ecdc-442a-b624-31a34ae23d1f
4020390
http://dx.doi.org/10.1186/s12859-017-1835-1
10.1186/s12859-017-1835-1
1.0
GNU General Public License (GPL)
No Rights Reserved (CC0)
d2SBin is easy-to-use contig-binning improving tool, which adjusted the contigs among bins based on the output of any existing binning tools. The tool is taxonomy-free only on the k-tuples for single metagenomic sample.
d2SBin is based on the mechanism that relative sequence compositions are similar across different regions of the same genome, but differ between genomes. Current tools generally used the normalized frequency of k-tuple directly, which actually is the absolute instead of relative sequence composition. Therefore, we attempted to model the relative sequence composition and to measure the dissimilarity between contigs with d2S. We applied d2SBin to adjust the outputs of five widely-used contig-binning tools on six datasets. The experiments showed that d2SBin can improve the contig binning performance significantly.
The d2SBin pipeline was developed with Python and run on the Unix and Linux platform, and the dS2Bin is available at https://github.com/kunWangkun/d2SBin.