A complete pipeline of free bioinformatics tools for de novo transcriptome assembly and SSR primer design

Naranpanawa, D. N. U.; Chandrasekara, C. H. W. M. R. B.; Bandaranayake, P. C. G.; Bandaranayake, A. U.

A complete pipeline of free bioinformatics tools for de novo transcriptome assembly and SSR primer design

dc.contributor.author	Naranpanawa, D. N. U.
dc.contributor.author	Chandrasekara, C. H. W. M. R. B.
dc.contributor.author	Bandaranayake, P. C. G.
dc.contributor.author	Bandaranayake, A. U.
dc.date.accessioned	2024-11-25T06:39:12Z
dc.date.available	2024-11-25T06:39:12Z
dc.date.issued	2019-09-12
dc.description.abstract	During the past few decades, next-generation sequencing technologies have grown exponentially in terms of throughput, speed and reduction of sequencing cost. This has revolutionized the field of genomics, allowing the production of vast datasets. However, methods and software requirements for analyzing this data to interpret correct biological meaning are not experiencing the same growth rate. One such limitation is the unaffordable price of commercially available bioinformatics software. Hence, only a small fraction of genomes and transcriptomes have been completely assembled and annotated. Lack of reference genomes for comparative assembly lead to computationally more challenging de novo assembly. In addition, obtaining an assembly is a complex process that require many steps by using several complex tools. Due to this, beginners in bioinformatics might find analysis procedures too complicated and time-consuming with the associated learning-curve. Therefore, in order to aid novice biologists in assembling sequence data, and to bridge the bottleneck in computational biology and bioinformatics, we present a complete pipeline of freely available bioinformatics software for de novo transcriptome assembly. This pipeline was developed by combining several individual software through user-friendly shell scripts. To test the pipeline, we used Illumina HiSeq paired-end RNA-seq reads from four oil-producing Santalum album (sandalwood) tree samples from a published study. The raw data were first filtered for low quality reads, trimmed for adapters and normalized. Assembly was performed with Trinity de novo assembler. The quality of the assembly was tested with BUSCO, Bowtie2 and TransRate, and indicated to be high quality. In order to further validate the accuracy of the assembly, we used the assembled transcriptome to identify gene-specific Simple Sequence Repeat (SSR) markers. Primers were designed for eight S. album oil biosynthetic genes and two control genes, which were validated in the laboratory with respective samples. All primers amplified successfully, confirming the designed workflow. Furthermore, five SSR markers polymorphic among tested sandalwood accessions are potential markers to be utilized in sandalwood breeding programs. To the best of our knowledge, this is the first attempt of developing a user-friendly, validated assembly pipeline with free bioinformatics software and tools, provided with detailed documentation.
dc.identifier.isbn	978-955-589-282-7
dc.identifier.uri	https://ir.lib.pdn.ac.lk/handle/20.500.14444/4011
dc.language.iso	en_US
dc.publisher	University of Peradeniya
dc.subject	De novo assembly
dc.subject	Transcriptome assembly
dc.subject	Bioinformatics
dc.subject	SSR primer design
dc.subject	Assembly pipeline
dc.title	A complete pipeline of free bioinformatics tools for de novo transcriptome assembly and SSR primer design
dc.type	Article

Files

Original bundle

Now showing 1 - 1 of 1

Name:: iPURSE 2019 Proceedings-1 [44].pdf
Size:: 78.8 KB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Item-specific license agreed to upon submission
Description:

Download

Collections

iPURSE 2019