A complete pipeline of free bioinformatics tools for de novo transcriptome assembly and SSR primer design

dc.contributor.authorNaranpanawa, D. N. U.
dc.contributor.authorChandrasekara, C. H. W. M. R. B.
dc.contributor.authorBandaranayake, P. C. G.
dc.contributor.authorBandaranayake, A. U.
dc.date.accessioned2024-11-25T06:39:12Z
dc.date.available2024-11-25T06:39:12Z
dc.date.issued2019-09-12
dc.description.abstractDuring the past few decades, next-generation sequencing technologies have grown exponentially in terms of throughput, speed and reduction of sequencing cost. This has revolutionized the field of genomics, allowing the production of vast datasets. However, methods and software requirements for analyzing this data to interpret correct biological meaning are not experiencing the same growth rate. One such limitation is the unaffordable price of commercially available bioinformatics software. Hence, only a small fraction of genomes and transcriptomes have been completely assembled and annotated. Lack of reference genomes for comparative assembly lead to computationally more challenging de novo assembly. In addition, obtaining an assembly is a complex process that require many steps by using several complex tools. Due to this, beginners in bioinformatics might find analysis procedures too complicated and time-consuming with the associated learning-curve. Therefore, in order to aid novice biologists in assembling sequence data, and to bridge the bottleneck in computational biology and bioinformatics, we present a complete pipeline of freely available bioinformatics software for de novo transcriptome assembly. This pipeline was developed by combining several individual software through user-friendly shell scripts. To test the pipeline, we used Illumina HiSeq paired-end RNA-seq reads from four oil-producing Santalum album (sandalwood) tree samples from a published study. The raw data were first filtered for low quality reads, trimmed for adapters and normalized. Assembly was performed with Trinity de novo assembler. The quality of the assembly was tested with BUSCO, Bowtie2 and TransRate, and indicated to be high quality. In order to further validate the accuracy of the assembly, we used the assembled transcriptome to identify gene-specific Simple Sequence Repeat (SSR) markers. Primers were designed for eight S. album oil biosynthetic genes and two control genes, which were validated in the laboratory with respective samples. All primers amplified successfully, confirming the designed workflow. Furthermore, five SSR markers polymorphic among tested sandalwood accessions are potential markers to be utilized in sandalwood breeding programs. To the best of our knowledge, this is the first attempt of developing a user-friendly, validated assembly pipeline with free bioinformatics software and tools, provided with detailed documentation.
dc.identifier.isbn978-955-589-282-7
dc.identifier.urihttps://ir.lib.pdn.ac.lk/handle/20.500.14444/4011
dc.language.isoen_US
dc.publisherUniversity of Peradeniya
dc.subjectDe novo assembly
dc.subjectTranscriptome assembly
dc.subjectBioinformatics
dc.subjectSSR primer design
dc.subjectAssembly pipeline
dc.titleA complete pipeline of free bioinformatics tools for de novo transcriptome assembly and SSR primer design
dc.typeArticle
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
iPURSE 2019 Proceedings-1 [44].pdf
Size:
78.8 KB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed to upon submission
Description:
Collections