Item request has been placed! ×
Item request cannot be made. ×
loading  Processing Request

Baiting out a full length sequence from unmapped RNA-seq data.

Item request has been placed! ×
Item request cannot be made. ×
loading   Processing Request
  • Additional Information
    • Source:
      Publisher: BioMed Central Country of Publication: England NLM ID: 100965258 Publication Model: Electronic Cited Medium: Internet ISSN: 1471-2164 (Electronic) Linking ISSN: 14712164 NLM ISO Abbreviation: BMC Genomics Subsets: MEDLINE
    • Publication Information:
      Original Publication: London : BioMed Central, [2000-
    • Subject Terms:
    • Abstract:
      Background: As a powerful tool, RNA-Seq has been widely used in various studies. Usually, unmapped RNA-seq reads have been considered as useless and been trashed or ignored.
      Results: We develop a strategy to mining the full length sequence by unmapped reads combining with specific reverse transcription primers design and high throughput sequencing. In this study, we salvage 36 unmapped reads from standard RNA-Seq data and randomly select one 149 bp read as a model. Specific reverse transcription primers are designed to amplify its both ends, followed by next generation sequencing. Then we design a statistical model based on power law distribution to estimate its integrality and significance. Further, we validate it by Sanger sequencing. The result shows that the full length is 1556 bp, with insertion mutations in microsatellite structure.
      Conclusion: We believe this method would be a useful strategy to extract the sequences information from the unmapped RNA-seq data. Further, it is an alternative way to get the full length sequence of unknown cDNA.
      (© 2021. The Author(s).)
    • References:
      Genome Biol. 2018 Feb 15;19(1):36. (PMID: 29548336)
      Genomics. 2017 Jan;109(1):36-42. (PMID: 27913251)
      BMC Genomics. 2011 Jun 06;12:293. (PMID: 21645359)
      Nat Biotechnol. 2015 Mar;33(3):290-5. (PMID: 25690850)
      Nat Methods. 2019 Jan;16(1):55-58. (PMID: 30573814)
      Proc Natl Acad Sci U S A. 1988 Dec;85(23):8998-9002. (PMID: 2461560)
      Nat Methods. 2012 Mar 04;9(4):357-9. (PMID: 22388286)
      Nat Rev Genet. 2011 Feb;12(2):87-98. (PMID: 21191423)
      Genome Res. 2002 Apr;12(4):656-64. (PMID: 11932250)
      Nat Biotechnol. 2015 Sep;33(9):962-9. (PMID: 26237517)
      Genome Res. 2009 Jun;19(6):1117-23. (PMID: 19251739)
      Brief Bioinform. 2020 Mar 23;21(2):676-686. (PMID: 30815667)
      Nat Protoc. 2016 Sep;11(9):1650-67. (PMID: 27560171)
      Gene. 2018 Nov 30;677:163-168. (PMID: 30056070)
      Biol Direct. 2009 Apr 16;4:14. (PMID: 19371405)
      PLoS One. 2014 Dec 01;9(12):e113862. (PMID: 25436869)
      Nat Biotechnol. 2010 May;28(5):511-5. (PMID: 20436464)
      Bioinformatics. 2012 Dec 15;28(24):3211-7. (PMID: 23071270)
      Nat Biotechnol. 2011 May 15;29(7):644-52. (PMID: 21572440)
      Methods Mol Biol. 2020;2148:111-125. (PMID: 32394378)
      BMC Bioinformatics. 2015;16 Suppl 5:S8. (PMID: 25860434)
      Nat Commun. 2016 Aug 17;7:12339. (PMID: 27531712)
      Bioinformatics. 2018 Sep 1;34(17):i884-i890. (PMID: 30423086)
      Mol Syst Biol. 2015 Aug 07;11(8):826. (PMID: 26253570)
      Nat Rev Genet. 2009 Jan;10(1):57-63. (PMID: 19015660)
      Nucleic Acids Res. 2007 Jan;35(Database issue):D668-73. (PMID: 17142222)
      Phys Rev E Stat Nonlin Soft Matter Phys. 2007 Aug;76(2 Pt 1):021902. (PMID: 17930060)
      Bioinformatics. 2009 May 1;25(9):1105-11. (PMID: 19289445)
      BMC Bioinformatics. 2019 Apr 18;20(Suppl 4):168. (PMID: 30999839)
      Nucleic Acids Res. 1997 Sep 1;25(17):3389-402. (PMID: 9254694)
    • Grant Information:
      31970592 National Natural Science Foundation of China; 32002173 National Natural Science Foundation of China; CAASQNYC-KYYJ-41 The Elite Young Scientists Program of Chinese Academy of Agricultural Sciences; 2018A0303130009 Natural Science Foundation of Guangdong Province; 2018YFA0903201 National Key Research and Development Program of China; JCYJ20180306173714935 Science and Technology Planning Project of Shenzhen Municipality
    • Contributed Indexing:
      Keywords: Full length sequence; RNA-seq; Statistical model; Unmapped reads
    • Accession Number:
      0 (DNA, Complementary)
    • Publication Date:
      Date Created: 20211128 Date Completed: 20211130 Latest Revision: 20211203
    • Publication Date:
      20220902
    • Accession Number:
      PMC8626966
    • Accession Number:
      10.1186/s12864-021-08146-4
    • Accession Number:
      34837950