Item request has been placed! ×
Item request cannot be made. ×
loading  Processing Request

HiFiAdapterFilt, a memory efficient read processing pipeline, prevents occurrence of adapter sequence in PacBio HiFi reads and their negative impacts on genome assembly.

Item request has been placed! ×
Item request cannot be made. ×
loading   Processing Request
  • Additional Information
    • Source:
      Publisher: BioMed Central Country of Publication: England NLM ID: 100965258 Publication Model: Electronic Cited Medium: Internet ISSN: 1471-2164 (Electronic) Linking ISSN: 14712164 NLM ISO Abbreviation: BMC Genomics Subsets: MEDLINE
    • Publication Information:
      Original Publication: London : BioMed Central, [2000-
    • Subject Terms:
    • Abstract:
      Background: Pacific Biosciences HiFi read technology is currently the industry standard for high accuracy long-read sequencing that has been widely adopted by large sequencing and assembly initiatives for generation of de novo assemblies in non-model organisms. Though adapter contamination filtering is routine in traditional short-read analysis pipelines, it has not been widely adopted for HiFi workflows.
      Results: Analysis of 55 publicly available HiFi datasets revealed that a read-sanitation step to remove sequence artifacts derived from PacBio library preparation from read pools is necessary as adapter sequences can be erroneously integrated into assemblies.
      Conclusions: Here we describe the nature of adapter contaminated reads, their consequences in assembly, and present HiFiAdapterFilt, a simple and memory efficient solution for removing adapter contaminated reads prior to assembly.
      (© 2022. The Author(s).)
    • References:
      Genome Res. 2020 Sep;30(9):1291-1305. (PMID: 32801147)
      Nature. 2020 Sep;585(7823):79-84. (PMID: 32663838)
      Insects. 2021 Jul 09;12(7):. (PMID: 34357286)
      Nat Biotechnol. 2019 Oct;37(10):1155-1162. (PMID: 31406327)
      Nucleic Acids Res. 2011 Jan;39(Database issue):D19-21. (PMID: 21062823)
      Comp Biochem Physiol C Toxicol Pharmacol. 2012 Jan;155(1):95-101. (PMID: 21651990)
      PLoS Comput Biol. 2018 Jan 26;14(1):e1005944. (PMID: 29373581)
      Bioinformatics. 2020 May 1;36(9):2896-2898. (PMID: 31971576)
      Proc Natl Acad Sci U S A. 2022 Jan 25;119(4):. (PMID: 35042800)
      PLoS One. 2013 Dec 23;8(12):e85024. (PMID: 24376861)
      BMC Bioinformatics. 2009 Dec 15;10:421. (PMID: 20003500)
      J Hered. 2013 Sep-Oct;104(5):595-600. (PMID: 23940263)
      Bioinformatics. 2017 Aug 15;33(16):2577-2579. (PMID: 28407147)
      Nat Biotechnol. 2018 Dec 6;36(12):1121. (PMID: 30520871)
      Bioinformatics. 2014 Aug 1;30(15):2114-20. (PMID: 24695404)
      Plant J. 2020 Apr;102(2):222-229. (PMID: 31788877)
      Bioinformatics. 2018 Mar 1;34(5):755-759. (PMID: 29069347)
      Nature. 2021 Apr;592(7856):737-746. (PMID: 33911273)
      BMC Bioinformatics. 2018 Nov 29;19(1):460. (PMID: 30497373)
      Bioinformatics. 2011 Jun 15;27(12):1691-2. (PMID: 21493652)
      Nat Methods. 2021 Feb;18(2):170-175. (PMID: 33526886)
    • Grant Information:
      2040-22430-027-00-D agricultural research service; 0500-00093-001-00-D Oak Ridge Institute for Science and Education
    • Contributed Indexing:
      Keywords: Adapter; Circular consensus sequencing; PacBio HiFi; Sequence data filtering
    • Publication Date:
      Date Created: 20220223 Date Completed: 20220224 Latest Revision: 20220301
    • Publication Date:
      20220908
    • Accession Number:
      PMC8864876
    • Accession Number:
      10.1186/s12864-022-08375-1
    • Accession Number:
      35193521