Item request has been placed! ×
Item request cannot be made. ×
loading  Processing Request

Comprehensive evaluation of structural variant genotyping methods based on long-read sequencing data.

Item request has been placed! ×
Item request cannot be made. ×
loading   Processing Request
  • Author(s): Duan X;Duan X;Duan X; Pan M; Pan M; Pan M; Fan S; Fan S
  • Source:
    BMC genomics [BMC Genomics] 2022 Apr 23; Vol. 23 (1), pp. 324. Date of Electronic Publication: 2022 Apr 23.
  • Publication Type:
    Journal Article
  • Language:
    English
  • Additional Information
    • Source:
      Publisher: BioMed Central Country of Publication: England NLM ID: 100965258 Publication Model: Electronic Cited Medium: Internet ISSN: 1471-2164 (Electronic) Linking ISSN: 14712164 NLM ISO Abbreviation: BMC Genomics Subsets: MEDLINE
    • Publication Information:
      Original Publication: London : BioMed Central, [2000-
    • Subject Terms:
    • Abstract:
      Background: Structural variants (SVs) play a crucial role in gene regulation, trait association, and disease in humans. SV genotyping has been extensively applied in genomics research and clinical diagnosis. Although a growing number of SV genotyping methods for long reads have been developed, a comprehensive performance assessment of these methods has yet to be done.
      Results: Based on one simulated and three real SV datasets, we performed an in-depth evaluation of five SV genotyping methods, including cuteSV, LRcaller, Sniffles, SVJedi, and VaPoR. The results show that for insertions and deletions, cuteSV and LRcaller have similar F1 scores (cuteSV, insertions: 0.69-0.90, deletions: 0.77-0.90 and LRcaller, insertions: 0.67-0.87, deletions: 0.74-0.91) and are superior to other methods. For duplications, inversions, and translocations, LRcaller yields the most accurate genotyping results (0.84, 0.68, and 0.47, respectively). When genotyping SVs located in tandem repeat region or with imprecise breakpoints, cuteSV (insertions and deletions) and LRcaller (duplications, inversions, and translocations) are better than other methods. In addition, we observed a decrease in F1 scores when the SV size increased. Finally, our analyses suggest that the F1 scores of these methods reach the point of diminishing returns at 20× depth of coverage.
      Conclusions: We present an in-depth benchmark study of long-read SV genotyping methods. Our results highlight the advantages and disadvantages of each genotyping method, which provide practical guidance for optimal application selection and prospective directions for tool improvement.
      (© 2022. The Author(s).)
    • References:
      Mol Neurodegener. 2018 Aug 21;13(1):46. (PMID: 30126445)
      Genome Res. 2017 May;27(5):677-685. (PMID: 27895111)
      Nucleic Acids Res. 2008 Sep;36(16):e105. (PMID: 18660515)
      Gigascience. 2017 Aug 1;6(8):1-9. (PMID: 28873962)
      Nat Rev Genet. 2006 Feb;7(2):85-97. (PMID: 16418744)
      Front Genet. 2021 Nov 18;12:761791. (PMID: 34868242)
      Trends Genet. 2002 Feb;18(2):74-82. (PMID: 11818139)
      Nucleic Acids Res. 2013 Jan;41(Database issue):D936-41. (PMID: 23193291)
      Genome Res. 2008 Nov;18(11):1851-8. (PMID: 18714091)
      Nat Genet. 2007 Oct;39(10):1261-5. (PMID: 17828264)
      Nat Genet. 2021 Jun;53(6):779-786. (PMID: 33972781)
      Nat Commun. 2019 Apr 16;10(1):1784. (PMID: 30992455)
      Genome Biol. 2019 Dec 19;20(1):291. (PMID: 31856913)
      Nat Rev Genet. 2020 Oct;21(10):597-614. (PMID: 32504078)
      Bioinformatics. 2019 Nov 1;35(22):4782-4787. (PMID: 31218349)
      Nat Biotechnol. 2020 Nov;38(11):1347-1355. (PMID: 32541955)
      Genet Med. 2018 Jan;20(1):159-163. (PMID: 28640241)
      Nat Methods. 2018 Jun;15(6):461-468. (PMID: 29713083)
      Bioinformatics. 2021 May 08;:. (PMID: 33963826)
      Nat Genet. 2007 Oct;39(10):1256-60. (PMID: 17828263)
      PLoS One. 2014 Jun 04;9(6):e99069. (PMID: 24896259)
      Am J Hum Genet. 2016 Apr 7;98(4):667-79. (PMID: 27018473)
      Nat Commun. 2019 Nov 27;10(1):5402. (PMID: 31776332)
      Bioinformatics. 2020 Nov 1;36(17):4568-4575. (PMID: 32437523)
      Nat Biotechnol. 2019 Oct;37(10):1155-1162. (PMID: 31406327)
      Bioinformatics. 2020 Feb 15;36(4):1267-1269. (PMID: 31589307)
      Genome Res. 2011 Jun;21(6):940-51. (PMID: 21460063)
      Nucleic Acids Res. 2021 May 7;49(8):e47. (PMID: 33503255)
      Genome Biol. 2020 Feb 12;21(1):35. (PMID: 32051000)
      Genome Biol. 2020 Aug 3;21(1):189. (PMID: 32746918)
      Genome Biol. 2019 Nov 20;20(1):246. (PMID: 31747936)
      Nat Biotechnol. 2008 Oct;26(10):1146-53. (PMID: 18846088)
      Gigascience. 2019 Sep 1;8(9):. (PMID: 31494671)
      Am J Hum Genet. 2021 Apr 1;108(4):597-607. (PMID: 33675682)
      Nat Methods. 2015 Oct;12(10):966-8. (PMID: 26258291)
      Cell. 2013 Feb 14;152(4):691-702. (PMID: 23415220)
      Nature. 2008 Sep 11;455(7210):232-6. (PMID: 18668039)
      Nat Genet. 2018 Jul;50(7):1054-1059. (PMID: 29915429)
      Bioinformatics. 2009 Aug 15;25(16):2078-9. (PMID: 19505943)
      Genome Res. 2008 Nov;18(11):1698-710. (PMID: 18775914)
      Nat Genet. 2017 May;49(5):692-699. (PMID: 28369037)
      Nat Rev Genet. 2011 May;12(5):363-76. (PMID: 21358748)
      Bioinformatics. 2018 Sep 15;34(18):3094-3100. (PMID: 29750242)
      Genome Biol. 2019 Jun 3;20(1):117. (PMID: 31159850)
      Bioinformatics. 2010 Mar 15;26(6):841-2. (PMID: 20110278)
      Nat Rev Genet. 2021 Sep;22(9):572-587. (PMID: 34050336)
      Front Genet. 2020 Jan 14;10:1313. (PMID: 32010185)
      PLoS Genet. 2012;8(4):e1002641. (PMID: 22570615)
      J Hum Genet. 2019 May;64(5):359-368. (PMID: 30760880)
      Bioinformatics. 2021 Jan 05;:. (PMID: 33399819)
      Genome Biol. 2016 Nov 28;17(1):241. (PMID: 27894357)
      Science. 2009 Jan 2;323(5910):133-8. (PMID: 19023044)
      Nat Rev Genet. 2013 Feb;14(2):125-38. (PMID: 23329113)
    • Grant Information:
      31970563 National Natural Science Foundation of China; 2020YFE0201600 Ministry of Science and Technology of the People's Republic of China; 19410741100 Science and Technology Commission of Shanghai Municipality
    • Contributed Indexing:
      Keywords: F1 score; Long-read sequencing; Performance evaluation; SV genotyping
    • Publication Date:
      Date Created: 20220424 Date Completed: 20220426 Latest Revision: 20220716
    • Publication Date:
      20220908
    • Accession Number:
      PMC9034514
    • Accession Number:
      10.1186/s12864-022-08548-y
    • Accession Number:
      35461238