
- This event has passed.
Next Generation Biomonitoring: Implementing Machine Learning for Accurate Metabarcoding with Nanopore MinION
22 September 2022 @ 15:30 - 17:30

Bilgenur Baloğlu
University of Southern California
Metabarcoding (identification of the plant, animal, and fungal taxa present in an environmental sample) rapidly gains importance in ecology, food safety, pest identification, and disease surveillance. NGS metabarcoding has a compelling advantage over traditional approaches for obtaining data on species distributions, however, it is often difficult to detect all the species present in a bulk sample using NGS. This can – in parts – be attributed to shorter read lengths most NGS instruments generate. Moreover, most NGS platforms are not portable, making in situ field-based sequencing not feasible. Oxford Nanopore sequencing platforms such as the MinION represent an exception to that and they are also known to provide longer reads albeit limited by rather high error rates (~12-15%). We used a freshwater mock community of 50 Operational Taxonomic Units (OTU) to test the capacity of the Oxford Nanopore MinION coupled with a rolling circle amplification protocol to provide long read metabarcoding results. We established a workflow for DNA metabarcoding of freshwater organisms using the Nanopore MinION sequencing platform. We also propose a new Python pipeline that explores error profiles of nanopore consensus sequences, mapping accuracy, and overall community representation of a complex bulk sample. Using our molecular and bioinformatics workflow that implements a machine learning algorithm, we were able to accurately estimate the diversity of the tested freshwater mock community with an average sequence accuracy of >99% for 1D2 sequencing on the nanopore platform. We could also show that the high error rates associated with long-read single molecule sequencing can be mitigated by using a rolling circle amplification protocol. Future bioassessment programs will tremendously benefit from portable, highly accurate, species-level metabarcoding and it appears that we reached a point were cost-effective field-based DNA metabarcoding is possible
About Speaker
Bilgenur Baloglu, Ph.D. is a molecular biologist, bioinformatics scientist, and a lecturer, based in Pasadena, California. She studied Molecular Biology and Genetics at Istanbul Technical University and earned a Ph.D. at the National University of Singapore in Molecular Ecology, working on the biological assessment of Singapore’s aquatic ecosystems using Next Generation Sequencing (NGS) and Nanopore sequencing. Her research interests focus on biodiversity monitoring and developing molecular and bioinformatics tools to make DNA sequencing technologies cheaper, faster, and more accurate. After completing her postdoctoral studies at the University of Guelph, Canada, she led bioinformatics efforts at Sequential Skin, and moved to Thermo Fisher Scientific, continuing work in bioinformatics support for NGS. She is also a part-time faculty at the University of Southern California, teaching a course on genomic analysis using machine learning techniques. Her seminar will focus on the newly developed python based bioinformatics algorithm ASHURE that could improve the nanopore consensus sequence accuracy to >99% for bulk sample metabarcoding.