BIOSCAN and Taxonomy
The BIOSCAN program offers significant opportunities to support taxonomic science and to accelerate description of the world’s biota. This thread is for general exploration of the linkages between BIOSCAN and taxonomic research. These linkages are in both directions, with BIOSCAN needing expert knowledge to interpret and curate data for different taxonomic groups, and with BIOSCAN offering data and insights for separating and characterising species. Early planning will help to maximise benefits and to avoid potential pitfalls. If appropriate, subtopics will be broken out into separate threads for more focused discussion. This opening post will be maintained as an introduction and overview of discussion topics.
Discuss linkages between BIOSCAN and taxonomy below.
BINs, OTUs and Species
The Barcode Index Numbers (BINs) generated by BOLD clustering serve as the units of discovery for barcoding and metabarcoding. They serve in many contexts as operational taxonomic units (OTUs). Many BINs are already identified as representing known and named species. Others may represent known species or currently unnamed species. In all cases, there may be many-to-many mappings between BINs and biological species. BINs are typically associated with one or more scientific names or OTU labels, based on the identifications submitted with the barcode sequences. What enhancements to BOLD or additional tools and services could be developed to improve the representation and usefulness of this information as a tool for taxonomy?
If the BIN system is maintained as a stable identifier scheme, with predictable rules, particularly for handling cases where additional sequences lead to the need to fuse or split the cluster of specimens associated with a BIN, it can provide a predictable and indeed computable basis for tracking understanding of OTUs and their relationships with known species. What features are required to make this possible?
The species identifications offered when submitting barcode sequences to BOLD vary in their reliability. The same BIN may be associated with several divergent identifications. In some cases, this will represent failure of the barcode sequences to separate related species. In other cases, it will represent mistaken identifications. On the other hand, a single binomial may be associated with multiple BINs, which may indicate intraspecific variation, cryptic species or further misidentifications. Information contained in associated metadata may sometimes help with resolving these questions. In particular, barcode sequences associated with type specimens would reliably anchor the reference for a name within a BIN. However, in general, expert knowledge is required to assist with resolving such issues. What tools should be offered, and what assistance can be given to taxonomists and others, to make such curation feasible and efficient?
BINs and New Species
Mapping the barcodes from millions of specimens inevitably reveals new and apparently unnamed OTUs, many of which will correspond to valid species. Under what circumstances, following what assessment and according to what threshold is it appropriate for these to be described as new species? This has particular relevance in the context of hyperdiverse genera, where massive numbers of apparent species may await description and where it may be difficult to offer reliable morphological characters for well-delimited OTUs. The challenge of offering diagnostic characters increases exponentially with the size of the group.
How should this challenge be addressed? Should BINs be promoted for wider use as interim identifiers for these taxonomic units? (This would again be a reason for ensuring that the BIN system offers predictability and transparency.) Or should new binomials be assigned to these units subject to best-practice criteria and thresholds? (This would have the advantage of placing these taxonomic units within the most widely understood framework for tracking and reporting biodiversity, but may increase the number of names requiring synonymisation in the future.) Finding the right balance will help to ensure that DNA barcoding and the BIN system are supportive of and supported by the taxonomic community.