Abstract:
Introduction: Recent regulations from United States Government agencies reshape the screening of synthetic nucleic acids. These take a step away from categorizing hazard on the basis of “bad” taxa and invoke the function of the sequence in pathogenesis or intoxication. Ascertaining functions related to pathogenesis and distinguishing these from other molecular abilities that are unproblematic is not simple. Some have suggested that this information can be readily obtained from existing databases of pathogens.
Objectives: We evaluate how virulence factors are described in current databases of pathogens and their adequacy for biothreat data science. We discuss limitations of how virulence factors have been conceived and propose using the sequence of concern (SoC) term to distinguish sequences with biothreat from those without. We discuss ways in which databases of SoCs might be implemented for research and regulatory purposes. We describe ongoing work improving functional descriptions of SoCs.
Methods: We assess the adequacy of descriptions of virulence factors in pathogen databases following extensive engagement with the literature in microbial pathogenesis.
Results/Conclusions: Descriptions of virulence factors in pathogen databases are inadequate for understanding biothreats. Many are not biothreats and would not be concerning if transferred to another pathogen. New gene ontology terms have been authored, and those specific to pathogenic viral processes are being generalized to make them relevant to other pathogenic taxa. This allows better understanding by humans and better recognition by machines. A database of annotated functions of SoCs could benefit the evolving biosecurity regulatory framework in the United States.
Read the article here.
Authors
Gene D. Godbold1 and Matthew Scholz1
1 Signature Science, Austin, TX, USA