Artificial intelligence (AI) models increasingly augment the day-to-day operations of individuals, enterprises, and government organizations across the world. AI can benefit scientific research, wherein models offer automation or semi-automation of practical chemical and biological tasks. In this context, responsible AI model developers must install safeguards to mitigate potential vulnerabilities, risks, and unintended behaviors. Signature Science subject matter experts support AI stakeholders by assessing and evaluating the efficacy of these safety measures.

Signature Science’s chemistry and biology researchers and CBRNE threat subject matter experts combine their hands-on, practical lab experience with capabilities in AI benchmarking, evaluations, and assessments. We can apply our current, practicing chemistry and biology expertise and operational laboratories to assessment methods that require lab validation.
SIGNATURE SCIENCE OFFERS
Red Teaming
Signature Science collaborates with AI developers to rigorously test AI systems to identify and address vulnerabilities, risks, and/or unintended behaviors. Through redlining, responsible developers can mitigate identified risks and implement safeguards, test the limits of the model, and enhance the model’s ability to handle unexpected inputs.
Question and Answer Pair Development
Targeted question and answer pairs can evaluate the knowledge base of AI systems and train the tools to provide accurate responses to user queries, allowing the AI to readily find relevant information when asked a similar question, and prevent instances of dangerous misuse. Our chemistry and biology experts can develop challenging multiple-choice or open-ended questions across chemistry and biology sub-domains to evaluate model performance relative to human expert performance or other reference models.
Automated Grading Rubric Development
AI model assessments benefit from clear, objective and measurable criteria by which submissions can be evaluated consistently. Scientific-based question and answer pairs often have numerous feasible correct answers, which complicates the development of an automated grading rubric. Our experts work to refine the question and ensure that our answer rubric is specific and comprehensive while maintaining compatibility with automated grading systems.
Human Uplift Studies
We design approaches to empirically test the biological and chemical laboratory uplift capabilities of AI models and investigate potential areas of concern posed by their capabilities as it relates to biological and chemical threats. By employing human uplift studies that ask humans to perform practical chemical and biological research tasks in a laboratory, AI stakeholders can gain a better understanding of the real-world impact of AI assistance on carrying out complex laboratory protocols..
For more information about AI Safety Consultation Services:

Alan Smith, PMP
AI Safety Lead, Chemistry

Danielle LeSassier, PhD
AI Biosafety Lead