Skip to main content

How Safety Analysis Works

Every conversation captured by Sensible is automatically analyzed for safety concerns. Here's how it works.

The process

  1. The Chrome extension captures a conversation (prompt + response)
  2. The conversation is sent to the Sensible backend
  3. The backend sends the conversation text to an AI model for analysis
  4. The AI evaluates the content against safety categories
  5. If something is flagged, an alert is created with a severity level and reason
  6. The conversation and any alerts appear on your dashboard

What gets analyzed

The AI analyzes the full conversation — both what your child typed and what the AI responded. This is important because:

  • A child might ask an innocent question that gets a concerning response
  • A child might ask something concerning that gets a responsible response
  • Context matters — the same words can be fine or concerning depending on the conversation

Safety categories

The analysis checks for:

CategoryExamples
Self-harm / SuicideReferences to self-harm, suicidal ideation, depression
Violence / ThreatsDescriptions of violence, threats toward others
Sexual contentAge-inappropriate sexual content or conversations
Illegal activitiesDrug use, illegal behavior, instructions for illegal acts
CyberbullyingHarassment, bullying language, targeting others
Personal informationSharing addresses, phone numbers, school names, etc.

Important limitations

  • No system is perfect. The AI analysis may miss some concerning content or flag something that's actually fine. Use your judgment as a parent.
  • Context matters. A child researching history might trigger flags about violence. A child doing health homework might trigger flags about medical topics. Always read the full conversation before reacting.
  • This is a supplement, not a replacement. Sensible helps you stay informed, but it doesn't replace active parenting and open communication.

Privacy of analysis

  • Conversation text is sent to Anthropic's Claude API for analysis only
  • The text is not stored by Anthropic or used to train their models
  • Only the safety assessment (flagged/not flagged, severity, reason) is stored alongside the conversation
  • See our Privacy Policy for full details