How Safety Analysis Works
Every conversation captured by Sensible is automatically analyzed for safety concerns. Here's how it works.
The process
- The Chrome extension captures a conversation (prompt + response)
- The conversation is sent to the Sensible backend
- The backend sends the conversation text to an AI model for analysis
- The AI evaluates the content against safety categories
- If something is flagged, an alert is created with a severity level and reason
- The conversation and any alerts appear on your dashboard
What gets analyzed
The AI analyzes the full conversation — both what your child typed and what the AI responded. This is important because:
- A child might ask an innocent question that gets a concerning response
- A child might ask something concerning that gets a responsible response
- Context matters — the same words can be fine or concerning depending on the conversation
Safety categories
The analysis checks for:
| Category | Examples |
|---|---|
| Self-harm / Suicide | References to self-harm, suicidal ideation, depression |
| Violence / Threats | Descriptions of violence, threats toward others |
| Sexual content | Age-inappropriate sexual content or conversations |
| Illegal activities | Drug use, illegal behavior, instructions for illegal acts |
| Cyberbullying | Harassment, bullying language, targeting others |
| Personal information | Sharing addresses, phone numbers, school names, etc. |
Important limitations
- No system is perfect. The AI analysis may miss some concerning content or flag something that's actually fine. Use your judgment as a parent.
- Context matters. A child researching history might trigger flags about violence. A child doing health homework might trigger flags about medical topics. Always read the full conversation before reacting.
- This is a supplement, not a replacement. Sensible helps you stay informed, but it doesn't replace active parenting and open communication.
Privacy of analysis
- Conversation text is sent to Anthropic's Claude API for analysis only
- The text is not stored by Anthropic or used to train their models
- Only the safety assessment (flagged/not flagged, severity, reason) is stored alongside the conversation
- See our Privacy Policy for full details