Skip to content

Safety Check Operator

The Safety Check operator is designed to analyze text-based content for potentially harmful or inappropriate material. It integrates toxicity detection and profanity filtering to help ensure that generated or user-submitted text adheres to safety guidelines. This operator is beneficial for applications where content moderation is crucial, such as chatbots, social media platforms, or any system involving user-generated text.

  • Python Packages:
    • detoxify
    • better_profanity
    • transformers (optional, for transformer-based toxicity detection) These can be installed via the ChatTD operator’s Python manager.
  • ChatTD Operator: Required and must be configured.
  • Input Table (DAT): Table containing the conversation/text to analyze. Required columns: id, role, message, timestamp.
  • Toxicity Table (DAT): Toxicity scores and details. Columns: toxicity_score, severe_toxicity, obscene, threat, insult, identity_hate, message_id, role, message, timestamp.
  • Profanity Table (DAT): Profanity detection results. Columns: contains_profanity, profanity_probability, flagged_words, message_id, role, message, timestamp.
  • PII Table (DAT): Personally Identifiable Information results. Columns: contains_pii, pii_types, confidence, message_id, role, message, timestamp.
  • Summary Table (DAT): Overall safety analysis summary. Columns: metric, value.
Start Safety Checks (Check) op('safety_check').par.Check Pulse
Default:
None
Status (Status) op('safety_check').par.Status String
Default:
Safety checks complete
Safety Checks (Checkmodes) op('safety_check').par.Checkmodes Menu
Default:
profanity
Options:
toxicity, profanity
Toxicity Threshold (Toxicitythreshold) op('safety_check').par.Toxicitythreshold Float
Default:
0.328
Profanity Threshold (Profanitythreshold) op('safety_check').par.Profanitythreshold Float
Default:
0.376
Table Update Mode (Updatemode) op('safety_check').par.Updatemode Menu
Default:
clear
Options:
append, replace, batch, clear
Analyze Mode (Analyzemode) op('safety_check').par.Analyzemode Menu
Default:
full_conversation
Options:
full_conversation, last_message, specific_message, custom_text
Clear Results (Clear) op('safety_check').par.Clear Pulse
Default:
None
Callbacks Header
Callback DAT (Callbackdat) op('safety_check').par.Callbackdat DAT
Default:
None
Edit Callbacks (Editcallbacksscript) op('safety_check').par.Editcallbacksscript Pulse
Default:
None
Create Callbacks (Createpulse) op('safety_check').par.Createpulse Pulse
Default:
None
onViolation (Onviolation) op('safety_check').par.Onviolation Toggle
Default:
On
Textport Debug Callbacks (Debugcallbacks) op('safety_check').par.Debugcallbacks Menu
Default:
Full Details
Options:
None, Errors Only, Basic Info, Full Details
Bypass (Bypass) op('safety_check').par.Bypass Toggle
Default:
Off
Show Built-in Parameters (Showbuiltin) op('safety_check').par.Showbuiltin Toggle
Default:
Off
Version (Version) op('safety_check').par.Version String
Default:
1.0.0
Last Updated (Lastupdated) op('safety_check').par.Lastupdated String
Default:
2024-11-10
Creator (Creator) op('safety_check').par.Creator String
Default:
dotsimulate
Website (Website) op('safety_check').par.Website String
Default:
https://dotsimulate.com
ChatTD Operator (Chattd) op('safety_check').par.Chattd OP
Default:
/dot_lops/ChatTD
Available Callbacks:
  • onViolation
Example Callback Structure:
def onViolation(info):
# Called when a safety check fails (e.g., toxicity/profanity threshold exceeded)
# info dictionary contains details like:
# - op: The Safety Check operator
# - checkType: 'toxicity' or 'profanity'
# - messageId: ID of the violating message
# - message: Content of the violating message
# - role: Role associated with the message
# - score: The calculated score (toxicity or profanity probability)
# - threshold: The threshold that was exceeded
print(f"Safety violation detected: {info.get('checkType')}")
# Example: op('path/to/notifier').par.Sendmessage.pulse()
pass
  • Performance depends on input text size and enabled checks.
  • Transformer-based toxicity detection can be resource-intensive.
  • Analyze only necessary parts of conversations (e.g., last_message) for better performance.
  • batch update mode might be faster for large inputs.
safety_checker = op('safety_check1')
conversation_dat = op('conversation_log') # Assuming this DAT exists
# Connect input
safety_checker.inputConnectors[0].connect(conversation_dat)
# Configure checks
safety_checker.par.Analyzemode = 'full_conversation'
safety_checker.par.Checkmodes = 'toxicity profanity' # Enable both
safety_checker.par.Toxicitythreshold = 0.5
safety_checker.par.Profanitythreshold = 0.6
# Start check
safety_checker.par.Check.pulse()
# View results
# toxicity_results = safety_checker.op('toxicity_table')
# profanity_results = safety_checker.op('profanity_table')
# 1. Create a Text DAT (e.g., 'safety_callbacks')
# 2. Add the onViolation function (see Callbacks section above)
safety_checker = op('safety_check1')
# Configure callbacks
safety_checker.par.Callbackdat = op('safety_callbacks')
safety_checker.par.Onviolation = 1
# Run checks as usual
safety_checker.par.Check.pulse()
# The onViolation function in 'safety_callbacks' DAT will execute if thresholds are met.
  • Moderating chatbots and virtual assistants.
  • Filtering user-generated content (comments, posts).
  • Ensuring safety in text-based games or virtual worlds.
  • Flagging inappropriate language in online communities.