Content Moderator
AI EndpointAnalyze text for toxicity, spam, and policy violations. Returns a safety score with specific flags.
0 stars14 deploysv1.0.0
What It Does
Send any user-generated text — get back a safety verdict (safe/unsafe), a confidence score (0-1), specific violation flags (toxicity, spam, hate speech, personal attacks), and the reason for the assessment. Built for automated content pipelines.
Example
Input: "This product is absolute garbage and anyone who buys it is an idiot"
Output:
{
"safe": false,
"score": 0.72,
"flags": ["personal_attack", "toxicity"],
"reason": "Contains personal insult directed at other users. The product criticism alone would be acceptable, but calling buyers idiots crosses into personal attack territory."
}
When To Use This
- User-generated content platforms — check posts, comments, and reviews before publishing
- Chat and messaging — filter messages in real time as part of your moderation pipeline
- Marketplace listings — screen product descriptions and seller communications for policy violations
Metadata
Version1.0.0
TypeAI Endpoint
Categoryextractor
Stars0
Deploys14
What's Included
System Prompt
Structured Output Schema
Deploy this ai endpoint in minutes
Analyze text for toxicity, spam, and policy violations. Returns a safety score with specific flags.