Understanding the 13 Categories

Content moderation works best when you understand what is being detected and why it matters. Each category exists to address specific harms that can affect individuals and communities.

Why These Categories Exist

Online content can cause real harm. Words can traumatize, radicalize, and even incite violence. The categories you see here represent years of research into the types of content that most frequently cause damage to individuals and communities.

Each category addresses a distinct type of harm. Some protect individuals from targeted attacks. Others protect vulnerable groups from discrimination. Still others prevent the spread of dangerous instructions or content that could inspire harmful actions.

Understanding these categories helps you make informed decisions about content in your care, whether you are moderating a community, reviewing AI outputs, or simply checking a message that concerns you.

Category 1 of 6

Harassment

Why This Matters

Everyone deserves to communicate without being targeted, belittled, or intimidated. Harassment erodes trust in online spaces and can cause lasting psychological harm. By detecting harassing content, we help create environments where people feel safe to participate and share.

What It Detects

Content that expresses, incites, or promotes harassing language towards any person or group. This includes bullying, intimidation, and targeted attacks designed to demean or silence others.

Examples

  • Repeated insults directed at a specific person
  • Mocking someone's appearance, abilities, or personal circumstances
  • Encouraging others to target or pile on an individual

Subcategory

Harassment with Threats

Harassment that escalates to include violence, serious harm, or credible threats against someone's safety or wellbeing.

Examples
  • Threatening physical violence against someone being harassed
  • Stating intent to harm someone or their family
  • Describing specific plans to cause someone distress or injury
Category 2 of 6

Hate Speech

Why This Matters

Hate speech targets people for who they are, attacking their fundamental identity and dignity. These attacks can fuel discrimination, social division, and real-world violence against vulnerable communities. Identifying hate speech helps protect marginalized groups and promotes inclusive dialogue.

What It Detects

Content that expresses, incites, or promotes hatred based on protected characteristics including race, gender, ethnicity, religion, nationality, sexual orientation, disability status, or caste.

Examples

  • Slurs or derogatory language targeting a racial or ethnic group
  • Claims that a religious group is inherently evil or dangerous
  • Statements dehumanizing people based on their sexual orientation

Subcategory

Hate with Threats

Hate speech that includes explicit threats of violence, calls for harm, or advocacy for violent action against protected groups.

Examples
  • Calling for violence against members of a religious group
  • Threatening ethnic cleansing or genocide
  • Advocating for physical attacks on LGBTQ+ individuals
Category 3 of 6

Violence

Why This Matters

Depictions of violence can normalize aggression, traumatize viewers, and in some contexts inspire imitation. While violence is sometimes discussed in legitimate news, educational, or artistic contexts, unrestricted violent content can cause harm. Detection helps platforms apply appropriate context and protections.

What It Detects

Content that depicts death, violence, or physical injury to people or animals. This covers descriptions and glorification of violent acts, regardless of whether they are real or fictional.

Examples

  • Detailed descriptions of physical fights or assaults
  • Glorifying or celebrating acts of violence
  • Describing injuries or deaths in visceral detail

Subcategory

Graphic Violence

Extremely detailed or gratuitous depictions of violence, gore, or physical trauma that go beyond typical descriptions.

Examples
  • Explicit descriptions of severe injuries or mutilation
  • Graphic depictions of torture or extreme suffering
  • Detailed accounts of gruesome deaths
Category 4 of 6

Self-Harm

Why This Matters

Content about self-harm requires special care because it can influence vulnerable individuals. Research shows that detailed descriptions or glorification of self-harm can trigger imitative behavior, especially among young people. Detecting this content allows platforms to intervene with support resources and prevent harm.

What It Detects

Content that promotes, encourages, or depicts acts of self-harm, including suicide, cutting, eating disorders, and other forms of self-injury.

Examples

  • Romanticizing or glorifying self-harm behaviors
  • Describing self-harm as a coping mechanism
  • Content that normalizes eating disorders

Subcategories

Self-Harm Intent

Content where someone expresses that they are currently engaging in, planning, or intending to engage in acts of self-harm.

Examples
  • Statements expressing suicidal ideation or plans
  • Declarations of intent to hurt oneself
  • Descriptions of ongoing self-harm behavior

Self-Harm Instructions

Content that provides guidance, methods, or step-by-step instructions for committing acts of self-harm.

Examples
  • Detailed methods for suicide
  • Instructions for self-injury techniques
  • Guidance on dangerous eating disorder behaviors
Category 5 of 6

Sexual Content

Why This Matters

Sexual content requires careful handling to protect minors, respect consent, and maintain appropriate boundaries in different contexts. What is acceptable varies widely by platform, audience, and jurisdiction. Detection enables platforms to enforce their policies and ensure content reaches only appropriate audiences.

What It Detects

Content that describes sexual activities, arousal, or explicit material intended to arouse. This includes both explicit descriptions and suggestive content that implies sexual acts.

Examples

  • Explicit descriptions of sexual acts
  • Sexually suggestive content intended to arouse
  • Detailed descriptions of sexual anatomy in arousing contexts

Subcategory

Sexual Content Involving Minors

Any sexual content that involves, depicts, or references minors in sexual situations. This is illegal in virtually all jurisdictions and causes severe harm.

Examples
  • Any sexual content depicting or involving children
  • Sexualization of minors in any context
  • Content that grooms or sexualizes young people
Category 6 of 6

Illicit Activities

Why This Matters

Instructions for illegal activities can enable real-world harm, from fraud that devastates families financially to drug manufacturing that endangers communities. While discussing laws and their implications is legitimate, providing actionable guidance for breaking them crosses a line. Detection helps prevent platforms from becoming instruction manuals for crime.

What It Detects

Content that provides advice, instructions, or detailed guidance on how to commit illegal acts, including but not limited to fraud, theft, hacking, and drug manufacturing.

Examples

  • Step-by-step instructions for committing fraud
  • Guidance on evading law enforcement
  • Tutorials for illegal hacking or unauthorized access

Subcategory

Violent Illicit Activities

Instructions or guidance for illegal activities that involve violence, weapons, or causing physical harm to others.

Examples
  • Instructions for creating weapons or explosives
  • Guidance on how to commit violent crimes
  • Detailed planning for attacks or violent acts

The Full Picture

Together, these 6 main categories and 7 subcategories (13 total) cover the spectrum of harmful content that appears online. Each serves a purpose: protecting individuals, safeguarding communities, and preventing content from causing real-world damage.

6
Main Categories
Harassment, Hate, Violence, Self-Harm, Sexual, Illicit
7
Subcategories
More specific variants that require additional attention

Ready to check some content?

Use the scanner to analyze any text against all 13 categories instantly.

Go to Scanner