Toxic Language Classifier

A lightweight model for Detecting Toxic Language

Demo App

The Demo App is available here:

GitHub Repository

Discover the code and installation guidelines within this project's repository.

Table of Contents

About The Project
- What is Multilabel Classification?
- Data
Model Architecture
Installation

About the Project

Thank you for your interest in this project!

In this project, I aimed to build an efficient, lightweight model capable of accurately classifying toxic content while requiring minimal computational resources. The model performs multilabel classification, meaning it can assign multiple labels to a single text based on its content. It is deployed in a Streamlit app, as shown in the screenshot above.

What is Multilabel Classification?

Multilabel classification is a task where each input instance (in this case, a text comment) can be assigned multiple labels simultaneously. This is in contrast to traditional single-label classification, where each instance belongs to just one class.

Key Concepts of multilabel classification:

Multiple Labels: Each comment can have more than one label. For instance, a toxic comment may be tagged with both “threat” and “identity hate”.
Independent Labels: The labels are not mutually exclusive, meaning a comment can belong to any combination of labels, including none at all.

Data

The dataset used in this project is from the Toxic Comment Classification Challenge on Kaggle. It contains a large set of Wikipedia comments that have been manually labeled for various types of toxic behavior.

Dataset Variables:

id: Unique identifier for each comment
comment_text: The text of the comment
toxic: Binary (True/False) — Whether the comment is toxic
severe_toxic: Binary (True/False) — Whether the comment is severely toxic
obscene: Binary (True/False) — Whether the comment is obscene
threat: Binary (True/False) — Whether the comment contains a threat
insult: Binary (True/False) — Whether the comment is insulting
identity_hate: Binary (True/False) — Whether the comment expresses identity hate

Model Architecture

Given the relatively small dataset of 100K samples and short text, a CNN+GRU hybrid model was chosen for this task, as it balances performance and efficiency.

Architecture Overview

Embedding Layer: Converts input text into dense vectors of fixed size.
CNN Layer:
- Conv1D: Applies convolutional filters to capture local patterns (e.g., n-grams).
- Pooling Layer: Reduces dimensionality while preserving key features.
GRU Layer: Captures long-range dependencies and contextual information within the sequential data.
Dropout Layer: Helps prevent overfitting by randomly setting a fraction of input units to zero during training.
Dense Layers: Produces the final classification decision.

Why CNN+GRU?

CNN: Effective at identifying local patterns (like n-grams) within text. It serves as a feature extractor, focusing on important phrases for classification.
GRU: Ideal for processing sequential data and understanding long-range dependencies, crucial for capturing context and meaning.

Key Benefits

Feature Extraction: The CNN extracts important local features (n-grams) before passing them to the GRU, which focuses on sequential dependencies.
Efficiency: Pooling in the CNN reduces the sequence length, enabling the GRU to process fewer time steps, speeding up training.
Empirical Performance: CNN+GRU architectures often outperform other combinations (e.g., GRU-CNN) in text classification tasks.

Weights & Biases

The hyperparameter search was conducted using Weights & Biases. The search results for the selected CNN+GRU model, as well as for other models that were not chosen, can be found here.

Hyperparameters Search

Embedding Dimension: Size of the word embedding vectors (typical range: 50-300).
Number of Filters: Number of filters in the CNN (typical range: 64-256).
Kernel Size: Size of the convolutional kernel, capturing different n-gram patterns (typical range: 3-7).
GRU Hidden Size: Number of features in the GRU’s hidden state (typical range: 50-200).
Learning Rate: Controls the step size during training (typical range: 0.0001-0.01).
Batch Size: Number of samples per batch during training (typical range: 16-128).
Dropout Rate: Prevents overfitting by randomly dropping units during training (typical range: 0.2-0.5).

Selected Hyperparameters:

Hyperparameter	Value
Embedding Dimension	50
Number of Filters	128
Kernel Size	3
GRU Hidden Size	64
Learning Rate	0.0008
Batch Size	16
Dropout Rate	0.3

Performance

Performance is measured using ROC AUC (Receiver Operating Characteristic - Area Under the Curve) on the test set.

What is ROC AUC?

ROC AUC is a metric used to evaluate the performance of classification models. It measures the area under the ROC curve, which plots the true positive rate (TPR) against the false positive rate (FPR) at various threshold settings. The ROC AUC score ranges from 0 to 1, where:

1 indicates a perfect classifier
Closer to 1 means better classification performance
0.5 suggests the model performs no better than random guessing

	Overall	toxic	severe_toxic	obscene	threat	insult	identity_hate
ROI AUC	0.95895	0.9705	0.9860	0.9865	0.9557	0.9777	0.9533

These high ROC AUC scores indicate that the model effectively distinguishes between toxic and non-toxic comments across different categories.

Estimated Memory Usage in RAM

The approximate memory usage of the model when loaded is estimated by counting the number of parameters and their data type.

Total Parameters	Estimated Memory Usage
9,305,748	35.50 MB

Share on

Twitter Facebook LinkedIn

Daniele Didino