LLMs-Integrated Automatic Hate Speech Recognition Using Controllable Text Generation Models

>100 Views

October 24, 25

スライド概要

R. Oshima, Y. Hosoda, Y. Iiguni, "LLMs-Integrated Automatic Hate Speech Recognition Using Controllable Text Generation Models," in Proc. of APSIPA, 2025.

profile-image

所属:立命館大学情報理工学部 助教 研究分野:音声処理 / 画像処理

シェア

またはPlayer版

埋め込む »CMSなどでJSが使えない場合

ダウンロード

関連スライド

各ページのテキスト
1.

LLMs-Integrated Automatic Hate Speech Recognition Using Controllable Text Generation Models University of Osaka Ryutaro Oshima, Yuya Hosoda, Yoji Iiguni This presentation contains the content of an offensive or hateful nature, but does not refer to any individual or group.

2.

1/6 Background Online platform Hate speech human identity Gender Race Religion ✓ Real-time voice communication ✓ Advancements in speech recognition ✓ Attacks on human identity ✓ Increase in the number of troubles ※ Distribution of inappropriate remarks ※ Difficult to control and regulate

3.

1/6 Background Online platform Hate speech human identity Gender Race Religion ✓ Real-time voice communication ✓ Advancements in speech recognition ✓ Attacks on human identity ✓ Increase in the number of troubles ※ Distribution of inappropriate remarks ※ Difficult to control and regulate Research Objective Speech recognition with masking inappropriate remarks in hate speech audio to prevent human rights violations. < I despise immigrants. > I *** immigrants.

4.

Methods Model Architecture 2/6 Cascade model [1] Hate speech ASR Encoder Decoder Transcript NLP BERT / LIME Censored Transcript Recognition errors mistakenly generate censored transcripts

5.

Methods Cascade model [1] Hate speech Model Architecture 2/6 Proposed model Hate speech audio < I despise immigrants. > ASR Encoder ASR-LLM [2] Decoder ASR Transcript NLP BERT / LIME Censored Transcript Encoder Q-Former LLM Prompt Recognize the speech and give me the transcription. Hide hateful terms using ***. Decoder Masked Text "I *** immigrants." LLM decoder replaces ASR decoder and NLP module Perform speech recognition & multi-word masking simultaneously

6.

Methods CoT prompt with HateXplain Dataset Generation African LLM Qwen2.5-1.5B-Instruct 3/6 nigger Hate speech text TTS Audio BERT Masked text Hate level determination "I *** immigrants." Generated Dataset

7.

Methods CoT prompt with HateXplain Dataset Generation African LLM Qwen2.5-1.5B-Instruct Hate level determination TTS Audio BERT Masked text HateXplain [3] ✓ Extract hate words e.g. nigger, kike, blacks, … nigger Hate speech text "I *** immigrants." Chain of Thought (CoT) prompt ✓ Hate speech text dataset with human-annotated explanation ✓ Split the text by target 3/6 African Arab 1. Defining hate speech 2. Providing two examples → Sentences from the same target 3. Generating hate speech text → Instructions to include hate words

8.

Methods Dataset Generation African LLM CoT prompt with HateXplain Qwen2.5-1.5B-Instruct Hate level determination T5 TTS Audio BERT Masked text "I *** immigrants." (*)Hugging Face Normal Harmful Normal Harmful Harmful (*) Hate-speech-CNERG/dehatebert-mono-english (*) IMSyPP/hate_speech_en (*) facebook/roberta-hate-speech-dynabench-r4-target BERT based on HateXplain [1] (*) Narrativa/byt5-base-tweet-hate-detection Level 0 H: 0 / N: 5 Level 1 H: 1 / N: 4 Level 2 H: 2 / N: 3 nigger Hate speech text Hate speech detection models BERT 3/6 Level 3 H: 3 / N: 2 Level 4 H: 4 / N: 1 Level 5 H: 5 / N: 0

9.

4/6 Fine-tuning 1st training (Encoder) Hate speech <audio> Whisper large-v2 <text> Encoder Decoder Transcript 1st epoch 5 / lr = 5e-7 2nd epoch 3 / lr = 3e-5 Loss

10.

4/6 Fine-tuning 1st training (Encoder) 2nd training (Decoder) Hate speech <audio> Whisper large-v2 Hate speech <text> <audio> Encoder Encoder Decoder Q-Former Transcription 1st epoch 5 / lr = 5e-7 2nd epoch 3 / lr = 3e-5 Loss Vicuna 7B-v1.5 Masked text (Ground truth) Prompt Decoder LoRA Loss Masked text

11.

Experiment 5/6 Baselines and Metrics Cascade model [1] ASR-LLM Integrated model Models ASR-BERT ASR*-BERT ASR-LLM ASR*-LLM Encoder Whisper Whisper Whisper Whisper Decoder Whisper Whisper Vicuna Vicuna Metrics Definition Target Masking accuracy (MAR) 𝑛(𝑀ori ∩ 𝑀out ) × 100 [%] 𝑛(𝑀ori) Censorship accuracy Word error rate (WER) 𝐼+𝑆+𝐷 × 100 [%] 𝑁 ASR and Censorship accuracy Unmasked WER (UMWER) ෡ 𝐼መ + 𝑆መ + 𝐷 × 100 [%] ෡ 𝑁 Impact of censorship on ASR 𝑀ori / 𝑀out : The set of masked words (original / output) 𝑁 : the number of words

12.

Result 6/6 Test : 300 sentences (HateXplain) 1st experiment Model Comparison (Train : Hate level 5) Model Masking Accuracy Rate ↑ Word Error Rate ↓ Unmasked WER ↓ ASR-BERT [1] 36.7 32.6 57.1 ASR*-BERT 54.4 25.8 47.4 ASR-LLM [2] 45.8 29.6 53.8 ASR*-LLM (Prop.) 58.6 27.1 47.3 ✓ ASR*-LLM recorded the best MAR and UMWER. Balancing speech recognition and censorship performances. ✓ WER Slightly increased compared to ASR*-BERT. LLM-integrated models sometimes produced over-masking.

13.

Result 6/6 Test : 300 sentences (HateXplain) 1st experiment Model Comparison (Train : Hate level 5) Model Masking Accuracy Rate ↑ Word Error Rate ↓ Unmasked WER ↓ ASR-BERT [1] 36.7 32.6 57.1 ASR*-BERT 54.4 25.8 47.4 ASR-LLM [2] 45.8 29.6 53.8 ASR*-LLM (Prop.) 58.6 27.1 47.3 ✓ All models showed worse WER than standard ASR models. Quality of masked text (generated dataset) is insufficient. A single dataset limits in the expressive of text generation.