LLMs-Integrated Automatic Hate Speech Recognition Using Controllable Text Generation Models

>100 Views

October 24, 25

スライド概要

R. Oshima, Y. Hosoda, Y. Iiguni, "LLMs-Integrated Automatic Hate Speech Recognition Using Controllable Text Generation Models," in Proc. of APSIPA, 2025.

細田侑也

@6204657444

スライド一覧

所属：立命館大学情報理工学部　助教研究分野：音声処理 / 画像処理

またはPlayer版

埋め込む »CMSなどでJSが使えない場合

ダウンロード

関連スライド

Plant growth prediction for lettuces using photosynthetic information

細田侑也 1.3K

高調波間の位相差に基づく発話区間検出

細田侑也 244

学振特別研究員になるために～2025年度申請版

学振 dc1 dc2 jsps pd

大上雅史 774.7K

研究に使える便利なフリーソフト ImageJ

imagej 放射線技師

片山豊 362.5K

StampFlyで学ぶマルチコプタ制御

伊藤恒平 359.4K

ZAZA株式会社_会社紹介

ZAZA株式会社 340.4K

各ページのテキスト

LLMs-Integrated Automatic Hate Speech Recognition Using Controllable Text Generation Models University of Osaka Ryutaro Oshima, Yuya Hosoda, Yoji Iiguni This presentation contains the content of an offensive or hateful nature, but does not refer to any individual or group.

1/6 Background Online platform Hate speech human identity Gender Race Religion ✓ Real-time voice communication ✓ Advancements in speech recognition ✓ Attacks on human identity ✓ Increase in the number of troubles ※ Distribution of inappropriate remarks ※ Difficult to control and regulate Research Objective Speech recognition with masking inappropriate remarks in hate speech audio to prevent human rights violations. < I despise immigrants. > I *** immigrants.

Methods Model Architecture 2/6 Cascade model [1] Hate speech ASR Encoder Decoder Transcript NLP BERT / LIME Censored Transcript Recognition errors mistakenly generate censored transcripts

Methods Cascade model [1] Hate speech Model Architecture 2/6 Proposed model Hate speech audio < I despise immigrants. > ASR Encoder ASR-LLM [2] Decoder ASR Transcript NLP BERT / LIME Censored Transcript Encoder Q-Former LLM Prompt Recognize the speech and give me the transcription. Hide hateful terms using ***. Decoder Masked Text "I *** immigrants." LLM decoder replaces ASR decoder and NLP module Perform speech recognition & multi-word masking simultaneously

Methods CoT prompt with HateXplain Dataset Generation African LLM Qwen2.5-1.5B-Instruct 3/6 nigger Hate speech text TTS Audio BERT Masked text Hate level determination "I *** immigrants." Generated Dataset

Methods CoT prompt with HateXplain Dataset Generation African LLM Qwen2.5-1.5B-Instruct Hate level determination TTS Audio BERT Masked text HateXplain [3] ✓ Extract hate words e.g. nigger, kike, blacks, … nigger Hate speech text "I *** immigrants." Chain of Thought (CoT) prompt ✓ Hate speech text dataset with human-annotated explanation ✓ Split the text by target 3/6 African Arab 1. Defining hate speech 2. Providing two examples → Sentences from the same target 3. Generating hate speech text → Instructions to include hate words

Methods Dataset Generation African LLM CoT prompt with HateXplain Qwen2.5-1.5B-Instruct Hate level determination T5 TTS Audio BERT Masked text "I *** immigrants." (*)Hugging Face Normal Harmful Normal Harmful Harmful (*) Hate-speech-CNERG/dehatebert-mono-english (*) IMSyPP/hate_speech_en (*) facebook/roberta-hate-speech-dynabench-r4-target BERT based on HateXplain [1] (*) Narrativa/byt5-base-tweet-hate-detection Level 0 H: 0 / N: 5 Level 1 H: 1 / N: 4 Level 2 H: 2 / N: 3 nigger Hate speech text Hate speech detection models BERT 3/6 Level 3 H: 3 / N: 2 Level 4 H: 4 / N: 1 Level 5 H: 5 / N: 0

4/6 Fine-tuning 1st training (Encoder) Hate speech <audio> Whisper large-v2 <text> Encoder Decoder Transcript 1st epoch 5 / lr = 5e-7 2nd epoch 3 / lr = 3e-5 Loss

10.

4/6 Fine-tuning 1st training (Encoder) 2nd training (Decoder) Hate speech <audio> Whisper large-v2 Hate speech <text> <audio> Encoder Encoder Decoder Q-Former Transcription 1st epoch 5 / lr = 5e-7 2nd epoch 3 / lr = 3e-5 Loss Vicuna 7B-v1.5 Masked text (Ground truth) Prompt Decoder LoRA Loss Masked text

11.

Experiment 5/6 Baselines and Metrics Cascade model [1] ASR-LLM Integrated model Models ASR-BERT ASR*-BERT ASR-LLM ASR*-LLM Encoder Whisper Whisper Whisper Whisper Decoder Whisper Whisper Vicuna Vicuna Metrics Definition Target Masking accuracy (MAR) 𝑛(𝑀ori ∩ 𝑀out ) × 100 [%] 𝑛(𝑀ori) Censorship accuracy Word error rate (WER) 𝐼+𝑆+𝐷 × 100 [%] 𝑁 ASR and Censorship accuracy Unmasked WER (UMWER) ෡ 𝐼መ + 𝑆መ + 𝐷 × 100 [%] ෡ 𝑁 Impact of censorship on ASR 𝑀ori / 𝑀out : The set of masked words (original / output) 𝑁 : the number of words

12.

Result 6/6 Test : 300 sentences (HateXplain) 1st experiment Model Comparison (Train : Hate level 5) Model Masking Accuracy Rate ↑ Word Error Rate ↓ Unmasked WER ↓ ASR-BERT [1] 36.7 32.6 57.1 ASR*-BERT 54.4 25.8 47.4 ASR-LLM [2] 45.8 29.6 53.8 ASR*-LLM (Prop.) 58.6 27.1 47.3 ✓ ASR*-LLM recorded the best MAR and UMWER. Balancing speech recognition and censorship performances. ✓ WER Slightly increased compared to ASR*-BERT. LLM-integrated models sometimes produced over-masking.

13.

Result 6/6 Test : 300 sentences (HateXplain) 1st experiment Model Comparison (Train : Hate level 5) Model Masking Accuracy Rate ↑ Word Error Rate ↓ Unmasked WER ↓ ASR-BERT [1] 36.7 32.6 57.1 ASR*-BERT 54.4 25.8 47.4 ASR-LLM [2] 45.8 29.6 53.8 ASR*-LLM (Prop.) 58.6 27.1 47.3 ✓ All models showed worse WER than standard ASR models. Quality of masked text (generated dataset) is insufficient. A single dataset limits in the expressive of text generation.