The internet, and particularly social media, has grown exponentially over the last decades. The nature of social media allows anyone to go online and create content they find interesting, whether appropriate or not. One form of inappropriate content is hate speech—offensive or threatening speech targeting certain people based on their ethnicity, religion, sexual orientation, and the like.

Hate speech detection models are computational systems that can identify and classify online comments as hate speech. “These models are crucial in moderating online content and mitigating the spread of harmful speech, particularly on social media,” said Assistant Professor Roy Lee from the Singapore University of Technology and Design (SUTD). Evaluating the performance of hate speech detection models is important, but traditional evaluation using held-out test sets often fail to properly assess the model’s performance due to inherent bias within the datasets.

To overcome this limitation, HateCheck and Multilingual HateCheck (MHC) were introduced as functional tests that capture the complexity and diversity of hate speech by simulating real-world scenarios. In their paper “SGHateCheck: Functional tests for detecting hate speech in low-resource languages of Singapore”, Asst Prof Lee and his team build on the frameworks of HateCheck and MHC to develop SGHateCheck, an artificial intelligence (AI)-powered tool that can distinguish between hateful and non-hateful comments in the specific context of Singapore and Southeast Asia.

Creating an evaluation tool specifically for the region’s linguistic and cultural context was necessary. This is because current hate speech detection models and datasets are mostly based on Western contexts, which do not accurately represent specific social dynamics and issues in Southeast Asia. “SGHateCheck aims to address these gaps by providing functional tests tailored to the region’s specific needs, ensuring more accurate and culturally sensitive detection of hate speech,” said Asst Prof Lee.

Unlike HateCheck and MHC, SGHateCheck uses large language models (LLMs) to translate and paraphrase test cases into Singapore’s four main languages—English, Mandarin, Tamil and Malay. Native annotators then refine these test cases to ensure cultural relevance and accuracy. The end result is over 11,000 test cases meticulously annotated as hateful or non-hateful, which allows for a more nuanced platform to evaluate hate speech detection models.