Poster
in
Workshop: Building Trust in LLMs and LLM Applications: From Guardrails to Explainability to Regulation
LLMS LOST IN TRANSLATION: M-ALERT UNCOVERS CROSS-LINGUISTIC SAFETY GAPS
Felix Friedrich · Simone Tedeschi · Patrick Schramowski · Manuel Brack · Roberto Navigli · Huu Nguyen · Bo Li · Kristian Kersting
Building safe Large Language Models (LLMs) across multiple languages is essen-tial in ensuring both safe access and linguistic diversity. To this end, we introduceM-ALERT, a multilingual benchmark that evaluates the safety of LLMs in five lan-guages: English, French, German, Italian, and Spanish. M-ALERT includes 15khigh-quality prompts per language, totaling 75k, following the detailed ALERTtaxonomy. Our extensive experiments on 10 state-of-the-art LLMs highlight theimportance of language-specific safety analysis, revealing that models often exhibitsignificant inconsistencies in safety across languages and categories. For instance,Llama3.2 shows high unsafety in category crimetax for Italian but remains safein other languages. Similar differences can be observed across all models. In con-trast, certain categories, such as substancecannabis and crime_propaganda,consistently trigger unsafe responses across models and languages. These findingsunderscore the need for robust multilingual safety practices in LLMs to ensureresponsible usage across diverse communities