New York, New York–(Newsfile Corp. – May 15, 2025) – The International Conference on Machine Learning (ICML) has officially accepted “ELITE: Enhanced Language-Image Toxicity Evaluation for Safety”, a collaborative paper from AIM Intelligence, Seoul National University, Yonsei University, KIST, Kyung Hee University, and Sookmyung Women’s University.

AIM Intelligence CI

The paper proposes ELITE, a high-quality benchmark designed to evaluate the safety of Vision-Language Models (VLMs) with greater precision. At its core is the ELITE evaluator, a rubric-based method that incorporates a toxicity score to measure harmfulness in multimodal contexts-especially where VLMs produce specific, convincing responses that may appear harmless but convey dangerous intent.

“We’re incredibly proud that ELITE is being recognized at ICML,” said Sangyoon Yu, co-author and CEO of AIM Intelligence. “This framework is designed not just for research, but to meet the demands of real-world deployment.”

ELITE Main Figure

Going Beyond Refusal Checks

Most safety benchmarks rely on simple refusal detection-whether a model rejects an unsafe prompt. ELITE takes it further by introducing a rubric-based evaluator that assigns a 0-25 score for every response. It assesses four dimensions:

RefusalSpecificityConvincingnessToxicity (0-5 scale)

This scoring system builds on the StrongREJECT framework (NeurIPS 2024) but adds a toxicity axis to better catch implicit harm, especially in safe-safe pairs-cases where both image and prompt appear safe, but the model response is not.

Designed for Real-World Attacks

To test models more thoroughly, the ELITE benchmark includes:

4,587 image-text pairs across 11 safety domains (e.g. hate, defamation, privacy, sexual content)1,054 adversarial examples, created using four techniques: BlueprintsFlowchartsFake NewsRole Play

These examples reflect the kinds of prompts that can cause real-world damage-even when they don’t look harmful on the surface.

Performance That Exposes the Gaps

ELITE was tested against 18 leading models, including GPT-4o, Gemini-2.0, and Pixtral-12B. The results speak for themselves:

MetricELITE BenchmarkPrior BenchmarksAttack Success Rate (E-ASR)2-3x more effective in detecting failuresOften underreportedAUROC vs. Human Judgment0.77 (ELITE) vs. 0.46 (StrongREJECT)Weaker evaluator alignmentPixtral-12B Failure Rate>79%Highest across all modelsGPT-4o Failure Rate15.67%Still vulnerable

Even the best models showed significant blind spots when tested with ELITE.

From Research to Product: AIM Supervisor

AIM Supervisor – AIM Red Dashboard 1

AIM Supervisor is AIM Intelligence’s enterprise AI safety platform designed to support both text-based and multimodal models, including Vision-Language Models (VLMs). It enables continuous evaluation and risk control through real-time scoring, adversarial testing, and policy-based output filtering.

The platform integrates with OpenAI-compatible and HuggingFace-based models via REST API, deployable through a container or one-line API wrapper.

Inference latency (GPU): under 700msFull evaluation cycle: under 1.5 second

AIM Guard Dashboard

Key components include:

AIM Red – an automated adversarial engine that generates jailbreak prompts across high-risk taxonomiesAIM Guard – a real-time output evaluator applying rubric-based filtersAI Safety Dashboard – a unified console for monitoring, scoring, and policy tuning

AIM Supervisor – AIM Red Dashboard 2

Together, these tools help organizations detect unsafe behavior, enforce policies, and maintain governance-all without slowing development.

Product access: https://suho.aim-intelligence.com/en

Global Adoption and Policy Recognition

AIM Guard User Input / Policy

AIM Intelligence’s safety technologies are gaining traction across both industry and policy communities.

In partnership with LG CNS, AIM conducted red teaming and guardrail implementation for a customer-facing AI assistant at Woori Bank. Within days, ELITE surfaced privacy and financial safety violations, leading to targeted architecture updates.

AIM also collaborated with KT, Korea’s largest telecom provider, to evaluate internal AI systems. The assessment revealed system-level vulnerabilities and informed new safety protocols for deployment.

AIM’s work is being recognized internationally:

Meta’s Llama Impact Innovation Award – First Korean recipientAnthropic Bug Bounty Program – Red teaming frontier modelsTTA Standardization Partner – Helping define national safety guidelines for finance, healthcare, robotics, and public-sector AI

“With ELITE and our broader safety stack, we’re giving builders and regulators the confidence they need to deploy AI responsibly,” said Yu.

AIM Supervisor – AIM Guard Graph Policy

AI Safety Market: Growing Fast, Under-Regulated

As AI systems become embedded in finance, healthcare, defense, and public infrastructure, trust and accountability are no longer optional.

According to Markets and Markets, the global AI safety market is projected to grow from $1.1 billion in 2024 to $5.4 billion by 2030, with a 30.2% CAGR. With regulation on the rise, organizations are seeking solutions that are both robust and scalable. ELITE and AIM Supervisor meet that demand.

AIM Intelligence Joint Research Team: From left: Ha-eon Park [Seoul National University], Yoo-jin Choi [Sookmyung Women’s University], Won-jun Lee [KIST (Yonsei University)], Do-hyun Lee [Seoul National University], Sang-yoon Yoo [Seoul National University]

Matribhumi Samachar English

AIM Intelligence’s ELITE Collaborative Paper Accepted by the ICML

About Saransh Kanaujia

Check Also

Anteros Metals Announces Private Placement