A new ATOM analysis of about 1,500 open language models maps downloads, derivatives, inference share and performance, and reports Chinese models surpassed U.S.
7 results for: language models
Multimodal LLMs Underperform in Real-World Dermatology Evaluation
A new study reveals that multimodal large language models struggle with clinical dermatology tasks.
Aymara AI Launches Safety Evaluation System for 20 Language Models
Aymara AI unveils a platform for custom safety evaluations of large language models, revealing performance gaps.
Research Proposes MedCheck Framework to Enhance Medical AI Benchmarks
A new framework aims to improve the assessment of medical AI benchmarks, addressing key shortcomings.
Nemobot Introduces Strategic AI Agents for Interactive Gaming
Nemobot leverages large language models to create customizable AI agents for strategic games.
NVIDIA Advances Optimizers to Speed Up LLM Training
NVIDIA introduces new higher-order optimizers to enhance training efficiency for large language models.
RepIt Framework Enables Concept-Specific Refusal in Language Models
A new framework exposes vulnerabilities in language model safety evaluations through concept-specific manipulations.