Search: Model | CurrentLens.com

AI in Coding

Export Controls Failed for PGP; Unlikely to Stop Anthropic’s Mythos

CurrentLens
Jun 21, 2026

TechCrunch argues three decades of failed export controls on security software cast doubt on any attempt to block Anthropic’s Mythos model.

Models & Launches

VibeThinker-3B Matches DeepSeek V3.2 and Kimi K2.5 on Verifiable Benchmarks

CurrentLens
Jun 21, 2026

VibeThinker-3B is a 3B MIT-licensed dense reasoning model built on Qwen2.

AI in Coding

Z.ai Ships GLM-5.2 with Usable 1M-Token Context

CurrentLens
Jun 16, 2026

GLM-5.2 arrives across GLM Coding tiers with a 1M-token context, two effort modes, Anthropic-compatible endpoints and no benchmarks at launch.

Models & Launches

Extend Vision-Language-Action Policies to New Tasks via Retrieval

CurrentLens
Jun 16, 2026

An arXiv paper shows frozen vision-language-action policies can absorb new tasks at test time by retrieving pool-side demonstrations instead of per-task fine-tuning.

Enterprise AI

NVIDIA Deploys MiniMax M3 on Blackwell for Long‑Context Multimodal Workflows

CurrentLens
Jun 16, 2026

NVIDIA is making the MiniMax M3 multimodal model available on its accelerated infrastructure, including Blackwell, to support long‑context reasoning and agentic workflows.

AI Creative

Google DeepMind Models Trained on Concept Art for Tribeca Film

CurrentLens
Jun 16, 2026

At Tribeca, creators behind Dear Upstairs Neighbors used concept art to fine-tune custom builds of Google DeepMind’s Veo and Imagen, signaling a move away from prompt-only workflows.

Models & Launches

Google Releases Gemini-SQL2; Gemini 3.1 Pro Scores 80.04% on BIRD

CurrentLens
Jun 13, 2026

Google Research announced Gemini-SQL2, a Gemini 3.1 Pro-powered text-to-SQL capability that posted 80.04% execution accuracy on the BIRD single-model leaderboard.

Models & Launches

DKPS method cuts model-evaluation queries using cached responses

CurrentLens
Jun 6, 2026

An arXiv paper introduces a DKPS-based approach that uses cached model outputs to predict benchmark scores while substantially reducing the number of queries.

Enterprise AI

NVIDIA Unveils Software and Models to Power Enterprise AI Agents

CurrentLens
Jun 2, 2026

NVIDIA announced new software, open-source models and platform partnerships to build autonomous AI agents for engineering, healthcare, development and business operations.

AI Creative

OpenAI’s frontier models and Codex go live on AWS

CurrentLens
Jun 2, 2026

OpenAI made its frontier models and Codex generally available on AWS, letting enterprises build with OpenAI inside existing AWS environments and procurement flows.

Models & Launches

PIGMENT extends quantitative diffusion MRI to sparse, multi-site and low-field scans

CurrentLens
Jun 2, 2026

A physics-informed foundation model called PIGMENT learns a universal microstructure prior and adapts zero-shot to individual diffusion MRI scans, enabling reliable maps from sparse and heterogeneous data.

Models & Launches

ATOM Report Finds Chinese Open Models Overtook Western Peers in 2025

CurrentLens
May 27, 2026

A new ATOM analysis of about 1,500 open language models maps downloads, derivatives, inference share and performance, and reports Chinese models surpassed U.S.

Models & Launches

Authors Release OpenEval and Demand Item-Level Benchmark Standards

CurrentLens
May 25, 2026

A position paper argues AI evaluation must publish item-level benchmark responses and ships OpenEval - 10M model responses across 155k items - to prove the point.

Open Source & Research

Multimodal LLMs Underperform in Real-World Dermatology Evaluation

CurrentLens
May 8, 2026

A new study reveals that multimodal large language models struggle with clinical dermatology tasks.

Policy & Safety

Pentagon Sees Opportunities in Frontier AI Models Despite Mythos Concerns

CurrentLens
May 8, 2026

Defense officials are discussing frontier AI models, focusing on potential benefits amidst risks raised by Mythos.

Models & Launches

New Study Reveals Limits of Model-Level Evaluations in Alignment Assessments

CurrentLens
May 8, 2026

A recent paper argues that alignment evaluation cannot solely rely on model-level assessments.

Open Source & Research

RPC-Bench Introduces Fine-Grained Benchmark for Research Paper Comprehension

CurrentLens
May 1, 2026

RPC-Bench addresses gaps in understanding academic papers for AI models with a new benchmark.

Models & Launches

Aymara AI Launches Safety Evaluation System for 20 Language Models

CurrentLens
May 1, 2026

Aymara AI unveils a platform for custom safety evaluations of large language models, revealing performance gaps.

AI in Education

Elon Musk Reveals xAI Trained Grok Using OpenAI Models

CurrentLens
Apr 30, 2026

Elon Musk testified that xAI used OpenAI's models to enhance its Grok AI, raising regulatory questions.

Science & Healthcare

Research Proposes MedCheck Framework to Enhance Medical AI Benchmarks

CurrentLens
Apr 30, 2026

A new framework aims to improve the assessment of medical AI benchmarks, addressing key shortcomings.

Models & Launches

Goodfire Launches Silico, a New Tool for Debugging LLMs

CurrentLens
Apr 30, 2026

Silico allows developers to fine-tune AI model parameters during training, enhancing control.

Chips & Infrastructure

NVIDIA Nemotron 3 Nano Omni Model Launches on Amazon SageMaker JumpStart

CurrentLens
Apr 29, 2026

NVIDIA now offers the Nemotron 3 Nano Omni model on Amazon SageMaker JumpStart for enterprise use.

Policy & Safety

AI Firms Limit Access to Models Amid Rising Dual-Use Risks

CurrentLens
Apr 28, 2026

Leading AI companies restrict access to advanced models like GPT-Rosalind due to safety concerns.

AI Defense & Warfare

Pentagon Integrates Google’s AI Model into GenAI.mil Amid Rising Usage

CurrentLens
Apr 28, 2026

The Pentagon has incorporated Google's latest AI model into GenAI.mil as user engagement surges.

Models & Launches

Microsoft Launches VibeVoice, a New Speech-to-Text Model

CurrentLens
Apr 28, 2026

Microsoft introduces VibeVoice, a Whisper-style speech-to-text model with speaker diarization.

Models & Launches

Test-Time Matching Enhances Compositional Reasoning in Multimodal Models

CurrentLens
Apr 27, 2026

A new test-time matching method improves compositional reasoning in AI models, achieving state-of-the-art results.

Open Source & Research

Civitai Launches High-Fidelity Studious Scout LoRA for Fortnite

CurrentLens
Apr 26, 2026

Civitai releases the Studious Scout 🎒 LoRA for Fortnite, designed for flexibility and character consistency.

Models & Launches

OpenAI Introduces Parameter Golf in Model Craft Initiative

CurrentLens
Apr 26, 2026

OpenAI's latest initiative, Parameter Golf, aims to refine model performance metrics.

Chips & Infrastructure

NVIDIA Optimizes Jetson for Empowering Physical AI with Enhanced Memory Efficiency

CurrentLens
Apr 26, 2026

NVIDIA reveals enhancements in Jetson's memory management, enabling larger AI models at the edge.

Models & Launches

DenoiseRank Introduces Generative Approach to Learning to Rank

CurrentLens
Apr 26, 2026

DenoiseRank leverages diffusion models for a fresh generative angle on learning to rank tasks.

AI in Coding

Claude Code Addresses Quality Complaints After Internal Review

CurrentLens
Apr 26, 2026

Claude Code's recent quality issues stem from three specific bugs, not from the models themselves.

Models & Launches

Nemobot Introduces Strategic AI Agents for Interactive Gaming

CurrentLens
Apr 26, 2026

Nemobot leverages large language models to create customizable AI agents for strategic games.

Models & Launches

AI Models Show Risks for Biological Misuse Amid Evolving Safeguards

CurrentLens
Apr 24, 2026

Recent benchmarks reveal AI models may enable biological weaponization by low-expertise users, raising urgent policy concerns.

Chips & Infrastructure

NVIDIA Advances Optimizers to Speed Up LLM Training

CurrentLens
Apr 23, 2026

NVIDIA introduces new higher-order optimizers to enhance training efficiency for large language models.

Models & Launches

Xiaomi Launches MiMo-V2.5-Pro and MiMo-V2.5 at Lower Costs

CurrentLens
Apr 23, 2026

Xiaomi's new MiMo models achieve frontier benchmarks while reducing token costs significantly.

AI Creative

ChatGPT Images 2.0 Excels in Text Generation Capabilities

CurrentLens
Apr 23, 2026

OpenAI's ChatGPT Images 2.0 model showcases a surprising proficiency in text generation.

AI in Coding

Qwen 3.6-27B Model Surpasses Previous Coding Benchmarks

CurrentLens
Apr 23, 2026

The new Qwen 3.6-27B model delivers superior coding performance with a significantly reduced size.

Models & Launches

RepIt Framework Enables Concept-Specific Refusal in Language Models

CurrentLens
Apr 23, 2026

A new framework exposes vulnerabilities in language model safety evaluations through concept-specific manipulations.

Models & Launches

Full fine-tuning concentrates LLM attribution in code-compliance models

CurrentLens
Apr 21, 2026

An arXiv study uses perturbation-based attribution to compare FFT, LoRA, and quantized LoRA across model sizes and finds FFT yields more focused interpretive patterns.

Models & Launches

OpenAI Releases ChatGPT Images 2.0

CurrentLens
Apr 21, 2026

OpenAI published ChatGPT Images 2.0; Simon Willison ran a Where's‑Waldo‑style prompt to compare it with gpt-image-1 and rival models.

Models & Launches

AllenAI launches vla-eval to unify Vision-Language-Action benchmarking

CurrentLens
Apr 21, 2026

vla-eval decouples model inference from simulator execution with a WebSocket+msgpack protocol and Docker isolation, supporting 14 benchmarks and six model servers.

Models & Launches

Anthropic updates Claude Opus 4.7 system prompt with new tools and tighter safety guidance

CurrentLens
Apr 19, 2026

Anthropic revised the Claude Opus 4.7 system prompt to add a PowerPoint agent, expand child-safety rules, and change interaction guidance.

Models & Launches

Anthropic Ships Claude Opus 4.7 for Agentic Coding and High‑Res Vision

CurrentLens
Apr 19, 2026

Anthropic released Claude Opus 4.7, a focused successor to Opus 4.6 that emphasizes agentic software engineering, high-resolution vision and long-horizon autonomy.

AI in Coding

Maps Claude system prompts into a Git commit timeline

CurrentLens
Apr 19, 2026

Simon Willison turned Anthropic’s published Claude system prompts into per-model Markdown files with fake git commits so changes can be browsed on GitHub.

Enterprise AI

NVIDIA Launches Ising Open Models to Accelerate Quantum-Processor Development

CurrentLens
Apr 17, 2026

NVIDIA introduced Ising, a family of open-source quantum AI models intended to help researchers and enterprises design quantum processors that can run useful applications.

Models & Launches

Anthropic ships Claude Opus 4.7 as its most powerful generally available model

CurrentLens
Apr 17, 2026

Opus 4.7 arrives as Anthropic’s strongest generally available Claude release, claiming upgrades for advanced coding, image analysis and instruction following.

Agents & Automation

OpenAI Launches GPT-Rosalind to Accelerate Life‑Sciences Research

CurrentLens
Apr 17, 2026

OpenAI introduced GPT‑Rosalind, a frontier reasoning model aimed at speeding drug discovery, genomics, protein reasoning, and scientific workflows.

Enterprise AI

OpenAI opens GPT‑5.4‑Cyber to security vendors with $10M Trusted Access grants

CurrentLens
Apr 17, 2026

OpenAI is placing GPT‑5.

AI Creative

Anthropic Lawsuit Exposes 'Humans-in-the-Loop' Illusion in AI Warfare

CurrentLens
Apr 17, 2026

A legal fight between Anthropic and the Pentagon centers on whether commercial models can be sold for military use as AI moves beyond purely analytic roles in the conflict with Iran.

Models & Launches

OpenAI Debuts GPT-Rosalind for Drug Discovery and Genomics

CurrentLens
Apr 17, 2026

OpenAI launched GPT-Rosalind, its first life‑sciences model aimed at accelerating drug discovery and genomic analysis and cutting long development timelines.

Models & Launches

Qwen3.6-35B-A3B bests Claude Opus 4.7 on Willison's pelican test

CurrentLens
Apr 16, 2026

Simon Willison reports that a local, quantized Qwen3.6-35B-A3B run produced better pelican and flamingo illustrations than Anthropic's Claude Opus 4.

AI in Education

Researchers Build an Index to Measure the Human Relationship with Nature

CurrentLens
Apr 16, 2026

Conservationists are moving from exclusionary models toward metrics that count human stewardship alongside ecological health.

Science & Healthcare

EVE Releases Open-Source 24B Earth-Intelligence LLM and Benchmarks

CurrentLens
Apr 16, 2026

EVE publishes EVE-Instruct, a 24B Mistral-based model and a suite of Earth-science datasets, benchmarks, and tooling for domain-specific LLM deployment.

Models & Launches

llm-anthropic 0.25 Adds Claude-Opus-4.7 with xhigh thinking_effort

CurrentLens
Apr 16, 2026

Simon Willison released llm-anthropic 0.25, which ships claude-opus-4.7 supporting thinking_effort: xhigh and new thinking flags.

Models & Launches

DeepMind Ships Gemini Robotics‑ER 1.6 for Physical Robot Reasoning

CurrentLens
Apr 15, 2026

Gemini Robotics‑ER 1.6 adds instrument-reading plus improved visual, spatial and planning skills to DeepMind's embodied-reasoning model for robots.

Enterprise AI

Anthropic Briefed Trump Administration on Mythos, Co‑Founder Confirms

CurrentLens
Apr 14, 2026

Jack Clark said at the Semafor summit that Anthropic provided a briefing on its Mythos model to the Trump administration while litigation is ongoing.

Models & Launches

NVIDIA Launches Ising AI Models to Tackle Noisy Qubits

CurrentLens
Apr 14, 2026

NVIDIA unveiled Ising, an open family of AI models with Calibration and Decoding domains designed to help build fault-tolerant quantum processors.

Models & Launches

OpenAI pushes to lock users and expand enterprise in internal memo

CurrentLens
Apr 14, 2026

CRO Denise Dresser told staff to prioritize user retention and enterprise sales and to build a product 'moat' as users easily switch between top models.