Xiaomi's new MiMo models achieve frontier benchmarks while reducing token costs significantly.
12 results for: Release
Hugging Face Releases ml-intern to Automate LLM Post‑Training Workflows
ml-intern is an open-source agent that automates literature review, dataset discovery, training script runs, and iterative evaluation for LLM post-training work.
OpenAI Releases ChatGPT Images 2.0
OpenAI published ChatGPT Images 2.0; Simon Willison ran a Where's‑Waldo‑style prompt to compare it with gpt-image-1 and rival models.
Anthropic Ships Claude Opus 4.7 for Agentic Coding and High‑Res Vision
Anthropic released Claude Opus 4.7, a focused successor to Opus 4.6 that emphasizes agentic software engineering, high-resolution vision and long-horizon autonomy.
AWS launches Spring AI SDK for Amazon Bedrock AgentCore
AWS has released an open-source Spring AI AgentCore SDK that embeds Bedrock AgentCore capabilities into Spring AI and targets production-ready agent workflows.
NVIDIA releases NVbandwidth to profile GPU interconnect and memory throughput
NVIDIA published NVbandwidth, a developer tool for measuring data-transfer and memory performance in CUDA-powered single- and multi-GPU systems.
Anthropic ships Claude Opus 4.7 as its most powerful generally available model
Opus 4.7 arrives as Anthropic’s strongest generally available Claude release, claiming upgrades for advanced coding, image analysis and instruction following.
Datasette 1.0a28 fixes alpha breakages, adds shutdown and test-cleanup APIs
Release 1.0a28 repairs compatibility regressions from 1.0a27, adds datasette.close and database.close behavior, and ships a pytest plugin to avoid fd leaks.
Qwen3.6-35B-A3B bests Claude Opus 4.7 on Willison's pelican test
Simon Willison reports that a local, quantized Qwen3.6-35B-A3B run produced better pelican and flamingo illustrations than Anthropic's Claude Opus 4.
EVE Releases Open-Source 24B Earth-Intelligence LLM and Benchmarks
EVE publishes EVE-Instruct, a 24B Mistral-based model and a suite of Earth-science datasets, benchmarks, and tooling for domain-specific LLM deployment.
llm-anthropic 0.25 Adds Claude-Opus-4.7 with xhigh thinking_effort
Simon Willison released llm-anthropic 0.25, which ships claude-opus-4.7 supporting thinking_effort: xhigh and new thinking flags.
DeepMind Ships Gemini Robotics‑ER 1.6 for Physical Robot Reasoning
Gemini Robotics‑ER 1.6 adds instrument-reading plus improved visual, spatial and planning skills to DeepMind's embodied-reasoning model for robots.