Thursday, April 23, 2026
  • facebook
  • instagram
  • x
  • linkedin

CurrentLens.com

Insight Today. Impact Tomorrow.

  • Home
  • Models
  • Agents
  • Coding
  • Creative
  • Policy
  • Infrastructure
  • Topics
    • Enterprise
    • Open Source
    • Science
    • Education
    • AI & Warfare
Latest News
  • Xiaomi Launches MiMo-V2.5-Pro and MiMo-V2.5 at Lower Costs
  • NVIDIA Advances Optimizers to Speed Up LLM Training
  • Space Force Accelerates Recruitment Amid Looming Budget Boost
  • Anthropic Unveils Responsible Scaling Policy for AI Governance
  • Google Launches Two New TPUs for AI Inference and Training
  • GitHub Copilot Tightens Pricing and Usage Limits for Individual Plans
  • Xiaomi Launches MiMo-V2.5-Pro and MiMo-V2.5 at Lower Costs
  • NVIDIA Advances Optimizers to Speed Up LLM Training
  • Space Force Accelerates Recruitment Amid Looming Budget Boost
  • Anthropic Unveils Responsible Scaling Policy for AI Governance
  • Google Launches Two New TPUs for AI Inference and Training
  • GitHub Copilot Tightens Pricing and Usage Limits for Individual Plans
  • Home
  • Models & Launches
  • Qwen3.6-35B-A3B bests Claude Opus 4.7 on Willison's pelican test

Qwen3.6-35B-A3B bests Claude Opus 4.7 on Willison's pelican test

Posted on Apr 16, 2026 by CurrentLens in Models
Qwen3.6-35B-A3B bests Claude Opus 4.7 on Willison's pelican test

Photo by Andrey Matveev on Unsplash

AI Quick Take

  • Willison's quick comparison favors Alibaba's Qwen3.6-35B-A3B over Anthropic's Opus 4.7 on two whimsical image-generation prompts.
  • The Qwen run used a 20.

Simon Willison reports that Qwen3.6-35B-A3B produced preferable illustrations to Anthropic's Claude Opus 4.7 on his informal 'pelican riding a bicycle' test and on a separate SVG flamingo-on-a-unicycle prompt. The comparison reflects direct prompt outputs and visual judgement rather than formal metrics.

The Qwen result came from a 20.9GB gguf quantized model by Unsloth, run locally on a MacBook Pro M5 through LM Studio and the llm-lmstudio plugin. Willison also ran Opus 4.7 and retried it with thinking_level set to max; his follow-up did not close the gap in these creative examples.

Willison emphasizes that the pelican benchmark is intentionally absurd and not a robust evaluation, though he notes past informal correlation between pelican quality and broader model usefulness. He also expresses skepticism that labs specifically train for this benchmark, even as the outcome nudges that suspicion.

For practitioners, the post is a narrow datapoint: it suggests quantized local inference of a 35B model can yield strong creative outputs, but it does not replace comprehensive benchmarks or controlled comparisons. Watch for repeatable, standardized tests and larger sample sets before changing deployment or procurement choices based on this anecdote.

Posted in Models & Launches | Tags: qwen, anthropic, model-release, benchmark, quantization, inference, llm, Claude
  • Latest
  • Trending
Xiaomi Launches MiMo-V2.5-Pro and MiMo-V2.5 at Lower Costs
  • Models & Launches

Xiaomi Launches MiMo-V2.5-Pro and MiMo-V2.5 at Lower Costs

  • CurrentLens
  • Apr 23, 2026

Xiaomi's new MiMo models achieve frontier benchmarks while reducing token costs significantly.

Read More
OpenAI Makes ChatGPT Free for Verified U.S. Healthcare Professionals
  • Models & Launches

OpenAI Makes ChatGPT Free for Verified U.S. Healthcare Professionals

  • CurrentLens
  • Apr 23, 2026

OpenAI has announced that verified U.S. physicians, nurse practitioners, and pharmacists can now access ChatGPT for Clinicians at no charge.

Read More
RepIt Framework Enables Concept-Specific Refusal in Language Models
  • Models & Launches

RepIt Framework Enables Concept-Specific Refusal in Language Models

  • CurrentLens
  • Apr 23, 2026

A new framework exposes vulnerabilities in language model safety evaluations through concept-specific manipulations.

Read More
OpenAI Adds Codex-Powered Workspace Agents to ChatGPT
  • Models & Launches

OpenAI Adds Codex-Powered Workspace Agents to ChatGPT

  • CurrentLens
  • Apr 22, 2026

OpenAI introduced workspace agents in ChatGPT: Codex-powered cloud agents designed to automate complex workflows and scale team work across tools securely.

Read More
OpenAI Adds Codex-Powered Workspace Agents to ChatGPT
  • Models & Launches

OpenAI Adds Codex-Powered Workspace Agents to ChatGPT

  • CurrentLens
  • Apr 22, 2026

OpenAI introduced workspace agents in ChatGPT: Codex-powered cloud agents designed to automate complex workflows and scale team work across tools securely.

Read More
RepIt Framework Enables Concept-Specific Refusal in Language Models
  • Models & Launches

RepIt Framework Enables Concept-Specific Refusal in Language Models

  • CurrentLens
  • Apr 23, 2026

A new framework exposes vulnerabilities in language model safety evaluations through concept-specific manipulations.

Read More
OpenAI Makes ChatGPT Free for Verified U.S. Healthcare Professionals
  • Models & Launches

OpenAI Makes ChatGPT Free for Verified U.S. Healthcare Professionals

  • CurrentLens
  • Apr 23, 2026

OpenAI has announced that verified U.S. physicians, nurse practitioners, and pharmacists can now access ChatGPT for Clinicians at no charge.

Read More
Xiaomi Launches MiMo-V2.5-Pro and MiMo-V2.5 at Lower Costs
  • Models & Launches

Xiaomi Launches MiMo-V2.5-Pro and MiMo-V2.5 at Lower Costs

  • CurrentLens
  • Apr 23, 2026

Xiaomi's new MiMo models achieve frontier benchmarks while reducing token costs significantly.

Read More

Categories

  • Models & Launches›
  • Agents & Automation›
  • AI in Coding›
  • AI Creative›
  • Policy & Safety›
  • Chips & Infrastructure›
  • Enterprise AI›
  • Open Source & Research›
  • Science & Healthcare›
  • AI in Education›
  • AI Defense & Warfare›
Advertisement
CurrentLens.com
Download on theApp Store
Get it onGoogle Play

Navigate

  • Home
  • Topics
  • About
  • Contact
  • Advertise
  • Privacy Policy

Coverage

  • Models & Launches
  • Agents & Automation
  • AI in Coding
  • AI Creative
  • Policy & Safety
  • Chips & Infrastructure

Newsletter

AI news that matters, straight to your inbox.

© 2026 CurrentLens.comAll rights reserved