Google Research announced Gemini-SQL2, a Gemini 3.1 Pro-powered text-to-SQL capability that posted 80.04% execution accuracy on the BIRD single-model leaderboard.
2 results for: Leaderboard
Research Proposes MedCheck Framework to Enhance Medical AI Benchmarks
A new framework aims to improve the assessment of medical AI benchmarks, addressing key shortcomings.