openPR Logo
Press release

ICLR 2026 | U2-BENCH: The First Large-Scale Comprehensive Ultrasound Multimodal Understanding Benchm

02-10-2026 11:56 PM CET | Associations & Organizations

Press release from: Getnews

/ PR Agency: HaiwaiPR
ICLR 2026 | U2-BENCH: The First Large-Scale Comprehensive

Dolphin AI, a Chinese startup specializing in ultrasound-specific medical intelligence, has announced the official release of U2-BENCH-a landmark evaluation standard for multimodal ultrasound AI. This research, recently accepted by ICLR 2026, represents the first systematic attempt to bridge the gap between general AI capabilities and specialized clinical ultrasound requirements.

1. The Challenges of Medical AI: From General Vision to Professional Ultrasound Understanding

Ultrasound imaging is among the most widely used diagnostic tools in global healthcare and continues to play an irreplaceable role in obstetrics and gynecology, emergency medicine, and cardiology. However, automated ultrasound image understanding has long faced significant bottlenecks:

High Variability: Quality is heavily influenced by the operator's technique, leading to substantial fluctuations and numerous artifacts.

Complex Spatial Relationships: Unlike the static slices of CT/MRI, ultrasound presents dynamic structures with strong spatial-contextual relationships.

Lack of Evaluation Systems: While general LVLMs like GPT-4V and Gemini show impressive performance, their professional capabilities in ultrasound have never been systematically evaluated.

To address these challenges, U2-BENCH has been introduced as the first comprehensive benchmark designed to evaluate LVLM capabilities in the ultrasound domain, covering four major task dimensions: classification, detection, regression, and text generation.

Image: https://www.globalnewslines.com/uploads/2026/02/424e4665a084944c9a99ccb202d9bf64.jpg

2. Core Design: Comprehensive Anatomical Coverage and Clinical-Heuristic Tasks

The core value of U2-BENCH lies in its high clinical relevance and rigorous construction pipeline:

2.1 Unprecedented Data Scale and Diversity

Breadth of Coverage: Aggregates 7,241 cases from 40 authorized datasets, spanning 15 anatomical regions with broad anatomical coverage (including fetus, heart, breast, and thyroid).

Scenario Diversity: Covers 50 clinical use cases to ensure evaluation results accurately reflect a model's performance on the medical frontline.

2.2 Eight Clinical Heuristic Task Categories

U2-BENCH organizes ultrasound understanding into four capability levels and eight specific tasks:

Classification Tasks: Disease Diagnosis (DD), View Recognition and Assessment (VRA).

Detection Tasks: Lesion Localization (LL), Organ Detection (OD), Keypoint Detection (KD).

Regression Tasks: Clinical Value Estimation (CVE).

Generation Tasks: Structured Report Generation (RG), Anatomical Caption Generation (CG).

3. Experimental Validation: Defining the Capabilities and Limitations of SOTA Models

A large-scale evaluation of 23 cutting-edge vision-language models was conducted on U2-BENCH:

3.1 Closed-Source Models Still Lead, but Significant Room for Improvement Remains

Top Performance: Dolphin-V1 ranked first with a total score (U2-Score) of 0.5835, significantly outperforming GPT-5 (0.3250) and Gemini-2.5-Pro (0.2968).

Open-Source Comparison: Among open-source models, DeepSeek-VL2 showed the strongest performance, though a generational gap remains in complex reasoning compared to top-tier closed-source models.

3.2 A Pronounced Gap Between Recognition and Reasoning

Classification vs. Spatial Reasoning: Models perform reasonably well on image-level classification such as Disease Diagnosis (DD), but struggle with spatial-related detection (KD/OD) and regression (CVE) tasks.

Challenges in Report Generation (RG): While the linguistic quality of generated text is high, serious deficiencies remain in medical accuracy and structured compliance.

3.3 Key Conclusion: Scaling Alone is Not the Answer

Diminishing Returns from Parameter Scaling: Comparisons within the Qwen family found that increasing model parameters from 3B to 72B brought steady improvements, but gains were not significant in certain spatial reasoning tasks. This suggests that domain-specific ultrasound training is more effective than simply expanding parameter size.

4. Summary and Outlook: Moving Toward Embodied Medical Intelligence

The successful establishment of U2-BENCH proves that ultrasound AI is undergoing a paradigm shift from "single-task narrow models" toward "all-encompassing foundation models." Looking ahead, U2-BENCH is slated for expansion to include:

Dynamic Video Understanding: Moving from single frames to real-time scanning sequences.

Long-Range Embodied Perception: Integrating with hardware such as robotic arms to achieve automated ultrasound scanning.

U2-BENCH is expected to serve as a vital guide for global medical AI researchers, contributing to the construction of a safer and more professional medical world model.
Media Contact
Company Name: Dolphin AI
Contact Person: Ruier Zhao
Email: Send Email [http://www.universalpressrelease.com/?pr=iclr-2026-u2bench-the-first-largescale-comprehensive-ultrasound-multimodal-understanding-benchm]
State: Jiaxing
Country: China
Website: https://dolphin-ai.cn/

Legal Disclaimer: Information contained on this page is provided by an independent third-party content provider. GetNews makes no warranties or responsibility or liability for the accuracy, content, images, videos, licenses, completeness, legality, or reliability of the information contained in this article. If you are affiliated with this article or have any complaints or copyright issues related to this article and would like it to be removed, please contact retract@swscontact.com



This release was published on openPR.

Permanent link to this press release:

Copy
Please set a link in the press area of your homepage to this press release on openPR. openPR disclaims liability for any content contained in this release.

You can edit or delete your press release ICLR 2026 | U2-BENCH: The First Large-Scale Comprehensive Ultrasound Multimodal Understanding Benchm here

News-ID: 4385218 • Views:

More Releases from Getnews

XMO Corp Founders to Showcase Agentic AI Security Platform at Military Veteran Startups Conference 2026
XMO Corp Founders to Showcase Agentic AI Security Platform at Military Veteran S …
Image: https://www.globalnewslines.com/uploads/2026/02/1770747355.jpg Image: https://www.globalnewslines.com/uploads/2026/02/395c8fff7f6a5b03d04037f5a5658382.jpg San Diego, CA - XMO Corp, a leader in smart cities development and digital innovation, is proud to announce that founders Fred Sotelo, CEO, and Brian Alvara (Ret. US Navy), Chief Technology Officer will be attending the Military Veteran Startups Conference 2026 the largest gathering of Military Veteran Entrepreneurs, Investors, and Industry leaders taking place in San Franciso, CA at the Marines' Memorial Club Feb 11-12. At the conference,
Open source of the Congzi AI algorithm: By 2025, ordinary AI will rise to become the source of the leading AI
Open source of the Congzi AI algorithm: By 2025, ordinary AI will rise to become …
In just 5 minutes, any AI can be upgraded to a scientific discoverer. On February 14th, 2026, the original Chinese AI algorithm "Congzi" will be officially open-sourced by Shandong Congzi Chao Quantum Technology Co., Ltd. In 2025, after being tested by multiple open-source AI companies from China and the United States, the new era of "AI for Science" has begun. Within 5 minutes, an ordinary AI can be upgraded to a
Irwin Brar Shares Outlook for the Coming Year in Affordable Housing and Construction
Irwin Brar Shares Outlook for the Coming Year in Affordable Housing and Construc …
Image: https://www.globalnewslines.com/uploads/2026/02/1770744706.jpg Irwin Brar Vancouver-born Alberta builder Irwin Brar outlines what individuals should expect-and how they can prepare-as housing pressures intensify across Canada. Redcliff, Alberta - February 10, 2026 - Irwin Brar, CEO of Apex Construction and COO of Ridge Apartments, is offering his personal outlook on the year ahead in the affordable housing and construction sector. Based on two decades of hands-on work across residential builds, community developments, and large-scale housing projects,
Dr. Jon Randall, Founder of XFA.COACH Interviewed on Podcast Discussing Ideal Client Growth: The Key to Transforming Advisory Practices
Dr. Jon Randall, Founder of XFA.COACH Interviewed on Podcast Discussing Ideal Cl …
Image: https://authoritypresswire.com/wp-content/uploads/2026/02/Jon_Randall_headshot-removebg-preview.png Dr. Jon Randall discusses the ideal client growth: the key to transforming advisory practices Listen to the interview on the Business Innovators Radio Network: https://businessinnovatorsradio.com/interview-with-jon-randall-founder-of-xfa-coach/ Jon emphasized the importance of transformation over mere metrics and KPIs, highlighting that true growth comes from addressing the constraints that hold practices back. He discussed the common issue of capacity, where advisors often have too much on their plates, leading to stunted growth. By focusing

All 5 Releases


More Releases for Task

Retail Task Management Software market growing popularity and emerging trends - …
"SWOT Analysis of Retail Task Management Software, Professional Survey Report Including Top Most Global Players Analysis with CAGR and Stock Market Up and Down." The Retail Task Management Software Market research report presents an all-inclusive study of the Retail Task Management Software market. The report includes all the major trends and technologies performing a major role in the Retail Task Management Software market development during the forecast period. The key players
TASKMO: On-Demand Task Fulfillment Platform
Our Growth Story It all started when two techies of the city, Prashant Janadri and Naveen Ram took an auto-ride to work and while listening to the conversation between the auto-driver and his son, the duo was astonished to find that the young lad had to drop out of his university because he couldn’t afford his higher education. It was at this point that the duo realised that several students
Minds Task Technologies Becomes Pimcore Silver Partner
Noida, India, May 20, 2021— Minds Task Technologies, a digital solution provider, is happy to announce that it has become a Pimcore silver partner. The partnership is a result of the company’s growing capabilities, expertise, and success with Pimcore Platform in the recent past. Sudhanshu Singh, Minds Task Technologies Co-founder, and CTO, said, “This partnership will further solidify our capabilities in delivering data management and digital experience solutions with best
Limited Performance in Dual-Task-Situations
The question of whether information can be retrieved from memory concurrently to other cognitive processes is an important issue in cognitive psychology. Rico Fischer pursued this question by investigating whether people can access information in memory in one task while being occupied processing a different task. The results generally favour the interpretation that semantic memory retrieval processes in dual-tasks require access to central capacity resources and are thus subject to
Taskenstein Announces Web Based Task Management Software
UNITED STATES (December 2010) – Taskenstein announces a web based task management software. This software helps people manage their entire daily tasks in an efficient and useful manner. Task management software is a useful tool for people because it allows them to successfully manage all the tasks of the day. These tasks include both professional and personal items. Therefore, the types of tasks which can be created come from work duties,
Next generation task management for BlackBerry
Houston, Texas, USA – August 5, 2008 – e-Mobile Software, Inc. today announced an exciting product "e-Mobile Task Pro" for BlackBerry devices. Based in Houston, Texas, e-Mobile Software, Inc. is a leading mobile software development company, offers software and solutions for Windows Mobile, BlackBerry and Symbian devices. "This revolutionary product for the first time brings to BlackBerry users a fresh, yet powerful task management tool. It gives user so many choices