Friday , 20 June 2025

Home Kripto Nvidia Unveils New AI Model That Rivals GPT-4 in Both Vision and Language Tasks

Kripto

Nvidia Unveils New AI Model That Rivals GPT-4 in Both Vision and Language Tasks

ByLivevartha2024-10-052 Mins read274 Views

Nvidia has introduced a groundbreaking open-source artificial intelligence model, poised to rival industry-leading systems like OpenAI’s GPT-4. This new development, named the NVLM 1.0 family, is headlined by the NVLM-D-72B, a multimodal large language model that excels across both vision and language tasks. With 72 billion parameters, the NVLM-D-72B delivers top-tier performance in both areas, showcasing its capability to process visual inputs like memes and images alongside traditional text-based tasks, such as solving math problems with step-by-step precision.

The researchers behind NVLM 1.0 emphasized its competitive nature, positioning it alongside proprietary models. They pointed out the model’s ability to enhance performance in text-only tasks after multimodal training, achieving a 4.3-point improvement across key benchmarks. This stands in contrast to similar models, which often experience diminished text accuracy following multimodal training.

Benchmark results comparing NVIDIA’s NVLM-D model to AI giants like GPT-4, Claude 3.5, and Llama 3-V, showing NVLM-D’s competitive performance across various visual and language tasks. (Credit: arxiv.org)

Nvidia’s decision to open-source the model represents a notable departure from the norm, where leading-edge AI models are typically closed off to the public. By making both the model weights and the training code accessible, Nvidia has provided researchers and developers with unprecedented tools to advance AI research. This move has been met with enthusiasm from the AI community. Some researchers have compared the NVLM-D-72B to Meta’s LLaMA 3.1 model, noting its high-level performance in math and coding tasks, while also integrating vision processing—a rare combination.

The NVLM project introduces several innovative architectural features, including hybrid multimodal processing techniques, which could influence future research directions in the field. While the open-source release has been hailed as a significant step forward, it also introduces potential challenges. With such powerful AI technology now publicly available, concerns about ethical use and potential misuse have surfaced, underscoring the need for responsible AI development.

Wow nvidia just published a 72B model with is ~on par with llama 3.1 405B in math and coding evals and also has vision 🤯 pic.twitter.com/c46DeXql7s

— Phil (@phill__1) October 1, 2024

Nvidia’s open-source initiative may also impact the broader AI industry’s structure. If high-performing models like NVLM 1.0 are freely accessible, companies could face pressure to rethink their business models, as smaller organizations and independent researchers gain access to tools that were previously restricted to tech giants. Nvidia’s move has opened a new chapter in AI development, with potential far-reaching consequences for how AI progress unfolds in the near future.

Kripto

Adobe’s Firefly Now Available on iOS and Android

Adobe continues its push to become the go-to platform for AI-powered creative...

ByLivevartha2025-06-20

Kripto

Tesla Full-Self Driving Tests Reveal Dangers: Speeds Past Stopped School Bus, Strikes Dummy Kids

Third-party testing conducted by The Dawn Project and partners has revealed serious...

ByLivevartha2025-06-20

Kripto

Trump Rejects Israeli Proposal to Target Iran’s Supreme Leader, Say US Officials

Amid escalating tensions between Israel and Iran, President Donald Trump opposed an...

ByLivevartha2025-06-20

Kripto

Tinder Now Lets You Arrange Double Dates with Friends

In response to declining user engagement, Tinder has introduced a new feature...

ByLivevartha2025-06-20

Recent Posts

IFIRST Bank Islam perkasa PKS, syarikat sederhana ke arah ekonomi hijau

Sektor pembinaan terus rancak dipacu projek infrastruktur awam

Rundingan perdagangan Malaysia-AS sentuh kestabilan pelaburan E&E

Instagram Users Report Mass Bans, Blame AI for Crackdown

Nvidia Unveils New AI Model That Rivals GPT-4 in Both Vision and Language Tasks

Recent Posts

Adobe’s Firefly Now Available on iOS and Android

Tesla Full-Self Driving Tests Reveal Dangers: Speeds Past Stopped School Bus, Strikes Dummy Kids

Trump Rejects Israeli Proposal to Target Iran’s Supreme Leader, Say US Officials

Tinder Now Lets You Arrange Double Dates with Friends

Categories

Related Articles

Adobe’s Firefly Now Available on iOS and Android

Tesla Full-Self Driving Tests Reveal Dangers: Speeds Past Stopped School Bus, Strikes Dummy Kids

Trump Rejects Israeli Proposal to Target Iran’s Supreme Leader, Say US Officials

Tinder Now Lets You Arrange Double Dates with Friends