H2O AI releases Danube, a super-tiny LLM for mobile applications

At the moment, H2O AI, the corporate working to democratize AI with a spread of open-source and proprietary instruments, introduced the discharge of Danube, a brand new super-tiny massive language mannequin (LLM) for cellular units.

Named after the second-largest river in Europe, the open-source mannequin comes with 1.8 billion parameters and is claimed to match or outperform equally sized fashions throughout a spread of pure language duties. This places it in the identical class as robust choices from Microsoft, Stability AI and Eleuther AI.

The timing of the announcement makes excellent sense. Enterprises constructing shopper units are racing to discover the potential of offline generative AI, the place fashions run domestically on the product, giving customers fast help throughout capabilities and eliminating the necessity to take data out to the cloud.

“We’re excited to launch H2O-Danube-1.8B as a transportable LLM on small units like your smartphone… The proliferation of smaller, lower-cost {hardware} and extra environment friendly coaching now permits modestly-sized fashions to be accessible to a wider viewers… We imagine H2O-Danube-1.8B will likely be a sport changer for cellular offline functions,” Sri Ambati, CEO and co-founder of H2O, stated in a press release.

What to anticipate from Danube-1.8B LLM?

Whereas Danube has simply been introduced, H2O claims it may be fine-tuned to deal with a spread of pure language functions on small units, together with frequent sense reasoning, studying comprehension, summarization and translation.

To coach the mini mannequin, the corporate collected a trillion tokens from numerous internet sources and utilized methods refined from Llama 2 and Mistral fashions to reinforce its era capabilities.

“We adjusted the Llama 2 structure for a complete of round 1.8B parameters. We (then) used the unique Llama 2 tokenizer with a vocabulary measurement of 32,000 and skilled our mannequin as much as a context size of 16,384. We integrated the sliding window consideration from Mistral with a measurement of 4,096,” the corporate famous whereas describing the mannequin structure on Hugging Face.

When tested on benchmarks, the mannequin was discovered to be acting on par or higher than most fashions within the 1-2B-parameter class.

For instance, within the Hellaswag check aimed toward evaluating frequent sense pure language inference, it carried out with an accuracy of 69.58%, sitting simply behind Stability AI’s Secure LM 2 1.6 billion parameter mannequin pre-trained on 2 trillion tokens. Equally, within the Arc benchmark for superior query answering, it ranks third behind Microsoft Phi 1.5 (1.3-billion parameter mannequin) and Secure LM 2 with an accuracy of 39.42%.

H2O has launched Danube-1.8B underneath an Apache 2.0 license for business use. Any group trying to implement the mannequin for a cellular use case can obtain it from Hugging Face and carry out application-specific fine-tuning.

To make this course of simpler, the corporate additionally plans to launch further tooling quickly. It has additionally launched a chat-tuned model of the mannequin (H2O-Danube-1.8B-Chat), which might be carried out for conversational functions.

In the long term, the supply of Danube and related small-sized fashions is predicted to drive a surge in offline generative AI functions throughout telephones and laptops, serving to with duties like e mail summarization, typing and picture modifying. The truth is, Samsung has already moved on this course with the launch of its S24 line of smartphones.

Source link

Artificial Intelligence
in Action

Top Stories

How Meta’s CyberSecEval 3 can help combat weaponized LLMs

Forrester’s CISO budget priorities include API, supply chain security

Table-augmented generation shows promise for complex dataset querying, outperforms text-to-SQL

H2O AI releases Danube, a super-tiny LLM for mobile applications

What to anticipate from Danube-1.8B LLM?

Leave a Reply Cancel reply

Related Strories

LSTM in Deep Learning: Architecture & Applications Guide

How Is 5G Technology Impacting Mobile App Development?

Generative AI vs Predictive AI: Differences and Real-World Applications

What is LLM? – Large Language Models Explained

Quick links

Popular Categories

Follow Socials

Artificial Intelligence in Action

Top Stories

How Meta’s CyberSecEval 3 can help combat weaponized LLMs

Forrester’s CISO budget priorities include API, supply chain security

Table-augmented generation shows promise for complex dataset querying, outperforms text-to-SQL

H2O AI releases Danube, a super-tiny LLM for mobile applications

What to anticipate from Danube-1.8B LLM?

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.

Leave a Reply Cancel reply

LSTM in Deep Learning: Architecture & Applications Guide

How Is 5G Technology Impacting Mobile App Development?

Generative AI vs Predictive AI: Differences and Real-World Applications

What is LLM? – Large Language Models Explained

Get Insider Tips and Tricks in Our Newsletter!

Artificial Intelligence
in Action