AI language model “LLaMA”: Meta presents competitor for ChatGPT and Google Bard

The current hype about chatbots and AI search engines is characterized in particular by Microsoft with the ChatGPT integration in Bing and Edge and Google with the presentation of Bard. So far it has been comparatively quiet at Meta. But the Facebook group has now also announced the launch of its own Large Language Model (LLM).

Meta's language models run under the name LLaMA (Large Language Model Meta AI) and are available in four sizes. The range extends from 7 billion to 65 billion parameters.

LLMs have shown promise in generating text, leading conversations, summarizing written material, as well as more complicated tasks like solving mathematical theorems or predicting protein structure.

Meta-Chef Mark Zuckerberg

LLaMA models for AI research

LLaMA is therefore Meta's next attempt to get involved in the field of complex AI language models. It is available for research under a non-commercial license. Laboratories from universities, government institutions, NGOs and industry are given access upon request.

The group describes details of the model in a model card and a paper. Accordingly, only publicly available data was used to train the model. At 67 percent, the majority comes from a CommonCrawl database, other data suppliers are Wikipedia and GitHub (4.5 percent each) and also literary works. For these, Meta used the Gutenberg collection, among others.

In addition, Meta publishes benchmark comparisons with other popular LLM language models. According to Meta, the LLaMA variants perform better than OpenAI's GPT-3 language model in many cases. However, ChatGPT already uses the improved GPT 3.5 model, for example.

Prejudices in the data mountains

LLaMA is not Meta's first attempt to provide language models for AI research. Most recently, the group released the Galactica language model in November 2022. Specially designed for research, it should answer scientific questions and help with writing texts. To do this, it was based on a data set that consisted primarily of academic literature.

However, the model was only available for three days. Wrong and biased answers were decisive for the quick conclusion. Shortly thereafter, OpenAI released ChatGPT and the hype surrounding the AI ​​language models picked up speed.

However, Microsoft is also struggling with the Bing search engine, which has been expanded to include a GPT language model, with incorrect and bizarre to biased answers. The developers are therefore constantly working on security mechanisms. In addition, the number of search queries per chat session and day has been limited. LLaMA is also struggling with similar problems, developers write in the model card.

Risks and harms of large language models include the generation of harmful, offensive or biased content. These models are often prone to generating incorrect information, sometimes referred to as hallucinations. We do not expect our model to be an exception in this regard.

LLama Model Card

The reason is well known: The models' training data already contain prejudices, stereotypes and False information – accordingly, the models also reproduce toxic content, provided the developers do not intervene.

Meta also uses various benchmarks to measure the extent of the difficulties, which include individual forms of discrimination. These differentiate between gender, skin color, age or sexual orientation. When compared again with GPT-3, the largest LLaMA model should again do a little better overall according to Meta's measurements.


Posted

in

by

Tags: