“The text below is a very recently leaked document, shared by an anonymous person on a public Discord server who has given permission for its publication. It comes from a Google researcher. The document is only the opinion of a Google employee. Google, not company-wide. We have verified its authenticity.”
This was the text that accompanied an article published last week in the newsletter of the consulting firm Semianalysis. Its authenticity has also been endorsed by various technological entities. Receiving an “8” on the credibility scale of Simon Willison, co-creator of Django (and quoted in the report), who calls it “the emost interesting writing I’ve seen about LLM in a long time.
“Reading it, it’s the kind of document you’d expect to see circulating within Google, and I don’t see why someone would write something this great and then decide to pretend it was a leak instead of taking credit for it.”
The basic premise of the report is that while Google and OpenAI continue to compete with each other (or, depending on your point of view, Google tries to compete against OpenAI), their advances are gradually being outpaced by the work done by the “open source community“.
Although the Google and OpenAI models still have a slight quality advantage, its author assures that “the gap will close surprisingly quickly“.
“Open source models are faster, more customizable, and comparatively speaking, more capable.”
The rise of open source alternatives
To your ability measured in inversion and number of parameters (the unit of measure for the complexity of language models), what do you mean by “relatively more capable, comparatively speaking”? Those are open source projects, to use your own example:.
“they are achieving things with 100 dollars 13,000 million parameters that cost us [lograr] with 10 million dollars and 540,000 million parameters”.
And here lies the key to the concern of the author of the article: “Y [todo esto] They are doing it in weeks, not months”. The report then addresses the rapid evolution since the community had access to the LLaMA model of Meta in March:
“[LLaMA] It did not have instructions, nor did it have conversation tuning, nor did it have RLHF. However, the community immediately understood the meaning of what they had been given. And here we are, barely a month later, and there are already variants with instruction adjustments, quantization, quality improvements, human evaluations, multimodality, RLHF, etc.”
Most important, according to the article, “is that they have solved the problem of scale,” in the sense that they have almost completely removed the barrier to entry for LLM training and experimentation, allowing “one person, in overnight, and armed with a relatively powerful computer” to achieve what was previously only possible for large corporations like OpenAI and Google.
In fact, the document includes a graphic from the Vicuña open source model website that has been modified to show how much time elapsed between the release of the two free LLaMA-based models and their public release. a total of three weeks:
Next, the anonymous author established a comparison between the increase of open source LLMs and that of open source image generation models:
“In many ways, this should come as no surprise to anyone. The current boom in open source LLMs comes on the heels of the one experienced in the field of imaging. […] Many call it the “Stable Diffusion moment” of language models.“.
“OpenAI doesn’t matter. It’s making the same mistakes we are in open source”
Can’t compete against open source
But, and how does all this affect Google? The report estimates that, from now on, it will be even more difficult for Google to compete in this field: “Who would pay for a restricted Google product of use if there is a free, high-quality alternative without them?”
In any case, the only problem lies not only in the lack of restrictions of the rival models, but in the same development model as Google, secret and proprietary:
“Keep our technology secret It was a bet destined to fail. Google researchers leave for other companies constantly, so we can assume that [dichas compañías] They know everything we know, and that will continue as long as that pipeline is open.”
And meanwhile, research institutions around the world continue to join efforts to develop on what has been developed by others, thanks to free software. “We can try to hold on tight to our secrets while outside innovation dilutes their value, or we can try to learn from each other.”
He also cites two factors that make individual developers more agile in innovating than the companies themselves:
- First of all, are not limited by licenses that limit free access to LLMs to “personal use” (the case of LLaMA).
- Second, the diversity of LLMs customized for specific uses is due to developers committed to these use cases (he gives an example of anime image generators based on Stable Diffusion):
“It’s being used and created by people deeply immersed in their particular subgenre, bringing a depth of insight and empathy that we can’t hope to match.”
Meta is the new Google
According to the document, “paradoxically the only clear winner in all this is Meta. Since the leaked model was theirs, they have got free labor for the entire planet […] nothing prevents them from incorporating the innovations made directly into their products”.
“The value of owning the ecosystem cannot be overstated. Google itself has successfully used this paradigm in its open source offerings, like Chrome and Android.”
What about OpenAI?
“The more we control our models, open alternatives are more attractive. Both Google and OpenAI have become defensive towards publishing models that allow them to maintain tight control over how their models are used. But this control is a fiction.”
The author of the report is blunt: “OpenAI doesn’t matter. They are making the same mistakes we are in their stance on open source.“. He predicts that OpenAI will end up losing its position unless honor its name and go back to betting on the ‘open’. “In this sense, at least, [en Google] We can take the first step.”
#ChatGPT #Bard #great #rival #OpenAI #Google #valued #now..