- Loss Curve
- Posts
- Gemini 2.0 and Sam Altman's Blog Post
Gemini 2.0 and Sam Altman's Blog Post
Google's Gemini 2.0 release with unbeatable pricing, DeepMind's new TPU textbook, and Sam Altman's observations on AGI scaling
Google released three models on Wednesday, February 5, 2024,
Gemini 2.0 Pro (Experimental): Google's "best model yet for coding performance and complex prompts", currently available as a free preview.
Gemini 2.0 Flash: “a powerful workhorse model” that is “highly capable of multimodal reasoning across vast amounts of information with a context window of 1 million tokens”
Gemini 2.0 Flash-Lite: better quality than 1.5 Flash, at the same speed and cost.
Model quality is outstanding: From LMSYS Chatbot Arena, Gemini-2.0-Pro takes #1 spot across all categories except Math being overtaken by the Gemini 2’s reasoning model. Gemini-2.0-Flash takes #3 and is on par with chatgpt-4o. The flash lite version is also ranked in the top 10.
Price is unbeatable: The star of the show is Gemini 2.0 Flash.
Gemini 2.0 Flash costs 10 cents/million for text/image input, 70 cents/million for audio input, and 40 cents/million for output.
In comparison, GPT-4o-mini is 150 cents/million for text input (15x), 1000 cents/million for audio input (14x), and 60 cents/million for output (1.5x).
The unbeatable price and generous context window (2 million tokens for Pro and 1 million tokens for Flash), along with context caching, have sparked imaginations and even claims like “RAG is dead” (RAG means Retrieval-augmented generation). Sergey Filimonov shared a blog post using Gemini flash to extract markdown from PDFs. The accuracy is near perfect, and the cost is 13x cheaper than chatgpt-4o-mini and 60x cheaper than claude-3-5-sonnet.
A textbook on JAX stack and Google's TPU hardware
Deepmind researchers published a textbook for understanding scaling large models, focusing on their TPU hardware. I only skimmed it, but it looks interesting (for example, this section on calculating the transformer FLOPs). Previously, I took CMU’s Deep Learning Systems course and benefited greatly from understanding the fundamentals (e.g., how to implement efficient automatic differentiation), but found the part on the hardware acceleration a bit brief. This textbook seems like a good follow-up resource.
Speaking of tutorials, Karparthy released a new 3.5hr lecture on LLM targeting the general audience in case you are interested.
Sam Altman published a blog post sharing his optimism (and concerns) about the future of AGI, and three observations on its current trajectory:
The intelligence of an AI model roughly equals the log of the resources used to train and run it.
The cost to use a given level of AI falls about 10x every 12 months, and lower prices lead to much more use.
The socioeconomic value of linearly increasing intelligence is super-exponential in nature. A consequence of this is that we see no reason for exponentially increasing investment to stop in the near future.
I have two takeaways from this blog post:
Sam Altman doesn’t believe AI hits the wall. He said the scaling law continues and “you can spend arbitrary amounts of money and get continuous and predictable gains”.
Super-exponential socioeconomic value can come from linearly increasing the (artificial) intelligence, which requires exponential resources to train and run it. As a result, the exponentially increasing infra investment is justified, to deliver super-exponential socioeconomic value. It is a very indirect way to say “don’t look at the model improvement, look at its socioeconomic impacts”.
Quote of the week
This is from a 1979 presentation. We are v slow learners, it seems.
— bumblebike (@bumblebike)
12:59 AM • Feb 17, 2017
From X,
A computer can never be held accountable
Therefore a computer must never make a management decision
Thanks for reading the 1st issue of Loss Curve!