GenAI360 - Weekly AI News
Posts
Grok 1.5 from X, Microsoft/OpenAI $100 Billion Supercomputer, Quantized LLMs, & News in Voice AI

Grok 1.5 from X, Microsoft/OpenAI $100 Billion Supercomputer, Quantized LLMs, & News in Voice AI

Grok 1.5 sets new highs, the hype over 1-bit LLMs, Notable Improvements to LLM accuracy

April 02, 2024

Hi! You're receiving this newsletter because you subscribed to Activeloop updates, are one of our GenAI360 course takers, or are an Activeloop Deep Lake user. We'll curate a weekly digest of top stories in the AI industry and research, summarizing the most important news, papers, projects, and sources so you have a 360-degree view of everything happening on the market.

Key Takeaways

This week’s key developments include:

X.ai’s Grok 1.5 was announced with 128,000 tokens and outperformed other models, such as Claude 3 Sonnet, in benchmarks like HumanEval (programming tasks).
Microsoft and OpenAI plan to build a $100 billion supercomputer called “Stargate”. The computer will use specialized server chips to boost computing power.
Mobius Labs and two independent studies advanced 1-bit and 2-bit quantization research, building on Microsoft's work in February 2024.
OpenAI announces a closed-source voice-cloning model that needs just 15 seconds to clone your voice and reproduce it in any language.
Hume AI unveiled its Empathic Voice Interface (EVI), which can understand the user’s emotional expressions.
DeepMind introduced its new Search-Augmented Factuality Evaluator (SAFE) system to improve the factuality of long-form LLM-generated content.
Accenture was revealed to be one of the most profitable companies in the generative AI space, securing $600 million in deal bookings in the first fiscal quarter.

The Latest AI News

Last week was an interesting one for AI, with events ranging from advancements in conversational AI to X.ai unveiling their new AI chatbot.

Grok 1.5 Revealed by X.ai

Grok 1.5 will soon be available for X users. X.ai announced its new AI chatbot, which is “capable of long context understanding and advanced reasoning”. This expands Grok’s application in processing and analyzing large documents.

It has a context length of 128,000 tokens, which allows it to better comprehend longer conversations. Additionally, Grok 1.5 outperforms its predecessor Grok 1 and other notable models like Claude 3 Sonnet.

Benchmark comparison of Grok 1.5 to other LLMs. (Source)

Grok 1.5’s performance on HumanEval was higher than other state-of-the-art models like GPT-4 - only falling short of Claude 3 Opus. Nevertheless, this indicates that AI’s problem-solving capabilities are quickly improving.

Grok 1.5 will be included in X’s chatbot soon - available to early testers and existing Grok users.

Microsoft and OpenAI Plan for $100 Billion Supercomputer

Tech giants Microsoft and OpenAI are collaborating on a massive project to develop a $100 billion supercomputer called “Stargate”, set to launch by 2028. The project aims to significantly boost computing power using specialized server chips.

This isn’t OpenAI's first involvement in a massive strategic partnership. On March 14th 2024, Abu Dhabi was in talks to invest in OpenAI’s chip venture to launch a semiconductor business.

These ventures indicate a reducing dependence on chip manufacturing giants like Nvidia. OpenAI’s semiconductor business pursuit, combined with the Stargate project vision to leverage Microsoft’s resources and infrastructure, hint toward OpenAI’s intention to rely less on Nvidia for its advanced chips.

OpenAI's 15 second Voice Cloning Tool

OpenAI has released a preview of a brand-new a text-to-speech model called Voice Engine that can generate natural-sounding speech from a single 15-second reference voice recording. They have been testing this technology with trusted partners in applications like translating content, supporting non-verbal individuals, and helping patients recover their voice.

Hume AI Introduces Empathic Voice Interface

Hume AI has raised $50 million in series B funding (led by EQT Ventures) to further develop its Empathic Voice Interface (EVI).

EVI, a conversational AI with emotional intelligence, bridges the gap between human and machine communication by understanding and responding to users' emotions, enhancing user experience.

We might see improved user satisfaction and customer service in various industries because of its ability to process emotional cues and adjust responses accordingly.

Further advancements in conversational AI include a developer’s preview of Open Interpreter’s 01 Light, a portable voice interface that can control home computers.

Advancements in AI Research

AI research in the last week tackled key challenges, such as long-form factuality, aligning human values with AI, and advances in quantization that lead to more efficient computing.

Quantization Advancements Boost Memory and Energy Efficiency

In Feburary 2024, Microsoft demonstrated that LLMs with ternary weights had the same performance as traditional 16-bit LLMs, leading to large memory efficiency and energy consumption improvements.

Following Microsoft’s publication, two independent studies successfully replicated these results.

This includes the 1bitLLM models, trained on the RedPajama dataset comprising 100 billion tokens, and the Nous research models trained on the 60 billion token Dolma dataset.

Mobius Labs has also made advancements in quantization for 1-bit and 2-bit models. They showed that fine-tuning a small fraction of parameters can increase the performance of 1-bit quantized models beyond that of models quantized using traditional methods.

DeepMind Improves Long-Form Factuality in LLMs

Since LLMs are known to generate incorrect facts, DeepMind researchers have presented a new approach to improve the factuality of LLMs in generating long-form content. They provided a comprehensive method for evaluating and improving the accuracy of information produced by LLMs.

The most important contributions from the paper include:

LongFact PromptSet: These are a new set of prompts designed to benchmark the factuality of LLMs across various topics.
F1@K metric: A proposed metric that balances the precision of supported facts in a response with the desired response length.
Search-Augmented Factuality Evaluator (SAFE): This new method evaluates response factuality by breaking it down into individual facts and verifying them against search results.

Example of the SAFE method in action. (Source)

The SAFE method achieved top results in factuality evaluation by agreeing with human annotators 72% of the time. It offers a more cost-effective alternative to human annotators (20x cheaper), making large-scale factuality evaluation more feasible.

Aligning AI With Human Values

Joe Edelman, Ryan Lowe, and Oliver Klingefjord explored values alignment in their new paper funded by OpenAI.

As LLMs become more powerful, aligning these models with human values is crucial. The authors focused on two main tasks: finding out what values people hold and how these values can be combined into clear goals for training AI models.

A new method called Moral Graph Elicitation (MGE) was proposed to gather and combine people’s values by conversing with them through an LLM. The authors set six standards for a successful goal and tested MGE with 500 Americans on controversial topics, showing that it can work well.

The three-step process of MGE. (Source)

Overcoming RLHF Limitations With RLP

Researchers have developed an unsupervised framework to tackle the issue of reward model inaccuracy in the reinforcement learning from human feedback (RLHF) process for LLMs.

In particular, the issue of policy optimization shifting the LLMs’ data distribution leads to the previously trained reward model becoming inaccurate off-distribution. This inaccuracy causes the LLMs' performance to degrade.

The new framework, Reward Learning on Policy (RLP), aims to refine a reward model using policy samples. It analyzes these policy samples using unsupervised learning and employs a method for creating simulated, high-quality data for more effective training.

Comparison of standard RLHF to RLHF with RLP. (Source)

Their framework ensures the reward model remains accurate and on-distribution, leading to improved overall performance and reliability of LLMs.

GenAI App Spotlight: NeighborhoodInsight - Chat with Multi-Modal Restaurant Review Data

What's the best burger restaurant in Bay Area that isn't too noisy and offers vegetarian options? Let's use Google Maps photos and reviews to decide! Our editorial team has built a multi-modal RAG app with CLIP, Deep Lake, and LangGraph to chat with any neighbourhood with GenAI. Read more how to build a multi-modal RAG project like this yourself here.

Images embedded with CLIP used in the article

Frameworks We Love

Some frameworks that caught our attention in the last week include:

Mini Gemini: Improves multi-modality Vision Language Models (VLMs) through high-resolution visual tokens and a high-quality dataset for better image comprehension.
OmniParser: Designed for visually-situated text in documents, OmniParser can perform text spotting, key information extraction, and table recognition tasks.
GauStudio: Supports advanced techniques for artifact reduction and 3D reconstructions.

If you want your framework to be featured here, get in touch with us.

Conversations We Loved This Week

The debate about where we draw the line for generative AI use continues as brands limit its use when working with ad agencies. We also saw an eye-catching discussion about what type of generative AI company is the most profitable, with the answer being one you wouldn’t expect.

Brands Sceptical About Generative AI Usage

Large brands are taking a stand against ad agencies using AI. (Source)

Discussions have been heating up regarding the use of AI in marketing. Ad agencies are looking to increase their use of AI, while brands are worried about the potential pitfalls.

Some brands are going a step further, requiring explicit authorization before any AI tools can be used - even for conceptual work.

Although generative AI tools like ChatGPT have gained widespread popularity, high-profile missteps have caused brands to be sceptical about them. One example of this was Google’s Gemini generating embarrassing images in February 2024.

Samuel Deats take on the situation. (Source)

Samuel Deats described generative AI as “theft on a massive scale” last week.

He commented on the ethical and legal concerns raised by AI-generated images. There isn’t a straightforward way to ensure a generated image doesn’t closely mimic an existing copyrighted work.

The whole situation indicates that there’s a growing call for clear ethical and legal guidelines on the use of copyrighted materials in training AI models and on the commercial use of AI-generated content.

Consulting Companies More Profitable Than Development Companies?

Peter’s comment about Accenture’s revenue. (Source)

The head of Visa Ventures, Peter Berg, made an interesting point about the type of company that's seemingly bagged the most money from generative AI… a global consultancy?

Accenture recently reported it had $600 million worth of booking in the second fiscal quarter alone.

Development companies may need to focus on partnerships and service models emphasizing implementation and integration to capture more value (and make their offerings easier to put into production). Accenture’s profitability shows that deploying generative AI is just as—if not more—important than the development aspect.

Fundraising in AI

Skyflow, 0g, and Celestial AI were all successful in funding rounds - with Celestial AI being the biggest winner.

Skyflow Raises $30 Million in Series B Funding Round

Data privacy specialist Skyflow successfully raised $30 million in a Series B extension round led by Khosla Ventures. This funding round coincides with Skyflow's expanded focus on enhancing data privacy management for data used by LLMs.

0G Raises $35 Million in Pre-Seed Round

The web3 infrastructure firm 0g has raised $35 million in a pre-seed round to build a “full-stack blockchain-based solution for training, deploying, and running AI models”. 0g is in the modular technology space, competing with EigenLayer.

Celestial AI Raises $175 Million in Series C Funding Round

Known for its Photonic Fabric™ optical interconnect technology platform, Celestial AI successfully raised $175 million in a funding round led by Thomas Tull’s U.S Innovative Technology Fund (USIT). The successful series C funding validates the Photonic Fabric™ technology and its potential impact on the future of computing.