• GenAI360
  • Posts
  • Stability AI & Stanford's New LLMs, DeepMind's Mixture-of-Depths, Beating LLama-v2 <$100K

Stability AI & Stanford's New LLMs, DeepMind's Mixture-of-Depths, Beating LLama-v2 <$100K

Also, major investments from Lambda, Canada's Trudeau, & OpenAI's beef with Tesla and Google

Newsletter Introduction

Hi! You're receiving this newsletter because you subscribed to Activeloop updates, are one of our GenAI360 course takers, or are an Activeloop Deep Lake user. We'll curate a weekly digest of top stories in the AI industry and research, summarizing the most important news, papers, projects, and sources so you have a 360-degree view of everything happening on the market.

Key Takeaways

This week’s key developments include:

  • Stanford’s Octopus V2 demonstrates a new method for on-device AI that surpasses GPT-4 in performance, addressing privacy and cost concerns.

  • Mixture-of-Depths (MoD) is a new method found by DeepMind researchers that enhances computational efficiency, achieving baseline performance with fewer FLOPs per forward pass.

  • Lambda’s $500 million funding marks a significant expansion in cloud offerings for AI start-ups by making high-performance AI compute resources more accessible.

  • OpenAI faces controversy over training methods for Sora and competition from new platforms like Higgsfield AI for video generation.

  • The open-source project JetMoE showcased cost-effective and high-performance LLM training that was accessible beyond tech giants.

Like our newsletter? Share it with a friend, get one month free on Activeloop Deep Lake premium plans

Refer 3 friends or colleagues and get one month free for activeloop.ai. Increase knowledge retrieval accuracy by up to 41%, train ML models at scale, visualize, version control, and query multi-modal data.

The Latest AI News

We saw some big news in AI over the last week, with Lambda expanding its cloud offerings and the controversy surrounding OpenAI’s training methods for Sora.

Lambda’s $500 Million Leap

The GPU cloud company Lambda has secured $500 million to expand its on-demand cloud offering for AI start-ups using NVIDIA GPUs. This will let Lambda deploy many GPUs without requiring a long-term contract from users, making AI compute resources more affordable.

It expands Lambda’s cloud offerings by letting more AI developers train and fine-tune generative AI models. It also makes high-performance AI compute resources more accessible to AI start-ups, which is a good way to speed up development and innovation.

State of Analytics Engineering Report 2024 Released

A new report by dbt Labs pointed out some interesting trends that will affect the analytics engineering field in 2024. 

Some notable trends include:

  • Increasing importance of data quality: There’s a significant rise in concerns over data quality.

  • Challenges of making data accessible: Organizations are struggling to make data easily accessible and understandable across departments.

  • Reduced budgets: Economic downturns have led to tighter budgets for data teams.

One key insight was that 57% of professionals claimed poor data quality was a major issue—an increase from 41% in 2022. This means data quality will need to be a higher priority moving forward.

The main challenges during data preparation. (Source)

Stability Audio 2.0 Announced

Stability Audio 2.0 lets users create full-length music tracks up to three minutes long from a single natural language prompt. It supports text-to-audio and introduces audio-to-audio capabilities. This means audio samples can be uploaded and turned into different sounds through natural language prompts.

A few advantages Stability Audio 2.0 has over its predecessor Stability Audio 1.0 include:

  • Sound effect generation

  • Style transfer

  • Ability to produce structured compositions with intros and outros

It’s a big leap forward in AI music generation by generating high-quality music with its advanced technical architecture of a latent diffusion model and highly compressed autoencoder.

The autoencoder of Stability Audio 2.0 compresses and reassembles audio to its initial state. (Source)

Controversy and Competition Surrounding OpenAI’s Sora

Debates have been heating up regarding the methods that OpenAI is using to train its AI video generator, Sora.

Although it's unclear if OpenAI is indeed using YouTube videos as training data, YouTube CEO Neal Mohan said this would be “an infraction of the platform’s terms and services”.

But, this isn’t the only issue that OpenAI is facing with Sora.

Former Snap AI chief Alex Mashrabov launched his own AI video generation platform, Higgsfield AI. Sora’s accessibility is still limited since it targets well-funded creatives. That’s why Higgsfield AI is a more accessible platform that a wide range of users can access.

Amazon Walks Out On… “Just Walk Out”

Amazon’s “Just Walk Out” technology, which used cameras and sensors for cashier-less shopping, was abandoned because of its heavy reliance on human contractors and privacy concerns. 

The system proved to be too costly and slow, with outsourced labor taking hours to process data for customer receipts.

Trudeau Announces Package of AI Investment Measures

Prime Minister Justin Trudeau announced a $2.4 billion investment to accelerate Canada’s AI sector. The aim is to boost job growth and establish international leadership in AI.

This is part of a broader strategy to enhance Canada’s AI capacity by focusing on AI safety and infrastructure. It’s also expected to support the development of AI applications and contribute to economic growth by creating high-tech jobs.

Advancements in AI Research

AI research has made some big strides lately, with Stanford’s Octopus V2 outperforming GPT-4 and new methods boosting faithfulness in LLMs.

Innovation in On-Device AI: Stanford’s Octopus V2 Surpasses GPT-4

The Octopus models achieved a higher function call accuracy than GPT-4. (Source)

Stanford University researchers have presented a new method that boosts the performance of an on-device model. In this paper, they focused on a model with 2 billion parameters that outperformed GPT-4 in accuracy and latency, reducing context latency by 95%.

Deploying large-scale language models on edge devices like smartphones led to cost and privacy concerns.  However, this new method addresses these issues by providing a more efficient on-device solution.

DeepMind Boosts Computational Efficiency 

DeepMind researchers introduced a new method called Mixture-of-Depths (MoD) where transformers dynamically allocate computational resources (FLOPs) across different parts of a sequence. This leads to optimized performance while sticking to a predefined compute budget.

This method achieves baseline performance with fewer FLOPs per forward pass, so it boosts accuracy without sacrificing accuracy.

It’s a big shift in how computational resources are managed within language models, which means we could see more sustainable and scalable AI development in future.

Visual Autoregressive Modelling Boosts Image Generation Quality

Comparison of standard autoregressive modelling to the new approach. (Source)

Researchers presented a new approach to image generation called Visual AutoRegressive (VAR) modelling, which focuses on “next-scale” prediction to improve efficiency and quality. Notably, it outperforms existing models in areas like:

  • Image quality

  • Speed

  • Data efficiency

  • Scalability

Moreover, the paper highlighted VAR's potential for practical applications in image editing and content creation by demonstrating zero-shot generalization in image manipulation.

Unsupervised Learning Advancements

This paper introduces CRATE-MAE, a “transformer-like masked autoencoder architecture” that applies white-box design principles to large-scale unsupervised representation learning. It does so by drawing a connection between diffusion processes and compression, meaning each layer’s role is “mathematically interpretable”.

CRATE-MAE showed potential for more efficient deep learning models without compromising performance since it performed well on large-scale imagery datasets with fewer parameters.

Reducing Hallucinations Using Hypothesis Verification

An overview of how the TWEAK method reduces hallucinations. (Source

TWEAK is a decoding-only method that integrates with any knowledge-to-text (K2T) generator to reduce hallucinations. This new method verifies generated text against input facts using a Hypothesis Verification Model (HVM). 

The authors also developed a new dataset called Fact-Aligned Textual Entailment (FATE) to train a task-specific HVM. FATE pairs input facts with original and perturbed descriptions to improve the model’s ability to improve the faithfulness (accuracy and reliability) of generated text.

They showed TWEAK can boost faithfulness without compromising the quality of the generated text, which is a usually common trade-off in text generation.

Frameworks We Love

Some frameworks that caught our attention in the last week include:

  1. RealHumanEval: Web interface that helps programmers measure the ability of LLMs via autocomplete or chat support.

  2. CodeEditorBench: Platform that assesses LLMs’ code editing capabilities across various tasks, such as debugging, translating, and requirement switching.

  3. LM-Guided COT: Leverages a small language model to guide an LLM in reasoning tasks like multi-hop question-answering.

If you want your framework to be featured here, get in touch with us. 

Conversations We Loved This Week

A couple of interesting conversations came up recently, one explaining how the cost of LLM training can be made more accessible and the other regarding Ethan Knight’s move to xAI.

Lowering the Cost of LLM Training

JetMoE reduces LLM training costs. (Source)

Zengyi Qin introduced JetMoE, an open-source project by Shell AI that shows LLM training can be cost-effective and high-performance. It goes against a common belief that only larger organisations can develop competitive LLMs.

With a training cost of less than $0.1 million, JetMoE outperformed Meta’s LLaMa-2, an LLM backed by a billion-dollar company. It’s also worth noting that JetMoE has 2.2 billion active parameters—a fraction of the active parameters in LLaMa-2.

Zengyi Qin also mentions that JetMoE can be fine-tuned with a very limited computing budget”.

The Ethan Knight Situation

Musk’s comments on Ethan Knight’s situation. (Source)

Previously associated with Tesla’s computer vision team, Ethan Knight has transitioned to Musk’s xAI. It happened during a highly competitive time when OpenAI also showed interest in Knight.

Musk clarified Knight’s previous role at Tesla while commenting on the intense competition for AI talent.

It shows that talent moving from leading companies can have a massive impact on development and innovation trajectories. Moreover, we can see just how fierce competition is becoming in the AI industry for talent, with Musk describing it as “the craziest talent war” he’s ever seen.

Fundraising in AI

Luminance, Coalesce, and Sima AI were all successful at raising funds this week, with Sima AI raising the most out of the three.

Luminance’s $40 Million Series B Funding

London-based AI law firm Luminance secured $40 million in Series B funding led by March Capital, with contributions from National Grid Partners. Luminance’s AI model automates contract negotiations and other legal documents to make legal processes simpler and quicker.

Coalesce Raises $50 Million in Series B Funding

Data transformation tech start-up Coalesce raised $50 million in Series B funding, led by Industry Ventures and Emergence Capitals.

Coalesce focuses on automating SQL coding for data transformations to accelerate data pipeline development. They plan on using this funding to improve software scalability and introduce AI-driven features.

Sima AI Raises $70 Million to Advance AI in Automotive and Robotics

Chip start-up Sima AI raised $70 million to accelerate AI integration in the automotive and robotics industries. The start-up focuses on creating advanced AI chips that can improve the performance and capabilities of autonomous vehicles and robotic systems. 

Sima AI’s chips are designed to handle high-level computations needed for AI applications, like machine learning algorithms and sensor data processing.