• GenAI360 - Weekly AI News
  • Posts
  • Meta's SceneScript for 3D, New Model from Mistral, and Activeloop Raises Series A to Onboard Fortune 500

Meta's SceneScript for 3D, New Model from Mistral, and Activeloop Raises Series A to Onboard Fortune 500

Meta detoxifies LLMs, Mistral introduces 32K-token model, Claude beats expectations, Activeloop announces enterprise success with Database for AI

Newsletter Introduction

Hi! You're receiving this newsletter because you're subscribed to Activeloop updates, are one of our GenAI360 course takers or Activeloop Deep Lake users. We'll be curating a weekly digest of top stories in AI industry and research, summarizing the most important news, papers, projects, and sources, so you have a 360 degree view of everything that's happening on the market.

Key Takeaways

  • Activeloop, the Database for AI company, announces a Series A round of $11M, with several Fortune500 customers like Bayer Radiology in production and plans to expand the Fortune500 enterprise footprint this year.

  • Meta’s SceneScript is a new method for accurately reconstructing 3D environments. It was trained on 100,000 unique interior environments.

  • Sakana AI’s Evolutionary Model Merge represents a significant step toward automatically generating foundation AI models and overcoming the limitations of current model merging practices.

  • Tinygrad cancelled their TinyBox project because of AMD RX 7900 XTX GPU firmware issues.

  • Cohere is looking to raise $500 million at a valuation of $5 billion.

  • Mistral released a new open-source model with 32k tokens that outperformed LLaMa-2 13B and performed similarly to LLaMa-34B. 

The Latest AI News

It’s been an eventful week, with events ranging from leadership changes to $650 million purchases for acquiring talent and AI model licenses.

Let’s dive into what this means for the future of AI.

Mistral Releases New Model

The reveal of the new Mistral model at the hackathon in San Francisco. (Source

Mistral revealed their new open-source model, Mistral 7B v0.2, at the San Francisco hackathon event on March 23.

The new model is an upgrade from its predecessor by going from 8000 tokens to 32,000 tokens, allowing it to process longer text sequences and produce relevant outputs.

Additionally, Mistral 7B v0.2 has also shown promising results across various performance benchmarks. It outperformed LLaMa-2 13B and had similar results to LLaMa-1 34B, despite only having 7.3 billion parameters.

Anthropic Dominates the LLM leaderboards and banks extra $2.75 billion from Amazon

Amazon has concluded a $4 billion investment in Anthropic, with an initial $1.25 billion investment made last September and an additional $2.75 billion investment made recently.

Industry benchmarks demonstrate that Claude 3 Opus, the smartest of the model family, has set a new standard, outperforming other models available today—including OpenAI’s GPT-4—in the areas of reasoning, math, and coding.

Comparison of the Claude 3 models to others across benchmark tasks.

Activeloop Raised Series A to Bring the Database for AI to Fortune 500

After recording success with enterprises like Bayer Radiology or Matterport, the startup announced a new funding round from Streamlined Ventures, Y Combinator, Samsung Next, Alumni Ventures, and Dispersion Capital. The capital will enable the onboarding of further enterprise customers, empowering anyone to organize complex unstructured data and retrieve knowledge with AI. More on the topic here.

Querying across unstructured data with natural language in Deep Lake

Stability CEO Steps Down

Stability AI’s former CEO, Emad Mostaque. (Source)

Stability AI announced on 23 March 2024 that CEO Emad Mostaque stepped down and left his position on the board of directors. According to Mostaque's tweet, he did so in pursuit of decentralized AI.

Microsoft Spends $650 million for Inflection’s AI Talent and Models

Microsoft has paid Inflection $650 million to license their models and hired a lot of its staff for its new consumer unit, Microsoft AI. It also hired Inflection's co-founders Mustafa Suleyman and Karen Simonyan.

With this move, Microsoft aims to drastically boost its AI offerings by integrating Inflection's team and generative AI models into its own ecosystem.

Inflection mentioned that their focus has moved from selling models to enterprise customers. They might have noticed a growing demand for AI solutions in enterprise settings, which has led to more competition in this segment.

Apple in Talks to Bring Gemini AI to iPhone

Speaking of drastically boosting AI offerings, Apple is moving in a similar direction as Microsoft. But instead of acquiring an AI startup's model licenses and staff, Apple wants to bring Google's Gemini AI to iPhone.

However, this move might be seen as questionable. 

Gemini has had issues with biased training data and outputs. Nevertheless, Apple sees the value of additional AI features, as they are becoming a key differentiating factor in the smartphone market.

Meta’s SceneScript for 3D Environment Reconstruction

SceneScripts reconstruction of a 3D environment. (Source)

Meta has been working on a new method of reconstructing 3D environments called SceneScript. 

We mentioned that Gemini AI has had issues with biased training data, but Meta has been much more cautious than Google in its training approaches. SceneScript is trained on 100,000 unique interior environments, as the publicly available training datasets weren't suitable for this model.

Meta mentioned SceneScript's potential, such as advancing LLMs' capabilities to "answer complex spatial queries". However, it's difficult to tell if SceneScript will have as big an impact as Meta claims it will during these early stages.

Advancements in AI Research

AI research has seen significant advancements this week, with novel approaches to reducing the toxicity of LLMs and combining open-source models to create foundation models.

Sakana AI’s Evolutionary Optimization Method

Sakana AI’s new approach to combine open-source models to create foundation models. (Source)

Japanese AI company Sakana AI recently published important research that changes how foundation models are developed.

They introduced the Evolutionary Model Merge method, which discovers optimal ways to combine open-source models for foundation model creation. Their approach has led to models that achieve high benchmark results, such as Japanese LLM math reasoning. 

This approach represents a big step toward fully automated foundation model development, reducing the time and resources needed to create AI models tailored to specific tasks or domains. It also overcomes the limitations of current model merging practices that rely on domain knowledge and human intuition.

OA-CNNs for 3D Semantic Segmentation Tasks

Researchers have developed a new approach to boost the performance of sparse CNNs for 3D semantic segmentation tasks.

Understanding 3D scenes is an integral part of applications like:

  • Robotics

  • Autonomous driving

  • Augmented reality

However, point transformers have performed better at 3D semantic segmentation tasks than sparse CNNs.

This paper introduces Omni-Adaptive (OA) 3D CNNs, which includes adaptivity into sparse CNNs. This allows them to achieve higher accuracies in indoor and outdoor scenes while using less computational power than transformers.

Point transformers overshadowed sparse CNNs, but this paper highlights their untapped potential and demonstrates how they could be the future of 3D semantic segmentation tasks.

Improving the Efficiency of Scaling Large Models With DSP

Existing sequence parallelism approaches are limited by their assumption of a single sequence dimension. This causes issues when adapting these approaches to multi-dimensional transformer architectures, which perform calculations across various dimensions. 

A new approach called Dynamic Sequence Parallelism (DSP) improves the efficiency of scaling large models across applications such as:

  • Language generation

  • Video generation

  • Multimodal tasks

DSP improved end-to-end throughput by 42.0% to 216.8%, a massive improvement over previous sequence parallelism approaches. As a result, DSP’s ability to adapt the parallelism dimension according to computation needs opens up new possibilities for efficient scaling of large models.

Detoxifying LLMs Using Knowledge Editing

An example of detoxifying an LLM for safer outputs. (Source)

Although LLMs like ChatGPT are becoming more capable, there are concerns that they might generate more harmful responses

This paper reduces the toxicity of LLMs using a method called Detoxifying, which reduces the toxicity of LLMs with a few tuning steps. 

Moreover, the researchers created their own benchmark for this study, SafeEdit. This benchmark covers many unsafe categories and contains metrics for generalization and defense success. 

They found that knowledge editing is an efficient way of detoxifying LLMs without significantly impacting their overall performance. 

Meta has also highlighted AI safety as a key component of their LLM, LLaMa. This paper lays the foundation for letting us get the best of both worlds from AI: beneficial and safe outputs.

Frameworks We Love 

Here’s a few frameworks that caught our eye in the last week:

  1. AnyV2V: A training-free framework that simplifies video-to-video editing into two main steps. 

  2. DreamReward: Focuses on learning and improving text-to-3D models from human preference feedback.

  3. Multimodal Video Understanding (MVU): Improves understanding of long videos by combining multimodal information with an efficient inference technique in a single language model pass.

If you want your framework to be featured here, get in touch with us. 

Conversations We Loved This Week

The drama in the AI world has been dense recently with the Tinygrad situation and the controversy surrounding the use of AI in Hollywood.

Agentic Workflows and the Future of AI Progress

Andrew Ng’s comment on AI agentic workflows. (Source)

Andrew Ng made an interesting point about believing AI agentic workflows could lead to more AI progress than foundation models.

An LLM iterates over a document multiple times in agentic workflows, leading to better results than single-pass outcomes. This process resembles human cognitive processes, where refinement and iteration helps achieve better outcomes.

He also mentioned that more research papers are being published on agents, which means more developers and researchers might be catching onto their potential to overcome current LLM limitations.

The Tinygrad and AMD Situation

George Hotz’s comment on the Tinygrad situation. (Source)

George Hotz, president of Comma AI and founder of Tinygrad, recently commented on AMD's GPU firmware challenges

Tinygrad is developing an AI compute cluster project called TinyBox, which uses six Radeon RX 7900 XTX GPUs. However, development has been paused because of driver instability issues, such as crashes.

Tinygrad mentioned that the GPU firmware was “complex, undocumented, closed source, and signed”, making it difficult to work with.

Hotz mentioned that he had talked with AMD’s CEO Lisa Su, but she “indirectly rejected” his offer to make the GPU’s firmware open-source. Open-sourcing firmware has benefits like being able to drive AI advancements, but AMD is a $300 billion company that doesn’t want to take this risk just yet.

AI in Hollywood?

Samuel Deets comments on the use of AI in Hollywood. (Source)

News has broken out that Hollywood studios and talent agencies have been prodded by OpenAI to start using AI in their work, but that doesn’t sit well with everyone.

Some believe that AI doesn't belong in creative industries, as AI-generated art and entertainment lack the personal touch and emotion of human works. Even though AI can generate large amounts of content quickly, it won't convey deep, nuanced messages like human creators can.

Samuel Deats directed the Castlevania Netflix series and described generative AI "theft on a massive scale". This brings up an important discussion about which applications AI should be used for, as removing the human touch from entertainment has received a lot of backlash.

On the opposite side of the spectrum, Tyler Perry, a Hollywood film mogul, put $800M studio expansion plans on hold after being impressed by OpenAI’s Sora earlier last month.

In any case, research has been advancing the application of AI in video generation.

MovieLLM is a framework that can generate comprehensive datasets for long video instructions tuning by synthesizing movie plots, styles, and character descriptions using GPT-4. It also overcomes the challenges of manual data collection and annotation.

AI has also been used in character design, making it “more efficient, creative, and interactive”.

This raises an interesting question - where do we draw the line regarding AI in entertainment? 

AI can be used to enhance the creative process, but the issue of AI-generated content lacking the human touch needs to be considered. 

Other Fundraising in AI

We've covered fundraising news by Anthropic and Activeloop in previous sections, but those are not the only news on the market! Cohere has been looking to raise funds to cover their expenses, while Borderless AI has already been successful in a recent funding round.

Cohere Fundraising

Cohere is an AI startup that develops foundation models. It is looking to raise $500 million at a valuation of $5 billion. It’s competing with OpenAI, but instead focuses on enterprise customers.

The huge costs associated with developing foundation AI models are a big reason why Cohere is fundraising for such a large amount, and the question remains to be answered if the revenue acquired can overcome the expenses associated with AI model development.

However, Cohere’s rival Anthropic is projected to achieve significantly higher revenue, as it’s forecasted to reach $850 million by the end of 2024.

Borderless AI Fundraising

Borderless AI, an AI-powered human resource (HR) management platform, raised $27 million in a recent funding round led by Susquehanna and Aglaé Ventures. They were the first company to use AI agents for HR, by speeding up processes like onboarding and payment.