RAG: How to Talk to Your Data

PLUS - OpenAI CEO Sam Altman Ousted Amidst Controversy

DevThink.AI

Essential AI Content for Software Devs, Minus the Hype

Welcome, I hope you find utility in these links I’ve found interesting over the previous week.

In this edition:

  • 📖 TUTORIALS & CASE STUDIES

  • 🧰 TOOLS

  • 📰 NEWS

📖 TUTORIALS & CASE STUDIES

RAG: How to Talk to Your Data
read time: 8 minutes

This article provides a comprehensive guide on how to analyze customer feedback using GPT and the RAG (Retrieval-augmented generation) approach. It covers topics such as loading documents, splitting documents, using vector stores for efficient data storage, retrieval techniques for relevant documents, and generating final answers using the RetrievalQA chain. The article also includes code examples and explanations for each step. Read more

Running Llama 2 Locally: A Step-by-Step Guide
read time: 8 minutes
This tutorial provides a comprehensive guide on how to run Llama 2, an open-source large language model developed by Meta, locally on your PC. It covers the requirements and detailed steps for Mac, Linux, and Windows users, making it easier for developers to leverage this powerful AI tool.

Introducing Multi-Modal Retrieval-Augmented Generation (RAG) by LlamaIndex
read time: 10 minutes
LlamaIndex introduces a new paradigm, multi-modal Retrieval-Augmented Generation (RAG), extending the capabilities of Large Language Models (LLMs) to handle both text and images. This includes support for GPT-4V, a multi-modal model released by OpenAI. The blog post provides a detailed walkthrough of the new abstractions and their potential applications.

Rust+Wasm: A Powerful Alternative for AI Inference
read time: 15 minutes
The Rust+Wasm stack offers a lightweight, fast, and portable alternative to Python for AI inference applications. A simple Rust program demonstrates running inference on llama2 models at native speed, with the binary application being completely portable across devices with heterogeneous hardware accelerators. The Rust+Wasm stack is touted as the future of AGI by Elon Musk.

Enhancing Text Summarization with Chain of Density and GPT-3.5 Fine-tuning
read time: 20 minutes
This article demonstrates how to implement the Chain of Density method for AI-based text summarization and fine-tune a GPT-3.5 model to match GPT-4's capabilities. The approach reduces latency by 20x, cuts costs by 50x, and maintains entity density.

Mastering the Art of Production-Grade LLM Applications
read time: 15 minutes
This guide provides developers with advanced techniques for transitioning Large Language Model (LLM) applications from development to production. It covers prompt engineering, evaluations, the use of Retrieval-Augmented Generation (RAG) for context, and fine-tuning for specialization. The guide also discusses common pitfalls and how to avoid them, ensuring the creation of scalable, stable, and cost-effective LLM applications.

🧰 TOOLS

Giskard Bot: A New Tool for Testing and Debugging ML Models on Hugging Face
read time: 15 minutes

Giskard, an open-source testing framework for ML models, has been integrated with Hugging Face. The Giskard bot can scan models for vulnerabilities, generate domain-specific tests, and automate test execution. It can also publish vulnerability reports, provide qualitative content, and offer debugging assistance. The bot supports a variety of AI model types, including NLP, LLM, and tabular models.

AI Security Vulnerabilities: A Practical Guide
read time: 8 minutes
Protect AI has released a repository highlighting the security vulnerabilities in AI/Machine Learning infrastructure. The repository, ai-exploits, includes exploits and scanning templates for disclosed vulnerabilities affecting machine learning tools. This initiative aims to raise awareness about the security issues in the AI/ML ecosystem and provide practical solutions for security professionals.

Introducing Qwen-Audio: Alibaba's Multimodal Large Audio Language Model
read time: 8 minutes
Alibaba Cloud has proposed a multimodal large model series, Qwen-Audio, that accepts diverse audio and text inputs. It supports various tasks, languages, and audio types, and has shown impressive performance across diverse benchmark tasks. The model series includes Qwen-Audio and Qwen-Audio-Chat, with the latter enabling multi-turn dialogues and supporting diverse audio-oriented scenarios.

KnowPAT: Enhancing Domain-Specific QA with LLMs
read time: 8 minutes
A new pipeline, KnowPAT, has been introduced to improve domain-specific question answering (QA) using large language models (LLMs). KnowPAT incorporates domain knowledge graphs and aligns model preferences with human preferences to generate reliable and user-friendly answers. The pipeline outperforms 15 baseline methods in real-world QA scenarios, demonstrating its effectiveness in leveraging LLMs for practical applications.

 

📰 NEWS

OpenAI CEO Sam Altman Ousted Amidst Controversy
read time: 8 minutes

OpenAI, the company behind ChatGPT, has ousted its CEO Sam Altman due to a lack of candor with the board of directors. The sudden departure brings uncertainty to the AI industry. Interim CEO Mira Murati takes over as the company searches for a permanent replacement. The transition is not expected to affect OpenAI's key business partnership with Microsoft. Read more about the situation here.

Argonne National Lab Begins Training of 1-Trillion Parameter Scientific AI
read time: 5 minutes
Argonne National Laboratory (ANL) has begun training a generative AI model, AuroraGPT, on its Aurora supercomputer. The model, also referred to as 'ScienceGPT', will serve as a chatbot interface for scientific researchers. The training, which could take months, is currently limited to 256 nodes but will eventually scale to all 10,000 nodes of the supercomputer. Read more about this development here.

Thanks for reading and we will see you next time

Follow me on twitter, DM me links you would like included in a future newsletters.