🆚 RAG vs Large Context Windows; a Comparison

PLUS - JetBrains Unveils Vendor-Neutral AI Coding Assistant

DevThink.AI

Essential AI Content for Software Devs, Minus the Hype

In this edition: It’s been interesting to see the evolution of standard architectures around GenAI applications. We are covering a lot of the usual suspects, RAG and Agent workflows but with examples that show the progression these mechanisms have made. I’m also looking forward to trying out Jetbrain’s coding copilot.

  • 📖 TUTORIALS & CASE STUDIES

  • 🧰 TOOLS

  • 📰 NEWS

📖 TUTORIALS & CASE STUDIES

Workshop on Fine-Tuning LLM Agents for Task Automation
read time: 1hr presentation
Join this online workshop from DeepLearning.AI to learn how to enhance the performance of Large Language Model (LLM) agents in application automation. Topics include fine-tuning techniques, metrics and logging, debugging with Weights & Biases, prompting paradigms, and practical evaluation with W&B.

Improving Contextual Recall with Claude 2.1
read time: 8 minutes
Anthropic's AI model, Claude 2.1, offers a 200K token context window, excelling at real-world retrieval tasks. However, it can be hesitant to answer questions based on out-of-place sentences. A minor prompting edit can overcome this reluctance, improving performance on these tasks.

RAG vs. Context-Window in GPT-4: A Comparative Analysis
read time: 15 minutes
This article presents a detailed comparison between Retrieval Augmented Generation (RAG) and context-window stuffing in GPT-4. The study reveals that RAG, when combined with GPT-4, delivers superior performance at just 4% of the cost, making it a more efficient choice for specializing Large Language Models' responses.

Evaluating Retrieval-Augmented Generation Applications with RAGAs
read time: 15 minutes

This article introduces RAGAs, a framework for evaluating Retrieval-Augmented Generation (RAG) applications. It provides metrics for assessing the performance of RAG pipelines and leverages Large Language Models for reference-free evaluation, making it a cost-effective solution.

🧰 TOOLS

Mistral AI Launches Beta Access to Generative AI Platform
read time: 5 minutes
Mistral AI has launched beta access to its platform, offering developers powerful generative models and efficient deployment methods. The platform includes three chat endpoints for text generation and an embedding endpoint. The models are pre-trained on open web data and fine-tuned for instructions. Learn more about the platform and its capabilities in the full article.

KwaiAgents: Open-Sourced Agent-Related Works by Kuaishou Technology
read time: 15 minutes

Kuaishou Technology has open-sourced a series of agent-related works, KwaiAgents, including KAgentSys-Lite, KAgentLMs, KAgentInstruct, and KAgentBench. These tools offer capabilities such as planning, reflection, and tool-use, and provide a benchmark for testing agent capabilities. The guide also includes instructions for deploying and using these tools.

Microsoft's Phi-2: The Surprising Power of Small Language Models
read time: 10 minutes

Microsoft Research's Machine Learning Foundations team has released Phi-2, a 2.7 billion-parameter language model that outperforms models up to 25x larger on complex benchmarks. The model's performance is attributed to strategic training data selection and innovative scaling techniques.

Anyscale Endpoints Introduces JSON Mode and Function Calling
read time: 15
Anyscale Endpoints has introduced JSON mode and function calling capabilities, enhancing the usability of open models like Mistral-7B. JSON mode ensures valid JSON outputs tailored to specific schema requirements. Function calling allows models to use APIs effectively. These features are currently in preview for the Mistral-7B model. Read more about these exciting updates here.

Ollama: A Local Solution for Large Language Models
read time: 8 minutes
Ollama is a tool that allows developers to run large language models locally. It supports a variety of open-source models and provides a simple API for creating, running, and managing models. It also offers customization options and a REST API for running and managing models. Learn more about it here.

 

📰 NEWS

Leveraging AI for Secure Code Development
read time: 8 minutes
This article discusses how AI and automation can enhance DevSecOps by providing personalized training, identifying vulnerabilities, managing dependencies, and integrating security tooling into the SDLC. It also provides tips on evaluating AI tools for adoption at work, emphasizing understanding data use, inspecting IP clauses, tracking tool performance, and auditing the tool's audits.

JetBrains Unveils Vendor-Neutral AI Coding Assistant
read time: 8 minutes

JetBrains has introduced a new AI coding assistant that leverages multiple large language models (LLMs) to provide coding suggestions, refactoring, and documentation support. The assistant is vendor-neutral, using both OpenAI and Google's LLMs, along with JetBrains' own models. The AI service architecture allows for easy integration of new models. Currently, the offering is only available to paying customers. Read more about it here.

Google's Duet AI: Revolutionizing the Software Development Lifecycle
read time: 15 minutes
Google has launched Duet AI for Developers, a generative AI tool designed to assist developers throughout the entire software development lifecycle. The tool integrates with IDEs and Google's Cloud Console, offering coding assistance, chat support, and ops tooling. Duet AI aims to increase developer productivity by reducing cognitive load and streamlining workflows.

 

Thanks for reading and we will see you next time

Follow me on twitter, DM me links you would like included in a future newsletters.