The Gen AI Journey: From Simplicity to Customization

3 min readFeb 3, 2025

Understanding the Landscape of LLMs

1. SaaS-Based LLMs

Examples: GPT-4, Gemini, Claude
Description: Pre-trained, API-based models provided by tech firms, offering general-purpose capabilities without further tuning.
Use Case: Ideal for quick deployment where domain-specific customization is not required.

2. Open-Source (OSS) LLMs

Examples: MPT, Mixtral
Description: Community-driven models that allow customization and fine-tuning for proprietary domains.
Use Case: Best for organizations prioritizing cost control, privacy, and flexibility.

3. Fine-Tuned Models

Examples: OSS models customized with domain-specific data
Description: Pre-trained open-source models further trained to enhance relevance and accuracy in specialized fields.
Use Case: Effective for precision tasks requiring high domain expertise.

4. Pre-Trained from Scratch

Description: Custom-built models trained entirely on proprietary data for maximum control and specificity.
Use Case: Best for unique business needs where existing models do not suffice.

Key Architectural Patterns in Gen AI

Pattern 1: Prompt Engineering

Definition: Optimizing inputs to improve responses from a pre-trained model without modifying the model itself.
Architecture Overview: Users interact via a web app, where structured prompts are sent to the underlying model (SaaS, OSS, or fine-tuned).

When Does Prompt Engineering Work?

Best for rapid deployment and general applications.
Suitable when domain customization is needed without costly retraining.

Limitations of Prompt Engineering

Struggles in dynamic environments where data continuously evolves.
Memory limitations: Once fine-tuned, the model does not automatically adapt to new information.

Use Cases:

Define application boundaries to ensure AI-generated responses remain within relevant topics.

Pattern 2: Retrieval-Augmented Generation (RAG)

Definition: Enhances AI responses by dynamically retrieving relevant data before generation.

RAG Architecture:

User submits a query.
A vector database (e.g., Pinecone, ChromaDB, Databricks) retrieves relevant context.
Retrieved context is appended to the query and passed to the LLM.
The AI generates a context-aware response.

When Does RAG Work Well?

When domain knowledge needs frequent updates without retraining.
When specialized, context-aware responses are required.

Limitations of RAG:

Requires fine-tuning of retrieval mechanisms, embedding models, and pipeline design.
Longer development cycles due to increased complexity.

Use Case:

To retrieve historical records for context-aware AI responses using vector databases

Pattern 3: Agents (Compound AI Systems)

Definition: AI agents that dynamically break down complex queries, reason through multiple steps, and orchestrate workflows using specialized tools.

How Agents Work:

User query is received.
The agent determines necessary subtasks.
The agent selects and invokes relevant tools.
The final response is generated based on the outputs of each tool.

Use Cases for Agents:

Complex scenarios like A user asks an AI agent tobook a ticket to Paris, book a hotel and prepare for all the activities during the day.

Role of LLM in Compound Systems:

Decision-making: Determines required tools and processes.
Workflow orchestration: Dynamically sequences tasks for complex problem-solving.
Tool optimization: Minimizes unnecessary computational overhead.

Why Agents Matter?

Enables multi-step AI-to-AI interaction.
Provides better task-specific expertise.
Enhances user experience with precise, contextual answers.

Use Case:

To deploy an AI assistant for users to generate documents and manage scheduling workflows.

Databricks Genie: A Simple Agentic System

Definition: A specialized agent designed for answering structured data queries using predefined functions (trusted assets).

Key Features:

Provides verified, structured responses.
Uses predefined tools and example queries to ensure accuracy.

Common Use Cases for Gen AI Today

Chatbots for Q&A
Personalized Recommendations
Document Generation
Automated Summarization
Decision Support Systems

This roadmap outlines the journey from using generic SaaS models to fully customized AI solutions. Organizations must carefully evaluate their needs, costs, and technical feasibility to determine the best Gen AI approach for their use case.