AIhero

    unlisted workshop

    The Code

    Make sure that you have the code for the project. It can be found here or by clicking the code button above.

    What You'll Learn

    Building Your Own Dataset
    Learn to structure and prepare data for AI retrieval. Understand the minimum requirements (id, subject, content) and how to work with any JSON source: emails, notes, documentation, or knowledge bases.

    Implementing BM25 Search
    Understand how keyword-based search works under the hood. Implement the BM25 algorithm to rank results by term frequency and document relevance.

    Working with Embeddings
    Generate and cache vector embeddings efficiently. Learn to perform semantic search using cosine similarity, finding results by meaning, not just matching words.

    Combining Rankings with Reciprocal Rank Fusion
    Master RRF, a rank fusion technique that combines multiple search algorithms into superior results. Understand why hybrid search outperforms either method alone.

    Creating Agent Tools
    Define tool schemas with Zod and integrate them into your AI agent. Learn to structure tool descriptions and parameters so the LLM knows when and how to use them.

    Prompt Engineering for Tool Use
    Write system prompts that guide the agent to use tools effectively. Implement anti-hallucination guardrails that enforce searching before answering.

    By the End

    You'll understand the complete pipeline from raw data to intelligent retrieval, and how to give your AI agent the ability to search before it speaks.

    Retrieval Project

    Matt Pocock
    Matt Pocock