Category: Machine Learning

  • Advanced Interpretability Techniques for Tracing LLM Activations

    Advanced Interpretability Techniques for Tracing LLM Activations

    Activation Logging and Internal State Monitoring One foundational approach is activation logging, which involves recording the internal activations (neuron outputs, attention patterns, etc.) of a model during its forward pass. By inspecting these activations, researchers can identify which parts of the network are highly active or contributing to a given output. Many open-source transformer models…

  • Temperature Parameter for Controlling AI Randomness

    Temperature Parameter for Controlling AI Randomness

    The Temperature parameter is a crucial setting used in generative AI models, such as large language models (LLMs), to influence the randomness and perceived creativity of the generated output. It directly affects the probability distribution of potential next words. Understanding the Basics What the Temperature Value Does In Practical Terms Using the sentence “The cat sat on…

  • Probability Threshold for Top-p (Nucleus) Sampling

    Probability Threshold for Top-p (Nucleus) Sampling

    The “Probability Threshold for Top-p (Nucleus) Sampling” is a parameter used in generative AI models, like large language models (LLMs), to control the randomness and creativity of the output text. Here’s a breakdown of what it does: Understanding the Basics What the Threshold Value Does In Practical Terms Imagine you’re asking the model to complete…

  • AlexNet: The Deep Learning Breakthrough That Reshaped Google’s AI Strategy

    AlexNet: The Deep Learning Breakthrough That Reshaped Google’s AI Strategy

    When Google, in collaboration with the Computer History Museum, open-sourced the original AlexNet source code, it marked a significant moment in the history of artificial intelligence. AlexNet was more than just an academic breakthrough; it was the tipping point that launched deep learning into mainstream AI research and reshaped the future of companies like Google.…

  • Teaching AI Models to Be Better Search Engines: A New Approach to Training Data

    Teaching AI Models to Be Better Search Engines: A New Approach to Training Data

    A recent patent application* reveals an innovative method for training AI models to become more effective at understanding and answering human queries. The approach tackles a fundamental challenge in modern search technology: how to teach AI systems to truly understand what people are looking for, rather than just matching keywords. The Core Innovation The traditional…

  • Self-Supervised Quantized Representation for KG-LLM Integration

    Self-Supervised Quantized Representation for KG-LLM Integration

    Paper: https://arxiv.org/pdf/2501.18119 This paper proposes a method called Self-Supervised Quantized Representation (SSQR) for seamlessly integrating Knowledge Graphs (KGs) with Large Language Models (LLMs). The key idea is to compress the structural and semantic information of entities in KGs into discrete codes (like tokens in natural language) that can be directly input into LLMs. Here’s a…

  • Introducing VecZip: Embedding Compression Algorithm

    Introducing VecZip: Embedding Compression Algorithm

    Embeddings are vital for representing complex data in machine learning, enabling models to perform tasks such as natural language understanding and image recognition. However, these embeddings can be massive in size, creating challenges for storage, processing, and transmission. At DEJAN AI, we’ve developed VecZip, a novel approach to address this issue, and reduce the file size…

  • Chrome AI Models

    Chrome AI Models

    Chrome’s AI-driven segmentation platform enhances user experiences by predicting behaviours and tailoring features accordingly. Explore the different models that power these optimizations and how they shape web interactions.

  • Attention Is All You Need

    Attention Is All You Need

    Summary by: https://illuminate.google.comPaper: https://arxiv.org/abs/1706.03762 Host Welcome to this discussion on the groundbreaking paper, “Attention Is All You Need.” This paper introduces the Transformer, a novel neural network architecture based solely on the attention mechanism, eliminating the need for recurrence and convolutions. Let’s start with the core motivation behind this work. What were the limitations of…

  • The State of AI

    The State of AI

    Access the report here: stateof.ai Transcript All right, let’s dive in. We’re tackling the state of AI report 2024 this time around. Seventh year they put this out. Nathan Benaish and Airstreet Capital, they really have their fingers on the pulse of AI. Talk about a must-read if you want to understand what’s really happening…