Category: Machine Learning
-

Probability Threshold for Top-p (Nucleus) Sampling
The “Probability Threshold for Top-p (Nucleus) Sampling” is a parameter used in generative AI models, like large language models (LLMs), to control the randomness and creativity of the output text. Here’s a breakdown of what it does: Understanding the Basics What the Threshold Value Does In Practical Terms Imagine you’re asking the model to complete…
-

AlexNet: The Deep Learning Breakthrough That Reshaped Google’s AI Strategy
When Google, in collaboration with the Computer History Museum, open-sourced the original AlexNet source code, it marked a significant moment in the history of artificial intelligence. AlexNet was more than just an academic breakthrough; it was the tipping point that launched deep learning into mainstream AI research and reshaped the future of companies like Google.…
-

Teaching AI Models to Be Better Search Engines: A New Approach to Training Data
A recent patent application* reveals an innovative method for training AI models to become more effective at understanding and answering human queries. The approach tackles a fundamental challenge in modern search technology: how to teach AI systems to truly understand what people are looking for, rather than just matching keywords. The Core Innovation The traditional…
-

Self-Supervised Quantized Representation for KG-LLM Integration
Paper: https://arxiv.org/pdf/2501.18119 This paper proposes a method called Self-Supervised Quantized Representation (SSQR) for seamlessly integrating Knowledge Graphs (KGs) with Large Language Models (LLMs). The key idea is to compress the structural and semantic information of entities in KGs into discrete codes (like tokens in natural language) that can be directly input into LLMs. Here’s a…
-

Introducing VecZip: Embedding Compression Algorithm
Embeddings are vital for representing complex data in machine learning, enabling models to perform tasks such as natural language understanding and image recognition. However, these embeddings can be massive in size, creating challenges for storage, processing, and transmission. At DEJAN AI, we’ve developed VecZip, a novel approach to address this issue, and reduce the file size…
-

Chrome AI Frameworks & Models
Chrome’s AI-driven segmentation platform enhances user experiences by predicting behaviours and tailoring features accordingly. Explore the different models that power these optimizations and how they shape web interactions.
-

Attention Is All You Need
Summary by: https://illuminate.google.comPaper: https://arxiv.org/abs/1706.03762 Host Welcome to this discussion on the groundbreaking paper, “Attention Is All You Need.” This paper introduces the Transformer, a novel neural network architecture based solely on the attention mechanism, eliminating the need for recurrence and convolutions. Let’s start with the core motivation behind this work. What were the limitations of…
-

The State of AI
Access the report here: stateof.ai Transcript All right, let’s dive in. We’re tackling the state of AI report 2024 this time around. Seventh year they put this out. Nathan Benaish and Airstreet Capital, they really have their fingers on the pulse of AI. Talk about a must-read if you want to understand what’s really happening…
-

ILO
The ILO App: A Step-by-Step Tool for Managing SEO Data and Improving Link Structures Managing SEO efficiently can be a complicated process, especially for websites with a large number of pages. The ILO app aims to simplify this by offering a structured, step-by-step approach. It brings together tools for handling key aspects of SEO, like…
-

Resource-Efficient Binary Vector Embeddings With Matryoshka Representation Learning
When conducting an advanced SEO analysis, I frequently utilise vector embeddings for text feature extraction, similarity searches, clustering, retrieval, ranking and so on. One of the main burdens on top of compute is storage space, as these files tends go into terabytes for very large websites. Today I did a deep analysis and realised I’ve…
