Technical Blog

Code Walkthrough: AI Agent Memory Patterns

Azure Cosmos DB Conf 2026

Code Walkthrough: AI Agent Memory Patterns

A Python implementation walkthrough of four AI agent memory strategies - Direct LLM, Sliding Window, Hierarchical 3-Tier, and Entity Graph - built on Azure Cosmos DB for NoSQL and OpenAI. Covers data models, message storage, identity anchoring, tier-...

AI Agent Memory Patterns

Azure Cosmos DB Conf 2026

AI Agent Memory Patterns

A hands-on benchmark of four multi-turn AI agent memory strategies - Direct LLM, Sliding Window, Hierarchical 3-Tier, and Entity Graph - measured against a 60-message Cazton seed dataset using Azure Cosmos DB for NoSQL and OpenAI. Covers recall rates...

Hey Cazton

Streamline Your Workflow With Voice to Text

Hey Cazton

A code walkthrough of Hey Cazton, a macOS menu bar app that uses on-device hotword detection, SwiftUI/AppKit architecture, chunked audio recording, and Azure Whisper transcription.

AI Agent Best Practices

AI Agent Best Practices

AI agents are transforming digital experiences, offering intelligent, contextual, and scalable solutions across industries. This guide explores proven best practices for designing and deploying high-performing AI agents, from input/output schemas to ...

FreshDiskANN: Revolutionizing Real-Time Similarity Search

FreshDiskANN: Revolutionizing Real-Time Similarity Search

In today's data-driven world, similarity search has become a cornerstone technology powering recommendation system, image recognition, natural language processing, and countless other applications. The ability to quickly find similar items in massive...

HNSW vs DiskANN

HNSW vs DiskANN

Searching large datasets effectively is a challenge, especially when each data item is represented as a vector. Traditional exact search methods can be too slow or require excessive memory, making them impractical for large-scale applications. Fortun...

OpenAI Agents API

OpenAI Agents API

Discover how OpenAI's latest Agents API is transforming AI development with autonomous agents, multi-turn conversations, and built-in tools like Web Search, File Search, and Computer Use. Learn how Cazton helps businesses integrate OpenAI, Azure Open...

Voice AI

The Future is Speaking Today

Voice AI

Imagine a world where interacting with technology feels as natural as having a conversation. Voice AI is transforming how we shop, work, and live - eliminating clicks and complicated interfaces in favor of simple, intuitive voice commands. From effor...

AI Voice Assistant

Voice RAG with Azure Cosmos DB and Azure OpenAI

AI Voice Assistant

In the ever-evolving landscape of artificial intelligence, the need for seamless, intelligent, and scalable communication between humans and AI assistants has become paramount. At our organization, we are thrilled to unveil the latest iteration of ou...

Voice RAG

Azure OpenAI and Azure Cosmos DB

Voice RAG

At Cazton, we specialize in creating AI systems that have high accuracy and performance. In this blog post, we showcase a cutting-edge system based on OpenAI Realtime Voice API that uses Azure OpenAI, Azure Cosmos DB for MongoDB, and advanced vector ...

vCore-based Azure Cosmos DB for MongoDB vs MongoDB Atlas

vCore-based Azure Cosmos DB for MongoDB vs MongoDB Atlas

This benchmark study provides a comprehensive comparison of Azure Cosmos DB for MongoDB (vCore) and MongoDB Atlas, with a focus on their performance in vector queries and how they can be leveraged for GenAI applications.

Fine-tuning vs RAG vs RAFT

Optimizing LLMs for Domain-Specific Question-Answering with Cazton's Expertise

Fine-tuning vs RAG vs RAFT

Fine-tuning, retrieval-augmented generation (RAG), and retrieval-augmented fine-tuning (RAFT) are three approaches used to enhance the performance of large language models (LLMs) in domain-specific question-answering tasks. Now we will compare and an...