🤖

Zaytri — AI Automation Platform

Zaytri is an AI automation platform built to orchestrate multi-agent workflows, run brand-aware RAG pipelines with pgvector, and route requests across multiple LLM providers. The system uses Celery and Redis for async task queues, supports local inference for cost optimization, and exposes a premium Next.js dashboard with full observability.

2025 - 2026

📊 Key metrics & outcomes

  • 3+ LLM providers (Ollama, OpenAI, Gemini) with unified routing
  • ~40% cost reduction via local inference
  • Sub-2s RAG retrieval latency (pgvector)
  • Dashboard load <500ms; p95 LLM latency <3s
  • Celery + Redis async workflows with full observability

Project Overview

📋Situation

The product needed a production-grade AI automation platform that could run multi-agent workflows, support brand-aware retrieval (RAG), optimize costs via local inference, and provide a single dashboard for observability across multiple LLM providers.

🎯Task

Design and build a multi-agent AI orchestration system with RAG using pgvector, Celery async workflows with Redis, multi-LLM routing (Ollama, OpenAI, Gemini), and a premium Next.js dashboard with observability.

Action

Built multi-agent AI orchestration system for complex workflows; implemented brand-aware RAG with pgvector for retrieval; achieved cost optimization via local inference (Ollama); designed Celery async workflows and Redis queue for reliability; implemented multi-LLM routing for Ollama, OpenAI, and Gemini; delivered a premium Next.js dashboard with observability and monitoring.

🏆Result

Shipped platform with 3+ LLM providers (Ollama, OpenAI, Gemini), ~40% cost reduction via local inference, sub-2s RAG retrieval latency (pgvector), and dashboard with <500ms load for observability across agents and queues.

Implementation 2

📋Situation

Coordinating multiple LLM providers, keeping RAG context brand-consistent, and ensuring the dashboard remained performant while displaying real-time task and model metrics required careful architecture.

🎯Task

Ensure consistent routing logic across providers, maintain brand-aware retrieval quality, and build a dashboard that scales with usage without impacting backend throughput.

Action

Designed abstraction layer for multi-LLM routing with fallbacks and cost-based selection; tuned pgvector indexes and RAG prompts for brand consistency; used Celery task chains and Redis for observability metadata; built Next.js dashboard with server components and efficient polling/streaming for live metrics.

🏆Result

Multi-LLM routing with p95 latency <3s and zero provider lock-in; RAG relevance score improvement; dashboard TTI <1s with no increase in backend error rate.

Technology Stack

Next.jsFastAPICeleryRedisPostgreSQLpgvectorOllamaOpenAIGeminiRAGMulti-LLM Routing