ServerMate icon

Dual-Model Discord AI With Real Memory

ServerMate Discord AI

A Discord bot that decides when to be lightning-fast and when to go deep, while remembering every user it meets.

Dual-model intelligence keeps costs lowMemory + personality tuned per serverImagen, video, browsing, and research in one bot

Servers Online

10+

multi-guild

Memories Stored

50k+

PostgreSQL

Image/Video Jobs

350+

Imagen 3.0 + Vertex

API Savings

90%

fast model routing

Overview

ServerMate feels alive in Discord servers. It inspects every incoming message, decides whether it needs deep reasoning or lightweight banter, and swaps between Gemini 2.0 Flash and Gemini 2.5 Pro automatically. Beyond chat, it generates images/videos, browses the web (respecting CAPTCHAs), and maintains per-user memory so conversations stay personal.

Stack

PythonDiscord.pyGoogle GeminiGemini 2.0 FlashGemini 2.5 ProImagen 3.0Vertex AIPostgreSQLSerper APIAsyncIO
Visual tour
Web browsing flow

Web Browsing & Safety

ServerMate records what it sees when browsing, including the CAPTCHA that stopped this request—transparency first.

Code highlights

python

AI-Driven Model Selection

Routes casual talk to Gemini Flash and tough prompts to Pro with guardrails for image queries.

async def decide_model(message_meta: dict) -> bool:
    if wants_image_search or message_meta.get("small_talk"):
        return False  # fast model

    decision_model = get_fast_model()
    prompt = f"User message: '{message.content}'\nDoes this need deep reasoning?"
    decision = await decision_model.generate_content(prompt)
    return "deep" in decision.text.lower()
  • Special-casing small talk and image searches
  • Fast model used unless the classifier flags deep reasoning
  • Logging choices for debugging cost/perf
Key outcomes
AI-driven model selection that routes casual chat to the fast model and complex prompts to the smart model
Long-term memory per user and per server with transparency commands (!memory and !forget)
Imagen 3.0-powered image generation plus video generation flows
Vision analysis so users can drop screenshots, diagrams, or equations
Web browsing with guardrails that explain CAPTCHAs instead of bypassing them
Serper-powered search for real-time research
Command suite covering stats, reminders, and creative tooling

Challenges tackled

  • Optimizing API costs while keeping responses instantaneous
  • Designing a normalized Postgres schema for memory, interactions, learned behaviours, and multimedia logs
  • Handling multi-server concurrency with AsyncIO and rate-limit friendly queues
  • Building consistent personality while still answering technical prompts accurately
  • Blending multimedia generation and browsing without blocking Discord event loops

Impact

ServerMate runs across multiple Discord servers today, generating thousands of interactions. Users notice the personality, the recall, and the fact that it can swap from joking to debugging in one turn.