DataVaultGPT

Admin Portal

TOTAL

Documents indexed

TOTAL

Queries answered

AVG

Confidence score

FEEDBACK

Helpful rate

ACTIVE

Open escalations

Query Volume (7 days)

Sources

Model Usage

Confidence Distribution

OpenRouter API Usage

Executive Intelligence

System performance insights for leadership review

Total Queries

Auto-Resolved

No escalation needed

Escalation Rate

Lower is better

Positive Feedback

Of rated responses

Confidence Distribution

🔥 Goal: shift LOW → HIGH over time by filling content gaps

Queries by Product Area

Escalation Circle Performance

Circle 1 → 2 promotion rate
Avg time to resolution

Top Negative-Feedback Sources

Escalated Topics by Volume

Product areas with the most unanswered queries

Zero-Results Queries

Questions with no matching sources in the knowledge base

Repeated Low-Confidence Questions

Same question asked multiple times, always resulting in escalation — prime candidates for KB articles

Question Asked Avg Confidence Last Seen

Document Health

Top cited documents ranked by health status

Document Citations Verified Neg Feedback Age (days) Health

Query History

Question Confidence Model User Channel Feedback Time
Showing of

Query Detail

Delete Sources?

This will permanently delete all selected documents and their indexed chunks. This cannot be undone.

Delete Source?

This will permanently delete the document and all its indexed chunks. This cannot be undone.

Knowledge Sources

selected
Source Type Authority Chunks Feedback Indexed Actions
No sources indexed yet
Showing of

Open

SLA Breached

At Risk (<1h)

Resolved

Question Circle Status Area Owner SLA Confidence Created

Test Query

Knowledge Gaps

FAQ Candidates

Channel Intelligence Stats

Historical Backfill

Import historical Slack conversations to seed the intelligence database. Backfill is idempotent — running it multiple times is safe.

DataVaultGPT Docs

Slack-first AI RAG System

Overview

What DataVaultGPT is and how it works

DataVaultGPT

A Slack-first AI-powered knowledge assistant that connects your organization's scattered documentation — Confluence, Google Drive, JIRA, Zendesk, Slack history, and more — into a single, queryable intelligence layer. Ask a question in Slack; get an accurate, cited answer in seconds.

How It Works

1

Ingest

Documents are pulled from connected sources (Confluence, Google Drive, JIRA, Zendesk, uploaded files, etc.), split into chunks, and stored as vector embeddings in PostgreSQL.

2

Receive

A user asks a question in Slack (or via the REST API). The query passes through a 10-step pipeline.

3

Search

Hybrid search (semantic vector search + BM25 keyword search) finds the most relevant document chunks. Results are re-ranked by a cross-encoder model.

4

Generate

The top chunks are passed as context to an LLM (via OpenRouter) which generates a grounded, cited answer.

5

Respond

The answer is posted back to Slack (or returned via API) with source citations. If confidence is low, the query is flagged for escalation.

Architecture

Core Stack

FastAPI — REST API backend + admin portal
PostgreSQL + pgvector — document store + HNSW vector index
Redis — semantic answer cache
OpenRouter — unified LLM gateway (Claude, GPT-4, Gemini…)
Slack Bolt — Socket Mode bot

Data Sources

Confluence (wiki pages)
Google Drive (docs, sheets, slides)
JIRA (tickets & resolutions)
Zendesk (support tickets)
Slack history, Release notes, File uploads
  Slack / REST API
       │
       ▼
  ┌─────────────────────────────────────────────────────┐
  │                  Query Pipeline                      │
  │  Cache? → FAQ? → Expand → Embed → Search → Rerank   │
  │                        → LLM → Verify → Respond     │
  └──────────────┬──────────────────────────────────────┘
                 │
        ┌────────┴────────┐
        ▼                 ▼
  PostgreSQL            Redis
  (chunks +           (answer
   pgvector)           cache)
        │
        ▼
  OpenRouter (LLM)
  Claude / GPT-4 / Gemini

Key Concepts

Authority Tiers Documents are ranked T1 (Canonical — ground truth) → T2 (Release notes) → T3 (Plans/drafts) → T4 (Slack messages). Higher-tier sources are preferred in answers.
Chunking Documents are split into overlapping text segments (~512 tokens each) for fine-grained retrieval. Each chunk is independently embedded and searchable.
Hybrid Search Combines dense vector similarity (semantic meaning) with sparse BM25 (keyword frequency) for best-of-both recall and precision.
Query Cache Answers to semantically similar questions are cached in Redis. If a new query is close enough to a cached one, the stored answer is returned instantly without running the full pipeline.

Getting Started

Set up DataVaultGPT in minutes

Prerequisites

Docker & Docker Compose — for running all services
OpenRouter API Key — required for LLM generation (openrouter.ai)
Voyage AI Key — optional, for higher-quality embeddings
Slack App — optional, for Slack bot integration

Installation

1 — Clone the repository

git clone <repo-url>
cd DataVaultGPT

2 — Configure environment

cp .env.example .env
# Edit .env and set your API keys

At minimum, set OPENROUTER_API_KEY, ADMIN_USERNAME, and ADMIN_PASSWORD.

3 — Start all services

docker compose up -d

This starts PostgreSQL, Redis, and the app server. Database migrations run automatically on startup.

4 — Access the admin portal

open http://localhost:8000/portal

Quick Start: Your First Query

1

Upload a document

Go to Sources → Upload Document. Upload a PDF, DOCX, PPTX, XLSX, or TXT file. Set the authority tier (T1 = most authoritative).

2

Test a query

Go to Test Query tab. Type a question about the document you uploaded and click Submit.

3

Review the result

The response shows the answer, confidence level, and source citations. Check Request Performance in Settings to see pipeline timing.

4

Connect Slack (optional)

Go to Sources → Slack Configuration. Enter your Slack Bot Token, Signing Secret, and App Token. Users can then @mention the bot in any channel.

Security Note

Change ADMIN_PASSWORD from the default before exposing to any network. The app refuses to start with the default password changeme unless DEBUG=true is set.

Features & Functionality

Everything DataVaultGPT can do

Document Ingestion

Documents from any connected source are automatically processed into searchable chunks. Three upload modes are available in Sources → Knowledge Sources:

Single File

Upload one file at a time. Formats: PDF, DOCX, PPTX, XLSX, TXT, MD (up to 250 MB).

Folder Upload

Select an entire local folder. All supported files are uploaded at once, preserving folder name in document titles.

ZIP Upload

Upload a .zip archive. Folder structure preserved as folder › file in titles. macOS resource forks and hidden files are automatically skipped. Progress tracked in real-time.

Connectors

Confluence, Google Drive, JIRA, Zendesk, Slack History, Release Notes (web), GitHub, Notion, File Upload (single/folder/ZIP)

Processing

Text extracted (PDF uses enhanced tolerance for character-positioned PDFs) → split into 512-token chunks with 64-token overlap → embedded → stored with full-text BM25 index

Authority Tiers

T1 Canonical · T2 Release · T3 Plans · T4 Slack — higher tiers are preferred in answers

Deduplication

Re-uploading the same file with unchanged content is a no-op. Changed content replaces the old chunks.

Hybrid Search & Confidence Scoring

Every query runs two search passes and merges results using Reciprocal Rank Fusion (RRF). Confidence scoring uses the best signal from both passes.

Semantic (dense)pgvector HNSW index finds chunks with similar meaning using cosine similarity. Best for conceptual questions.
BM25 (sparse)PostgreSQL full-text tsvector finds exact keyword matches. Best for specific terms, acronyms, and product names. Runs in an isolated database session to prevent transaction conflicts.
Re-rankingIf Cohere API is configured, a cross-encoder model re-scores merged results for final precision. Falls back to RRF score without Cohere.
Confidence scoreCalculated as 0.30 × authority_signal + 0.70 × retrieval_signal. The retrieval signal uses max(vector_score, rrf_normalized) — whichever is higher — so strong BM25 hits count equally to strong vector hits. This prevents correct answers from being escalated due to low vector similarity alone.
Confidence Bands:  ✅ HIGH ≥ 0.80 — answer returned directly  ·  🟡 MEDIUM 0.60–0.79 — returned with a verify warning  ·  🔴 LOW < 0.60 — escalated to product team

FAQ System

Pre-authored Q&A pairs that bypass the full pipeline for instant, guaranteed-accurate answers on critical questions.

When a query closely matches a FAQ question (cosine similarity > threshold), the FAQ answer is returned directly without calling the LLM. This ensures consistent answers for policy-critical questions and is significantly faster than the full pipeline.

Manage FAQs in Sources → FAQ Management. Each FAQ can be refreshed to update its embedding if the answer text changes.

Semantic Answer Cache

Redis-backed cache that stores answers keyed by query embedding. When a new question is semantically similar enough to a previously answered one (configurable threshold), the cached answer is returned immediately — zero LLM cost, sub-100ms response.

Routing Rules

Keyword and topic-based rules that control how different types of questions are handled. Rules are evaluated in priority order.

Custom Model

Route specific topics to a different LLM (e.g., route legal questions to a more precise model)

Topic Filtering

Restrict which document product areas are searched for a given query topic

Escalation System

Three-circle escalation system that routes unanswered queries to human experts, tracks SLA compliance, and closes the loop with the original asker.

Circle 1 — Topic PM

Default 4h SLA. Triggered on low confidence, missing sources, or manual escalation. Owner DM'd immediately.

Circle 2 — Area Lead

Default 8h SLA. Triggered on Circle 1 SLA breach. New owner DM'd automatically on promotion.

Circle 3 — Leadership

Default 24h SLA. Triggered on Circle 2 breach, or 3+ open escalations in the same product area (systemic pattern).

End-to-End Flow

Low-confidence answer triggers escalation; record stored with the original asker's Slack user ID
Escalation card posted to the designated Slack channel; assigned owner DM'd with the question, bot answer, and a "✍️ Provide Answer" button
Owner clicks the button → types reply in a Slack modal → submits
Reply saved to DB; escalation marked resolved; original asker DM'd with the expert answer
On SLA breach, POST /escalations/check-sla promotes the record and DMs the next-circle owner automatically

Configure topic-to-owner routing in Escalations → Escalation Mapping. Each product area can have a dedicated Circle 1 owner, Circle 2 backup, and custom SLA hours.

Performance Tracking

Every query is timed at each pipeline step. Slow steps are flagged with actionable recommendations.

Embedding

slow if >800ms

Cache lookup

slow if >300ms

Search

slow if >2s

Re-rank

slow if >2s

LLM generation

slow if >8s

Total (Slack)

slow if >20s

Enable in Settings → Performance Tracking. View history and recommendations in the Request Performance table.

Slack Reply UI

Every Slack answer is formatted using Block Kit with a consistent, confidence-aware layout.

Confidence headerLarge bold header block: ✅ Answer verified against N sources · 0.85 / 🟡 Partial match — verify before sharing / 🔴 No answer found — escalated to product team
CitationsOne line per verified source: 📄 Title · Last updated Date · Source. Unverified sources are omitted. Each citation has its own 👍 / 👎 feedback buttons.
Action buttonsContextual per confidence state: HIGH → Share to Channel; MEDIUM/LOW (not escalated) → Escalate to Expert; Escalated → Try rephrasing / Wrong Question only. Feedback buttons hidden for escalated answers.
Cache indicator⚡ Cached response label shown in footer when answer came from Redis cache. Re-ask button appears for cached responses.

Query History & Pagination

Query History and Knowledge Sources are both paginated with 25 records per page.

Paginated APIs

Both /admin/queries and /admin/documents return {total, items}. Count queries run in parallel with data fetches for minimal overhead.

User Display Names

Query History shows Slack user avatar + display name instead of raw user IDs. Names are loaded from /admin/slack-users.

API Reference

All available endpoints with request and response formats

Base URL: http://your-server:8000

Authentication: Admin endpoints require HTTP Basic Auth. Include an Authorization: Basic <base64(user:pass)> header.

POST /query Submit a question to the pipeline

Request Body (JSON)

{
"query": "What is our AI usage policy?", // required
"source": "api", // optional
"user_id": "U123ABC" // optional, for Slack user context
}

Response (200 OK)

{
"answer": "According to the AI Use Policy...",
"confidence_score": 0.82,
"confidence_band": "HIGH",
"citations": [
{ "title": "AI Policy", "source": "confluence", "tier": 1 }
],
"cached": false,
"query_id": "uuid"
}

cURL Example

curl -X POST http://localhost:8000/query \
-H "Content-Type: application/json" \
-d '{"query": "What is our return policy?"}'
POST /query/stream Streaming SSE response

Same request body as /query. Returns a Server-Sent Events stream. Events: token (partial answer), done (final with citations), error.

GET /health Service health check
{ "status": "ok", "db": "connected", "redis": "connected" }

Admin Endpoints (require Basic Auth)

MethodPathDescription
GET/admin/statsDashboard aggregate statistics
GET/admin/queriesQuery history — paginated, returns {total, items}. Params: limit, offset, band, search
GET/admin/documentsIndexed documents — paginated, returns {total, items}. Params: limit, offset, search
POST/admin/uploadUpload and index a single file (multipart/form-data: file, authority_tier, product_area)
POST/admin/upload/zipUpload a .zip archive and index all contained files. Form fields: file, authority_tier, product_area, job_id (for progress polling)
GET/admin/upload/zip/progress/:job_idPoll ZIP upload progress. Returns {status, total, done, failed, current_file}
DELETE/admin/documents/:idDelete a document and its chunks
GET/admin/llm/effective-chainShow the currently wired LLM client: primary model, fallback chain, and whether it came from admin override or env default
GET/admin/faqsList FAQ entries
POST/admin/faqsCreate a new FAQ entry
DELETE/admin/faqs/:idDelete a FAQ entry
GET/admin/routing-rulesList routing rules
POST/admin/routing-rulesCreate a routing rule
GET/admin/connectorsList all connectors and their status
GET/admin/logsApplication log entries (filterable). Supports Clear Logs action.
GET/admin/performanceRequest performance records with per-step timing breakdown
GET/admin/escalationsEscalation queue
GET/admin/product-areasDistinct product areas from queries & documents (for escalation mapping)
GET/admin/slack-usersSlack workspace users with display names and avatar URLs
PUT/admin/escalation-mappingSave topic-to-owner mapping (Circle 1, Circle 2, SLA per area)
POST/escalations/check-slaPromote breached escalations C1→C2→C3 and DM newly assigned owners
PUT/admin/openrouter-settingsUpdate LLM model/token configuration
PUT/admin/settingsUpdate app settings (e.g. perf tracking toggle)
POST/admin/restartRestart the application

Configuration

All environment variables and their defaults

How settings work: Environment variables in .env provide the base configuration. Certain settings (OpenRouter model, connector credentials, expansion model) can also be overridden at runtime through the admin portal and are persisted to the database.

Core

VariableDefaultDescription
DATABASE_URLpostgresql+asyncpg://…PostgreSQL connection string
REDIS_URLredis://redis:6379/0Redis connection string for answer cache
DEBUGfalseEnable debug mode. Do NOT use in production.
CORS_ALLOWED_ORIGINS(empty)Comma-separated allowed CORS origins. Empty = localhost only.

Authentication

VariableDefaultDescription
ADMIN_USERNAMEadminAdmin portal username. Change before deploying.
ADMIN_PASSWORDchangemeMust be changed. App refuses to start with default in production. Supports bcrypt hashes (prefix: $2b$).

AI Services

VariableRequiredDescription
OPENROUTER_API_KEYrequiredOpenRouter key for LLM access. Get one at openrouter.ai.
OPENROUTER_PRIMARY_MODELclaude-haiku-4.5Default LLM model. Overridable from admin portal.
OPENROUTER_MAX_TOKENS512Max tokens per LLM response.
OPENROUTER_TEMPERATURE0.2LLM temperature. Lower = more deterministic.
VOYAGE_API_KEYoptionalVoyage AI key for higher-quality embeddings.
COHERE_API_KEYoptionalCohere key for cross-encoder re-ranking.

Slack

VariableRequiredDescription
SLACK_BOT_TOKENfor SlackBot token (xoxb-…) from Slack app settings.
SLACK_SIGNING_SECRETfor SlackSigning secret for request verification.
SLACK_APP_TOKENfor SlackApp-level token (xapp-…) for Socket Mode.
SLACK_ESCALATION_CHANNELoptionalChannel ID where low-confidence queries are posted.
SLACK_HISTORY_CHANNELSoptionalComma-separated channel IDs to ingest as knowledge.

Connectors

Confluence

CONFLUENCE_BASE_URL — e.g. https://myorg.atlassian.net/wiki

CONFLUENCE_USERNAME — your Atlassian email

CONFLUENCE_API_TOKEN — API token from id.atlassian.com

CONFLUENCE_SPACES — comma-separated space keys, e.g. PROD,ENG

Google Drive

GDRIVE_SERVICE_ACCOUNT_JSON — path to service account key file

GDRIVE_FOLDER_IDS — comma-separated folder IDs to index

JIRA

JIRA_BASE_URL · JIRA_USERNAME · JIRA_API_TOKEN · JIRA_PROJECTS

Zendesk

ZENDESK_SUBDOMAIN · ZENDESK_EMAIL · ZENDESK_API_TOKEN

Workflows & Tutorials

Step-by-step guides for common tasks

Bulk Upload via ZIP Archive

Use ZIP upload to ingest an entire folder of documents in one step — useful for initial data loads or batch re-indexing.

1

Prepare your ZIP file

Compress a folder of supported files (PDF, DOCX, PPTX, XLSX, TXT, MD). Nested folders are preserved as folder › file in document titles. macOS __MACOSX resource forks are automatically skipped.

2

Go to Sources → Knowledge Sources → Upload Document

Switch the upload mode toggle to ZIP.

3

Set authority tier and product area

All files in the ZIP will inherit the selected tier and product area. You can edit individual documents after upload if needed.

4

Click Upload and monitor progress

A real-time progress bar shows how many files have been indexed. Each file is processed sequentially. Errors are shown inline — other files continue processing.

Tip: For large ZIPs, the upload HTTP request may take several minutes. Monitor progress via the progress bar or poll GET /admin/upload/zip/progress/{job_id} directly.

Adding a New Data Source

1

Go to Sources → Connectors

Find the connector for your data source (Confluence, Google Drive, etc.)

2

Enter credentials

Fill in the required fields (API token, base URL, etc.) and save. Credentials are encrypted before storage.

3

Test the connection

Click "Test Connection" to verify credentials are valid before running a full sync.

4

Trigger sync

Click "Sync Now" to begin ingesting content. Large sources may take several minutes. Check Application Logs for progress.

Creating an FAQ Entry

1

Go to Sources → FAQ Management

FAQs intercept matching queries before the full pipeline runs.

2

Click "Add FAQ"

Enter a representative question (this is what queries are matched against) and a definitive answer.

3

Save

The question is immediately embedded. Future queries that are semantically similar will return your FAQ answer instantly.

Best practice: Use FAQs for policy-critical questions where you need guaranteed, consistent answers — e.g., "What is our data retention policy?" rather than general knowledge questions.

Setting Up Routing Rules

1

Go to Sources → Routing Rules

2

Click "Create Rule"

Enter keywords that will trigger this rule (e.g., "legal", "compliance", "GDPR") and the desired behavior.

3

Set priority

Rules with lower priority numbers run first. When multiple rules match, the highest-priority rule applies.

Monitoring Query Performance

1

Enable tracking in System → Settings

Toggle "Performance Tracking" on. This records pipeline timing for every query.

2

View records in the Request Performance table

Each row shows total time and per-step breakdown. Slow steps are highlighted in red/amber.

3

Click a row for recommendations

Each flagged step has an actionable recommendation — e.g., "set a dedicated expansion model", "check pgvector index", "reduce context tokens".

Configuring Escalation Mapping

Map product areas to the right PM owners so escalations are automatically routed and owners DM'd when a low-confidence query triggers an escalation.

1

Go to Escalations → Escalation Mapping

The mapping table shows all configured product areas. A _default row acts as the fallback when no area match is found.

2

Click "Add Topic"

Choose a product area from the dropdown (populated from your ingested documents and past queries) or type a custom area name.

3

Assign owners

Use the searchable Slack user picker to assign a Circle 1 (primary PM) and Circle 2 (backup) owner. Avatars and display names are loaded from your Slack workspace.

4

Set SLA hours (optional)

Default is 4 hours for Circle 1. Adjust per-area if some topics need faster or slower triage.

5

Click "Save Mapping"

Changes are persisted to the database immediately. The next escalation uses the updated routing without a restart.

Tip: Set a _default Circle 1 owner to ensure every escalation reaches someone even when the product area is unknown or unmatched.

How the Owner Reply Loop Works

Once an escalation is created, the system handles the full conversation loop automatically — from owner DM to final delivery to the original asker.

1

Bot DMs the assigned Circle 1 owner

The DM includes the original question, the bot's answer, the confidence score, and a "✍️ Provide Answer" button.

2

Owner clicks "Provide Answer"

A Slack modal opens. The owner types their expert answer and clicks Submit.

3

Reply is saved and escalation resolved

The reply text and timestamp are stored on the escalation record. Status is set to resolved automatically.

4

Original asker receives a DM

The bot DMs the person who originally asked the question with the expert answer, attributed to the owner by name.

Note: Owner DMs are only sent when Slack is configured (SLACK_BOT_TOKEN set) and the escalation has an assigned owner. Check Application Logs if DMs are not arriving.

Automating SLA Checks

Call POST /escalations/check-sla on a schedule to automatically promote stale escalations and notify the next-circle owner.

1

Set up a cron job or scheduler

Run every 15 minutes. Example using system cron:

*/15 * * * * curl -s -X POST -u admin:pass http://localhost:8000/escalations/check-sla
2

Promotions happen automatically

Any Circle 1 escalation past its SLA deadline is promoted to Circle 2, new owner assigned, SLA clock restarted, and new owner DM'd. Same for Circle 2 → Circle 3.

3

Systemic pattern detection

If 3 or more open escalations share the same product area, the newest one is automatically promoted to Circle 3 (leadership) regardless of SLA status.

Optimizing for Speed

Set a dedicated expansion model — In Sources → OpenRouter Settings → Expansion Model, set a fast cheap model like google/gemini-2.0-flash. This reduces query expansion from ~8s to ~1s.
Add FAQs for common questions — FAQ lookups take <100ms vs 5–15s for a full pipeline run.
Warm the cache — The first time a question is asked it runs the full pipeline; subsequent similar questions hit the Redis cache instantly.
Reduce max tokens — Lowering OPENROUTER_MAX_TOKENS speeds up LLM generation. 256–512 is suitable for most Q&A use cases.

Troubleshooting

Common issues and how to fix them

PDF text extracted as garbled/split words (e.g. "N otification")

Some PDFs use character-level positioning where each letter is placed individually, causing word splits during extraction.

• The PDF parser uses enhanced tolerance settings (x_tolerance=10, y_tolerance=5) to merge adjacent characters into words. If you see garbled text from a document indexed before this fix, delete and re-upload the file to re-extract with the improved settings.

• If problems persist, try exporting the PDF from its source application as "PDF/A" or using pdftoppm to render pages to images and then OCR them.

"Could not extract any text from file"

Returned when uploading a document that produces no text.

Scanned PDF: pypdf cannot extract text from image-based PDFs. Convert to text-based PDF first, or use an OCR tool.

PPTX with only images/diagrams: Slides with no text shapes will be skipped. Add text descriptions or speaker notes.

Encrypted/password-protected file: Remove protection before uploading.

"No results found" on every query

The search returns no matching chunks.

• Check Sources → Documents — does the document have >0 chunks? If 0, the file may have failed to parse.

• Check that the embedding service is configured (VOYAGE_API_KEY or OpenRouter embeddings are active). View Application Logs for errors.

• Verify the document's product area matches your routing rules (if any rules restrict search scope).

Slow responses (>15 seconds)

• Enable performance tracking and check which step is slow (Settings → Request Performance).

Slow embedding: Voyage AI or OpenRouter embeddings are rate-limited. Consider caching or upgrading the plan.

Slow LLM: Switch to a faster model (e.g., Haiku instead of Opus). Reduce max tokens.

Slow expansion: Set a cheap dedicated expansion model in OpenRouter Settings.

Slow search: Ensure the pgvector HNSW index exists. Check with \\d chunks in psql.

429 Too Many Requests

• The /query endpoint is rate-limited to 20 requests/minute per IP.

• If this is triggering during normal use, check that your client isn't sending redundant requests.

• API key rate limits from OpenRouter or Voyage AI will appear as 500 errors in logs.

Admin portal shows blank page or login fails

• Verify the app container is running: docker compose ps

• Check startup logs: docker compose logs app --tail=50

• Ensure ADMIN_PASSWORD is not changeme (the app won't start with the default password unless DEBUG=true).

• Check CDN connectivity — the portal requires access to cdn.tailwindcss.com and cdn.jsdelivr.net.

Slack bot not responding

• Verify SLACK_BOT_TOKEN, SLACK_SIGNING_SECRET, and SLACK_APP_TOKEN are all set correctly.

• Check that Socket Mode is enabled in your Slack app settings (api.slack.com).

• The bot needs to be invited to the channel: /invite @your-bot-name

• Check logs for "Slack not configured" or connection errors.

Escalation owner DMs not being sent

• Confirm Slack is configured and the bot is running — check logs for Slack bot started.

• Verify the escalation has an assigned owner: open Escalations and check the "Assigned To" column. If blank, no mapping entry matched — add a _default owner in Escalation Mapping.

• Check Application Logs for _dm_escalation_owner errors. A common cause is the bot lacking the im:write and users:read OAuth scopes.

• The DM fires 4 seconds after the query response is returned (background task). If the container was restarted in that window, the task is lost. It will not retry automatically.

Slack user picker shows "No users match"

• The picker loads from GET /admin/slack-users. If that returns HTTP 400, the SLACK_BOT_TOKEN is not set or not saved via the admin panel.

• Go to Sources → Connectors → Slack, enter your bot token, and click Save. The picker will reload on next open.

• Ensure the bot token has users:read scope. Bots without this scope will receive an empty user list from Slack.

Queries with known answers always escalate (confidence score too low)

• Check the confidence score in Query History — if it is below 0.60, the answer was treated as LOW and escalated.

• Make sure the relevant documents are indexed and have authority tier T1 or T2. T3/T4 documents have a lower authority signal.

• Re-upload any documents that were indexed before the PDF extraction fix — garbled text produces poor embeddings and low retrieval scores.

• If documents are older than 12 months, the freshness penalty (STALE_WEIGHT=0.50) halves their authority signal. Consider updating the documents or promoting their authority tier.

• Check GET /admin/llm/effective-chain to confirm the LLM is wired (not stub mode). If is_stub: true, set a valid OPENROUTER_API_KEY.

Database migration errors on startup

• Run docker compose exec app alembic upgrade head manually.

• Check that the PostgreSQL container is healthy before the app starts: docker compose ps

• If migrations are stuck, check alembic_version table in the database for the current state.

Glossary

Key terms and concepts

OpenRouter Settings

Configure the LLM model, parameters, and system prompt

Primary Model

Configured primary model
Primary

Query Expansion Model

Used only when expanding short queries (<12 words) into variants. Should be a fast, cheap model. Leave blank to use the primary model (slower but no extra config needed).

Recommended: google/gemini-2.5-flash or meta-llama/llama-3.1-8b-instruct:free

Embedding Model

Used to encode documents at ingest and queries at search time. Both must use the same model — changing this requires re-uploading all documents to rebuild embeddings in the new vector space.

Configured embedding model

Re-embed All Documents

Rebuilds every chunk's vector using the currently active embedding model. Run this after switching models.

Fallback Chain

Tried in order when the primary model returns 402 / 429 / 503

Generation Parameters

0 — deterministic 2 — creative

Maximum tokens in the response

System Prompt

Instructions prepended to every query. Clear to restore the default.

Model Usage

Quick Reference

Provider OpenRouter
Active model
Embed model
Temperature
Max tokens
Fallbacks

Changes apply live — no restart needed

Restart App?

The app will be unavailable for ~10 seconds while it restarts. New credentials will be active immediately after.

Connector Configuration

Manage API keys and credentials for all data sources and AI services

Settings saved — applied live

Product Area Classification

Configure how documents and queries are automatically assigned a product area. Strategies are tried in priority order — the first one that returns a result wins. Enable the strategies you want and drag the handles to reorder within each phase.

Taxonomy Manager

Canonical label set used by all classification strategies. Labels here drive LLM prompts, keyword rules, and escalation routing.

active labels total docs stale (need re-classification) unclassified All docs current

Saved — active on next ingest / query
Document Classification Runs at ingest time · strategies 1–4
Query Classification Runs at query time · strategies 5–7

AI Services

Data Sources

Web Page Scraper

Web Page Configured Tier 2

Scrape and ingest any publicly accessible web page as a knowledge source

Ingested Pages

No pages ingested yet. Paste a URL above and click Scrape & Ingest.

Delete Page?

This will permanently remove the page and all its chunks from the knowledge base. Answers citing this page will no longer be available.

Application Logs

30-day history of errors, warnings, and connector activity

Clear All Logs?

This will permanently delete all log entries from the database. This cannot be undone.

Errors (24h)

Warnings (24h)

Total Errors (30d)

Total Entries (30d)

Most Recent Errors

Loading logs…
No log entries match the current filter.
Time Level Component Message

Escalation Mapping

Assign a product owner to each topic — when a question is escalated, the matching owner is notified

New escalation topic

Circle 1 is the primary owner notified immediately on escalation. Circle 2 is escalated to if Circle 1 breaches the SLA.

Topic names are matched against the product_area field on each query. Unmatched queries fall through to the Default owner.

FAQ Management

Question Answer (preview) Status Actions

Model Routing Rules

Rules evaluated in priority order (lowest number = highest priority)

Priority Name Model Conditions Status Actions

Feature Toggles

Enable or disable optional pipeline features

Saved
Request Performance Tracking Enabled Disabled

When enabled, records end-to-end timing for every query — embedding, search, rerank, LLM generation, verification, and total wall time. Slow steps are flagged with actionable recommendations. Results appear in the Request Performance table below.

Note: adds a small DB write per request (~1–2 ms overhead).

Request Performance

Per-step timing for recent queries — slow steps highlighted in amber/red

Export & Recovery

Full backup and restore of all application data for disaster recovery

Export Backup

Downloads a ZIP archive containing all tables: documents, chunks (with embeddings), FAQs, escalations, query history, connector settings, Slack messages, and more.

The backup includes API keys and credentials stored in connector settings. Store it securely.

Import & Restore

Upload a previously exported backup ZIP. All existing data will be permanently replaced. This cannot be undone — export a backup first if you have data you want to keep.

Confirm Data Restore

This action cannot be undone

You are about to permanently replace all current data with the contents of this backup:

Documents, settings, FAQs, escalation history, and query logs will all be overwritten. The app will continue serving queries during the restore.

Assign owner