RAG Configuration
Advanced configuration options for Retrieval Augmented Generation (RAG) in Cognipeer AI.
Overview
RAG (Retrieval Augmented Generation) allows your AI Peers to answer questions using your own data sources while maintaining accuracy and providing source attribution. This guide covers all RAG configuration options and best practices.
Core Settings
Enable RAG Pipeline
Control whether the RAG pipeline is active for your Peer.
{
"enableRagPipeline": true
}Default: true
When to disable:
- Pure conversational Peers without data requirements
- Testing general LLM behavior
- Reducing latency for simple queries
Include Data Sources as Tools
Expose data sources as callable tools in agentic workflows.
{
"includeDataSourcesAsTools": true
}Default: true
Benefits:
- AI can explicitly query specific data sources
- Better control over when retrieval happens
- Improved reasoning about which source to use
Metadata Enrichment
Include Metadata
Add rich contextual metadata to retrieved documents.
{
"ragIncludeMetadata": true
}Default: true
What gets included:
- Dataset items: Item identifiers, dataset names, custom fields
- Documents: Filename, author, upload date, file type
- External sources: URL, crawl date, page structure
- Collections: Custom metadata from your schema
Example Output:
Context: "The Q4 revenue was $2.5M"
Without metadata:
Answer: "Q4 revenue was $2.5M."
With metadata:
Answer: "According to the Financial Report 2024 (uploaded by Jane Smith on Oct 15, 2025),
the Q4 revenue was $2.5M."Include Conversation Sources
Add references to previous conversation context in RAG results.
{
"ragIncludeConversationSources": true
}Default: true
Use cases:
- Multi-turn conversations requiring previous context
- Follow-up questions referencing earlier topics
- Maintaining conversation coherence
Answer Validation
Final Answer Validation
Enable dual-LLM validation to prevent hallucinations.
{
"ragValidateFinalAnswer": true,
"ragValidateFinalAnswerInstructions": "Ensure all medical information is explicitly stated in verified sources. Never extrapolate symptoms or treatments."
}Default: false (due to latency/cost impact)
How it works:
- Primary LLM generates answer from RAG context
- Validator LLM receives:
- Original question
- Generated answer
- RAG context
- Evidence sources
- Custom validation instructions
- Validator checks answer against evidence
- Returns original or revised answer
Performance impact:
- Adds 500-2000ms latency (depends on model)
- Doubles LLM API calls
- Recommended for: Healthcare, legal, financial domains
Custom validation instructions:
Provide domain-specific rules:
{
"ragValidateFinalAnswerInstructions": `
- Only cite information from documents dated within last 6 months
- All numerical claims must have explicit source citations
- If multiple sources conflict, acknowledge the discrepancy
- Technical specifications must match exact product documentation
`
}Strict Knowledge Base Mode
Force AI to only answer from indexed data.
{
"ragStrictKnowledgebaseAnswers": true
}Default: false
Behavior:
- AI will not use general knowledge
- Missing information → "I don't know" responses
- All answers must be verifiable from sources
System prompt injection:
CRITICAL: You must ONLY provide information that is explicitly
present in the retrieved context. If the context does not contain
the answer, you MUST state that you don't have that information.
DO NOT use general knowledge, DO NOT make assumptions, and
DO NOT extrapolate beyond what is explicitly stated.Best practices:
Combine with helpful fallback messages:
{
"ragStrictKnowledgebaseAnswers": true,
"additionalPrompt": `If the information is not found in your knowledge base, respond:
"I don't have that information in my current knowledge base.
However, I can help with [list available topics].
For additional assistance, please contact support@company.com"`
}Query Controls
Maximum Results
Control how many document chunks to retrieve.
{
"ragMaxResults": 10
}Default: null (system determines based on query)
Recommendations by use case:
// FAQ Chatbot - Simple, focused answers
{ "ragMaxResults": 3 }
// Research Assistant - Comprehensive analysis
{ "ragMaxResults": 20 }
// Technical Documentation - Precise examples
{ "ragMaxResults": 5 }
// General Q&A - Balanced
{ "ragMaxResults": 10 }Trade-offs:
- Too low (1-3): May miss relevant information
- Optimal (5-10): Balance relevance and cost
- Too high (20+): Token waste, slower responses, more noise
Score Threshold
Set minimum similarity score for retrieved documents.
{
"ragScoreThreshold": 0.75
}Default: null (system default, typically 0.7)
Score interpretation:
1.0 - 0.9: Exact or near-exact matches
0.9 - 0.8: Highly relevant, same topic
0.8 - 0.7: Related, conceptually similar
0.7 - 0.6: Loosely related
< 0.6: Different topics (usually filtered)Tuning guide:
// High Precision (Strict relevance)
{
"ragScoreThreshold": 0.85,
"ragMaxResults": 3
}
// Use: Medical, legal, financial
// Balanced (Default)
{
"ragScoreThreshold": 0.7,
"ragMaxResults": 10
}
// Use: General Q&A, customer support
// High Recall (Broad coverage)
{
"ragScoreThreshold": 0.6,
"ragMaxResults": 15
}
// Use: Research, exploratory queriesSearch Mode
Choose retrieval strategy.
{
"ragAllItemMode": "hybrid"
}Options:
"semantic": Vector similarity search (default)"keyword": Full-text keyword search"hybrid": Combined semantic + keyword with RRF ranking
Semantic search:
{ "ragAllItemMode": "semantic" }- Best for: Natural language queries, conceptual searches
- Pros: Understands synonyms, handles typos, multilingual
- Cons: May miss exact terms or IDs
Keyword search:
{ "ragAllItemMode": "keyword" }- Best for: Exact phrase matching, product SKUs, code search
- Pros: Precise term matching, faster
- Cons: No semantic understanding, strict matching
Hybrid search (Recommended):
{ "ragAllItemMode": "hybrid" }- Best for: Technical documentation, mixed content
- Pros: Combines strengths of both approaches
- How it works: Parallel retrieval → RRF merge → Re-ranking
Hybrid search architecture:
Query: "How do I authenticate with JWT?"
↓ (parallel)
┌────────────────┐
│ │
Semantic Keyword
Search Search
│ │
├─ "Auth Guide" ├─ "JWT Validation"
├─ "OAuth Setup" ├─ "Authentication"
├─ "Sessions" ├─ "API Security"
│ │
└────────┬───────┘
↓
RRF Merge & Rerank
↓
├─ JWT Validation (0.93) ← Best!
├─ Authentication (0.87)
├─ OAuth Setup (0.76)
└─ API Security (0.74)Complete Configuration Examples
High-Stakes Application (Healthcare)
{
"enableRagPipeline": true,
"ragIncludeMetadata": true,
"ragIncludeConversationSources": true,
"ragValidateFinalAnswer": true,
"ragValidateFinalAnswerInstructions": "Only provide medical information from peer-reviewed sources dated within last 2 years. Never diagnose or prescribe. Always recommend consulting healthcare professionals.",
"ragStrictKnowledgebaseAnswers": true,
"ragMaxResults": 5,
"ragScoreThreshold": 0.85,
"ragAllItemMode": "hybrid",
"additionalPrompt": "If medical information is incomplete or outdated, clearly state limitations and recommend consulting with healthcare provider."
}Customer Support Bot
{
"enableRagPipeline": true,
"ragIncludeMetadata": true,
"ragIncludeConversationSources": true,
"ragValidateFinalAnswer": false,
"ragStrictKnowledgebaseAnswers": false,
"ragMaxResults": 8,
"ragScoreThreshold": 0.7,
"ragAllItemMode": "hybrid",
"includeDataSourcesAsTools": true
}Research Assistant
{
"enableRagPipeline": true,
"ragIncludeMetadata": true,
"ragIncludeConversationSources": true,
"ragValidateFinalAnswer": false,
"ragStrictKnowledgebaseAnswers": false,
"ragMaxResults": 20,
"ragScoreThreshold": 0.65,
"ragAllItemMode": "semantic",
"additionalPrompt": "Provide comprehensive analysis with citations. When multiple sources present different viewpoints, acknowledge all perspectives."
}Technical Documentation Search
{
"enableRagPipeline": true,
"ragIncludeMetadata": true,
"ragIncludeConversationSources": false,
"ragValidateFinalAnswer": false,
"ragStrictKnowledgebaseAnswers": true,
"ragMaxResults": 5,
"ragScoreThreshold": 0.75,
"ragAllItemMode": "hybrid",
"additionalPrompt": "Always include code examples when available. Link to full documentation. If examples are incomplete, clearly indicate what's missing."
}Monitoring and Optimization
Key Metrics to Track
Retrieval Quality:
- Average similarity score
- Number of results per query
- Cache hit rate
- Retrieval latency
Answer Quality:
- Validation pass rate
- User satisfaction scores
- Source attribution percentage
- "I don't know" response rate
Performance:
- End-to-end latency
- Token usage (prompt + completion)
- API costs per query
A/B Testing
Test different configurations:
// Configuration A: Strict
const configA = {
ragStrictKnowledgebaseAnswers: true,
ragValidateFinalAnswer: true,
ragScoreThreshold: 0.8,
ragMaxResults: 5
};
// Configuration B: Lenient
const configB = {
ragStrictKnowledgebaseAnswers: false,
ragValidateFinalAnswer: false,
ragScoreThreshold: 0.65,
ragMaxResults: 10
};
// Monitor:
// - Answer rate: % of queries answered
// - Accuracy: % of correct answers
// - Latency: Average response time
// - User satisfaction: Explicit feedbackOptimization Process
- Start with defaults and baseline metrics
- Identify problems:
- Too many "I don't know" → Lower threshold, increase max results
- Hallucinations → Enable validation, strict mode
- Slow responses → Reduce max results, disable validation
- Irrelevant answers → Increase threshold, try hybrid search
- Test changes with evaluation dataset
- Monitor production metrics
- Iterate based on data
API Configuration
Via Dashboard
Navigate to: Peer Settings → RAG Configuration
All options available in UI with real-time preview.
Via API
const peer = await cognipeer.peer.update(peerId, {
enableRagPipeline: true,
ragIncludeMetadata: true,
ragValidateFinalAnswer: true,
ragValidateFinalAnswerInstructions: "Custom validation rules...",
ragStrictKnowledgebaseAnswers: true,
ragMaxResults: 10,
ragScoreThreshold: 0.75,
ragAllItemMode: "hybrid",
ragIncludeConversationSources: true
});Per-Request Override
Override Peer defaults for specific queries:
const response = await cognipeer.peer.ask(peerId, {
message: "What's the return policy?",
overrides: {
ragMaxResults: 5,
ragScoreThreshold: 0.8
}
});Common Issues and Solutions
Issue: "I don't have that information" for known content
Causes:
- Threshold too high
- Wording mismatch (semantic search limitation)
- Content not properly indexed
Solutions:
// Lower threshold
{ "ragScoreThreshold": 0.65 }
// Use hybrid search
{ "ragAllItemMode": "hybrid" }
// Increase results
{ "ragMaxResults": 15 }
// Re-index data source
await cognipeer.datasource.reindex(datasourceId);Issue: Hallucinations in responses
Causes:
- Strict mode disabled
- No answer validation
- Low-quality source data
Solutions:
// Enable protections
{
"ragValidateFinalAnswer": true,
"ragStrictKnowledgebaseAnswers": true,
"ragScoreThreshold": 0.8
}
// Add explicit instructions
{
"additionalPrompt": "Only provide information explicitly stated in sources. If unsure, say 'I don't know.'"
}Issue: Slow response times
Causes:
- Too many retrieved chunks
- Answer validation enabled
- Large metadata payloads
Solutions:
// Reduce retrieval
{
"ragMaxResults": 5,
"ragScoreThreshold": 0.75
}
// Disable validation for non-critical use
{
"ragValidateFinalAnswer": false
}
// Trim metadata if excessive
{
"ragIncludeMetadata": false
}Issue: Missing context from conversations
Causes:
- Conversation sources disabled
- Short conversation history limit
Solutions:
// Enable conversation context
{
"ragIncludeConversationSources": true
}
// Increase history in Channel settings
{
"messageHistoryLimit": 20
}Best Practices
✅ DO:
- Start with default settings and measure
- Enable metadata for production systems
- Use strict mode for compliance/regulated domains
- Validate answers in high-stakes applications
- Monitor metrics and iterate
- Test with evaluation datasets before production
- Document your configuration choices
❌ DON'T:
- Blindly copy configurations without testing
- Disable all safeguards for speed
- Set thresholds without measuring impact
- Forget to re-index after source updates
- Ignore user feedback
- Skip A/B testing for critical changes
Related Resources
Support
Questions about RAG configuration? Reach out:
- Discord: Join our community
- Email: support@cognipeer.com
- Documentation: docs.cognipeer.com

