AI-Powered Analysis & Optimization

Cognipeer AI's AI Analysis feature uses advanced language models to automatically analyze your peer's evaluation results and provide actionable improvement suggestions. This feature helps you quickly identify issues and optimize your peer's configuration for better performance.

Overview

The AI Analysis feature:

Analyzes Evaluation Results: Reviews failed and low-scoring questions
Identifies Patterns: Finds common issues across multiple failures
Generates Suggestions: Provides specific, actionable recommendations
Explains Reasoning: Details why each suggestion will help
Enables Quick Application: Apply changes with one click

How It Works

Analysis Process

Data Collection: Gathers evaluation results, peer configuration, and context
Pattern Recognition: Identifies recurring problems and failure patterns
Root Cause Analysis: Determines underlying issues
Suggestion Generation: Creates specific improvement recommendations
Prioritization: Orders suggestions by potential impact

AI Model

The analysis uses GPT-4o to ensure high-quality, contextual recommendations. The model considers:

Question-answer pairs and their scores
Peer's current configuration (prompt, tools, datasources)
Evaluation metrics and failure patterns
Best practices for AI peer design

Using AI Analysis

Starting an Analysis

From Evaluation Results

Navigate to an evaluation run's results page
Click the "Analyze with AI" button
Wait for analysis to complete (10-30 seconds)

Auto-Analysis

Enable automatic analysis after each evaluation run:

json

{
  "autoAnalyze": true,
  "autoAnalyzeThreshold": 0.7
}

Analysis runs automatically if overall score is below threshold.

Viewing Analysis Results

The analysis panel shows:

Summary Section

Overall Assessment: High-level evaluation of peer performance
Key Issues Identified: Main problems affecting scores
Improvement Potential: Expected score increase if suggestions are applied

Detailed Suggestions

Each suggestion includes:

Category: Type of improvement (Prompt, Tools, Settings, Data)
Priority: High, Medium, or Low
Title: Brief description
Detailed Explanation: Why this change helps
Specific Changes: Exact modifications to make
Expected Impact: Predicted score improvement

Example Analysis Output

markdown

## Overall Assessment

Your peer is struggling with customer support scenarios, particularly 
around product specifications and troubleshooting. The main issues are:
- Insufficient context about product features
- Missing tools for real-time data access
- Overly creative responses (high temperature)

Expected improvement: +15-20% overall score

## Suggestions

### 1. Enhance System Prompt for Product Support (Priority: High)

**Current Issue:**
The peer lacks specific instructions for handling product-related queries.

**Recommendation:**
Add product support guidelines to the system prompt:

"When discussing products:
1. Always reference official specifications
2. Provide step-by-step troubleshooting
3. Offer to escalate complex technical issues
4. Use clear, non-technical language for customers"

**Expected Impact:** +12% on product-related questions

### 2. Enable Knowledge Base Tool (Priority: High)

**Current Issue:**
The peer cannot access product documentation during conversations.

**Recommendation:**
Enable the "Product Documentation" datasource tool to allow 
real-time access to specifications and guides.

**Expected Impact:** +18% on specification questions

### 3. Lower Temperature Setting (Priority: Medium)

**Current Issue:**
Temperature of 0.8 leads to inconsistent, creative responses 
where accuracy is needed.

**Recommendation:**
Reduce temperature to 0.3 for more consistent, factual responses.

**Expected Impact:** +8% on factual questions

Applying Suggestions

Individual Application

Apply suggestions one at a time:

Review the suggestion details
Click "Preview Changes" to see exact modifications
Click "Apply" to implement the change
The peer configuration updates immediately

Bulk Application

Apply multiple or all suggestions:

Select suggestions using checkboxes
Click "Apply Selected" or "Apply All"
Review combined changes in preview modal
Confirm to apply all changes

Custom Modifications

Edit suggestions before applying:

Click "Edit" on any suggestion
Modify the proposed changes
Save your custom version
Apply the modified suggestion

Suggestion Categories

1. Prompt Improvements

Examples:

Add specific instructions for handling edge cases
Include examples of good responses
Clarify tone and style requirements
Add constraints or guidelines

Impact: Often highest impact, affects all responses

2. Tool Recommendations

Examples:

Enable relevant datasource tools
Add integration tools (API calls, database access)
Configure web search for real-time information
Enable file processing tools

Impact: Enables new capabilities, fills knowledge gaps

3. Settings Adjustments

Examples:

Temperature modifications
Model selection changes
Max tokens adjustments
Reasoning settings

Impact: Fine-tunes response quality and consistency

4. Data Source Updates

Examples:

Add missing documentation to knowledge base
Update outdated information
Improve data organization
Add contextual metadata

Impact: Improves factual accuracy and completeness

Best Practices

Before Requesting Analysis

Run Complete Evaluation: Ensure you have sufficient test coverage (20+ questions)
Diverse Question Set: Include various scenarios and difficulty levels
Clear Expected Answers: Provide realistic, well-defined expected responses
Baseline Configuration: Start with a reasonable baseline setup

Reviewing Suggestions

Understand the Reasoning: Read the full explanation for each suggestion
Check for Conflicts: Ensure suggestions don't contradict each other
Validate Priority: Higher priority doesn't always mean apply first
Test Incrementally: Apply and test suggestions one at a time when possible

After Applying Changes

Re-run Evaluation: Verify improvements with the same test suite
Compare Scores: Check if actual improvement matches prediction
Monitor Edge Cases: Ensure changes don't hurt other areas
Iterate: Request new analysis if scores still need improvement

Advanced Features

Custom Analysis Criteria

Specify what to focus on:

javascript

POST /api/v1/evaluation/:runId/suggest-improvements
{
  "focus": "accuracy",  // or "speed", "cost", "tone"
  "constraints": {
    "maxTemperature": 0.5,
    "preferredTools": ["datasource"],
    "maintainTone": true
  }
}

Analysis History

Track all analyses and applied suggestions:

View historical recommendations
See which suggestions were applied
Track improvement trends over time
Revert to previous configurations

A/B Testing Support

Test suggestions before full deployment:

Clone your peer
Apply suggestions to the clone
Run evaluations on both versions
Compare results
Deploy the winner

Common Scenarios

Scenario 1: Low Overall Scores

Symptoms:

Overall score < 60%
Failing across multiple question types
Inconsistent responses

Typical Suggestions:

Major prompt restructuring
Enable essential tools
Add comprehensive knowledge base
Adjust temperature down

Approach:

Apply high-priority prompt changes first
Add tools that fill knowledge gaps
Re-evaluate after each major change

Scenario 2: Specific Topic Failures

Symptoms:

High scores overall
Consistent failures in one category
Missing domain knowledge

Typical Suggestions:

Add topic-specific instructions to prompt
Enable specialized tools or datasources
Add targeted knowledge base content

Approach:

Apply category-specific suggestions
Test with focused evaluation suite
Expand to related categories

Scenario 3: Inconsistent Quality

Symptoms:

High variance in scores
Same question gets different answers
Unpredictable behavior

Typical Suggestions:

Lower temperature setting
Add more explicit constraints to prompt
Use more deterministic model
Add examples to prompt

Approach:

Apply temperature changes first
Add constraints incrementally
Test for consistency improvement

Limitations & Considerations

What AI Analysis Can Do

✅ Identify patterns in evaluation failures
✅ Suggest configuration improvements
✅ Recommend relevant tools and datasources
✅ Propose prompt enhancements
✅ Prioritize changes by expected impact

What AI Analysis Cannot Do

❌ Guarantee specific score improvements
❌ Fix fundamental knowledge gaps without data
❌ Optimize for contradictory goals simultaneously
❌ Predict all edge case behaviors
❌ Replace human judgment and testing

Important Notes

Suggestions are Recommendations: Always review before applying
Context Matters: AI doesn't know your full business context
Test After Changes: Always validate with fresh evaluation runs
Iterative Process: Multiple rounds may be needed for optimal results
Data Quality: Analysis quality depends on evaluation quality

Troubleshooting

Analysis Taking Too Long

Problem: Analysis doesn't complete or times out

Solutions:

Ensure evaluation run completed successfully
Reduce number of questions if >100
Try again after a few minutes
Check API rate limits

Suggestions Don't Improve Scores

Problem: Applied suggestions but scores didn't increase

Possible Causes:

Expected answers may be unrealistic: Review and adjust expectations
Conflicting suggestions applied: Try one at a time
Insufficient test coverage: Add more diverse questions
Fundamental capability gap: May need different model or architecture

Generic or Unhelpful Suggestions

Problem: Suggestions are too vague or not actionable

Solutions:

Ensure evaluation has detailed failure data
Add more context to questions and expected answers
Run evaluation with multiple evaluators
Provide more specific peer description and purpose

API Reference

Request Analysis

javascript

POST /api/v1/evaluation/:runId/suggest-improvements

Response:
{
  "analysis": {
    "summary": "Overall assessment...",
    "suggestions": [
      {
        "id": "sugg_1",
        "category": "prompt",
        "priority": "high",
        "title": "Enhance system prompt...",
        "description": "Detailed explanation...",
        "changes": { /* specific modifications */ },
        "expectedImpact": "+12%"
      }
    ],
    "overallImprovement": "+15-20%"
  }
}

Apply Suggestions

javascript

POST /api/v1/peer/:peerId/apply-improvements
{
  "suggestions": ["sugg_1", "sugg_2"],
  "preview": false
}

Response:
{
  "applied": 2,
  "changes": { /* actual modifications made */ },
  "backup": { /* previous configuration */ }
}

Get Analysis History

javascript

GET /api/v1/peer/:peerId/analysis-history

Response:
{
  "analyses": [
    {
      "id": "analysis_1",
      "timestamp": "2025-10-20T10:30:00Z",
      "evaluationRunId": "run_123",
      "suggestionsCount": 5,
      "appliedCount": 3,
      "scoreImprovement": "+14%"
    }
  ]
}

Integration Examples

Automated Optimization Workflow

javascript

// 1. Run evaluation
const run = await evaluateion.execute(suiteId);

// 2. Auto-analyze if score is low
if (run.averageScore < 0.7) {
  const analysis = await evaluation.suggestImprovements(run.id);
  
  // 3. Auto-apply high-priority suggestions
  const highPriority = analysis.suggestions
    .filter(s => s.priority === 'high');
  
  await peer.applyImprovements(peerId, {
    suggestions: highPriority.map(s => s.id)
  });
  
  // 4. Re-run evaluation
  const newRun = await evaluation.execute(suiteId);
  
  console.log(`Improvement: ${newRun.averageScore - run.averageScore}`);
}

Scheduled Optimization

javascript

// Run weekly optimization
cron.schedule('0 2 * * 0', async () => {
  const peers = await peer.list({ autoOptimize: true });
  
  for (const p of peers) {
    // Run evaluation
    const run = await evaluation.execute(p.defaultSuiteId);
    
    // Get suggestions
    const analysis = await evaluation.suggestImprovements(run.id);
    
    // Notify admin
    await notifications.send({
      to: 'admin@company.com',
      subject: `${p.name} Optimization Report`,
      body: analysis.summary,
      suggestions: analysis.suggestions
    });
  }
});

Peer Evaluation & Testing - Complete evaluation system guide
Peer Settings - Understanding peer configuration options
Prompt Best Practices - Effective prompt design
API Reference - Evaluation and analysis API

Summary

AI-Powered Analysis accelerates the optimization process by automatically identifying issues and providing expert-level recommendations. By combining evaluation data with AI insights, you can systematically improve your peers' performance with minimal manual effort.

Best Workflow:

Create comprehensive evaluation suite
Run initial evaluation (baseline)
Request AI analysis
Review and apply suggestions
Re-run evaluation (measure improvement)
Iterate until target performance is reached
Schedule regular evaluations for monitoring

This continuous improvement loop ensures your peers maintain high quality and adapt to changing requirements over time.

AI-Powered Analysis & Optimization ​

Overview ​

How It Works ​

Analysis Process ​

AI Model ​

Using AI Analysis ​

Starting an Analysis ​

From Evaluation Results ​

Auto-Analysis ​

Viewing Analysis Results ​

Summary Section ​

Detailed Suggestions ​

Example Analysis Output ​

Applying Suggestions ​

Individual Application ​

Bulk Application ​

Custom Modifications ​

Suggestion Categories ​

1. Prompt Improvements ​

2. Tool Recommendations ​

3. Settings Adjustments ​

4. Data Source Updates ​

Best Practices ​

Before Requesting Analysis ​

Reviewing Suggestions ​

After Applying Changes ​

Advanced Features ​

Custom Analysis Criteria ​

Analysis History ​

A/B Testing Support ​

Common Scenarios ​

Scenario 1: Low Overall Scores ​

Scenario 2: Specific Topic Failures ​

Scenario 3: Inconsistent Quality ​

Limitations & Considerations ​

What AI Analysis Can Do ​

What AI Analysis Cannot Do ​

Important Notes ​

Troubleshooting ​

Analysis Taking Too Long ​

Suggestions Don't Improve Scores ​

Generic or Unhelpful Suggestions ​

API Reference ​

Request Analysis ​

Apply Suggestions ​

Get Analysis History ​

Integration Examples ​

Automated Optimization Workflow ​

Scheduled Optimization ​

Related Documentation ​

Summary ​

AI-Powered Analysis & Optimization

Overview

How It Works

Analysis Process

AI Model

Using AI Analysis

Starting an Analysis

From Evaluation Results

Auto-Analysis

Viewing Analysis Results

Summary Section

Detailed Suggestions

Example Analysis Output

Applying Suggestions

Individual Application

Bulk Application

Custom Modifications

Suggestion Categories

1. Prompt Improvements

2. Tool Recommendations

3. Settings Adjustments

4. Data Source Updates

Best Practices

Before Requesting Analysis

Reviewing Suggestions

After Applying Changes

Advanced Features

Custom Analysis Criteria

Analysis History

A/B Testing Support

Common Scenarios

Scenario 1: Low Overall Scores

Scenario 2: Specific Topic Failures

Scenario 3: Inconsistent Quality

Limitations & Considerations

What AI Analysis Can Do

What AI Analysis Cannot Do

Important Notes

Troubleshooting

Analysis Taking Too Long

Suggestions Don't Improve Scores

Generic or Unhelpful Suggestions

API Reference

Request Analysis

Apply Suggestions

Get Analysis History

Integration Examples

Automated Optimization Workflow

Scheduled Optimization

Related Documentation

Summary