Voice Conversation
Voice Conversation enables real-time voice interactions with your AI peers, providing a natural and hands-free way to communicate. This feature uses advanced audio processing and Voice Activity Detection (VAD) to create seamless voice conversations.
Overview
The Voice Conversation feature allows users to:
- Speak directly to AI peers instead of typing
- Receive audio responses from peers
- Experience real-time conversation flow
- Use hands-free interaction mode
Key Features
1. Voice Activity Detection (VAD)
VAD automatically detects when you're speaking and when you've finished:
- Auto-detection: System knows when you start and stop speaking
- Natural Flow: No need to press buttons to start/stop
- Background Noise Filtering: Distinguishes speech from ambient noise
- Smart Pausing: Waits for you to finish before peer responds
2. Real-time Audio Processing
Audio is processed in real-time for immediate interaction:
- Low Latency: Minimal delay between speech and response
- High Quality: Clear audio input and output
- Continuous Conversation: Maintains context across voice exchanges
- Audio Visualization: Visual feedback during recording and playback
3. Bootstrap Audio Support
The system supports bootstrap mode for initial audio setup:
- Quick Start: Fast initialization of voice conversation
- Audio Testing: Built-in mic and speaker testing
- Settings Adjustment: Configure audio preferences before starting
Using Voice Conversation
Starting a Voice Conversation
- Navigate to your peer's chat interface
- Click the microphone icon in the message input area
- Allow microphone permissions when prompted
- Start speaking when ready
[Microphone Icon] - Click to enable voice modeDuring the Conversation
While Speaking:
- Microphone icon shows active recording
- Visual waveform displays your voice input
- Speak naturally - VAD detects pauses
Peer Response:
- Audio response plays automatically
- Text transcription shown simultaneously
- Visual indicator during playback
Controls:
- Pause/Resume: Pause conversation anytime
- Stop: End voice mode and return to text
- Volume: Adjust output volume
Best Practices
For Clear Recognition
✅ Do:
- Speak clearly and at normal pace
- Use a quality microphone
- Minimize background noise
- Wait for peer response before speaking again
❌ Avoid:
- Speaking too fast or too slow
- Multiple people speaking simultaneously
- Very noisy environments
- Interrupting during peer response
Conversation Tips
- Start with Context: Begin with clear context about your question
- Natural Language: Speak as you would to a person
- One Topic at a Time: Complete one thought before moving to next
- Confirm Understanding: Ask peer to confirm if needed
Compatibility & Requirements
Voice Conversation requires browser permissions to function:
- Microphone Access: To capture voice input.
- Audio Playback: To play Peer audio responses.
Grant these permissions when prompted by your browser to enable voice features.
Voice Settings
Configuring Voice Options
Navigate to Peer Settings > Voice to configure the following settings:
- Voice Enabled: Toggle voice interactions on or off.
- Language: Select the default language for voice conversations.
- Voice Gender: Choose between Male, Female, or Neutral.
- Speaking Rate: Adjust speed from 0.5x to 2.0x.
- Pitch: Adjust the voice pitch level.
- VAD Sensitivity: Adjust Voice Activity Detection (VAD) sensitivity.
Available Options
Language Selection:
- English (US, UK, AU)
- Spanish (ES, MX)
- French
- German
- And more...
Voice Characteristics:
- Gender: Male, Female, Neutral
- Rate: 0.5x - 2.0x speed
- Pitch: -10 to +10 semitones
VAD Sensitivity:
- High: Detects even quiet speech, may trigger on noise
- Medium: Balanced detection (recommended)
- Low: Only clear speech triggers, may miss soft talking
Use Cases
Customer Support
Enable voice conversations for:
- Phone-like Experience: Familiar interaction mode
- Hands-free Support: Users can multitask
- Accessibility: Easier for some users than typing
- Quick Queries: Faster than typing for simple questions
Example:
User: "What's the status of my order number 12345?"
Peer: [Voice Response] "Your order 12345 shipped yesterday
and is expected to arrive on October 23rd."Virtual Assistant
Use voice for:
- Task Commands: "Schedule a meeting for tomorrow at 2 PM"
- Information Lookup: "What's on my calendar today?"
- Quick Actions: "Send email to John about the project"
- Reminders: "Remind me to call Sarah in 30 minutes"
Education & Training
Voice conversations for:
- Language Learning: Practice pronunciation
- Interactive Lessons: Verbal Q&A sessions
- Accessibility: Support for reading difficulties
- Engagement: More engaging than text for some learners
Healthcare Support
Medical assistants using voice:
- Symptom Checking: Describe symptoms verbally
- Medication Reminders: Audio reminders and confirmations
- Emergency Assistance: Hands-free guidance
- Patient Comfort: More personal interaction
Troubleshooting
Microphone Not Working
Problem: Voice input not detected
Solutions:
- Check browser permissions - microphone should be allowed
- Select correct microphone in system settings
- Test microphone in browser settings
- Try a different browser
- Restart browser or device
Poor Recognition Quality
Problem: Peer doesn't understand speech correctly
Solutions:
- Reduce background noise
- Speak closer to microphone
- Adjust VAD sensitivity to "high"
- Use headset microphone instead of laptop mic
- Check internet connection stability
Audio Response Issues
Problem: Can't hear peer responses
Solutions:
- Check system volume settings
- Verify browser audio permissions
- Test with different output device
- Clear browser cache
- Check for conflicting extensions
Connection Drops
Problem: Voice conversation disconnects
Solutions:
- Check internet connection stability
- Reduce video calls or streaming on network
- Try wired connection instead of WiFi
- Clear browser cache and cookies
- Contact support if issue persists
Privacy & Security
Data Handling
Audio Processing:
- Audio converted to text on server
- Original audio can be discarded after processing
- Text follows same security as typed messages
Storage:
- Audio files encrypted in transit
- Optional: Store audio for quality improvement
- User can request deletion of audio data
Privacy Options:
- Disable voice history storage
- Auto-delete after conversation
- GDPR compliant data handling
Security Considerations
✅ Secure Practices:
- All audio transmitted over HTTPS
- Server-side audio processing in secure environment
- No third-party access to audio data
- Regular security audits
⚠️ User Responsibility:
- Don't share sensitive information if concerned
- Use in private settings for confidential topics
- Review voice conversation transcripts
- Report any security concerns
Advanced Features
Voice Commands
Enable special voice commands:
"Start over" - Clear conversation and begin fresh
"Repeat that" - Peer repeats last response
"Slower please" - Reduce speaking rate temporarily
"Spell that" - Spell out specific words
"Switch to text" - Return to text modeMulti-Language Support
Switch languages mid-conversation:
User: "Hola, ¿cómo estás?"
Peer: [Detects Spanish] "¡Hola! Estoy bien, gracias..."Voice Macros
Create voice shortcuts for common requests:
"Status update" → Triggers pre-defined status report
"Daily briefing" → Morning summary of tasks/events
"Quick help" → Opens help menuPerformance Optimization
Reducing Latency
- Streaming Audio: Enable audio streaming for faster response
- Local VAD: Process VAD client-side when possible
- Compression: Use compressed audio formats
- CDN: Serve audio responses from CDN
Bandwidth Considerations
Audio Upload:
- Typical: 128 kbps (16 KB/s)
- 1 minute conversation: ~960 KB
Audio Download:
- Peer response: 64-128 kbps
- Varies by speaking rate and quality
Recommendations:
- Minimum: 256 kbps connection
- Recommended: 1 Mbps or higher
- Mobile: Consider data usage limits
Related Features
- Introduction - Getting started with peers
- Peer Settings - Configure peer behavior
- Evaluation System - Test voice interactions
FAQs
Q: Can I use voice conversation on mobile?
A: Yes, voice conversation works on mobile browsers that support the Media Devices API.
Q: Is voice conversation free?
A: Voice processing may consume additional credits depending on your plan. Check pricing for details.
Q: Can multiple users use voice in the same conversation?
A: Currently, voice is single-user. Use text for multi-user conversations.
Q: What languages are supported?
A: We support 20+ languages. Check settings for the complete list.
Q: Can I download conversation audio?
A: Yes, audio can be downloaded from conversation history if storage is enabled.
Q: How accurate is voice recognition?
A: Accuracy is typically 95%+ in good conditions with clear speech.
Need Help? Contact support or visit our Community Forum for assistance with voice conversations.

