-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: update STT server urls for pt and ca #186
Conversation
WalkthroughThe pull request introduces changes to two configuration files related to speech-to-text (STT) settings. In Changes
Possibly related PRs
Suggested labels
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## dev #186 +/- ##
=====================================
Coverage ? 0.00%
=====================================
Files ? 10
Lines ? 837
Branches ? 0
=====================================
Hits ? 0
Misses ? 837
Partials ? 0 ☔ View full report in Codecov by Sentry. 🚨 Try these New Features:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (2)
ovos_config/recommends/online_stt/pt-pt.conf (1)
5-9
: Consider documenting endpoint capabilitiesEach endpoint might have different characteristics (e.g., model size, supported languages, response times). Consider adding comments or documentation about their specific capabilities.
"urls": [ + # General purpose STT service "https://stt.smartgic.io/mynorthai/stt", + # Faster Whisper instance 1 "https://stt.smartgic.io/fasterwhisper/stt", + # Faster Whisper instance 2 "https://fasterwhisper.ziggyai.online/stt" ]ovos_config/recommends/online_stt/ca-es.conf (1)
Line range hint
1-13
: Consider documenting the failover behavior.The configuration now supports multiple URLs, but it might be helpful to document whether they are used in order (failover) or distributed (load balancing).
Consider adding a comment in the configuration file or documentation about the URL selection strategy.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (2)
ovos_config/recommends/online_stt/ca-es.conf
(1 hunks)ovos_config/recommends/online_stt/pt-pt.conf
(1 hunks)
🔇 Additional comments (3)
ovos_config/recommends/online_stt/pt-pt.conf (1)
4-10
: Verify the availability and security of STT endpoints
The new configuration structure with multiple URLs provides better fallback options, which is good for reliability. However, we should verify these endpoints.
Consider implementing:
- Load balancing strategy across these endpoints
- Automatic failover mechanism
- Response time monitoring to prioritize faster endpoints
ovos_config/recommends/online_stt/ca-es.conf (2)
Line range hint 1-13
: Verify Portuguese (pt-pt) configuration changes.
The PR objectives mention updates for both Portuguese and Catalan STT servers, but only the Catalan configuration is provided for review.
Let's check for the Portuguese configuration file:
✅ Verification successful
Portuguese (pt-pt) STT configuration is present and properly configured
The Portuguese STT configuration is present at ovos_config/recommends/online_stt/pt-pt.conf
with appropriate server URLs configured:
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Locate and examine the Portuguese STT configuration
# Expected: Find and show content of pt-pt.conf
echo "Searching for Portuguese STT config..."
fd -t f "pt-pt.conf" | while read -r file; do
echo "Found: $file"
cat "$file"
done
Length of output: 2329
6-8
: LGTM! The addition of Aina STT endpoints enhances Catalan language support.
The change from a single URL to an array format is a good improvement, allowing for better failover and load balancing. The new Aina STT endpoints specifically designed for Catalan should provide better recognition accuracy.
Let's verify the availability of the new endpoints:
✅ Verification successful
Both new STT endpoints return error responses (405 Method Not Allowed and 403 Forbidden)
The HTTP status checks reveal potential issues:
- https://stt.smartgic.io/aina/stt returns 405 (Method Not Allowed)
- https://ainastt.ziggyai.online/stt returns 403 (Forbidden)
While these endpoints might require specific authentication or POST requests for actual STT operations, we should verify if these are the expected responses for HEAD requests. Let's check if there's any documentation or similar configurations in the codebase.
Let's check the content of the Portuguese config since it's also mentioned in the PR objectives and uses the same "urls" format.
STT endpoints return errors, but this appears to be expected behavior
The configuration follows the established pattern seen in other language configs (e.g., pt-pt.conf) where:
- Both use the same domain pattern (smartgic.io and ziggyai.online)
- The 405/403 responses are expected as these endpoints require POST requests with audio data
- The URLs maintain fallback options (Aina-specific endpoints followed by Citrinet endpoints for Catalan)
The configuration structure is consistent with other language configs in the repository and maintains redundancy for better availability.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check if the new STT endpoints are accessible
# Expected: HTTP 200 OK or similar valid response
for url in "https://stt.smartgic.io/aina/stt" "https://ainastt.ziggyai.online/stt"; do
echo "Checking $url..."
curl -I -s -L "$url" | head -n 1
done
Length of output: 531
Script:
#!/bin/bash
# Search for documentation or similar STT configurations
echo "Searching for STT configuration files..."
fd -e conf -e md . ovos_config/recommends/online_stt/
echo -e "\nSearching for documentation mentioning these endpoints..."
rg -i "stt.smartgic.io|ainastt.ziggyai" --type md --type txt
echo -e "\nChecking other language configs for similar patterns..."
rg "urls|url" ovos_config/recommends/online_stt/
Length of output: 1049
Script:
#!/bin/bash
# Check content of both configs for comparison
echo "Content of ca-es.conf:"
cat ovos_config/recommends/online_stt/ca-es.conf
echo -e "\nContent of pt-pt.conf:"
cat ovos_config/recommends/online_stt/pt-pt.conf
# Check if these endpoints are used in other configurations
echo -e "\nChecking for similar endpoint patterns across configs:"
rg -A 2 "\"url.*\": \[" ovos_config/recommends/online_stt/
Length of output: 1590
new dedicated STT models for catalan and portuguese
Summary by CodeRabbit
New Features
Improvements