Commit Graph

62 Commits

Author SHA1 Message Date
cogwheel0
d092bb2e44 fix(audio): optimize audio configuration for iOS and Android platforms 2025-11-27 18:41:41 +05:30
cogwheel0
1e841e03f6 feat(voip): add CallKit availability check for iOS devices 2025-11-26 20:09:34 +05:30
cogwheel0
bccc0135ad feat(voice-input): Optimize VAD parameters for improved speech detection 2025-11-24 14:17:51 +05:30
cogwheel0
83d59fb294 fix(voice-input): Simplify VAD recording stop logic by removing redundant condition 2025-11-24 14:00:14 +05:30
cogwheel0
d38e986d7c feat(callkit): Add CallKit service for native call UI and permissions 2025-11-24 12:29:44 +05:30
cogwheel0
4b0c16b522 feat(voice-input): handle iOS simulator speech recognition 2025-11-21 13:39:19 +05:30
cogwheel0
84af6bbe86 feat(tts): Remove auto engine and fix ios STS 2025-11-21 13:39:19 +05:30
cogwheel0
36915fba09 feat(tts): ensure Android default TTS engine is set before speaking 2025-11-21 12:22:23 +05:30
cogwheel0
64173a2168 feat(file-attachment): improve base64 image data URL parsing validation 2025-11-13 12:39:09 +05:30
cogwheel
e95ff86f31 Merge pull request #147 from cogwheel0/voice-call-send-message-tool
feat(voice-call): send message with selected tool IDs
2025-11-13 12:33:03 +05:30
cogwheel0
c4764b0075 feat(voice-call): send message with selected tool IDs 2025-11-13 12:30:07 +05:30
cogwheel0
f885513a89 feat(voice): Improve voice input service with locale handling and permission checks 2025-11-13 12:21:59 +05:30
cogwheel0
b05d9f84a5 feat(tts): Add server-side speech synthesis and playback pipeline 2025-11-10 02:43:31 +05:30
cogwheel0
62c9243e34 feat: Replace mic_stream_recorder with vad and update iOS deployment target 2025-11-10 01:57:28 +05:30
cogwheel0
0d49309ad1 feat(tts): Refactor text splitting and offset computation for TTS 2025-11-05 00:59:57 +05:30
cogwheel0
3424af60f9 feat(l10n): Add silence duration settings for speech-to-text 2025-11-05 00:48:20 +05:30
cogwheel0
1bb2cbae25 feat(voice): add voice silence duration configuration 2025-11-05 00:33:17 +05:30
cogwheel0
a3b5c4f5b7 feat(audio): replace record package with mic_stream_recorder 2025-11-05 00:09:35 +05:30
cogwheel0
715849aff3 feat(tts): add speech rate support for text-to-speech generation 2025-11-03 00:44:24 +05:30
cogwheel0
1a570f4a08 feat(voice-input): improve server-side speech detection with silence auto-stop 2025-11-03 00:36:25 +05:30
cogwheel0
cfadeffd24 feat(tts): add auto mode for text-to-speech engine selection 2025-11-02 21:31:13 +05:30
cogwheel0
86339715b1 feat(sts): add server side speech-to-text 2025-11-02 19:03:36 +05:30
cogwheel0
5d33e5fe65 fix: server side tts on ios 2025-10-31 23:20:04 +05:30
cogwheel0
de0f195aea feat(tts): Improve text-to-speech service with enhanced error handling and state management 2025-10-30 21:42:35 +05:30
cogwheel0
44149d5f81 feat(tts): add server default voice retrieval and integrate it into 2025-10-30 16:10:20 +05:30
cogwheel0
a2d786109c feat(tts): initialize and sync TTS with app settings for calls 2025-10-24 00:58:46 +05:30
cogwheel0
56246507de feat(tts): add karaoke-style TTS progress bar to assistant UI
Add rendering and support for a karaoke-style text-to-speechprogress bar in assistant messages so users can see the currently
spoken sentence and highlighted word during playback.

- Append TTS karaoke bar to AssistantMessageWidget when the message is
  the active TTS target and playback is speaking/paused/loading.
- Implement _buildKaraokeBar to render the active sentence with a
  highlighted word span, using ConduitCard and theme styles.
- Import conduit_components for shared UI primitives.
- Extend TextToSpeechState with sentence data:
  sentences, sentenceOffsets, activeSentenceIndex, and per-word
  progress (wordStartInSentence, wordEndInSentence).
- Add provider callbacks wiring: onSentenceIndex and
  onDeviceWordProgress handlers (hooked into TTS backend).
- Prepare sentence splitting and word-progress plumbing in the TTS
  provider (prepares data used to drive the karaoke display).

This change improves UX by visually indicating the spoken sentence
and current word during TTS playback, aiding comprehension and
accessibility.
2025-10-23 17:05:35 +05:30
cogwheel0
8ec411d6aa feat(tts): server chunked playback queue on server pathRefactor server-backedTS path to perform sentence chunking and
queued playback via a dedicated _startServerChunkedPlayback method
instead of generating a single monolithic audio blob.

This change simplifies the server flow, avoids constructing an entire
audio buffer in memory, and enables smoother playback and error
recovery. On errors, the code still falls back to device TTS.
2025-10-23 16:46:24 +05:30
cogwheel0
561e7dd616 feat(tts): server-backed TTS engine selection
Introduce server TTS support and engine selection while keeping
device TTS as the default.

- Add new persistence keys for storing TTS engine and selected
  server voice (ttsEngine, ttsServerVoiceId, ttsServerVoiceName).
- Extend TextToSpeechService to support two engines:
  TtsEngine.device (FlutterTts) and TtsEngine.server (remote audio).
- Wire in an AudioPlayer and optional ApiService to fetch raw
  audio bytes from the server and play them, with event hooks
  mapped to existing lifecycle callbacks.
- Implement fallback to device TTS on server errors or empty
  responses, and ensure player lifecycle (pause/stop/dispose)
  is handled when using server engine.
- Allow engine and preferred voice to be configured before
  initialization and updated at runtime via updateSettings.

This enables selecting a server-side voice and using a remote
TTS provider while preserving compatibility with the existing
device TTS implementation.
2025-10-23 16:31:15 +05:30
cogwheel0
2f8fd97022 refactor: Enhance file attachment handling and UI components
- Updated the file attachment service to utilize a new LocalAttachment class, improving the management of file metadata such as display names.
- Refactored methods for picking and uploading files to accommodate the new LocalAttachment structure, ensuring consistent handling of file attributes.
- Improved the chat page to validate and manage file attachments more effectively, enhancing user experience during file uploads.
- Added functionality for image previews in the file attachment widget, allowing users to see selected images before sending.
- Introduced a remove button for attachments, improving usability by enabling users to easily discard unwanted files.
2025-10-19 13:50:54 +05:30
cogwheel0
6c81d68e59 feat: Add Text-to-Speech settings and customization options
- Introduced new preference keys for TTS settings: voice, speech rate, pitch, and volume.
- Updated SettingsService to handle TTS settings and persist them.
- Enhanced AppSettings to include TTS-related properties.
- Implemented TTS settings UI in AppCustomizationPage, allowing users to select voice and adjust speech parameters.
- Added localization support for TTS settings in multiple languages.
2025-10-17 14:40:44 +05:30
cogwheel0
4eb1191748 feat: enhance background streaming functionality with improved wake lock management
- Updated the wake lock duration in BackgroundStreamingHandler to 3 hours, ensuring the service remains active for longer periods.
- Modified the keepAlive method to support both iOS and Android, allowing for better background task management across platforms.
- Implemented a periodic keep-alive timer in VoiceCallService to refresh the wake lock every 5 minutes, enhancing service reliability during voice calls.
- Added debug logging for successful keep-alive invocations, improving traceability of background operations.
2025-10-10 19:59:17 +05:30
cogwheel0
a9030473b0 feat: enhance background streaming handler with microphone support
- Updated BackgroundStreamingHandler to include microphone permission handling for background execution.
- Modified startBackgroundExecution method to accept a requiresMicrophone parameter, allowing dynamic management of streams requiring microphone access.
- Adjusted service intent to pass microphone requirement status, improving service behavior based on app state.
- Enhanced VoiceCallService to utilize the new microphone support during voice call streaming, ensuring proper resource management.
2025-10-09 16:18:14 +05:30
cogwheel0
259fe3f9f0 feat: implement self-signed certificate support in API and UI
- Added support for self-signed TLS certificates in the ApiService, allowing configuration based on server settings.
- Introduced a toggle in the ServerConnectionPage to enable or disable trusting self-signed certificates.
- Updated localization files to include new strings for self-signed certificate settings in multiple languages.
- Enhanced the OptimizedStorageService to manage trusted servers based on user preferences for self-signed certificates.
- Improved error handling and logging throughout the affected services to ensure clarity and maintainability.
2025-10-09 01:49:56 +05:30
cogwheel0
fabb1df63a feat: enhance text-to-speech functionality with markdown support
- Integrated markdown conversion in TextToSpeechController to clean text before speech synthesis, ensuring only valid content is spoken.
- Updated VoiceCallService to utilize markdown conversion for responses, improving the clarity of spoken content.
- Enhanced VoiceCallPage to display cleaned text from markdown, providing a better user experience during voice interactions.
2025-10-09 00:20:36 +05:30
cogwheel0
96202c7453 refactor: streamline background streaming service and notification handling
- Updated BackgroundStreamingService to use a minimal notification for foreground service, enhancing clarity and compliance with Android requirements.
- Removed redundant notification updates and logging statements in VoiceCallService to improve code readability and maintainability.
- Adjusted notification channel settings for better background service management.
2025-10-09 00:10:08 +05:30
cogwheel0
e98f5cbf0f feat: integrate flutter_local_notifications for enhanced voice call notifications
- Added flutter_local_notifications dependency to manage notifications during voice calls.
- Implemented notification handling in VoiceCallService to update call status and manage user interactions.
- Enabled wake lock functionality to keep the screen on during calls and prevent audio interruptions.
- Updated AndroidManifest.xml to include necessary permissions for Bluetooth and foreground services.
- Enhanced notification actions to allow users to mute, unmute, or end calls directly from notifications.
2025-10-09 00:01:35 +05:30
cogwheel0
ea79a193be feat: enhance voice call functionality and response handling
- Introduced a new boolean flag `_isSpeaking` in VoiceCallService to manage speaking state during voice interactions.
- Improved response handling by extracting incremental content from socket events and updating the accumulated response accordingly.
- Updated the chat page to include a voice call button, allowing users to initiate voice calls directly from the chat interface.
- Enhanced the modern chat input widget to support voice call functionality, providing a seamless user experience for initiating calls.
2025-10-08 19:09:57 +05:30
cogwheel0
7dd41ebf60 refactor: clean up logging and improve error handling in voice call service
- Removed unnecessary print statements from VoiceCallService to enhance code clarity and maintainability.
- Improved error handling by ensuring that exceptions are properly caught and handled without excessive logging.
- Updated the VoiceCallPage to streamline error dialog presentation, removing redundant console logs while maintaining user feedback.
- Enhanced the use of color values in UI components for better readability and consistency.
2025-10-08 13:38:56 +05:30
cogwheel0
4f6c10c857 feat: enhance text-to-speech and voice call services
- Added volume, speech rate, and pitch settings to the TextToSpeechService for improved audio control.
- Reset the accumulated response in VoiceCallService before sending messages to ensure accurate response handling.
- Enhanced the handling of socket events in VoiceCallService to manage streaming content and completion more effectively.
- Improved logging for better debugging and tracking of TTS and voice call states.
2025-10-08 13:35:24 +05:30
cogwheel0
b673921002 feat: add voice call functionality to chat page
- Introduced a new button in the chat page's app bar to initiate voice calls.
- Implemented the _handleVoiceCall method to navigate to the VoiceCallPage.
- Enhanced user experience by providing a direct way to start voice calls from the chat interface.
2025-10-08 13:04:28 +05:30
cogwheel0
8a8ba76298 refactor: update providers to use keepAlive for enhanced state management
- Changed multiple provider annotations to `@Riverpod(keepAlive: true)` to improve state retention and management across the application.
- This update aligns with recent enhancements in state management practices, ensuring better performance and user experience throughout the app.
2025-10-01 18:32:16 +05:30
cogwheel0
8543f9255e refactor: migrate voiceInputAvailableProvider
Phase 5.1 Complete (1/5)
- voiceInputAvailableProvider → voiceInputAvailable
- Simple FutureProvider migration
- 2 usages updated automatically
2025-09-30 14:58:53 +05:30
cogwheel0
a63739db6b refactor: migrate Phase 1 providers (2-7/10) to @riverpod
Migrated providers:
- selectedModelProvider → SelectedModel
- isManualModelSelectionProvider → IsManualModelSelection
- reviewerModeProvider → ReviewerMode
- isLoadingConversationProvider → IsLoadingConversation
- prefilledInputTextProvider → PrefilledInputText
- inputFocusTriggerProvider → InputFocusTrigger
- composerHasFocusProvider → ComposerHasFocus
- batchModeProvider → BatchMode
- reducedMotionProvider → ReducedMotion

All provider names unchanged, no breaking changes.
Build runner successful, analyzer passing.
Only 1 WARNING (keepAlive usage) and 2 INFO items remaining.
2025-09-30 14:31:56 +05:30
cogwheel0
9210b2155a refactor: all logging 2025-09-25 22:36:42 +05:30
cogwheel0
5f013b1b73 refactor: formatting 2025-09-24 12:00:49 +05:30
cogwheel0
462bf4cde2 refactor: migrate to riverpod 3 2025-09-21 22:31:44 +05:30
cogwheel0
37e5633c5c fix: tts 2025-09-21 20:18:21 +05:30
cogwheel0
c05644f731 feat: text to speech 2025-09-20 23:58:18 +05:30
cogwheel0
7e6009d2cc refactor: text streaming 2025-09-13 10:16:58 +05:30