5.4 KiB
5.4 KiB
Personal Browser Companion - Plans & To-Do
Classification
- Core = works in extension-only mode (no extra server required).
- Optional = requires extra server/services (MCP, cloud sync, external APIs) and is opt-in per user.
Goals
- [Core] Start local-first with an option to sync to cloud.
- [Core] Online-only operation (LLM required for decisions).
- [Core] Auto-start mode during meetings.
- [Optional] Integrations: calendar, email, Discord, Nextcloud.
Phase Plan
Phase 1: Local MVP (Foundation) [Core]
- [Core] Local storage for sessions, summaries, and user profile.
- [Core] Meeting/interview modes with manual start and overlay UI.
- [Core] Basic memory retrieval: recent session summaries + user profile.
- [Core] Audio capture + STT pipeline (mic + tab) and transcript display.
- [Core] Privacy controls: store/forget, per-session toggle.
Phase 2: Smart Auto-Start [Core]
- [Core] Detect meeting tabs (Google Meet, Zoom, Teams) and prompt to start.
- [Core] Auto-start rules (domain allowlist, time-based, calendar hints).
- [Core] Lightweight on-device heuristics for meeting detection.
Phase 3: Cloud Sync (Optional) [Optional]
- [Optional] Opt-in cloud sync for memory + settings.
- [Optional] Conflict resolution strategy (last-write wins + merge for summaries).
- [Optional] Encryption at rest, user-controlled delete/export.
Phase 4: Integrations (MCP) [Optional]
- [Optional] Calendar: read upcoming meetings, attach context.
- [Optional] Email: draft follow-ups, summaries.
- [Optional] Discord: post meeting summary or action items to a channel.
- [Optional] Nextcloud: store meeting notes, transcripts, and attachments.
MVP To-Do (Local)
Core
- [Core] Define memory schema (profile, session, summary, action items).
- [Core] Implement local RAG: index summaries + profile into embeddings.
- [Core] Add session lifecycle: start, pause, end, summarize.
Audio + STT
- [Core] Implement reliable STT for tab audio (OpenAI Whisper chunk transcription from tab/mixed audio).
- [Core] Keep mic-only STT as fallback.
- [Core] Add device selection + live mic monitor.
- [Core] Add separate STT settings (provider/model/API key) independent from chat provider.
- [Optional] Add local STT bridge support (self-hosted faster-whisper endpoint).
- [Core] Add STT "Test Connection" action in Assistant Setup.
- [Core] Add multilingual STT controls (auto/forced language, task, VAD, beam size) with session language lock in auto mode.
UI/UX
- [Core] Overlay controls: resize, hide/show, minimize.
- [Core] Auto-start toggle in side panel.
- [Core] Session summary view with “save to memory” toggle.
- [Core] Sidebar automation preset selector before Start Listening.
- [Core] One-click session context selector before Start Listening.
- [Core] Profile-scoped context loading to reduce cross-session prompt leakage.
- [Core] Profile manager UI (create/edit/delete profile with mode + prompt).
- [Core] Import/export context profiles.
Privacy
- [Core] Per-session storage consent prompt.
- [Core] “Forget session” button.
Advanced Settings (Core)
- [Core] Open full settings window from side panel (⚙️).
- [Core] Webhook test: send sample payload and show status.
- [Core] MCP connection test (basic reachability).
- [Core] Cloud endpoint validation (basic reachability).
- [Core] Automation framework: triggers + actions + approval flow.
Integration To-Do (MCP)
MCP Server Options
- [Optional] Build a local MCP server as a bridge for integrations.
- [Optional] Use MCP tool registry for calendar/email/Discord/Nextcloud.
Automation (Rules Engine)
- [Core] Configure triggers (session start/end/manual, meeting domain filters).
- [Core] Configure actions per trigger (MCP tool + args).
- [Core] Approval mode: auto-send or review before send.
- [Core] Run actions on session end (hook into session lifecycle).
- [Core] Manual “Run Actions” button.
Calendar
- [Optional] Read upcoming meetings and titles.
- [Optional] Auto-attach relevant context packs.
- [Optional] Generate follow-up drafts from summary + action items.
Discord
- [Optional] Post meeting summary/action items to a selected channel.
Nextcloud
- [Optional] Upload meeting notes and transcripts.
Open Questions
- [Core] How do we isolate interview vs meeting prompts/contexts safely?
- Best solution: Use explicit context profiles (e.g., Interview, Standup, Sales) with separate prompt + context store per profile, and require users to pick one profile before Start Listening.
- [Optional] Preferred cloud provider for sync?
- Best solution: Start with Supabase (Postgres + Auth + Storage) for fastest MVP, then add S3-compatible storage as an optional backend for enterprise/self-hosting.
- [Core] How long should session memories persist by default?
- Best solution: 90 days by default with per-session “keep forever” and a global retention slider (7/30/90/365 days).
- [Core] Should auto-start be opt-in per domain or global?
- Best solution: Opt-in per domain, with a one-click “trust this site” prompt on first detection.
- [Optional] What data should be redacted before sync?
- Best solution: Default to redacting emails, phone numbers, calendar IDs, and detected secrets (API keys/tokens) while letting users add custom redaction rules.