stan44/Thisper

stan44 3a051c8012 Initial Thisper MVP

2026-03-29 21:59:48 -05:00

1.9 KiB

Raw Blame History

Phase 3 Speech Plan

Speech is queued after Phase 2. It is not part of current desktop completion criteria.

Scope

microphone capture
chunked or streaming transcription
pass transcript through the same rewrite modes used for typed input
preserve the same trust rules as typed workflows
reuse the existing review and diff layer where practical

Preconditions

Phase 2 desktop/system utility work marked complete
THISPER_STATUS.md updated
tray/background behavior stable
legacy hardware validation report updated
documentation aligned with actual behavior

Required Interfaces

Add a dedicated transcription boundary instead of mixing speech into the rewrite provider:

ITranscriptionProvider
transcription request and response types
partial transcript event types
buffering rules for chunked and streaming modes

Required Planning Outputs

Audio Pipeline

microphone capture lifecycle
mute/start/stop controls
audio buffering strategy
failure handling for permissions and device loss

Transcript Pipeline

partial transcript UX
final transcript handoff into rewrite modes
model/provider selection rules
retry and cancellation behavior

Privacy Rules

explicit handling of whether audio leaves the device
no silent cloud upload
no raw audio retention by default
no raw transcript persistence unless the user explicitly keeps it

Acceptance Coverage

real voice samples
noisy and clean environments
short dictation and long dictation
interruption/resume behavior
factual preservation and voice-preserving cleanup after transcription

First Implementation Goal

Build a desktop speech path that feels like the typed workflow:

capture speech
show transcript progressively
run the existing rewrite modes
review changes
copy or accept output

Do not start with mobile speech. Keep the first speech implementation desktop-only.