docs: add README with C# backend documentation

Co-Authored-By: Warp <agent@warp.dev>
This commit is contained in:
Jacob Schmidt 2026-02-23 21:11:24 -06:00
parent d3781d6c3e
commit 08ebeeb3c6

377
README.md
View File

@ -1,264 +1,187 @@
# Journal Backend (.NET)
# Project_Journal
A .NET 10 backend for the Project Journal app. Provides core journal functionality as a class library with a sidecar console app for Tauri integration and an optional HTTP API.
A structured journaling system with encrypted monthly vaults, desktop UI, CLI tools, and optional AI-assisted analysis.
## Project Structure
## Support Matrix
```
backend/
├── Journal.Core/ Class library — all business logic
│ ├── Models/
│ │ ├── Fragment.cs Domain model (validated, owns Guid ID)
│ │ ├── Command.cs Stdin command shape for sidecar protocol
│ │ ├── ParsedSection.cs Parsed section model for entry parity work
│ │ ├── SectionTitles.cs Canonical section title list (Python parity)
│ │ └── JournalEntry.cs Entry domain (`date/raw_content/sections/fragments` + merge + markdown reconstruction)
│ ├── Dtos/
│ │ └── FragmentDtos.cs Immutable records for API boundary
│ │ ├── FragmentDto Read (what goes out)
│ │ ├── CreateFragmentDto Create (what comes in)
│ │ └── UpdateFragmentDto Update (partial, all fields optional)
│ ├── Repositories/
│ │ ├── IFragmentRepository.cs Interface (data access contract)
│ │ ├── InMemoryFragmentRepository.cs In-memory implementation (tests/dev)
│ │ └── FileFragmentRepository.cs File-backed implementation (default)
│ ├── Services/
│ │ ├── IFragmentService.cs Interface (business logic contract)
│ │ ├── FragmentService.cs Validates, calls repo, maps to DTOs
│ │ ├── IEntrySearchService.cs Entry search contract (content parity)
│ │ ├── EntrySearchService.cs Searches decrypted `.md` entries by raw content query
│ │ ├── IJournalConfigService.cs Config contract for path/vault/AI/speech settings parity
│ │ ├── JournalConfigService.cs Env/default-backed config surface aligned with Python keys
│ │ ├── IAiService.cs AI bridge contract (optional provider)
│ │ ├── DisabledAiService.cs No-op AI provider for deterministic disabled mode
│ │ ├── PythonSidecarAiService.cs Local Python sidecar adapter (stdin/stdout JSON)
│ │ ├── SidecarCli.cs CLI runner (`vault` + `search`) used by Sidecar host
│ │ ├── JournalParser.cs Date + section + checkbox + fragment parser slices (Phase 2)
│ │ ├── IVaultCryptoService.cs Vault crypto contract
│ │ ├── VaultCryptoService.cs AES-256-GCM + PBKDF2 compatibility layer
│ │ ├── IVaultStorageService.cs Vault load/workflow contract
│ │ └── VaultStorageService.cs Monthly naming + load/decrypt/extract workflow
│ ├── Entry.cs Command dispatcher (stdin/stdout)
│ ├── ServiceCollectionExtensions.cs DI registration helper
│ └── Journal.Core.csproj
├── Journal.Sidecar/ Console app — Tauri sidecar bridge
│ ├── App.cs Boots DI container, runs Entry.RunAsync()
│ └── Journal.Sidecar.csproj References Journal.Core
├── Journal.Api/ Web API — HTTP endpoint wrapper (optional)
│ ├── Program.cs
│ └── Journal.Api.csproj
└── README.md
- Python: `3.14`
- Platforms: Windows and Linux (first-class), macOS (best effort)
- Default profile: CPU
- Optional profiles: GPU, optional NLP backend
## Dependency Profiles
- `requirements_base.txt`: shared Journal runtime dependencies
- `requirements_cpu_only.txt`: base + CPU AI stack
- `requirements_gpu.txt`: base + GPU AI stack
- `requirements_nlp_optional.txt`: optional spaCy backend (auto-fallback if unavailable)
## Quickstart
### Linux (CPU default)
```bash
cd Project_Journal
python3.14 -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip
python -m pip install --extra-index-url https://download.pytorch.org/whl/cpu -r requirements_cpu_only.txt
```
## Architecture
### Linux (GPU optional)
Each layer only knows about the one below it:
```
Sidecar (stdin/stdout) ──┐
├──► Services (business logic) ──► Repositories (data access)
API (HTTP/JSON) ─────────┘
```bash
cd Project_Journal
python3.14 -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip
python -m pip install -r requirements_gpu.txt
```
- **Models** — Domain objects with validation. The source of truth.
- **DTOs** — Immutable records that cross the API boundary. Internal logic never leaks out.
- **Repositories** — Where data lives. Current default is file-backed; can evolve to SQLite/EF Core without touching anything above.
- **Services** — Business rules, validation, orchestration. Doesn't know about HTTP or stdin.
- **Entry** — Transport adapter. Translates stdin/stdout JSON into service calls.
## Dependencies
- **Journal.Core**`Microsoft.Extensions.DependencyInjection.Abstractions` (interface-only, lightweight)
- **Journal.Sidecar**`Microsoft.Extensions.DependencyInjection` (full container implementation) + references `Journal.Core`
- **Journal.Api**`Microsoft.AspNetCore.OpenApi` + ASP.NET shared framework
## Building
### Windows PowerShell (CPU default)
```powershell
# Build everything (building Sidecar also rebuilds Core if changed)
dotnet build backend\Journal.Sidecar\Journal.Sidecar.csproj
# Build just the library
dotnet build backend\Journal.Core\Journal.Core.csproj
# Format code
dotnet format backend\Journal.Core\Journal.Core.csproj
cd Project_Journal
py -3.14 -m venv .venv
.\.venv\Scripts\Activate.ps1
python -m pip install --upgrade pip
python -m pip install --extra-index-url https://download.pytorch.org/whl/cpu -r requirements_cpu_only.txt
```
## Publishing
On Windows + Python 3.14, `pywebview` is intentionally skipped due upstream
`pythonnet` build compatibility. `run_desktop.py` will auto-fallback to opening
the app in your system browser.
Publish as a single-file self-contained executable (no .NET runtime install needed):
### Optional NLP backend (spaCy)
```bash
python -m pip install -r requirements_nlp_optional.txt
python -m spacy download en_core_web_sm
```
If spaCy is missing or unsupported, Journal now auto-falls back to built-in NLP heuristics.
On current Python 3.14 environments, this optional install may be skipped due upstream spaCy compatibility.
## Running
### Desktop App
```bash
python ./journal/run_desktop.py
```
### CLI
```bash
python -m journal.cli.main --help
python -m journal.cli.main vault load
python -m journal.cli.main search "your query"
```
## NLP Backend Control
Set `JOURNAL_NLP_BACKEND` to choose behavior:
- `auto` (default): use spaCy when available, else fallback
- `spacy`: require spaCy backend and fail clearly if unavailable
- `fallback`: always use fallback heuristics
Examples:
```bash
export JOURNAL_NLP_BACKEND=fallback
python ./journal/run_desktop.py
```
```powershell
dotnet publish backend\Journal.Sidecar\Journal.Sidecar.csproj -c Release -r win-x64 --self-contained -p:PublishSingleFile=true -p:IncludeNativeLibrariesForSelfExtract=true
$env:JOURNAL_NLP_BACKEND = "spacy"
python .\journal\run_desktop.py
```
Output: `backend\Journal.Sidecar\bin\Release\net10.0\win-x64\publish\Journal.Sidecar.exe` (~70MB, everything bundled)
## Installer Script
To exclude debug symbols: add `-p:DebugType=none`
Use the Linux helper script:
For a smaller build that requires .NET 10 on the target machine:
```powershell
dotnet publish backend\Journal.Sidecar\Journal.Sidecar.csproj -c Release -r win-x64 -p:PublishSingleFile=true
```bash
./installreqs.sh
./installreqs.sh --gpu
./installreqs.sh --with-nlp
```
## Sidecar Protocol
## C# Backend
The sidecar communicates over stdin/stdout using JSON lines. One JSON line in, one JSON line out.
When run with no command-line args, this protocol mode is used by default.
The `backend/` directory contains a .NET 10 implementation that provides the same journal functionality as the Python layer, with encrypted vault support and an identical JSON command protocol.
## Sidecar CLI
### Projects
`Journal.Sidecar` also supports direct vault and search CLI commands:
- **Journal.Core** — shared library: domain models, services, repositories, DTOs
- **Journal.Api** — minimal ASP.NET Core web API (`/api/command` POST endpoint)
- **Journal.Sidecar** — console app (stdin/stdout JSON protocol or CLI with `vault` and `search` subcommands)
- **Journal.SmokeTests** — 70+ integration tests (no test framework dependency)
```powershell
# Load vaults into decrypted data workspace
dotnet run --project Journal.Sidecar/Journal.Sidecar.csproj -- vault load
# Save (rebuild) monthly vaults from decrypted markdown files
dotnet run --project Journal.Sidecar/Journal.Sidecar.csproj -- vault save
# Search entries (query + filters)
dotnet run --project Journal.Sidecar/Journal.Sidecar.csproj -- search "common text" --tag stress --type !TRIGGER --start-date 2026-02-01 --end-date 2026-02-28 --section Summary --checked "med taken"
```
Password prompt behavior:
- If `--password` is omitted, CLI prompts with `Vault password:` (hidden input in terminal mode).
- For automation/non-interactive use, pass `--password <value>`.
Optional path overrides:
- `--vault-dir <path>`
- `--data-dir <path>`
- Env fallback: `JOURNAL_VAULT_DIR`, `JOURNAL_DATA_DIR`, `JOURNAL_APP_DIR`
Search CLI flags:
- positional `query` (optional)
- `--tag` / `-t` (repeatable)
- `--type` / `-y` (repeatable)
- `--start-date` / `-s` (`yyyy-MM-dd`)
- `--end-date` / `-e` (`yyyy-MM-dd`)
- `--section` / `-sec`
- `--checked` / `-chk` (repeatable)
- `--unchecked` / `-uchk` (repeatable)
- `--data-dir <path>` (optional override)
## Config Keys (Parity Surface)
`JournalConfigService` exposes and normalizes key settings expected from Python config:
- Paths: `JOURNAL_PROJECT_ROOT`, `JOURNAL_APP_DIR`, `JOURNAL_DATA_DIR`, `JOURNAL_VAULT_DIR`, `JOURNAL_LOG_DIR`, `JOURNAL_PID_FILE`, `JOURNAL_SERVER_CONTROL_FILE`
- Vault format: `JOURNAL_MONTHLY_VAULT_FORMAT` (default `%Y-%m.vault`)
- AI endpoints/models: `CLOUDAI_API_KEY`, `CLOUDAI_API_URL`, `LLAMA_CPP_URL`, `LLAMA_CPP_MODEL`, `LLAMA_CPP_TIMEOUT`, `EMBEDDING_API_URL`, `EMBEDDING_MODEL_NAME`, `MODEL_CONTEXT_TOKENS`, `CHUNK_TOKEN_BUDGET`
- AI bridge mode: `JOURNAL_AI_PROVIDER` (`none` or `python-sidecar`), `JOURNAL_PYTHON_EXE`, `JOURNAL_AI_SIDECAR_PATH`, `JOURNAL_AI_TIMEOUT_MS`
- Speech/NLP: `MICROPHONE_DEVICE_INDEX`, `SPEECH_RECOGNITION_ENGINE`, `WHISPER_MODEL_SIZE`, `JOURNAL_NLP_BACKEND`
### Command Format
```json
{
"action": "fragments.create",
"id": null,
"type": null,
"tag": null,
"payload": { "type": "!TRIGGER", "description": "stomach drop" }
}
```
**Fields:**
- `action` — The operation to perform (e.g. `fragments.list`, `fragments.create`)
- `id` — Target entity ID (for get/update/delete)
- `type` / `tag` — Filter parameters (for search)
- `payload` — Request body, deserialized into the appropriate DTO per action
### Available Actions
| Action | Description | Requires |
|--------|-------------|----------|
| `fragments.list` | List all fragments | — |
| `fragments.get` | Get fragment by ID | `id` |
| `fragments.create` | Create a new fragment | `payload` (CreateFragmentDto) |
| `fragments.update` | Update a fragment | `id`, `payload` (UpdateFragmentDto) |
| `fragments.delete` | Delete a fragment | `id` |
| `fragments.search` | Search by type/tag | `type` and/or `tag` |
| `entries.list` | List decrypted markdown entries in a data directory | optional `payload.dataDirectory` |
| `entries.load` | Load one entry file and return parsed metadata + raw content | `payload.filePath` |
| `entries.save` | Save/merge entry content to file (fragment append or full merge path) | `payload.content`, optional `payload.filePath`, `payload.mode` |
| `db.status` | Return DB key/schema compatibility status snapshot | `payload.password`, optional `payload.dataDirectory` |
| `db.initialize_schema` | Write SQL schema bootstrap (`journal_schema.sql`) for parity tables | optional `payload.dataDirectory` |
| `db.hydrate_workspace` | Perform C# DB hydration step for decrypted workspace (schema bootstrap + metadata) | `payload.password`, optional `payload.dataDirectory` |
| `config.get` | Return current backend config snapshot | — |
| `ai.health` | Return AI bridge health/provider status | — |
| `ai.summarize_entry` | Summarize one entry through AI provider | `payload.content`, optional `payload.fileStem` |
| `ai.summarize_all` | Summarize a set of entries through AI provider | `payload.entries[]` |
| `ai.chat` | Send chat prompt through AI provider bridge | `payload.prompt` |
| `ai.embed` | Generate embedding vector through AI provider bridge | `payload.content` |
| `search.entries` | Search decrypted entry content with optional parity filters | `payload.dataDirectory`, optional `payload.query`, `payload.section`, `payload.startDate`, `payload.endDate`, `payload.tags[]`, `payload.types[]`, `payload.checked[]`, `payload.unchecked[]` |
| `vault.initialize` | Ensure vault directory exists | `payload.password`, `payload.vaultDirectory` |
| `vault.load_all` | Load/decrypt all monthly vaults into data directory | `payload.password`, `payload.vaultDirectory`, `payload.dataDirectory` |
| `vault.save_current_month` | Save only current month vault (optimized path) | `payload.password`, `payload.vaultDirectory`, `payload.dataDirectory`, optional `payload.nowUtc` |
| `vault.rebuild_all` | Rebuild all monthly vaults from decrypted `.md` data | `payload.password`, `payload.vaultDirectory`, `payload.dataDirectory` |
| `vault.clear_data_directory` | Clear decrypted data directory and recreate it | `payload.dataDirectory` |
### Response Format
Success:
```json
{ "ok": true, "data": { "id": "abc-123", "type": "!TRIGGER", "description": "...", "time": "...", "tags": [] } }
```
Error:
```json
{ "ok": false, "error": "Description is required" }
```
## Extending with New Modules
The `Command` class is generic — new modules use the same dot-notation pattern:
### Architecture
```
vault.unlock → IVaultService (future)
vault.lock
entries.list → IEntryService (future)
entries.create
ai.health → IAiService (implemented bridge)
ai.summarize_* → IAiService (implemented bridge)
ai.chat → IAiService (implemented bridge)
ai.embed → IAiService (implemented bridge)
db.status → IJournalDatabaseService (in-progress DB parity)
search.query → ISearchService (future)
Entry (thin command dispatcher)
├── IFragmentService → FragmentService → IFragmentRepository
├── IEntryFileService → EntryFileService → IEntryFileRepository
├── IEntrySearchService → EntrySearchService
├── IVaultStorageService → VaultStorageService → IVaultCryptoService
├── IJournalDatabaseService → JournalDatabaseService (SQLCipher)
├── IAiService → PythonSidecarAiService | DisabledAiService
├── ISpeechBridgeService → PythonSidecarSpeechService | DisabledSpeechBridgeService
├── CommandLogger
└── IJournalConfigService → JournalConfigService
```
To add a module:
1. Create model, DTO, repository, and service in `Journal.Core/`
2. Register the new service in `ServiceCollectionExtensions.cs`
3. Inject the service into `Entry.cs` and add cases to the action switch
4. No changes needed to `Command.cs` or `App.cs`
### Build & Run
## Dependency Injection
`ServiceCollectionExtensions.cs` wires everything up. Any host (sidecar, API, tests) calls:
```csharp
services.AddFragmentServices();
```bash
cd backend
dotnet build
```
This registers:
- `IFragmentRepository``FileFragmentRepository` (singleton — persisted fragment store)
- `IFragmentService``FragmentService` (transient — fresh instance per request)
Run the API server:
## Fragment Store Location
```bash
dotnet run --project Journal.Api
```
`FileFragmentRepository` persists data to:
Run the sidecar (stdin/stdout mode):
- default: `.journal-sidecar/fragments.json` under current working directory
- override: `JOURNAL_FRAGMENT_STORE_PATH` environment variable
```bash
dotnet run --project Journal.Sidecar
```
## Legacy Vault Compatibility Note
Sidecar CLI commands:
The legacy Python placeholder file `_init_vault.vault` is treated as obsolete.
During vault load, the C# backend ignores this file for decryption and removes it.
This preserves compatibility while migrating older vault directories forward.
```bash
dotnet run --project Journal.Sidecar -- vault load --password <value>
dotnet run --project Journal.Sidecar -- vault save --password <value>
dotnet run --project Journal.Sidecar -- search "your query" --tag stress --start-date 2026-02-01
```
Run smoke tests:
```bash
dotnet run --project Journal.SmokeTests
```
### Environment Variables
- `JOURNAL_PROJECT_ROOT` — override project root detection
- `JOURNAL_DATA_DIR` / `JOURNAL_VAULT_DIR` — override data/vault paths
- `JOURNAL_AI_PROVIDER``none` (default) or `python-sidecar`
- `JOURNAL_PYTHON_EXE` — Python executable path (default: `python`)
- `JOURNAL_LOG_LEVEL``trace`, `debug`, `information`, `warning` (default), `error`, `critical`
### Encryption
- Vault: AES-256-GCM with PBKDF2-HMAC-SHA256 key derivation (600k iterations)
- Database: SQLCipher with PBKDF2-derived key
- Wire format matches the Python implementation for cross-language parity
## Notes
- Decrypted journal data in `journal/data` is cleared on graceful shutdown.
- Vault save/load commands remain unchanged.