sandpypi/chunk-engine-progress.md

150 lines
10 KiB
Markdown

# Chunk Engine Progress
## Purpose
This log tracks milestone-relevant chunk-engine progress, blockers, benchmark runs, and real-app comparison data. It is not meant to capture every casual test run.
## Current headline status
- Dense is still the production/reference backend.
- The chunk backend is integrated into the main app and benchmark harness as the main experimental lane.
- Chunk now has page-local storage, active-page scheduling, dirty-chunk rendering, motion-regression coverage, and app-sized benchmark modes.
- High simultaneous move volume is now the primary chunk optimization blocker.
- Further chunk parity work is gated behind active-move instrumentation and optimization.
- Post-instrumentation baselines show chunk still ahead on settled/localized workloads, but behind dense on sustained active-move scenes.
- Live app branch-level counters now show the stable long-press gas case is materially more controlled than before.
- The main remaining single-thread hotspot is now mixed-scene `full-runtime` work across solids, liquids, and gases.
- New live-app stress runs show gas-specific retry/runtime work is harder to spike, while mixed material scenes still pay too much broad runtime-chain cost.
## Open blockers
- Chunk still needs more branch-level metrics to fully explain attempt spikes across all material classes.
- Gas-heavy and continuous-paint scenes still need targeted high-move optimization.
- Mixed scenes still route too many runtime-heavy particles through broad lifecycle / thermal / reaction / pressure work in the same frame.
- Move-attempt cost and stalled-motion cost need clearer measurement and trend tracking.
- Stable gas and overload behavior are improved, but mixed-scene full-runtime cost still needs to come down before parity work resumes.
- Real-app chunk results still need to beat dense consistently, not only in selected scenes.
- Chunk parity remains paused behind the active-move optimization gate.
## Manual app scenarios
### `active_sand_flood`
- Setup: continuously pour a dense falling solid stream until the scene sustains `10k+` simultaneous moves.
- Measure: average FPS, minimum FPS, move attempts, successful moves, stalled movable cells.
- Acceptance: no wake/sleep artifact and move cost scales better than the recorded baseline.
### `mixed_pile`
- Setup: create a layered sand/water pile over a floor and let it settle for 30 seconds.
- Measure: average FPS, minimum FPS, stability of settling.
- Acceptance: no obvious shimmer or frozen partial-settle behavior.
### `active_gas_burst`
- Setup: inject a large gas burst in a confined chamber until gas dominates movement.
- Measure: average FPS, minimum FPS, move attempts, full-runtime step count.
- Acceptance: gas remains visually coherent and does not dominate frame time disproportionately.
### `continuous_mixed_paint`
- Setup: sustain mixed solid/liquid/gas painting for 30 seconds.
- Measure: FPS plus movement/runtime bucket timings.
- Acceptance: no severe collapse when move volume spikes.
### `tool_stress`
- Setup: apply wind/air/gravity tools continuously over a mixed-material scene for 30 seconds.
- Measure: FPS, visible stability, and chunk field activity.
- Acceptance: field-driven motion remains active without repainting.
## Metrics table
Required columns for recorded runs:
- date
- commit or branch label
- backend
- build
- scene name
- world size
- particle count
- avg FPS
- min FPS
- avg step ms
- avg frame/build ms
- loaded chunks
- active chunks
- stepped chunks
- dirty chunks
- field cells
- move attempts
- successful moves
- swap attempts
- stalled movable cells
- movement-only fast-path count
- full-runtime step count
- activation ms
- movement ms
- runtime ms
- field decay ms
- render ms
- notes
## Dated entries
### 2026-03-09
- Replaced the chunk prototype hot path with page-local storage/scheduling scaffolding and removed the old global sparse/LINQ step path.
- Added app-sized benchmark mode and snapshot mode to `Sand.Benchmarks`.
- Extended shared/app frame stats with chunk workload counters and surfaced them in the app debug overlay.
- Reordered the chunk program around an active-move optimization gate before further parity expansion.
- Added move-attempt, stalled-motion, fast-path/full-runtime, and coarse timing buckets to chunk step stats, app overlay, and benchmark output.
- Recorded the first post-instrumentation high-motion baseline below.
- Added row-count guided stepping, narrower border wakes, same-page occupancy fast paths, and limited gas-runtime throttling.
- Result: `continuous_mixed_paint` improved materially and `active_sand_flood` improved versus the prior local pass, but chunk is still behind dense on sustained flood and gas-burst scenes.
- Added branch-level move-attempt counters (`vertical/diagonal/lateral`) plus full-runtime counters by particle kind (`solid/liquid/gas`) to the chunk overlay.
- Live-app chunk gas stress with `ultratanium` now shows the runtime bottleneck clearly: a representative run reported `moves 2357`, `attempts 9161`, `Att v 5467 d 2221 l 1073`, `full 5644`, `s 0 l 0 g 5644`, `run 11.52 ms`, `render 0.52 ms`.
- Interpretation: render is not the primary limiter in the observed gas stress case; the next optimization pass should reduce general `full-runtime gas` exposure and repeated vertical gas probes without particle-specific throttles.
- Follow-up retry-reduction work made the worst spikes harder to trigger in the live app, but later `ultratanium` screenshots still showed large `full-runtime gas` counts once the gas mass interacted with terrain and boundaries.
- Interpretation update: the next generic gas optimization should focus on thermally stable gas near non-gas boundaries, not only open-cloud gas and not particle-id-specific behavior.
### 2026-03-10
- Added app-shell frame instrumentation and overload control so the main app now reports total frame, update, sim-loop, build, upload, draw, and fixed-step counts.
- Result: long-press gas abuse is materially harder to use to collapse the app, because the fixed-step loop no longer tries to chase every missed step forever under overload.
- Recent live-app screenshots now show a clearer split:
- stable gas runs can remain responsive with `Full 0`, meaning the worst gas-runtime branch is no longer the default limiter
- mixed scenes still fall back into expensive `full-runtime` work across solids, liquids, and gases
- Updated single-thread focus: stop treating gas-only stress as the only blocker and reduce the mixed-scene runtime chain before planning multithreading.
- Current code work is trimming the chunk full-runtime path so particles only pay for lifecycle, burn, emission, special, phase, reaction, pressure, and thermal domains when they actually use them.
- Latest mixed-scene live tests after runtime trimming showed a much healthier profile at roughly `44.7k` particles:
- `FPS 23`
- `moves 4936`, `attempts 5466`, `stalled 218`
- `app frame 44.07 ms`, `sim 41.55 ms`, `steps 2`
- `render 1.18 ms`, `draw 0.97 ms`
- Interpretation update: the app is no longer mostly failing from catch-up debt or render/upload cost in this stress case. The remaining hotspot is honest mixed-scene simulation/runtime work.
## Baseline measurements
| date | commit/label | backend | build | scene | world | particles | avg FPS | min FPS | avg step ms | avg frame/build ms | loaded chunks | active chunks | stepped chunks | dirty chunks | field cells | move attempts | successful moves | swap attempts | stalled movable cells | movement-only fast-path count | full-runtime step count | activation ms | movement ms | runtime ms | field decay ms | render ms | notes |
| --- | --- | --- | --- | --- | --- | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | --- |
| 2026-03-09 | local-working-tree | dense | Release | active_sand_flood | 265x200 | 1920 | 433.46 | n/a | 2.265 | 0.042 | 0 | 0 | 0 | 0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.000 | 0.000 | 0.000 | 0.000 | 0.042 | dense baseline for sustained falling solids |
| 2026-03-09 | local-working-tree | chunk | Release | active_sand_flood | 265x200 | 2520 | 200.45 | n/a | 4.806 | 0.182 | 21 | 21 | 21 | 21 | 0 | 1330.0 | 1330.0 | 0.0 | 0.0 | 1330.0 | 0.0 | 0.016 | 4.722 | 0.000 | 0.001 | 0.182 | movement-only work dominates; chunk still behind dense here |
| 2026-03-09 | local-working-tree | dense | Release | active_gas_burst | 265x200 | 1566 | 1150.01 | n/a | 0.848 | 0.022 | 0 | 0 | 0 | 0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.000 | 0.000 | 0.000 | 0.000 | 0.022 | dense baseline for sustained gas injection |
| 2026-03-09 | local-working-tree | chunk | Release | active_gas_burst | 265x200 | 2130 | 290.73 | n/a | 3.406 | 0.034 | 16 | 16 | 10 | 10 | 0 | 1143.8 | 1147.3 | 0.0 | 0.0 | 142.8 | 1009.8 | 0.005 | 0.005 | 3.024 | 0.001 | 0.034 | full-runtime gas path is the dominant blocker |
| 2026-03-09 | local-working-tree | dense | Release | continuous_mixed_paint | 265x200 | 2216 | 697.89 | n/a | 1.411 | 0.022 | 0 | 0 | 0 | 0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.000 | 0.000 | 0.000 | 0.000 | 0.022 | dense baseline for sustained mixed painting |
| 2026-03-09 | local-working-tree | chunk | Release | continuous_mixed_paint | 265x200 | 2196 | 323.38 | n/a | 3.016 | 0.076 | 23 | 23 | 23 | 23 | 0 | 1320.5 | 991.1 | 0.4 | 0.1 | 386.8 | 388.9 | 0.008 | 0.001 | 2.233 | 0.001 | 0.076 | movement and runtime both contribute under mixed active paint |
## Next actions
- reduce move-heavy cost on `active_sand_flood`, `active_gas_burst`, and `continuous_mixed_paint`
- reduce mixed-scene `full-runtime` cost by narrowing the lifecycle / reaction / pressure / thermal chain per particle
- keep stable-gas wins intact while mixed-scene runtime work is reduced
- keep `ultratanium` as a stress particle, but avoid particle-id-specific throttles or hacks
- continue reducing chunk overhead on gas-heavy and continuous-paint workloads
- use the healthier mixed-scene overlay numbers as the single-thread baseline before any multithread planning
- return to same-page falling-solid movement overhead after the mixed-scene runtime chain is under better control
- add deterministic snapshot review for the named scenes after `120` and `300` steps
- keep further parity work paused until the active-move gate shows repeatable gains