meeting_memory/MIGRATION_TASKS.md

44 lines
2.3 KiB
Markdown

# Migration Tasks
## Goal
Align this project toward `D:\github_project\graphiti` while keeping the meeting-processing flow usable and making the codebase easier to maintain.
Target direction:
- Neo4j is the only persistence layer for graph and retrieval data
- Retrieval is hybrid: semantic similarity + keyword/fact recall + graph relationship context
- Storage is more provenance-friendly, closer to `Meeting / Episode / Entity / Fact`
- Core implementation lives in package modules instead of the repository root
## In Progress
- [ ] No active migration tasks
## Todo
- [ ] Clean up any stale data directories only after explicit user confirmation
## Done
- [x] Step 1: Extract a shared embedding utility and stop coupling semantic retrieval to the old vector-store implementation
- [x] Step 2.1: Create a package structure and move shared foundations out of the repository root
- [x] Step 2.2: Move extraction, raw storage, and state tracking into package modules
- [x] Step 2.3: Move graph storage, processing, and CLI into package modules
- [x] Step 3: Redesign Neo4j schema from simple `Meeting -> Entity -> RELATES_TO` into `Meeting / Episode / Entity / Fact`
- [x] Step 4: Store semantic retrieval payload inside Neo4j instead of external vector storage
- [x] Step 5: Replace current query path with hybrid retrieval over Neo4j candidates
- [x] Step 6: Replace duplicate detection to use Neo4j-backed semantic matching and exact meeting lookup
- [x] Step 7: Remove runtime dependency on `llama-index` and `chroma`
- [x] Step 8: Update CLI stats output to reflect hybrid retrieval structures such as episodes and facts
- [x] Step 9: Update README and environment instructions to match the new architecture
- [x] Step 10: Run end-to-end verification on `process`, `query`, and `stats` with a real Neo4j environment
- [x] Remove Obsidian from the project documentation and dependency surface
- [x] Remove Obsidian from the runtime processing pipeline
- [x] Move raw meeting archival to `data/raw`
- [x] Move meeting state storage to `data/meeting_state.json`
- [x] Introduce Neo4j configuration and a minimal graph storage layer
- [x] Write extracted meeting entities and relations into Neo4j
- [x] Add graph statistics to the CLI status output
- [x] Redesign retrieval to combine vector recall with graph facts