# Migration Tasks ## Goal Align this project toward `D:\github_project\graphiti` while keeping the meeting-processing flow usable and making the codebase easier to maintain. Target direction: - Neo4j is the only persistence layer for graph and retrieval data - Retrieval is hybrid: semantic similarity + keyword/fact recall + graph relationship context - Storage is more provenance-friendly, closer to `Meeting / Episode / Entity / Fact` - Core implementation lives in package modules instead of the repository root ## In Progress - [ ] No active migration tasks ## Todo - [ ] Clean up any stale data directories only after explicit user confirmation ## Done - [x] Step 1: Extract a shared embedding utility and stop coupling semantic retrieval to the old vector-store implementation - [x] Step 2.1: Create a package structure and move shared foundations out of the repository root - [x] Step 2.2: Move extraction, raw storage, and state tracking into package modules - [x] Step 2.3: Move graph storage, processing, and CLI into package modules - [x] Step 3: Redesign Neo4j schema from simple `Meeting -> Entity -> RELATES_TO` into `Meeting / Episode / Entity / Fact` - [x] Step 4: Store semantic retrieval payload inside Neo4j instead of external vector storage - [x] Step 5: Replace current query path with hybrid retrieval over Neo4j candidates - [x] Step 6: Replace duplicate detection to use Neo4j-backed semantic matching and exact meeting lookup - [x] Step 7: Remove runtime dependency on `llama-index` and `chroma` - [x] Step 8: Update CLI stats output to reflect hybrid retrieval structures such as episodes and facts - [x] Step 9: Update README and environment instructions to match the new architecture - [x] Step 10: Run end-to-end verification on `process`, `query`, and `stats` with a real Neo4j environment - [x] Remove Obsidian from the project documentation and dependency surface - [x] Remove Obsidian from the runtime processing pipeline - [x] Move raw meeting archival to `data/raw` - [x] Move meeting state storage to `data/meeting_state.json` - [x] Introduce Neo4j configuration and a minimal graph storage layer - [x] Write extracted meeting entities and relations into Neo4j - [x] Add graph statistics to the CLI status output - [x] Redesign retrieval to combine vector recall with graph facts