2.3 KiB
2.3 KiB
Migration Tasks
Goal
Align this project toward D:\github_project\graphiti while keeping the meeting-processing flow usable and making the codebase easier to maintain.
Target direction:
- Neo4j is the only persistence layer for graph and retrieval data
- Retrieval is hybrid: semantic similarity + keyword/fact recall + graph relationship context
- Storage is more provenance-friendly, closer to
Meeting / Episode / Entity / Fact - Core implementation lives in package modules instead of the repository root
In Progress
- No active migration tasks
Todo
- Clean up any stale data directories only after explicit user confirmation
Done
- Step 1: Extract a shared embedding utility and stop coupling semantic retrieval to the old vector-store implementation
- Step 2.1: Create a package structure and move shared foundations out of the repository root
- Step 2.2: Move extraction, raw storage, and state tracking into package modules
- Step 2.3: Move graph storage, processing, and CLI into package modules
- Step 3: Redesign Neo4j schema from simple
Meeting -> Entity -> RELATES_TOintoMeeting / Episode / Entity / Fact - Step 4: Store semantic retrieval payload inside Neo4j instead of external vector storage
- Step 5: Replace current query path with hybrid retrieval over Neo4j candidates
- Step 6: Replace duplicate detection to use Neo4j-backed semantic matching and exact meeting lookup
- Step 7: Remove runtime dependency on
llama-indexandchroma - Step 8: Update CLI stats output to reflect hybrid retrieval structures such as episodes and facts
- Step 9: Update README and environment instructions to match the new architecture
- Step 10: Run end-to-end verification on
process,query, andstatswith a real Neo4j environment - Remove Obsidian from the project documentation and dependency surface
- Remove Obsidian from the runtime processing pipeline
- Move raw meeting archival to
data/raw - Move meeting state storage to
data/meeting_state.json - Introduce Neo4j configuration and a minimal graph storage layer
- Write extracted meeting entities and relations into Neo4j
- Add graph statistics to the CLI status output
- Redesign retrieval to combine vector recall with graph facts