44 lines
2.3 KiB
Markdown
44 lines
2.3 KiB
Markdown
# Migration Tasks
|
|
|
|
## Goal
|
|
|
|
Align this project toward `D:\github_project\graphiti` while keeping the meeting-processing flow usable and making the codebase easier to maintain.
|
|
|
|
Target direction:
|
|
|
|
- Neo4j is the only persistence layer for graph and retrieval data
|
|
- Retrieval is hybrid: semantic similarity + keyword/fact recall + graph relationship context
|
|
- Storage is more provenance-friendly, closer to `Meeting / Episode / Entity / Fact`
|
|
- Core implementation lives in package modules instead of the repository root
|
|
|
|
## In Progress
|
|
|
|
- [ ] No active migration tasks
|
|
|
|
## Todo
|
|
|
|
- [ ] Clean up any stale data directories only after explicit user confirmation
|
|
|
|
## Done
|
|
|
|
- [x] Step 1: Extract a shared embedding utility and stop coupling semantic retrieval to the old vector-store implementation
|
|
- [x] Step 2.1: Create a package structure and move shared foundations out of the repository root
|
|
- [x] Step 2.2: Move extraction, raw storage, and state tracking into package modules
|
|
- [x] Step 2.3: Move graph storage, processing, and CLI into package modules
|
|
- [x] Step 3: Redesign Neo4j schema from simple `Meeting -> Entity -> RELATES_TO` into `Meeting / Episode / Entity / Fact`
|
|
- [x] Step 4: Store semantic retrieval payload inside Neo4j instead of external vector storage
|
|
- [x] Step 5: Replace current query path with hybrid retrieval over Neo4j candidates
|
|
- [x] Step 6: Replace duplicate detection to use Neo4j-backed semantic matching and exact meeting lookup
|
|
- [x] Step 7: Remove runtime dependency on `llama-index` and `chroma`
|
|
- [x] Step 8: Update CLI stats output to reflect hybrid retrieval structures such as episodes and facts
|
|
- [x] Step 9: Update README and environment instructions to match the new architecture
|
|
- [x] Step 10: Run end-to-end verification on `process`, `query`, and `stats` with a real Neo4j environment
|
|
- [x] Remove Obsidian from the project documentation and dependency surface
|
|
- [x] Remove Obsidian from the runtime processing pipeline
|
|
- [x] Move raw meeting archival to `data/raw`
|
|
- [x] Move meeting state storage to `data/meeting_state.json`
|
|
- [x] Introduce Neo4j configuration and a minimal graph storage layer
|
|
- [x] Write extracted meeting entities and relations into Neo4j
|
|
- [x] Add graph statistics to the CLI status output
|
|
- [x] Redesign retrieval to combine vector recall with graph facts
|