meeting_memory/MIGRATION_TASKS.md

2.3 KiB

Migration Tasks

Goal

Align this project toward D:\github_project\graphiti while keeping the meeting-processing flow usable and making the codebase easier to maintain.

Target direction:

  • Neo4j is the only persistence layer for graph and retrieval data
  • Retrieval is hybrid: semantic similarity + keyword/fact recall + graph relationship context
  • Storage is more provenance-friendly, closer to Meeting / Episode / Entity / Fact
  • Core implementation lives in package modules instead of the repository root

In Progress

  • No active migration tasks

Todo

  • Clean up any stale data directories only after explicit user confirmation

Done

  • Step 1: Extract a shared embedding utility and stop coupling semantic retrieval to the old vector-store implementation
  • Step 2.1: Create a package structure and move shared foundations out of the repository root
  • Step 2.2: Move extraction, raw storage, and state tracking into package modules
  • Step 2.3: Move graph storage, processing, and CLI into package modules
  • Step 3: Redesign Neo4j schema from simple Meeting -> Entity -> RELATES_TO into Meeting / Episode / Entity / Fact
  • Step 4: Store semantic retrieval payload inside Neo4j instead of external vector storage
  • Step 5: Replace current query path with hybrid retrieval over Neo4j candidates
  • Step 6: Replace duplicate detection to use Neo4j-backed semantic matching and exact meeting lookup
  • Step 7: Remove runtime dependency on llama-index and chroma
  • Step 8: Update CLI stats output to reflect hybrid retrieval structures such as episodes and facts
  • Step 9: Update README and environment instructions to match the new architecture
  • Step 10: Run end-to-end verification on process, query, and stats with a real Neo4j environment
  • Remove Obsidian from the project documentation and dependency surface
  • Remove Obsidian from the runtime processing pipeline
  • Move raw meeting archival to data/raw
  • Move meeting state storage to data/meeting_state.json
  • Introduce Neo4j configuration and a minimal graph storage layer
  • Write extracted meeting entities and relations into Neo4j
  • Add graph statistics to the CLI status output
  • Redesign retrieval to combine vector recall with graph facts