修改界面布局,修改图谱展示逻辑

main
Bifang 2026-06-11 17:48:25 +08:00
parent d5c9368c1b
commit 677838c29e
27 changed files with 4708 additions and 1007 deletions

View File

@ -1,13 +1,38 @@
# Migration Tasks
## Goal
Align this project toward `D:\github_project\graphiti` while keeping the meeting-processing flow usable and making the codebase easier to maintain.
Target direction:
- Neo4j is the only persistence layer for graph and retrieval data
- Retrieval is hybrid: semantic similarity + keyword/fact recall + graph relationship context
- Storage is more provenance-friendly, closer to `Meeting / Episode / Entity / Fact`
- Core implementation lives in package modules instead of the repository root
## In Progress
- [ ] Rework deduplication and identity resolution for entities, action items, and metrics
- [ ] No active migration tasks
## Todo
- [ ] Clean up any stale data directories only after explicit user confirmation
## Done
- [x] Step 1: Extract a shared embedding utility and stop coupling semantic retrieval to the old vector-store implementation
- [x] Step 2.1: Create a package structure and move shared foundations out of the repository root
- [x] Step 2.2: Move extraction, raw storage, and state tracking into package modules
- [x] Step 2.3: Move graph storage, processing, and CLI into package modules
- [x] Step 3: Redesign Neo4j schema from simple `Meeting -> Entity -> RELATES_TO` into `Meeting / Episode / Entity / Fact`
- [x] Step 4: Store semantic retrieval payload inside Neo4j instead of external vector storage
- [x] Step 5: Replace current query path with hybrid retrieval over Neo4j candidates
- [x] Step 6: Replace duplicate detection to use Neo4j-backed semantic matching and exact meeting lookup
- [x] Step 7: Remove runtime dependency on `llama-index` and `chroma`
- [x] Step 8: Update CLI stats output to reflect hybrid retrieval structures such as episodes and facts
- [x] Step 9: Update README and environment instructions to match the new architecture
- [x] Step 10: Run end-to-end verification on `process`, `query`, and `stats` with a real Neo4j environment
- [x] Remove Obsidian from the project documentation and dependency surface
- [x] Remove Obsidian from the runtime processing pipeline
- [x] Move raw meeting archival to `data/raw`

178
README.md
View File

@ -1,103 +1,75 @@
# 会议纪要长期记忆系统
基于 `LLM + LlamaIndex + 本地状态存储` 的会议纪要长期记忆原型,当前聚焦三件事
一个面向会议纪要的长期记忆原型,当前架构已经从“根目录脚本堆叠 + 外部向量库存储”迁移为更清晰的包结构,并收敛到
- 会议内容结构化抽取
- 跨会议行动项与指标状态追踪
- 语义检索与重复内容过滤
- `Neo4j` 作为唯一图存储与检索数据载体
- `Embedding + 关键词 + 图事实` 的混合检索模式
- 更接近 `graphiti``Meeting / Episode / Entity / Fact` 数据组织方式
## 当前处理链路
## 当前能力
- 会议文本结构化抽取
- 原文归档到 `data/raw`
- 行动项和指标状态的跨会议合并
- 基于内容哈希和语义相似度的重复检测
- 基于 `Neo4j` 的图谱写入
- 基于 `Neo4j` 的混合检索
## 处理流程
```text
meeting.md
-> 内容哈希去重
-> 语义相似去重
-> LLM 结构化抽取
-> 状态合并
-> 原文归档到 data/raw
-> 写入向量索引
-> Neo4j 语义相似去重
-> LLM 抽取结构化信息
-> 原文归档
-> 行动项 / 指标状态合并
-> 写入 Neo4j:
Meeting
Episode
Entity
Fact
```
当前版本已经移除了 `Obsidian` 作为主流程依赖,后续会继续引入图数据库来承载关系层。
## 快速开始
```bash
cd meeting_memory
python -m venv .venv
.venv\Scripts\pip install -r requirements.txt
copy .env.example .env
# 然后补充 API 配置
.venv\Scripts\python main.py process meeting_example.md
```
## 用法
```bash
python main.py
python main.py process meeting_example.md
python main.py process meeting_example.md -f
python main.py query "弱光指标目标值是多少"
python main.py stats
python main.py text "今天会议讨论了..."
python main.py batch "meetings/*.md" -f
```
## 目录
## 项目结构
```text
meeting_memory/
├── config.py 配置
├── extractor.py LLM 结构化抽取
├── meeting_processor.py 主处理流程
├── meeting_state.py 跨会议状态追踪
├── raw_store.py 原文归档
├── vector_store.py 向量索引与语义检索
├── main.py CLI 入口
├── meeting_memory/
│ ├── __init__.py
│ ├── cli.py
│ ├── config.py
│ ├── extractor.py
│ ├── graph_store.py
│ ├── meeting_processor.py
│ ├── meeting_state.py
│ ├── raw_store.py
│ └── services/
│ ├── __init__.py
│ └── embedding_service.py
├── data/
│ ├── raw/ 原始会议文本归档
│ └── meeting_state.json 状态持久化
└── vector_store_data/ 向量索引持久化目录
│ ├── raw/
│ └── meeting_state.json
├── main.py
├── MIGRATION_TASKS.md
└── requirements.txt
```
## 核心能力
说明:
### 1. 结构化抽取
- `meeting_memory/` 包目录中是当前真实实现
- 根目录现在只保留 `main.py` 作为 CLI 入口,其他实现全部收拢到包目录
- `vector_store.py` 已移除,检索能力已迁到 `Neo4j` 图结构中
从会议文本中提取:
## 环境配置
- 标题、日期、参会人
- 实体
- 关系
- 行动项
- 指标
- 决策
- 摘要
复制环境变量模板:
### 2. 长期状态追踪
```bash
copy .env.example .env
```
- 行动项按 `task + assignee` 做稳定 ID
- 指标按 `metric_name + owner` 做稳定 ID
- 保留历史状态演化
- 支持跨会议合并
### 3. 双重去重
- 内容哈希精确去重
- 语义相似度去重
### 4. 语义检索
- 会议内容写入向量库
- 支持自然语言查询
- 索引持久化,重启自动加载
## 配置
编辑 `.env`
填写配置:
```ini
LLM_API_KEY=sk-xxx
@ -107,8 +79,52 @@ LLM_MODEL=deepseek-chat
EMBEDDING_API_KEY=sk-xxx
EMBEDDING_BASE_URL=https://api.openai.com/v1
EMBEDDING_MODEL=text-embedding-3-small
NEO4J_ENABLED=true
NEO4J_URI=bolt://localhost:7687
NEO4J_USER=neo4j
NEO4J_PASSWORD=your-password
NEO4J_DATABASE=neo4j
```
## 迁移计划
## 安装
当前迁移进度见 [MIGRATION_TASKS.md](/d:/github_project/my_code/meeting_memory/MIGRATION_TASKS.md:1)。
```bash
python -m venv .venv
.venv\Scripts\pip install -r requirements.txt
```
## 使用方式
```bash
python main.py
python main.py process meeting_example.md
python main.py process meeting_example.md -f
python main.py text "今天会议讨论了弱光指标和交付节奏"
python main.py query "弱光指标目标值是多少"
python main.py stats
python main.py batch "meetings/*.md" -f
```
## 检索设计
当前查询不再依赖独立向量库,而是基于 `Neo4j` 中的三类候选进行混合排序:
- `Episode`:会议级文本上下文
- `Entity`:实体摘要与描述
- `Fact`:主体-关系-客体事实
排序信号包括:
- 语义相似度
- 关键词命中
- 图事实加权
## 迁移说明
迁移任务记录见 [MIGRATION_TASKS.md](/d:/github_project/my_code/meeting_memory/MIGRATION_TASKS.md:1)。
## 当前限制
- 当前环境如果没有安装 `neo4j` Python 包,导入图存储模块时会退化为禁用状态
- 由于本地运行环境限制,端到端验证仍然依赖可用的 Neo4j 实例和正确的凭据

View File

@ -0,0 +1,497 @@
{
"action_items": {
"59f75356": {
"item_id": "59f75356",
"task": "针对关键业务上量指标缺乏保障措施问题,出具具体可行方案并明确责任人",
"assignee": "建维部",
"series": "合川分公司周例会",
"created_meeting": "meeting_ed164adc704f.md",
"history": [
{
"date": "2026-05-06",
"meeting": "meeting_ed164adc704f.md",
"status": "进行中",
"priority": "高",
"deadline": "本周内"
}
],
"latest": {
"date": "2026-05-06",
"meeting": "meeting_ed164adc704f.md",
"status": "进行中",
"priority": "高",
"deadline": "本周内"
}
},
"b16a65ce": {
"item_id": "b16a65ce",
"task": "完成招聘情况、农村渠道进度及营销方案汇报",
"assignee": "市场部",
"series": "合川分公司周例会",
"created_meeting": "meeting_ed164adc704f.md",
"history": [
{
"date": "2026-05-06",
"meeting": "meeting_ed164adc704f.md",
"status": "待办",
"priority": "中",
"deadline": "本周内"
}
],
"latest": {
"date": "2026-05-06",
"meeting": "meeting_ed164adc704f.md",
"status": "待办",
"priority": "中",
"deadline": "本周内"
}
},
"691d7a64": {
"item_id": "691d7a64",
"task": "视频汇报运动会筹备情况",
"assignee": "市场部",
"series": "合川分公司周例会",
"created_meeting": "meeting_ed164adc704f.md",
"history": [
{
"date": "2026-05-06",
"meeting": "meeting_ed164adc704f.md",
"status": "待办",
"priority": "中",
"deadline": "本周六上午"
}
],
"latest": {
"date": "2026-05-06",
"meeting": "meeting_ed164adc704f.md",
"status": "待办",
"priority": "中",
"deadline": "本周六上午"
}
},
"8d9685f0": {
"item_id": "8d9685f0",
"task": "商客经理每日发送微信日报",
"assignee": "商客经理",
"series": "合川分公司周例会",
"created_meeting": "meeting_ed164adc704f.md",
"history": [
{
"date": "2026-05-06",
"meeting": "meeting_ed164adc704f.md",
"status": "待办",
"priority": "中",
"deadline": "每日"
}
],
"latest": {
"date": "2026-05-06",
"meeting": "meeting_ed164adc704f.md",
"status": "待办",
"priority": "中",
"deadline": "每日"
}
},
"723bdb36": {
"item_id": "723bdb36",
"task": "跟进学校IP限速机制建立",
"assignee": "宽带/客服部",
"series": "宽带运维、行政管理及市场业务推进会议",
"created_meeting": "meeting_5026dc1db2fe.md",
"history": [
{
"date": "2026-05-06 13:37",
"meeting": "meeting_5026dc1db2fe.md",
"status": "进行中",
"priority": "高",
"deadline": ""
}
],
"latest": {
"date": "2026-05-06 13:37",
"meeting": "meeting_5026dc1db2fe.md",
"status": "进行中",
"priority": "高",
"deadline": ""
}
},
"c1ebcaaf": {
"item_id": "c1ebcaaf",
"task": "处理客服内部机房问题清单并与客户沟通",
"assignee": "宽带/客服部",
"series": "宽带运维、行政管理及市场业务推进会议",
"created_meeting": "meeting_5026dc1db2fe.md",
"history": [
{
"date": "2026-05-06 13:37",
"meeting": "meeting_5026dc1db2fe.md",
"status": "进行中",
"priority": "中",
"deadline": ""
}
],
"latest": {
"date": "2026-05-06 13:37",
"meeting": "meeting_5026dc1db2fe.md",
"status": "进行中",
"priority": "中",
"deadline": ""
}
},
"9428cf05": {
"item_id": "9428cf05",
"task": "完成食堂改造审计及自饮机引入批复跟进",
"assignee": "综合部",
"series": "宽带运维、行政管理及市场业务推进会议",
"created_meeting": "meeting_5026dc1db2fe.md",
"history": [
{
"date": "2026-05-06 13:37",
"meeting": "meeting_5026dc1db2fe.md",
"status": "进行中",
"priority": "中",
"deadline": ""
}
],
"latest": {
"date": "2026-05-06 13:37",
"meeting": "meeting_5026dc1db2fe.md",
"status": "进行中",
"priority": "中",
"deadline": ""
}
},
"e4df98ca": {
"item_id": "e4df98ca",
"task": "落实招待费公示及纪检报备",
"assignee": "综合部",
"series": "宽带运维、行政管理及市场业务推进会议",
"created_meeting": "meeting_5026dc1db2fe.md",
"history": [
{
"date": "2026-05-06 13:37",
"meeting": "meeting_5026dc1db2fe.md",
"status": "进行中",
"priority": "高",
"deadline": ""
}
],
"latest": {
"date": "2026-05-06 13:37",
"meeting": "meeting_5026dc1db2fe.md",
"status": "进行中",
"priority": "高",
"deadline": ""
}
},
"e5e1449e": {
"item_id": "e5e1449e",
"task": "汇报招聘进度、农村渠道进度及营销方案",
"assignee": "市场部",
"series": "宽带运维、行政管理及市场业务推进会议",
"created_meeting": "meeting_5026dc1db2fe.md",
"history": [
{
"date": "2026-05-06 13:37",
"meeting": "meeting_5026dc1db2fe.md",
"status": "待办",
"priority": "高",
"deadline": "本周内"
}
],
"latest": {
"date": "2026-05-06 13:37",
"meeting": "meeting_5026dc1db2fe.md",
"status": "待办",
"priority": "高",
"deadline": "本周内"
}
},
"eb342fed": {
"item_id": "eb342fed",
"task": "每日微信发送满意度日报",
"assignee": "市场部",
"series": "宽带运维、行政管理及市场业务推进会议",
"created_meeting": "meeting_5026dc1db2fe.md",
"history": [
{
"date": "2026-05-06 13:37",
"meeting": "meeting_5026dc1db2fe.md",
"status": "待办",
"priority": "高",
"deadline": "每日"
}
],
"latest": {
"date": "2026-05-06 13:37",
"meeting": "meeting_5026dc1db2fe.md",
"status": "待办",
"priority": "高",
"deadline": "每日"
}
},
"3db9820d": {
"item_id": "3db9820d",
"task": "确定体育文化节方阵补充人员名单",
"assignee": "各部门",
"series": "宽带运维、行政管理及市场业务推进会议",
"created_meeting": "meeting_5026dc1db2fe.md",
"history": [
{
"date": "2026-05-06 13:37",
"meeting": "meeting_5026dc1db2fe.md",
"status": "待办",
"priority": "高",
"deadline": "今日"
}
],
"latest": {
"date": "2026-05-06 13:37",
"meeting": "meeting_5026dc1db2fe.md",
"status": "待办",
"priority": "高",
"deadline": "今日"
}
},
"684a31f4": {
"item_id": "684a31f4",
"task": "针对专线助账客指标拿出具体保障方案并回复",
"assignee": "各部门",
"series": "宽带运维、行政管理及市场业务推进会议",
"created_meeting": "meeting_5026dc1db2fe.md",
"history": [
{
"date": "2026-05-06 13:37",
"meeting": "meeting_5026dc1db2fe.md",
"status": "待办",
"priority": "中",
"deadline": ""
}
],
"latest": {
"date": "2026-05-06 13:37",
"meeting": "meeting_5026dc1db2fe.md",
"status": "待办",
"priority": "中",
"deadline": ""
}
}
},
"metrics": {
"a76a5616": {
"metric_id": "a76a5616",
"metric_name": "弱光指标",
"owner": "建维部",
"history": [
{
"date": "2026-05-06",
"meeting": "meeting_ed164adc704f.md",
"value": "0.51",
"target": "",
"trend": "向好"
}
],
"latest": {
"date": "2026-05-06",
"meeting": "meeting_ed164adc704f.md",
"value": "0.51",
"target": "",
"trend": "向好"
}
},
"d64cea03": {
"metric_id": "d64cea03",
"metric_name": "三代终端年度目标",
"owner": "建维部",
"history": [
{
"date": "2026-05-06",
"meeting": "meeting_ed164adc704f.md",
"value": "",
"target": "5.5",
"trend": "需压降"
}
],
"latest": {
"date": "2026-05-06",
"meeting": "meeting_ed164adc704f.md",
"value": "",
"target": "5.5",
"trend": "需压降"
}
},
"13144224": {
"metric_id": "13144224",
"metric_name": "九零工程月度转化率",
"owner": "建维部",
"history": [
{
"date": "2026-05-06",
"meeting": "meeting_ed164adc704f.md",
"value": "87.35%",
"target": "90%",
"trend": "接近目标"
}
],
"latest": {
"date": "2026-05-06",
"meeting": "meeting_ed164adc704f.md",
"value": "87.35%",
"target": "90%",
"trend": "接近目标"
}
},
"e056b315": {
"metric_id": "e056b315",
"metric_name": "退单率",
"owner": "建维部",
"history": [
{
"date": "2026-05-06",
"meeting": "meeting_ed164adc704f.md",
"value": "6.53%",
"target": "",
"trend": ""
}
],
"latest": {
"date": "2026-05-06",
"meeting": "meeting_ed164adc704f.md",
"value": "6.53%",
"target": "",
"trend": ""
}
},
"12e0764a": {
"metric_id": "12e0764a",
"metric_name": "商客市场2月收入",
"owner": "商客市场部",
"history": [
{
"date": "2026-05-06",
"meeting": "meeting_ed164adc704f.md",
"value": "88.5万元",
"target": "",
"trend": "增长"
}
],
"latest": {
"date": "2026-05-06",
"meeting": "meeting_ed164adc704f.md",
"value": "88.5万元",
"target": "",
"trend": "增长"
}
},
"23942096": {
"metric_id": "23942096",
"metric_name": "FPTR",
"owner": "宽带/客服部",
"history": [
{
"date": "2026-05-06 13:37",
"meeting": "meeting_5026dc1db2fe.md",
"value": "达标",
"target": "达标",
"trend": "稳定"
}
],
"latest": {
"date": "2026-05-06 13:37",
"meeting": "meeting_5026dc1db2fe.md",
"value": "达标",
"target": "达标",
"trend": "稳定"
}
},
"671827ff": {
"metric_id": "671827ff",
"metric_name": "弱光指标",
"owner": "宽带/客服部",
"history": [
{
"date": "2026-05-06 13:37",
"meeting": "meeting_5026dc1db2fe.md",
"value": "趋近目标",
"target": "预设目标值",
"trend": "改善"
}
],
"latest": {
"date": "2026-05-06 13:37",
"meeting": "meeting_5026dc1db2fe.md",
"value": "趋近目标",
"target": "预设目标值",
"trend": "改善"
}
},
"5d622e23": {
"metric_id": "5d622e23",
"metric_name": "投诉率",
"owner": "各部门",
"history": [
{
"date": "2026-05-06 13:37",
"meeting": "meeting_5026dc1db2fe.md",
"value": "",
"target": "KPI考核核心值",
"trend": "需管控"
}
],
"latest": {
"date": "2026-05-06 13:37",
"meeting": "meeting_5026dc1db2fe.md",
"value": "",
"target": "KPI考核核心值",
"trend": "需管控"
}
},
"def6050a": {
"metric_id": "def6050a",
"metric_name": "工信部有责指标",
"owner": "各部门",
"history": [
{
"date": "2026-05-06 13:37",
"meeting": "meeting_5026dc1db2fe.md",
"value": "",
"target": "KPI考核核心值",
"trend": "需管控"
}
],
"latest": {
"date": "2026-05-06 13:37",
"meeting": "meeting_5026dc1db2fe.md",
"value": "",
"target": "KPI考核核心值",
"trend": "需管控"
}
}
},
"meeting_series": {
"合川分公司周例会": {
"latest_date": "2026-05-06",
"processed_titles": [
"合川分公司周例会2026第X期"
]
},
"宽带运维、行政管理及市场业务推进会议": {
"latest_date": "2026-05-06 13:37",
"processed_titles": [
"宽带运维、行政管理及市场业务推进会议"
]
}
},
"content_hashes": {
"090ff5313a9e5c0dfd8d91c8f8aeb5246bd40a3ed92def6e498bd8254d71a9a4": {
"title": "合川分公司周例会2026第X期",
"date": "2026-05-06",
"filename": "meeting_ed164adc704f.md"
},
"64078fdfd6dbe3c094ddad97b907bbcc0404df3de912488c020efb0e76fbe048": {
"title": "宽带运维、行政管理及市场业务推进会议",
"date": "2026-05-06 13:37",
"filename": "meeting_5026dc1db2fe.md"
}
}
}

View File

@ -0,0 +1,35 @@
---
title: "宽带运维、行政管理及市场业务推进会议"
date: "2026-05-06 13:37"
status: archived
---
# 宽带运维、行政管理及市场业务推进会议
会议概述
会议主要围绕宽带运维指标、综合行政与工会管理、市场政企业务推进、满意度提升及年度考核准备等议题展开,旨在总结阶段性工作进度,协调跨部门资源,明确后续重点任务与考核要求。
主要讨论点
宽带运维与网络质量上周上门量及安装进度受天气影响弱光指标趋近目标FPTR达标但主动过境偏后。PCDN专线在学校端持续恶化已报市公司分析IP并拟限速。超频基站故障已恢复专线巡检进度符合预期。客服培训后内部机房问题已梳理清单。
综合行政与工会事务2025年剩余两项工作按原计划推进。工会经费压减食堂改造拟于5月结合更替修补进行拟引入自饮机降本。第四届体育文化节筹备中因主力选手受伤需各部门抽调人员补充方阵。主题教育简报已发基层党组织学习已完成。
区表彰、主题教育与招待费管理区级担当作为表彰正在对接建议争取参与以拉开竞争差距。招待费实行每年公开1次制度综合部超预算需调整26年预算其他部门严控成本。政企对外接待需统筹严禁客户经理个人垫资。
拆迁商客、满意度考核与KPI准备拆迁以二次升套为主社区与单位集中营销并行。满意度测评发现开卷考试形式导致拉分相关经理思想松懈。KPI考核已明确工信部有责及投诉率两项核心指标强调日清日结与执行力。
决策事项
二级基站拆除及下电服务费调整需在4月15日前全量完成。
招待费实行每年公开1次制度综合部超预算需调整26年预算其他部门严控成本。
满意度测评取消开卷考试形式,后续采用“后机评”模式,市场部需按四公司规定制定满意客户样板及不满客户处理流程。
第四届体育文化节方阵缺编9人需各部门协调抽调确定名单后统一采购服装并安排下班后排练。
年度考核指标已初步定调,各部门需提前与市公司沟通争取有利政策,避免起跑线落后。
待办事项
宽带/客服部跟进学校IP限速机制建立处理客服内部机房问题清单并与客户沟通。
综合部:完成食堂改造审计及自饮机引入批复跟进;落实招待费公示及纪检报备。
市场部:本周内汇报招聘进度、农村渠道进度及营销方案;每日微信发送满意度日报。
各部门:今日确定体育文化节方阵补充人员名单;针对专线助账客指标拿出具体保障方案并回复。
关键信息
会议时间2026-05-06 13:37
核心考核导向KPI考核聚焦工信部有责与投诉率强调执行力与日清日结习惯。
业务风险点PCDN专线恶化、满意度测评因形式问题导致拉分、招待费预算超支。
AI建议
针对满意度考核风险,建议市场部立即复盘测评机制,避免形式主义拉低指标,并提前演练“后机评”应对策略。
针对招待费超支及客户经理个人垫资问题,建议综合部建立统一审批与结算台账,明确费用归属与报销时效,规避财务与合规风险。
针对KPI考核准备建议建立市公司指标动态跟踪表提前模拟考核场景强化跨部门协同与数据预埋确保年底考核不被动。

View File

@ -0,0 +1,81 @@
---
title: "合川分公司周例会2026第X期"
date: "2026-05-06"
status: archived
---
# 合川分公司周例会2026第X期
# 会议记录
议 题合川分公司周例会2026第X期
时 间2026年5月6日 13:37—14:23
地 点:分公司会议室
主持人AlanPaine
参加人:分公司领导、各部门经理及相关人员
议程:
一、各部门汇报
二、分公司领导指示部署
---
## 会议内容
### 一、各部门汇报
建维部、综合部、商客市场负责人按议程现场按顺序做汇报。建维部汇报宽带安装受天气影响进度偏后弱光指标0.51持续向好三代终端年度目标5.5需持续压降九零工程月度转化率87.35%接近90%目标退单率6.53%PCDN专线学校出口问题正协调限速机制二级基站拆除预计4月中旬完成综合部通报建委相关工作清单及投资计划已汇报打印设备已协调保障招投标需求工会经费压减后严考严用食堂改造及自饮机引入方案正在推进第四届体育文化节方阵人员招募与排练已部署商客市场2月收入88.5万元实现增长三期项目二期拆迁完成1145户社区与单位清洗服务5场落实签约量待提升。
---
### 二、部署强调
#### 建维部负责人强调:
1. **网络运维与指标管控:**
- 弱光指标0.51持续向好三代终端年度目标5.5需持续压降FPTR已达标但主动过境0.3靠后。
- 九零工程月度转化率87.35%接近90%目标退单率6.53%主要受用户原因及改约影响已建议施工优化BtoC审核撤单流程。
- PCDN专线因学校出口带宽问题持续恶化正协调限速机制超频基站故障已及时处理专线巡检按计划推进。
2. **工作反馈与执行要求:**
- 强调养成“日清日结”习惯,工作回复必须量化、有结果、有措施,杜绝工作拖延数月未动。
- 针对关键业务上量指标缺乏保障措施问题,要求本周内出具具体可行方案并明确责任人。
---
#### 综合部负责人强调:
1. **工会经费与后勤保障:**
- 工会经费全面压减,需严考严用、以更少资金办更好实事。软性工程(更衣室、食堂改造)已确定上报市公司,拟引入自饮机解决饮水问题并节约成本。
2. **奖项申报策略:**
- 针对河川区“担当作为先进集体和先进个人”申报,评选条件多为定性要求,建议提前与区领导或分管领导沟通确认意向后再行申报,避免盲目提交浪费资源。
---
#### 市场部负责人强调:
1. **季度收官与二季度谋划:**
- 市场部需提前谋划季度收官及二季度业务活动打破淡季思维全力推动商客、H业务及AI军团活动升温。
- 本周内完成招聘情况、农村渠道进度及营销方案汇报;本周六上午视频汇报运动会筹备情况。
2. **满意度与考核管控:**
- 深刻反思满意度测评前期工作未做到位问题要求主管亲自抓。明确满意度及投诉考核标准不满客户需严格按5:30及5:35节点操作报警。
- 要求商客经理每日微信发送日报,跟进考核细节,确保指标可控。
---
#### 分公司主要领导强调:
1. **强化执行力与作风:**
- 各部门及一线人员必须摒弃“知道怎么做却不去做”的作风,做到事不做好不收兵。分管领导需加强政企部等部门督导力度,必要时亲自沟通。
2. **年度考核提前摸底:**
- 针对四公司年度考核及集团相关指标提升,要求各部门提前深入了解考核细则及可能产生重大影响的不利因素并及时上报,切忌定稿后被动。
- 市公司会统筹考虑分公司整体情况,务必提前布局、赢在起跑线。

View File

@ -1,157 +0,0 @@
import json
import logging
import re
from typing import List, Optional
from pydantic import BaseModel
from openai import OpenAI
from config import config
logger = logging.getLogger(__name__)
client = OpenAI(
api_key=config.llm.api_key or None,
base_url=config.llm.base_url if config.llm.base_url else None,
)
class Entity(BaseModel):
name: str
entity_type: str
description: str = ""
class Relation(BaseModel):
subject: str
subject_type: str
predicate: str
object: str
object_type: str
description: str = ""
class ActionItem(BaseModel):
task: str
assignee: str = ""
deadline: str = ""
status: str = "待办"
priority: str = ""
class Decision(BaseModel):
content: str
proposer: str = ""
status: str = "已决"
class MeetingMetric(BaseModel):
metric_name: str
value: str
target: str = ""
owner: str = ""
trend: str = ""
class MeetingExtraction(BaseModel):
title: str
date: str = ""
participants: List[str] = []
agenda: List[str] = []
entities: List[Entity] = []
relations: List[Relation] = []
action_items: List[ActionItem] = []
decisions: List[Decision] = []
metrics: List[MeetingMetric] = []
summary: str = ""
EXTRACTION_SYSTEM_PROMPT = """
你是一个专业的会议纪要信息抽取专家你的任务是从中文会议记录中抽取结构化信息并严格按照要求的JSON格式返回
## 抽取内容
### 1. 实体
- 人物参会人员提及的人员
- 组织/部门公司部门团队
- 项目/任务正在进行的项目任务
- 指标/KPI关键绩效指标如转化率退单率等
- 概念/制度管理概念制度要求
- 地点会议地点项目地点
### 2. 关系 (主体-关系谓词-客体)
抽取事实性关系例如
- {"subject": "建维部", "subject_type": "组织", "predicate": "负责", "object": "网络运维", "object_type": "任务", "description": ""}
- {"subject": "弱光指标", "subject_type": "指标", "predicate": "目标值", "object": "0.5以下", "object_type": "数值", "description": ""}
### 3. 行动项
谁负责什么任务截止时间优先级
### 4. 决策
做出的决定和结论
### 5. 指标数据
具体的数字指标当前值目标值负责人趋势(向好/持平/恶化)
## 规则
- 只提取事实性信息
- 过滤比喻假设主观评价
- 数字指标要精确提取
- entitiesrelationsaction_itemsdecisionsmetrics 如果没有则返回空数组
"""
def _call_llm(system: str, user: str) -> str:
response = client.chat.completions.create(
model=config.llm.model,
messages=[
{"role": "system", "content": system},
{"role": "user", "content": user},
],
max_tokens=config.llm.max_tokens,
temperature=config.llm.temperature,
)
content = response.choices[0].message.content
if content is None:
raise ValueError("LLM returned empty response")
return content
def extract_meeting_info(text: str) -> MeetingExtraction:
user_prompt = f"""
从以下会议记录中抽取结构化信息
JSON字段说明
- title: 会议标题
- date: 会议日期
- participants: 参会人列表
- agenda: 议程列表
- entities: 实体列表每个实体包含 name(名称), entity_type(类型), description(描述)
- relations: 关系列表每个关系包含 subject(主体), subject_type(主体类型), predicate(关系谓词), object(客体), object_type(客体类型), description(描述)
- action_items: 行动项列表每条包含 task(任务), assignee(负责人), deadline(截止时间), status(状态), priority(优先级)
- decisions: 决策列表每条包含 content(决策内容), proposer(提出人), status(状态)
- metrics: 指标列表每条包含 metric_name(指标名), value(当前值), target(目标值), owner(负责人), trend(趋势)
- summary: 会议摘要
请直接返回JSON对象不要包含任何额外说明文字
会议记录
{text}
"""
content = _call_llm(EXTRACTION_SYSTEM_PROMPT, user_prompt)
data = _try_parse_json(content)
return MeetingExtraction(**data)
def _try_parse_json(content: str) -> dict:
try:
return json.loads(content)
except json.JSONDecodeError:
logger.warning("JSON解析失败尝试修复...")
match = re.search(r'\{.*\}', content, re.DOTALL)
if match:
try:
return json.loads(match.group())
except json.JSONDecodeError as e:
logger.error(f"修复后的JSON仍无法解析: {e}")
raise

View File

@ -1,250 +0,0 @@
import logging
from typing import Any, Dict, List
from config import config
logger = logging.getLogger(__name__)
class Neo4jGraphStore:
def __init__(self):
self._driver = None
self._enabled = False
self._connect()
def _connect(self):
if not config.neo4j.enabled:
logger.info("Neo4j graph store disabled")
return
try:
from neo4j import GraphDatabase
except ImportError:
logger.warning("neo4j package is not installed")
return
if not config.neo4j.password:
logger.warning("Neo4j is enabled but NEO4J_PASSWORD is empty")
return
self._driver = GraphDatabase.driver(
config.neo4j.uri,
auth=(config.neo4j.user, config.neo4j.password),
)
self._enabled = True
@property
def enabled(self) -> bool:
return self._enabled and self._driver is not None
def close(self):
if self._driver is not None:
self._driver.close()
def run_query(self, query: str, **params) -> List[Dict[str, Any]]:
if not self.enabled:
return []
with self._driver.session(database=config.neo4j.database) as session:
result = session.run(query, **params)
return [record.data() for record in result]
def initialize_schema(self):
if not self.enabled:
return
statements = [
"CREATE CONSTRAINT meeting_id IF NOT EXISTS FOR (m:Meeting) REQUIRE m.meeting_id IS UNIQUE",
"CREATE CONSTRAINT entity_name IF NOT EXISTS FOR (e:Entity) REQUIRE e.name IS UNIQUE",
"CREATE INDEX meeting_title IF NOT EXISTS FOR (m:Meeting) ON (m.title)",
"CREATE INDEX entity_type IF NOT EXISTS FOR (e:Entity) ON (e.entity_type)",
]
for statement in statements:
self.run_query(statement)
def get_stats(self) -> Dict[str, Any]:
if not self.enabled:
return {"enabled": False}
rows = self.run_query(
"""
CALL {
MATCH (m:Meeting)
RETURN count(m) AS meetings
}
CALL {
MATCH (e:Entity)
RETURN count(e) AS entities
}
RETURN meetings, entities
"""
)
if not rows:
return {"enabled": True, "meetings": 0, "entities": 0}
return {"enabled": True, **rows[0]}
def upsert_meeting_subgraph(self, meeting_data: dict) -> None:
if not self.enabled:
return
meeting_id = meeting_data.get("_graph_meeting_id")
if not meeting_id:
return
self.initialize_schema()
self.run_query(
"""
MERGE (m:Meeting {meeting_id: $meeting_id})
SET m.title = $title,
m.date = $date,
m.summary = $summary,
m.content_hash = $content_hash,
m.updated_at = datetime()
""",
meeting_id=meeting_id,
title=meeting_data.get("title", ""),
date=meeting_data.get("date", ""),
summary=meeting_data.get("summary", ""),
content_hash=meeting_data.get("_content_hash", ""),
)
for entity in meeting_data.get("entities", []):
self._upsert_entity(meeting_id, entity)
for participant in meeting_data.get("participants", []):
self._upsert_entity(
meeting_id,
{"name": participant, "entity_type": "participant", "description": ""},
)
for relation in meeting_data.get("relations", []):
self._upsert_relation(meeting_id, relation, meeting_data.get("date", ""))
def _upsert_entity(self, meeting_id: str, entity: dict) -> None:
name = entity.get("name", "").strip()
if not name:
return
self.run_query(
"""
MATCH (m:Meeting {meeting_id: $meeting_id})
MERGE (e:Entity {name: $name})
SET e.entity_type = CASE
WHEN $entity_type <> '' THEN $entity_type
ELSE coalesce(e.entity_type, '')
END,
e.description = CASE
WHEN $description <> '' THEN $description
ELSE coalesce(e.description, '')
END,
e.updated_at = datetime()
MERGE (m)-[:MENTIONS]->(e)
""",
meeting_id=meeting_id,
name=name,
entity_type=entity.get("entity_type", ""),
description=entity.get("description", ""),
)
def _upsert_relation(self, meeting_id: str, relation: dict, meeting_date: str) -> None:
subject = relation.get("subject", "").strip()
predicate = relation.get("predicate", "").strip()
obj = relation.get("object", "").strip()
if not subject or not predicate or not obj:
return
self._upsert_entity(
meeting_id,
{
"name": subject,
"entity_type": relation.get("subject_type", ""),
"description": "",
},
)
self._upsert_entity(
meeting_id,
{
"name": obj,
"entity_type": relation.get("object_type", ""),
"description": "",
},
)
self.run_query(
"""
MATCH (s:Entity {name: $subject})
MATCH (o:Entity {name: $object})
MERGE (s)-[r:RELATES_TO {
meeting_id: $meeting_id,
predicate: $predicate,
object_name: $object
}]->(o)
SET r.description = $description,
r.meeting_date = $meeting_date,
r.updated_at = datetime()
""",
meeting_id=meeting_id,
subject=subject,
predicate=predicate,
object=obj,
description=relation.get("description", ""),
meeting_date=meeting_date,
)
def remove_meeting_subgraph(self, meeting_id: str) -> None:
if not self.enabled:
return
self.run_query(
"""
MATCH (m:Meeting {meeting_id: $meeting_id})
OPTIONAL MATCH (m)-[mentions:MENTIONS]->(:Entity)
DELETE mentions
WITH m
OPTIONAL MATCH ()-[r:RELATES_TO {meeting_id: $meeting_id}]->()
DELETE r
WITH m
DELETE m
""",
meeting_id=meeting_id,
)
def search_facts(self, question: str, limit: int = 5) -> List[Dict[str, Any]]:
if not self.enabled or not question.strip():
return []
rows = self.run_query(
"""
CALL {
MATCH (m:Meeting)
WHERE toLower(m.title) CONTAINS toLower($question)
OR toLower(coalesce(m.summary, '')) CONTAINS toLower($question)
RETURN 'meeting' AS kind,
m.title AS title,
coalesce(m.summary, '') AS text,
m.date AS date
UNION
MATCH (e:Entity)
WHERE toLower(e.name) CONTAINS toLower($question)
OR toLower(coalesce(e.description, '')) CONTAINS toLower($question)
OPTIONAL MATCH (e)-[r:RELATES_TO]-(other:Entity)
RETURN 'entity' AS kind,
e.name AS title,
coalesce(
head(collect(
e.name + ' -[' + coalesce(r.predicate, '') + ']-> ' + coalesce(other.name, '')
)),
coalesce(e.description, '')
) AS text,
'' AS date
}
RETURN kind, title, text, date
LIMIT $limit
""",
question=question,
limit=limit,
)
return rows
graph_store = Neo4jGraphStore()

182
main.py
View File

@ -1,184 +1,4 @@
import argparse
import glob as glob_module
import logging
import os
import sys
if sys.stdout.encoding and sys.stdout.encoding.lower() == "gbk":
sys.stdout.reconfigure(encoding="utf-8")
logging.basicConfig(
level=logging.INFO,
format="%(asctime)s [%(levelname)s] %(name)s: %(message)s",
datefmt="%H:%M:%S",
)
logger = logging.getLogger(__name__)
def cmd_process(args):
from meeting_processor import meeting_processor
filepath = args.file
if not os.path.exists(filepath):
print(f"错误:文件不存在 {filepath}")
sys.exit(1)
print(f"正在处理会议文件:{filepath}")
archive_path = meeting_processor.process_meeting_file(filepath, force=getattr(args, "force", False))
if archive_path:
print("\n处理完成")
print(f"原文归档:{archive_path}")
else:
print("\n处理失败或已跳过")
sys.exit(1)
def cmd_text(args):
from meeting_processor import meeting_processor
print("正在处理会议文本...")
archive_path = meeting_processor.process_meeting_text(args.text, force=getattr(args, "force", False))
if archive_path:
print("\n处理完成")
print(f"原文归档:{archive_path}")
else:
print("\n处理失败或已跳过")
def cmd_query(args):
from meeting_processor import meeting_processor
print(f"查询:{args.question}")
print("-" * 40)
result = meeting_processor.query(args.question, top_k=args.top_k)
print(result if result else "未找到相关信息")
def cmd_stats(args):
from meeting_processor import meeting_processor
stats = meeting_processor.stats()
print("会议记忆系统统计")
print("-" * 40)
print(f"向量节点数:{stats.get('vector_index', {}).get('node_count', 0)}")
print(f"Neo4j 启用:{stats.get('graph', {}).get('enabled', False)}")
print(f"图谱会议数:{stats.get('graph', {}).get('meetings', 0)}")
print(f"图谱实体数:{stats.get('graph', {}).get('entities', 0)}")
print(f"行动项数:{stats.get('state', {}).get('action_items_tracked', 0)}")
print(f"指标数:{stats.get('state', {}).get('metrics_tracked', 0)}")
print(f"会议系列数:{stats.get('state', {}).get('meeting_series', 0)}")
print(f"原文归档目录:{stats.get('raw_dir', '')}")
print(f"状态文件:{stats.get('state_path', '')}")
def cmd_batch(args):
from meeting_processor import meeting_processor
files = glob_module.glob(args.pattern, recursive=True)
if not files:
print(f"未匹配到任何文件:{args.pattern}")
sys.exit(1)
print(f"找到 {len(files)} 个文件,开始批量处理...")
success = 0
for path in files:
try:
print(f"\n处理:{path}")
result = meeting_processor.process_meeting_file(path, force=getattr(args, "force", False))
if result:
success += 1
except Exception as exc:
logger.error("处理失败: %s - %s", path, exc)
print(f"\n批量处理完成:{success}/{len(files)} 成功")
def cmd_interactive():
from meeting_processor import meeting_processor
print("会议纪要长期记忆系统")
print("=" * 50)
print("可用命令:")
print(" query <问题> 语义查询会议记忆")
print(" process <路径> 处理会议文件")
print(" stats 查看统计")
print(" help 显示帮助")
print(" exit/quit 退出")
print("=" * 50)
while True:
try:
line = input("\n> ").strip()
except (EOFError, KeyboardInterrupt):
print()
break
if not line:
continue
if line in ("exit", "quit", "q"):
break
if line == "help":
print(" query <问题>")
print(" process <路径>")
print(" stats")
print(" exit/quit")
continue
if line == "stats":
cmd_stats(None)
continue
if line.startswith("process "):
filepath = line[8:].strip()
if not os.path.exists(filepath):
print(f"文件不存在:{filepath}")
continue
result = meeting_processor.process_meeting_file(filepath)
print(f"完成:{result}" if result else "处理失败或已跳过")
continue
question = line[6:].strip() if line.startswith("query ") else line
result = meeting_processor.query(question, top_k=3)
print(result if result else "未找到相关信息")
print("bye!")
def main():
parser = argparse.ArgumentParser(description="会议纪要长期记忆系统")
subparsers = parser.add_subparsers(dest="command")
p_process = subparsers.add_parser("process", help="处理会议 markdown 文件")
p_process.add_argument("file", help="会议文件路径")
p_process.add_argument("-f", "--force", action="store_true", help="发现重复时自动覆盖")
p_text = subparsers.add_parser("text", help="直接处理一段会议文本")
p_text.add_argument("text", help="会议文本内容")
p_text.add_argument("-f", "--force", action="store_true", help="发现重复时自动覆盖")
p_query = subparsers.add_parser("query", help="语义查询会议记忆")
p_query.add_argument("question", help="查询问题")
p_query.add_argument("--top-k", type=int, default=3, help="返回结果数量")
subparsers.add_parser("stats", help="查看统计")
p_batch = subparsers.add_parser("batch", help="批量处理会议文件")
p_batch.add_argument("pattern", help="glob 模式,如 meetings/*.md")
p_batch.add_argument("-f", "--force", action="store_true", help="发现重复时自动覆盖")
args = parser.parse_args()
if args.command == "process":
cmd_process(args)
elif args.command == "text":
cmd_text(args)
elif args.command == "query":
cmd_query(args)
elif args.command == "stats":
cmd_stats(args)
elif args.command == "batch":
cmd_batch(args)
else:
cmd_interactive()
from meeting_memory.cli import main
if __name__ == "__main__":

View File

@ -0,0 +1,3 @@
from meeting_memory.config import config
__all__ = ["config"]

View File

@ -0,0 +1,202 @@
import argparse
import glob as glob_module
import logging
import os
import sys
from meeting_memory.meeting_processor import meeting_processor
from meeting_memory.web_demo import run_demo_server
if sys.stdout.encoding and sys.stdout.encoding.lower() == "gbk":
sys.stdout.reconfigure(encoding="utf-8")
logging.basicConfig(
level=logging.INFO,
format="%(asctime)s [%(levelname)s] %(name)s: %(message)s",
datefmt="%H:%M:%S",
)
logger = logging.getLogger(__name__)
def cmd_process(args):
filepath = args.file
if not os.path.exists(filepath):
print(f"错误:文件不存在 {filepath}")
sys.exit(1)
print(f"正在处理会议文件:{filepath}")
archive_path = meeting_processor.process_meeting_file(
filepath,
force=getattr(args, "force", False),
)
if archive_path:
print("\n处理完成")
print(f"原文归档:{archive_path}")
else:
print("\n处理失败或已跳过")
sys.exit(1)
def cmd_text(args):
print("正在处理会议文本...")
archive_path = meeting_processor.process_meeting_text(
args.text,
force=getattr(args, "force", False),
)
if archive_path:
print("\n处理完成")
print(f"原文归档:{archive_path}")
else:
print("\n处理失败或已跳过")
def cmd_query(args):
print(f"查询:{args.question}")
print("-" * 40)
result = meeting_processor.query(args.question, top_k=args.top_k)
print(result if result else "未找到相关信息")
def cmd_stats(_args):
stats = meeting_processor.stats()
graph = stats.get("graph", {})
state = stats.get("state", {})
print("会议纪要长期记忆系统统计")
print("-" * 40)
print(f"Neo4j 启用:{graph.get('enabled', False)}")
print(f"图谱会议数:{graph.get('meetings', 0)}")
print(f"图谱 Episode 数:{graph.get('episodes', 0)}")
print(f"图谱实体数:{graph.get('entities', 0)}")
print(f"图谱 Fact 数:{graph.get('facts', 0)}")
print(f"行动项数:{state.get('action_items_tracked', 0)}")
print(f"指标数:{state.get('metrics_tracked', 0)}")
print(f"会议系列数:{state.get('meeting_series', 0)}")
print(f"原文归档目录:{stats.get('raw_dir', '')}")
print(f"状态文件:{stats.get('state_path', '')}")
def cmd_batch(args):
files = glob_module.glob(args.pattern, recursive=True)
if not files:
print(f"未匹配到任何文件:{args.pattern}")
sys.exit(1)
print(f"找到 {len(files)} 个文件,开始批量处理...")
success = 0
for path in files:
try:
print(f"\n处理:{path}")
result = meeting_processor.process_meeting_file(
path,
force=getattr(args, "force", False),
)
if result:
success += 1
except Exception as exc:
logger.error("处理失败: %s - %s", path, exc)
print(f"\n批量处理完成:{success}/{len(files)} 成功")
def cmd_web(args):
run_demo_server(
host=getattr(args, "host", "127.0.0.1"),
port=getattr(args, "port", 8765),
)
def cmd_interactive():
print("会议纪要长期记忆系统")
print("=" * 50)
print("可用命令:")
print(" query <问题> 查询会议记忆")
print(" process <路径> 处理会议文件")
print(" stats 查看统计")
print(" help 显示帮助")
print(" exit/quit 退出")
print("=" * 50)
while True:
try:
line = input("\n> ").strip()
except (EOFError, KeyboardInterrupt):
print()
break
if not line:
continue
if line in ("exit", "quit", "q"):
break
if line == "help":
print(" query <问题>")
print(" process <路径>")
print(" stats")
print(" exit/quit")
continue
if line == "stats":
cmd_stats(None)
continue
if line.startswith("process "):
filepath = line[8:].strip()
if not os.path.exists(filepath):
print(f"文件不存在:{filepath}")
continue
result = meeting_processor.process_meeting_file(filepath)
print(f"完成:{result}" if result else "处理失败或已跳过")
continue
question = line[6:].strip() if line.startswith("query ") else line
result = meeting_processor.query(question, top_k=3)
print(result if result else "未找到相关信息")
print("bye!")
def main():
parser = argparse.ArgumentParser(description="会议纪要长期记忆系统")
subparsers = parser.add_subparsers(dest="command")
p_process = subparsers.add_parser("process", help="处理会议 markdown 文件")
p_process.add_argument("file", help="会议文件路径")
p_process.add_argument("-f", "--force", action="store_true", help="发现重复时自动覆盖")
p_text = subparsers.add_parser("text", help="直接处理一段会议文本")
p_text.add_argument("text", help="会议文本内容")
p_text.add_argument("-f", "--force", action="store_true", help="发现重复时自动覆盖")
p_query = subparsers.add_parser("query", help="查询会议记忆")
p_query.add_argument("question", help="查询问题")
p_query.add_argument("--top-k", type=int, default=3, help="返回结果数量")
subparsers.add_parser("stats", help="查看统计")
p_batch = subparsers.add_parser("batch", help="批量处理会议文件")
p_batch.add_argument("pattern", help="glob 模式,如 meetings/*.md")
p_batch.add_argument("-f", "--force", action="store_true", help="发现重复时自动覆盖")
p_web = subparsers.add_parser("web", help="启动 Web 界面")
p_web.add_argument("--host", default="127.0.0.1", help="绑定地址")
p_web.add_argument("--port", type=int, default=8765, help="服务端口")
args = parser.parse_args()
if args.command == "process":
cmd_process(args)
elif args.command == "text":
cmd_text(args)
elif args.command == "query":
cmd_query(args)
elif args.command == "stats":
cmd_stats(args)
elif args.command == "batch":
cmd_batch(args)
elif args.command == "web":
cmd_web(args)
else:
cmd_interactive()
if __name__ == "__main__":
main()

View File

@ -5,7 +5,8 @@ from pydantic import BaseModel, Field
load_dotenv()
PROJECT_ROOT = os.path.dirname(os.path.abspath(__file__))
PACKAGE_ROOT = os.path.dirname(os.path.abspath(__file__))
PROJECT_ROOT = os.path.dirname(PACKAGE_ROOT)
class LLMConfig(BaseModel):
@ -26,11 +27,6 @@ class StorageConfig(BaseModel):
data_dir: str = Field(default=os.path.join(PROJECT_ROOT, "data"))
raw_dir: str = Field(default=os.path.join(PROJECT_ROOT, "data", "raw"))
class VectorStoreConfig(BaseModel):
persist_dir: str = Field(default=os.path.join(PROJECT_ROOT, "vector_store_data"))
class Neo4jConfig(BaseModel):
enabled: bool = Field(default=os.getenv("NEO4J_ENABLED", "false").lower() == "true")
uri: str = Field(default=os.getenv("NEO4J_URI", "bolt://localhost:7687"))
@ -43,7 +39,6 @@ class ProjectConfig(BaseModel):
llm: LLMConfig = Field(default_factory=LLMConfig)
embedding: EmbeddingConfig = Field(default_factory=EmbeddingConfig)
storage: StorageConfig = Field(default_factory=StorageConfig)
vector_store: VectorStoreConfig = Field(default_factory=VectorStoreConfig)
neo4j: Neo4jConfig = Field(default_factory=Neo4jConfig)
state_path: str = Field(default=os.path.join(PROJECT_ROOT, "data", "meeting_state.json"))

View File

@ -0,0 +1,364 @@
import json
import logging
import re
import sys
from typing import List, Optional
from openai import OpenAI
from pydantic import BaseModel, Field
from meeting_memory.config import config
logger = logging.getLogger(__name__)
client = OpenAI(
api_key=config.llm.api_key or None,
base_url=config.llm.base_url if config.llm.base_url else None,
)
class Entity(BaseModel):
name: str
entity_type: str
description: str = ""
class Relation(BaseModel):
subject: str
subject_type: str
predicate: str
object: str
object_type: str
description: str = ""
fact: str = ""
qualifiers: List[str] = Field(default_factory=list)
evidence: str = ""
confidence: float = 0.0
valid_at: str = ""
invalid_at: str = ""
class ActionItem(BaseModel):
task: str
assignee: str = ""
deadline: str = ""
status: str = "待办"
priority: str = ""
class Decision(BaseModel):
content: str
proposer: str = ""
status: str = "已决"
class MeetingMetric(BaseModel):
metric_name: str
value: str
target: str = ""
owner: str = ""
trend: str = ""
class MeetingExtraction(BaseModel):
title: str
date: str = ""
participants: List[str] = Field(default_factory=list)
agenda: List[str] = Field(default_factory=list)
entities: List[Entity] = Field(default_factory=list)
relations: List[Relation] = Field(default_factory=list)
action_items: List[ActionItem] = Field(default_factory=list)
decisions: List[Decision] = Field(default_factory=list)
metrics: List[MeetingMetric] = Field(default_factory=list)
summary: str = ""
EXTRACTION_SYSTEM_PROMPT = """
你是一个专业的会议知识抽取助手你的任务是从中文会议记录中抽取结构化事实尤其要抽出更细粒度更有语义深度的关系
输出要求
1. 只输出一个 JSON 对象不要输出解释文字
2. 关系抽取不要停留在部门汇报了工作这种浅层描述要尽可能向下细化到
- 责任归属
- 目标值 / 当前值 / 趋势
- 约束条件
- 因果 / 影响
- 时间要求
- 依赖关系
- 部署 / 决策 / 要求 / 风险 / 支撑关系
3. 每条关系尽量同时给出
- subject / predicate / object
- fact: 一句自然语言事实表述
- qualifiers: 限定条件范围状态数值约束等
- evidence: 原文中的关键短句或压缩证据
- confidence: 0 1 之间
- valid_at / invalid_at: 如果文中明确提到时间可填写否则留空
4. 如果原文存在多个事实不要只抽象概括要拆成多条关系
5. 避免空泛关系词优先使用更具体的谓词例如
- 负责 / 汇报 / 目标值 / 当前值 / 低于 / 高于 / 要求 / 督导 / 推进 / 影响 / 支撑 / 依赖 / 计划 / 完成 / 截止于
"""
def _call_llm(system: str, user: str, stream: bool = False) -> str:
if not stream:
response = client.chat.completions.create(
model=config.llm.model,
messages=[
{"role": "system", "content": system},
{"role": "user", "content": user},
],
max_tokens=config.llm.max_tokens,
temperature=config.llm.temperature,
)
content = response.choices[0].message.content
if content is None:
raise ValueError("LLM returned empty response")
return content
response = client.chat.completions.create(
model=config.llm.model,
messages=[
{"role": "system", "content": system},
{"role": "user", "content": user},
],
max_tokens=config.llm.max_tokens,
temperature=config.llm.temperature,
stream=True,
)
chunks: List[str] = []
print("\n[LLM] 开始抽取,流式输出中:")
for event in response:
if not event.choices:
continue
delta = event.choices[0].delta.content
if not delta:
continue
chunks.append(delta)
sys.stdout.write(delta)
sys.stdout.flush()
print("\n[LLM] 抽取输出结束")
return "".join(chunks)
def extract_meeting_info(text: str, stream: bool = False) -> MeetingExtraction:
user_prompt = f"""
请从下面会议记录中提取结构化信息并重点做深层关系抽取
输出 JSON 字段
- title
- date
- participants
- agenda
- entities: name, entity_type, description
- relations:
- subject
- subject_type
- predicate
- object
- object_type
- description
- fact
- qualifiers
- evidence
- confidence
- valid_at
- invalid_at
- action_items: task, assignee, deadline, status, priority
- decisions: content, proposer, status
- metrics: metric_name, value, target, owner, trend
- summary
关系抽取规则
1. 不要只抽汇报了工作这种会议动作要尽量继续下钻出具体事实
2. 如果一句话里同时包含主体 + 指标 + 当前值 + 目标值 + 负责人 + 趋势应拆成多条关系或在 qualifiers 中保留这些细节
3. 对于要求部署负责依赖影响约束目标风险类信息优先保留
4. fact 必须是一句完整自然可检索的事实描述
5. qualifiers 用于补充数值范围状态条件截止时间优先级等信息
6. evidence 用原文中的关键词短句不要太长
7. confidence 取值 0 1
会议记录如下
{text}
"""
content = _call_llm(EXTRACTION_SYSTEM_PROMPT, user_prompt, stream=stream)
data = _try_parse_json(content)
data = _normalize_meeting_data(data)
return MeetingExtraction(**data)
def _try_parse_json(content: str) -> dict:
try:
return json.loads(content)
except json.JSONDecodeError:
logger.warning("JSON parsing failed; trying to repair extracted block")
match = re.search(r"\{.*\}", content, re.DOTALL)
if match:
try:
return json.loads(match.group())
except json.JSONDecodeError as exc:
logger.error("Repaired JSON still failed to parse: %s", exc)
raise
def _normalize_meeting_data(data: dict) -> dict:
if not isinstance(data, dict):
return {}
return {
"title": _as_str(data.get("title")),
"date": _as_str(data.get("date")),
"participants": _as_str_list(data.get("participants")),
"agenda": _as_str_list(data.get("agenda")),
"entities": _normalize_entities(data.get("entities")),
"relations": _normalize_relations(data.get("relations")),
"action_items": _normalize_action_items(data.get("action_items")),
"decisions": _normalize_decisions(data.get("decisions")),
"metrics": _normalize_metrics(data.get("metrics")),
"summary": _as_str(data.get("summary")),
}
def _as_str(value) -> str:
if value is None:
return ""
if isinstance(value, str):
return value
return str(value)
def _as_float(value) -> float:
if value is None or value == "":
return 0.0
try:
numeric = float(value)
return max(0.0, min(1.0, numeric))
except (TypeError, ValueError):
return 0.0
def _as_str_list(value) -> List[str]:
if isinstance(value, dict):
items = []
for key, item in value.items():
key_text = _as_str(key)
value_text = _as_str(item)
if key_text and value_text:
items.append(f"{key_text}: {value_text}")
elif key_text:
items.append(key_text)
elif value_text:
items.append(value_text)
return items
if not isinstance(value, list):
return []
return [_as_str(item) for item in value if item is not None]
def _normalize_entities(value) -> List[dict]:
if not isinstance(value, list):
return []
items = []
for entity in value:
if not isinstance(entity, dict):
continue
items.append(
{
"name": _as_str(entity.get("name")),
"entity_type": _as_str(entity.get("entity_type")),
"description": _as_str(entity.get("description")),
}
)
return items
def _normalize_relations(value) -> List[dict]:
if not isinstance(value, list):
return []
items = []
for relation in value:
if not isinstance(relation, dict):
continue
subject = _as_str(relation.get("subject"))
predicate = _as_str(relation.get("predicate"))
obj = _as_str(relation.get("object"))
description = _as_str(relation.get("description"))
fact = _as_str(relation.get("fact"))
if not fact and subject and predicate and obj:
fact = f"{subject} {predicate} {obj}"
items.append(
{
"subject": subject,
"subject_type": _as_str(relation.get("subject_type")),
"predicate": predicate,
"object": obj,
"object_type": _as_str(relation.get("object_type")),
"description": description,
"fact": fact,
"qualifiers": _as_str_list(relation.get("qualifiers")),
"evidence": _as_str(relation.get("evidence")),
"confidence": _as_float(relation.get("confidence")),
"valid_at": _as_str(relation.get("valid_at")),
"invalid_at": _as_str(relation.get("invalid_at")),
}
)
return items
def _normalize_action_items(value) -> List[dict]:
if not isinstance(value, list):
return []
items = []
for action in value:
if not isinstance(action, dict):
continue
items.append(
{
"task": _as_str(action.get("task")),
"assignee": _as_str(action.get("assignee")),
"deadline": _as_str(action.get("deadline")),
"status": _as_str(action.get("status")) or "待办",
"priority": _as_str(action.get("priority")) or "",
}
)
return items
def _normalize_decisions(value) -> List[dict]:
if not isinstance(value, list):
return []
items = []
for decision in value:
if not isinstance(decision, dict):
continue
items.append(
{
"content": _as_str(decision.get("content")),
"proposer": _as_str(decision.get("proposer")),
"status": _as_str(decision.get("status")) or "已决",
}
)
return items
def _normalize_metrics(value) -> List[dict]:
if not isinstance(value, list):
return []
items = []
for metric in value:
if not isinstance(metric, dict):
continue
items.append(
{
"metric_name": _as_str(metric.get("metric_name")),
"value": _as_str(metric.get("value")),
"target": _as_str(metric.get("target")),
"owner": _as_str(metric.get("owner")),
"trend": _as_str(metric.get("trend")),
}
)
return items

View File

@ -0,0 +1,784 @@
import hashlib
import json
import logging
import re
import time
from typing import Any, Dict, List, Optional
from meeting_memory.config import config
from meeting_memory.services.embedding_service import embedding_service
logger = logging.getLogger(__name__)
def _cosine_similarity(left: List[float], right: List[float]) -> float:
if not left or not right or len(left) != len(right):
return 0.0
dot = sum(a * b for a, b in zip(left, right))
left_norm = sum(a * a for a in left) ** 0.5
right_norm = sum(b * b for b in right) ** 0.5
if left_norm == 0 or right_norm == 0:
return 0.0
return dot / (left_norm * right_norm)
def _keyword_score(text: str, question: str) -> float:
source = (text or "").lower()
terms = _keyword_terms(question)
if not source or not terms:
return 0.0
hits = sum(1 for term in terms if term in source)
return hits / len(terms)
def _keyword_terms(text: str) -> List[str]:
normalized = (text or "").lower()
raw_terms = re.findall(r"[a-z0-9]+|[\u4e00-\u9fff]{2,}", normalized)
stopwords = {"是什么", "多少", "分别", "以及", "还有", "当前值", "目标值"}
terms: List[str] = []
for raw in raw_terms:
if raw in stopwords:
continue
if raw not in terms:
terms.append(raw)
if re.fullmatch(r"[\u4e00-\u9fff]{4,}", raw):
for size in (2, 3, 4):
for idx in range(0, len(raw) - size + 1):
piece = raw[idx : idx + size]
if piece not in stopwords and piece not in terms:
terms.append(piece)
return terms
class Neo4jGraphStore:
def __init__(self):
self._driver = None
self._enabled = False
self._uri = config.neo4j.uri
self._last_failure_at = 0.0
self._retry_cooldown_seconds = 10.0
self._connect()
def _connect(self):
if not config.neo4j.enabled:
logger.info("Neo4j graph store disabled")
return
try:
from neo4j import GraphDatabase
except ImportError:
logger.warning("neo4j package is not installed")
return
if not config.neo4j.password:
logger.warning("Neo4j is enabled but NEO4J_PASSWORD is empty")
return
tried_uris = [self._uri]
if self._uri.startswith("neo4j://"):
tried_uris.append("bolt://" + self._uri[len("neo4j://") :])
for uri in tried_uris:
driver = None
try:
driver = GraphDatabase.driver(
uri,
auth=(config.neo4j.user, config.neo4j.password),
)
driver.verify_connectivity()
self._driver = driver
self._uri = uri
self._enabled = True
self._last_failure_at = 0.0
if uri != config.neo4j.uri:
logger.warning("Neo4j routing URI unavailable; fell back to %s", uri)
return
except Exception as exc:
logger.warning("Neo4j connection failed for %s: %s", uri, exc)
try:
driver.close()
except Exception:
pass
self._mark_unavailable("Neo4j is currently unreachable")
@property
def enabled(self) -> bool:
if not self._enabled and self._should_retry_connect():
self._connect()
return self._enabled and self._driver is not None
def _should_retry_connect(self) -> bool:
return (time.time() - self._last_failure_at) >= self._retry_cooldown_seconds
def _mark_unavailable(self, reason: str = "") -> None:
if reason:
logger.warning("Neo4j temporarily disabled: %s", reason)
self._enabled = False
self._last_failure_at = time.time()
if self._driver is not None:
try:
self._driver.close()
except Exception:
pass
self._driver = None
@staticmethod
def meeting_id(meeting_data: dict) -> str:
title = meeting_data.get("title", "")
date = meeting_data.get("date", "")
raw = f"{date}_{title}"
return f"meeting_{hashlib.md5(raw.encode('utf-8')).hexdigest()[:12]}"
def close(self):
if self._driver is not None:
self._driver.close()
def run_query(self, query: str, **params) -> List[Dict[str, Any]]:
if not self.enabled:
return []
try:
with self._driver.session(database=config.neo4j.database) as session:
result = session.run(query, **params)
return [record.data() for record in result]
except Exception as exc:
logger.warning("Neo4j query failed: %s", exc)
self._mark_unavailable(str(exc))
return []
def initialize_schema(self):
if not self.enabled:
return
statements = [
"CREATE CONSTRAINT meeting_id IF NOT EXISTS FOR (m:Meeting) REQUIRE m.meeting_id IS UNIQUE",
"CREATE CONSTRAINT episode_id IF NOT EXISTS FOR (e:Episode) REQUIRE e.episode_id IS UNIQUE",
"CREATE CONSTRAINT entity_name IF NOT EXISTS FOR (e:Entity) REQUIRE e.name IS UNIQUE",
"CREATE CONSTRAINT fact_id IF NOT EXISTS FOR (f:Fact) REQUIRE f.fact_id IS UNIQUE",
"CREATE INDEX meeting_title IF NOT EXISTS FOR (m:Meeting) ON (m.title)",
"CREATE INDEX episode_title IF NOT EXISTS FOR (e:Episode) ON (e.title)",
"CREATE INDEX entity_type IF NOT EXISTS FOR (e:Entity) ON (e.entity_type)",
"CREATE INDEX fact_predicate IF NOT EXISTS FOR (f:Fact) ON (f.predicate)",
]
for statement in statements:
self.run_query(statement)
def get_stats(self) -> Dict[str, Any]:
if not self.enabled:
return {"enabled": False}
rows = self.run_query(
"""
CALL () {
MATCH (m:Meeting)
RETURN count(m) AS meetings
}
CALL () {
MATCH (ep:Episode)
RETURN count(ep) AS episodes
}
CALL () {
MATCH (e:Entity)
RETURN count(e) AS entities
}
CALL () {
MATCH (f:Fact)
RETURN count(f) AS facts
}
RETURN meetings, episodes, entities, facts
"""
)
if not rows:
return {"enabled": False, "meetings": 0, "episodes": 0, "entities": 0, "facts": 0}
return {"enabled": True, **rows[0]}
def upsert_meeting_subgraph(self, meeting_data: dict) -> None:
if not self.enabled:
return
meeting_id = meeting_data.get("_graph_meeting_id") or self.meeting_id(meeting_data)
episode_text = self._build_episode_text(meeting_data)
episode_embedding = embedding_service.embed_text(episode_text)
self.initialize_schema()
self.run_query(
"""
MERGE (m:Meeting {meeting_id: $meeting_id})
SET m.title = $title,
m.date = $date,
m.summary = $summary,
m.content_hash = $content_hash,
m.raw_path = $raw_path,
m.updated_at = datetime()
MERGE (ep:Episode {episode_id: $meeting_id})
SET ep.title = $title,
ep.date = $date,
ep.summary = $summary,
ep.content = $content,
ep.content_hash = $content_hash,
ep.raw_path = $raw_path,
ep.participants = $participants,
ep.content_embedding = $content_embedding,
ep.updated_at = datetime()
MERGE (m)-[:HAS_EPISODE]->(ep)
""",
meeting_id=meeting_id,
title=meeting_data.get("title", ""),
date=meeting_data.get("date", ""),
summary=meeting_data.get("summary", ""),
content_hash=meeting_data.get("_content_hash", ""),
raw_path=meeting_data.get("_original_text_path", ""),
content=episode_text,
participants=meeting_data.get("participants", []),
content_embedding=episode_embedding,
)
for entity in meeting_data.get("entities", []):
self._upsert_entity(meeting_id, entity)
for participant in meeting_data.get("participants", []):
self._upsert_entity(
meeting_id,
{"name": participant, "entity_type": "participant", "description": ""},
)
for relation in meeting_data.get("relations", []):
self._upsert_relation(meeting_id, relation, meeting_data.get("date", ""))
def _upsert_entity(self, meeting_id: str, entity: dict) -> None:
name = entity.get("name", "").strip()
if not name:
return
summary = self._entity_summary(entity)
name_embedding = embedding_service.embed_text(summary or name)
self.run_query(
"""
MATCH (:Meeting {meeting_id: $meeting_id})-[:HAS_EPISODE]->(ep:Episode {episode_id: $meeting_id})
MERGE (e:Entity {name: $name})
SET e.entity_type = CASE
WHEN $entity_type <> '' THEN $entity_type
ELSE coalesce(e.entity_type, '')
END,
e.description = CASE
WHEN $description <> '' THEN $description
ELSE coalesce(e.description, '')
END,
e.summary = CASE
WHEN $summary <> '' THEN $summary
ELSE coalesce(e.summary, '')
END,
e.name_embedding = CASE
WHEN size($name_embedding) > 0 THEN $name_embedding
ELSE coalesce(e.name_embedding, [])
END,
e.updated_at = datetime()
MERGE (ep)-[:MENTIONS]->(e)
""",
meeting_id=meeting_id,
name=name,
entity_type=entity.get("entity_type", ""),
description=entity.get("description", ""),
summary=summary,
name_embedding=name_embedding,
)
def _upsert_relation(self, meeting_id: str, relation: dict, meeting_date: str) -> None:
subject = relation.get("subject", "").strip()
predicate = relation.get("predicate", "").strip()
obj = relation.get("object", "").strip()
if not subject or not predicate or not obj:
return
self._upsert_entity(
meeting_id,
{
"name": subject,
"entity_type": relation.get("subject_type", ""),
"description": "",
},
)
self._upsert_entity(
meeting_id,
{
"name": obj,
"entity_type": relation.get("object_type", ""),
"description": "",
},
)
fact_text = self._fact_text(relation)
fact_id = hashlib.md5(
f"{meeting_id}|{subject}|{predicate}|{obj}".encode("utf-8")
).hexdigest()
fact_embedding = embedding_service.embed_text(fact_text)
self.run_query(
"""
MATCH (:Meeting {meeting_id: $meeting_id})-[:HAS_EPISODE]->(ep:Episode {episode_id: $meeting_id})
MATCH (s:Entity {name: $subject})
MATCH (o:Entity {name: $object})
MERGE (f:Fact {fact_id: $fact_id})
SET f.fact = $fact,
f.predicate = $predicate,
f.description = $description,
f.qualifiers = $qualifiers,
f.evidence = $evidence,
f.confidence = $confidence,
f.valid_at = $valid_at,
f.invalid_at = $invalid_at,
f.meeting_id = $meeting_id,
f.meeting_date = $meeting_date,
f.fact_embedding = $fact_embedding,
f.updated_at = datetime()
MERGE (ep)-[:HAS_FACT]->(f)
MERGE (s)-[:FACT_SOURCE]->(f)
MERGE (f)-[:FACT_TARGET]->(o)
""",
meeting_id=meeting_id,
subject=subject,
predicate=predicate,
object=obj,
fact_id=fact_id,
fact=fact_text,
description=relation.get("description", ""),
qualifiers=relation.get("qualifiers", []),
evidence=relation.get("evidence", ""),
confidence=relation.get("confidence", 0.0),
valid_at=relation.get("valid_at", ""),
invalid_at=relation.get("invalid_at", ""),
meeting_date=meeting_date,
fact_embedding=fact_embedding,
)
def remove_meeting_subgraph(self, meeting_id: str) -> None:
if not self.enabled:
return
self.run_query(
"""
MATCH (m:Meeting {meeting_id: $meeting_id})-[:HAS_EPISODE]->(ep:Episode {episode_id: $meeting_id})
OPTIONAL MATCH (ep)-[mention:MENTIONS]->(entity:Entity)
OPTIONAL MATCH (ep)-[has_fact:HAS_FACT]->(fact:Fact)
OPTIONAL MATCH (fact)-[target_rel:FACT_TARGET]->(:Entity)
OPTIONAL MATCH (:Entity)-[source_rel:FACT_SOURCE]->(fact)
DELETE mention, has_fact, target_rel, source_rel
WITH m, ep, collect(DISTINCT fact) AS facts, collect(DISTINCT entity) AS entities
FOREACH (fact IN facts | DELETE fact)
DELETE ep, m
WITH entities
UNWIND entities AS entity
WITH DISTINCT entity WHERE entity IS NOT NULL
OPTIONAL MATCH (entity)<-[m1:MENTIONS]-(:Episode)
OPTIONAL MATCH (entity)-[m2:FACT_SOURCE|FACT_TARGET]-(:Fact)
WITH entity, count(m1) + count(m2) AS refs
WHERE refs = 0
DELETE entity
""",
meeting_id=meeting_id,
)
def get_meeting(self, title: str, date: str = "") -> Optional[Dict[str, Any]]:
if not self.enabled:
return None
rows = self.run_query(
"""
MATCH (m:Meeting)
WHERE m.title = $title
AND ($date = '' OR m.date = $date)
RETURN m.meeting_id AS meeting_id,
m.title AS title,
m.date AS date,
m.summary AS summary,
m.content_hash AS content_hash
LIMIT 1
""",
title=title,
date=date,
)
return rows[0] if rows else None
def find_similar_episode(self, text: str, threshold: float = 0.92) -> Optional[Dict[str, Any]]:
if not self.enabled or not text.strip():
return None
query_embedding = embedding_service.embed_text(text)
rows = self.run_query(
"""
MATCH (m:Meeting)-[:HAS_EPISODE]->(ep:Episode)
RETURN m.meeting_id AS meeting_id,
m.title AS title,
m.date AS date,
m.content_hash AS content_hash,
ep.content_embedding AS content_embedding
"""
)
best_match = None
for row in rows:
score = _cosine_similarity(query_embedding, row.get("content_embedding", []))
if score >= threshold and (best_match is None or score > best_match["score"]):
best_match = {
"metadata": {
"meeting_id": row.get("meeting_id", ""),
"title": row.get("title", ""),
"date": row.get("date", ""),
"content_hash": row.get("content_hash", ""),
},
"score": score,
}
return best_match
def hybrid_search(self, question: str, limit: int = 5) -> List[Dict[str, Any]]:
if not self.enabled or not question.strip():
return []
query_embedding = embedding_service.embed_text(question)
candidates = self._load_fact_candidates()
candidates.extend(self._load_entity_candidates())
candidates.extend(self._load_episode_candidates())
scored = []
for item in candidates:
combined_text = " ".join(
[
str(item.get("title") or ""),
str(item.get("text") or ""),
str(item.get("meeting_title") or ""),
str(item.get("date") or ""),
]
)
semantic = _cosine_similarity(query_embedding, item.get("embedding", []))
lexical = _keyword_score(combined_text, question)
graph_bonus = 0.1 if item.get("kind") == "fact" else 0.05
score = semantic * 0.7 + lexical * 0.2 + graph_bonus
if score <= 0:
continue
scored.append(
{
**item,
"score": round(score, 4),
"semantic_score": round(semantic, 4),
"keyword_score": round(lexical, 4),
}
)
scored.sort(key=lambda row: row["score"], reverse=True)
return scored[:limit]
def search_facts(self, question: str, limit: int = 5) -> List[Dict[str, Any]]:
return self.hybrid_search(question, limit=limit)
def get_graph_kinds(self) -> List[Dict[str, Any]]:
if not self.enabled:
return []
rows = self.run_query(
"""
MATCH (n)
WHERE n:Meeting OR n:Episode OR n:Entity OR n:Fact
WITH [lbl IN labels(n) WHERE lbl IN ['Meeting','Episode','Entity','Fact']][0] AS kind
RETURN kind, count(*) AS count
ORDER BY count DESC
"""
)
return rows
def get_entity_types(self) -> List[Dict[str, Any]]:
if not self.enabled:
return []
return self.run_query(
"""
MATCH (e:Entity)
WHERE coalesce(e.entity_type, '') <> ''
RETURN e.entity_type AS entity_type, count(*) AS count
ORDER BY count DESC
"""
)
def get_graph_snapshot(
self,
query: str = "",
entity_types: Optional[List[str]] = None,
kinds: Optional[List[str]] = None,
limit_nodes: int = 80,
limit_edges: int = 160,
) -> Dict[str, Any]:
if not self.enabled:
return {"nodes": [], "edges": [], "stats": {"enabled": False}}
keyword_terms = _keyword_terms(query) if query else []
raw_nodes = self.run_query(
"""
MATCH (n)
WHERE (n:Meeting OR n:Episode OR n:Entity OR n:Fact)
AND ($kinds = [] OR [lbl IN labels(n) WHERE lbl IN ['Meeting','Episode','Entity','Fact']][0] IN $kinds)
AND ($terms = []
OR (n:Meeting AND any(t IN $terms WHERE toLower(coalesce(n.title,'')) CONTAINS t OR toLower(coalesce(n.summary,'')) CONTAINS t))
OR (n:Episode AND any(t IN $terms WHERE toLower(coalesce(n.title,'')) CONTAINS t OR toLower(coalesce(n.content,'')) CONTAINS t))
OR (n:Entity AND any(t IN $terms WHERE toLower(coalesce(n.name,'')) CONTAINS t OR toLower(coalesce(n.summary,'')) CONTAINS t OR toLower(coalesce(n.description,'')) CONTAINS t))
OR (n:Fact AND any(t IN $terms WHERE toLower(coalesce(n.fact,'')) CONTAINS t OR toLower(coalesce(n.predicate,'')) CONTAINS t OR toLower(coalesce(n.description,'')) CONTAINS t))
)
AND ($types = [] OR NOT n:Entity OR coalesce(n.entity_type, '') IN $types)
OPTIONAL MATCH (n)-[r]-()
RETURN n.meeting_id AS meeting_id,
n.episode_id AS episode_id,
n.name AS entity_name,
n.fact_id AS fact_id,
n.title AS title,
n.summary AS summary,
n.date AS date,
n.entity_type AS entity_type,
n.description AS description,
n.predicate AS predicate,
n.fact AS fact,
n.confidence AS confidence,
n.meeting_date AS meeting_date,
[lbl IN labels(n) WHERE lbl IN ['Meeting','Episode','Entity','Fact']][0] AS kind,
count(DISTINCT r) AS degree
ORDER BY degree DESC, coalesce(n.title, n.name, n.fact) ASC
LIMIT $limit_nodes
""",
terms=keyword_terms,
types=entity_types or [],
kinds=kinds or [],
limit_nodes=limit_nodes,
)
if not raw_nodes:
return {"nodes": [], "edges": [], "stats": self.get_stats()}
all_raw_ids = set()
nodes = []
for row in raw_nodes:
kind = row.get("kind", "")
if kind == "Meeting":
raw_id = row.get("meeting_id", "")
label = row.get("title", "") or raw_id
elif kind == "Episode":
raw_id = row.get("episode_id", "")
label = row.get("title", "") or raw_id
elif kind == "Entity":
raw_id = row.get("entity_name", "")
label = raw_id
elif kind == "Fact":
raw_id = row.get("fact_id", "")
label = row.get("predicate", "") or row.get("fact", "") or raw_id
else:
continue
if not raw_id:
continue
nid = f"{kind}:{raw_id}"
all_raw_ids.add(raw_id)
nodes.append({
"id": nid,
"label": label,
"kind": kind,
"entity_type": row.get("entity_type", "") if kind == "Entity" else "",
"description": row.get("description", "") or row.get("summary", "") or "",
"date": row.get("date", "") or row.get("meeting_date", "") or "",
"degree": row.get("degree", 0),
"fact": row.get("fact", "") if kind == "Fact" else "",
"summary": row.get("summary", "") or "",
})
if not nodes:
return {"nodes": [], "edges": [], "stats": self.get_stats()}
ids_list = list(all_raw_ids)
edges_raw = self.run_query(
"""
MATCH (s)-[r]->(t)
WHERE type(r) IN ['HAS_EPISODE','MENTIONS','HAS_FACT','FACT_SOURCE','FACT_TARGET']
AND (
(s:Meeting AND s.meeting_id IN $ids)
OR (s:Episode AND s.episode_id IN $ids)
OR (s:Entity AND s.name IN $ids)
OR (s:Fact AND s.fact_id IN $ids)
)
AND (
(t:Meeting AND t.meeting_id IN $ids)
OR (t:Episode AND t.episode_id IN $ids)
OR (t:Entity AND t.name IN $ids)
OR (t:Fact AND t.fact_id IN $ids)
)
RETURN type(r) AS predicate,
CASE WHEN s:Meeting THEN s.meeting_id
WHEN s:Episode THEN s.episode_id
WHEN s:Entity THEN s.name
WHEN s:Fact THEN s.fact_id END AS source_raw,
CASE WHEN t:Meeting THEN t.meeting_id
WHEN t:Episode THEN t.episode_id
WHEN t:Entity THEN t.name
WHEN t:Fact THEN t.fact_id END AS target_raw,
CASE WHEN s:Meeting THEN 'Meeting' WHEN s:Episode THEN 'Episode'
WHEN s:Entity THEN 'Entity' WHEN s:Fact THEN 'Fact' END AS source_kind,
CASE WHEN t:Meeting THEN 'Meeting' WHEN t:Episode THEN 'Episode'
WHEN t:Entity THEN 'Entity' WHEN t:Fact THEN 'Fact' END AS target_kind,
CASE WHEN s:Fact THEN coalesce(s.predicate, '')
WHEN t:Fact THEN coalesce(t.predicate, '') ELSE '' END AS fact_predicate,
CASE WHEN s:Fact THEN coalesce(s.fact, '')
WHEN t:Fact THEN coalesce(t.fact, '') ELSE '' END AS fact_text,
CASE WHEN s:Fact THEN coalesce(s.description, '')
WHEN t:Fact THEN coalesce(t.description, '') ELSE '' END AS fact_description,
CASE WHEN s:Fact THEN coalesce(s.confidence, 0.0)
WHEN t:Fact THEN coalesce(t.confidence, 0.0) ELSE 0.0 END AS fact_confidence,
CASE WHEN s:Fact THEN coalesce(s.meeting_date, '')
WHEN t:Fact THEN coalesce(t.meeting_date, '') ELSE '' END AS fact_date,
CASE WHEN s:Fact THEN coalesce(s.meeting_id, '')
WHEN t:Fact THEN coalesce(t.meeting_id, '') ELSE '' END AS fact_meeting_id
LIMIT $limit_edges
""",
ids=list(all_raw_ids),
limit_edges=limit_edges,
)
degree_map: Dict[str, int] = {}
for row in edges_raw:
src = row.get("source", "")
tgt = row.get("target", "")
degree_map[src] = degree_map.get(src, 0) + 1
degree_map[tgt] = degree_map.get(tgt, 0) + 1
for node in nodes:
node["degree"] = degree_map.get(node["id"], node.get("degree", 0))
edges = []
for idx, row in enumerate(edges_raw, start=1):
sk = row.get("source_kind", "")
tk = row.get("target_kind", "")
edges.append({
"id": f"edge_{idx}",
"source": f"{sk}:{row['source_raw']}" if sk and row.get("source_raw") else "",
"target": f"{tk}:{row['target_raw']}" if tk and row.get("target_raw") else "",
"predicate": row.get("predicate", ""),
"fact": row.get("fact_text", "") or row.get("fact_description", "") or "",
"description": row.get("fact_description", "") or "",
"confidence": row.get("fact_confidence", 0.0),
"date": row.get("fact_date", "") or "",
"meeting_id": row.get("fact_meeting_id", "") or "",
})
return {
"nodes": nodes,
"edges": edges,
"stats": self.get_stats(),
"query": query,
}
def format_search_context(self, question: str, top_k: int = 5) -> str:
results = self.hybrid_search(question, limit=top_k)
if not results:
return ""
lines = []
for idx, row in enumerate(results, start=1):
date = row.get("date", "")
meeting_title = row.get("meeting_title", "")
title = row.get("title", row.get("kind", "item"))
suffix = f" ({date})" if date else ""
source = f" | 来源会议: {meeting_title}" if meeting_title else ""
lines.append(
f"[{idx}] {title}{suffix}{source}\n"
f"{row.get('text', '')}\n"
f"score={row.get('score', 0):.4f}, semantic={row.get('semantic_score', 0):.4f}, keyword={row.get('keyword_score', 0):.4f}"
)
return "\n\n".join(lines)
def _load_fact_candidates(self) -> List[Dict[str, Any]]:
return self.run_query(
"""
MATCH (ep:Episode)-[:HAS_FACT]->(f:Fact)
OPTIONAL MATCH (s:Entity)-[:FACT_SOURCE]->(f)
OPTIONAL MATCH (f)-[:FACT_TARGET]->(o:Entity)
RETURN 'fact' AS kind,
coalesce(s.name + ' -[' + coalesce(f.predicate, '') + ']-> ' + o.name, f.fact) AS title,
coalesce(
f.description + CASE
WHEN size(coalesce(f.qualifiers, [])) > 0 THEN ' | ' + reduce(acc = '', item IN f.qualifiers |
acc + CASE WHEN acc = '' THEN item ELSE '; ' + item END
)
ELSE ''
END,
f.fact,
''
) AS text,
ep.date AS date,
ep.title AS meeting_title,
f.fact_embedding AS embedding
"""
)
def _load_entity_candidates(self) -> List[Dict[str, Any]]:
return self.run_query(
"""
MATCH (e:Entity)
OPTIONAL MATCH (ep:Episode)-[:MENTIONS]->(e)
RETURN 'entity' AS kind,
e.name AS title,
coalesce(e.summary, e.description, '') AS text,
max(ep.date) AS date,
head(collect(DISTINCT ep.title)) AS meeting_title,
e.name_embedding AS embedding
"""
)
def _load_episode_candidates(self) -> List[Dict[str, Any]]:
return self.run_query(
"""
MATCH (m:Meeting)-[:HAS_EPISODE]->(ep:Episode)
RETURN 'episode' AS kind,
m.title AS title,
coalesce(ep.summary, ep.content, '') AS text,
ep.date AS date,
m.title AS meeting_title,
ep.content_embedding AS embedding
"""
)
@staticmethod
def _entity_summary(entity: dict) -> str:
entity_type = entity.get("entity_type", "").strip()
name = entity.get("name", "").strip()
description = entity.get("description", "").strip()
parts = [part for part in [entity_type, name, description] if part]
return " | ".join(parts)
@staticmethod
def _fact_text(relation: dict) -> str:
subject = relation.get("subject", "").strip()
predicate = relation.get("predicate", "").strip()
obj = relation.get("object", "").strip()
description = relation.get("description", "").strip()
fact = relation.get("fact", "").strip() or f"{subject} {predicate} {obj}".strip()
qualifiers = relation.get("qualifiers", [])
qualifier_text = "; ".join(item for item in qualifiers if item)
if description and qualifier_text:
return f"{fact}. {description}. {qualifier_text}"
if description:
return f"{fact}. {description}"
if qualifier_text:
return f"{fact}. {qualifier_text}"
return fact
@staticmethod
def _build_episode_text(meeting_data: dict) -> str:
payload = {
"title": meeting_data.get("title", ""),
"date": meeting_data.get("date", ""),
"participants": meeting_data.get("participants", []),
"summary": meeting_data.get("summary", ""),
"entities": meeting_data.get("entities", []),
"relations": meeting_data.get("relations", []),
"action_items": meeting_data.get("action_items", []),
"metrics": meeting_data.get("metrics", []),
"decisions": meeting_data.get("decisions", []),
"original_text": meeting_data.get("_original_text", ""),
}
return json.dumps(payload, ensure_ascii=False)
graph_store = Neo4jGraphStore()

View File

@ -1,38 +1,56 @@
import hashlib
import logging
import os
from typing import Optional
from typing import Callable, Optional
from config import config
from extractor import MeetingExtraction, extract_meeting_info
from graph_store import graph_store
from meeting_state import MeetingStateStore
from raw_store import raw_meeting_store
from vector_store import meeting_vector_store
from meeting_memory.config import config
from meeting_memory.extractor import MeetingExtraction, extract_meeting_info
from meeting_memory.graph_store import graph_store
from meeting_memory.meeting_state import MeetingStateStore
from meeting_memory.raw_store import raw_meeting_store
logger = logging.getLogger(__name__)
state_store = MeetingStateStore(config.state_path)
ProgressCallback = Callable[[int, int, str], None]
class MeetingProcessor:
def process_meeting_file(self, filepath: str, force: bool = False) -> Optional[str]:
with open(filepath, "r", encoding="utf-8") as f:
text = f.read()
with open(filepath, "r", encoding="utf-8") as file_obj:
text = file_obj.read()
return self.process_meeting_text(text, force=force)
def process_meeting_text(self, text: str, force: bool = False) -> Optional[str]:
def process_meeting_text(
self,
text: str,
force: bool = False,
interactive: bool = True,
progress_callback: Optional[ProgressCallback] = None,
) -> Optional[str]:
def report(step: int, message: str) -> None:
if progress_callback:
progress_callback(step, 7, message)
print(f"[{step}/7] {message}")
report(1, "计算内容哈希")
content_hash = self._compute_content_hash(text)
if not force and state_store.has_content_hash(content_hash):
print("\n检测到重复内容,已跳过。")
logger.info("Duplicate content hash skipped: %s", content_hash[:12])
return None
if not force:
similar = meeting_vector_store.find_similar_text(text, threshold=0.92)
report(2, "Neo4j 语义相似去重检索")
similar = graph_store.find_similar_episode(text, threshold=0.92)
if similar:
meta = similar["metadata"]
if not interactive:
logger.info(
"Skipped similar meeting in non-interactive mode: %s",
meta.get("title", ""),
)
return None
print(
f"\n发现相似会议:{meta.get('title', '')} ({meta.get('date', '')}) "
f"相似度 {similar['score']:.2%}"
@ -46,7 +64,10 @@ class MeetingProcessor:
force = True
break
print("请输入 s 或 o。")
else:
report(2, "跳过语义去重,按覆盖模式继续")
report(3, "调用大模型抽取结构化信息")
meeting_data = self._extract(text)
if not meeting_data:
logger.error("Failed to extract meeting information")
@ -54,21 +75,24 @@ class MeetingProcessor:
data_dict = meeting_data.model_dump()
data_dict["_content_hash"] = content_hash
data_dict["_graph_meeting_id"] = meeting_vector_store._meeting_id(data_dict)
data_dict["_graph_meeting_id"] = graph_store.meeting_id(data_dict)
should_skip = self._handle_duplicate(data_dict, force)
report(4, "检查标题和日期重复")
should_skip = self._handle_duplicate(data_dict, force=force, interactive=interactive)
if should_skip:
return None
meeting_title = data_dict.get("title", "")
meeting_date = data_dict.get("date", "")
raw_path = raw_meeting_store.save(text, title=meeting_title, date=meeting_date)
report(5, "归档原始会议文本")
raw_path = raw_meeting_store.save(text, title=meeting_title, date=meeting_date)
data_dict["_original_text"] = text
data_dict["_original_text_path"] = raw_path
meeting_filename = f"{meeting_vector_store._meeting_id(data_dict)}.md"
meeting_filename = f"{graph_store.meeting_id(data_dict)}.md"
report(6, "合并行动项和指标状态")
data_dict["action_items"] = state_store.merge_action_items(
data_dict.get("action_items", []),
meeting_title,
@ -84,16 +108,17 @@ class MeetingProcessor:
state_store.add_content_hash(content_hash, meeting_title, meeting_date, meeting_filename)
state_store.save()
meeting_vector_store.add_meeting(data_dict)
report(7, "写入 Neo4j 图谱和检索数据")
graph_store.upsert_meeting_subgraph(data_dict)
logger.info("Meeting processed: %s", meeting_title)
return raw_path
def _handle_duplicate(self, data_dict: dict, force: bool) -> bool:
def _handle_duplicate(self, data_dict: dict, force: bool, interactive: bool = True) -> bool:
title = data_dict.get("title", "")
date = data_dict.get("date", "")
existing = meeting_vector_store.find_meeting(title, date)
existing = graph_store.get_meeting(title, date)
if not existing:
return False
@ -103,6 +128,10 @@ class MeetingProcessor:
self._remove_old(data_dict, existing)
return False
if not interactive:
logger.info("Skipped duplicate meeting in non-interactive mode: %s", title)
return True
print(f"\n发现重复会议:{title} ({date})")
while True:
choice = input("选择 [s]跳过 / [o]覆盖(默认 s").strip().lower() or "s"
@ -114,9 +143,8 @@ class MeetingProcessor:
return False
print("请输入 s 或 o。")
def _remove_old(self, data_dict: dict, existing: Optional[dict] = None):
meeting_id = meeting_vector_store._meeting_id(data_dict)
meeting_vector_store.remove_meeting(meeting_id)
def _remove_old(self, data_dict: dict, existing: Optional[dict] = None) -> None:
meeting_id = graph_store.meeting_id(data_dict)
graph_store.remove_meeting_subgraph(meeting_id)
new_hash = data_dict.get("_content_hash", "")
@ -136,34 +164,16 @@ class MeetingProcessor:
def _extract(self, text: str) -> Optional[MeetingExtraction]:
try:
return extract_meeting_info(text)
return extract_meeting_info(text, stream=True)
except Exception as exc:
logger.error("LLM extraction failed: %s", exc)
return None
def query(self, question: str, top_k: int = 3) -> str:
vector_context = meeting_vector_store.query_as_context(question, top_k=top_k)
graph_results = graph_store.search_facts(question, limit=top_k)
parts = []
if vector_context:
parts.append("=== Vector Context ===\n" + vector_context)
if graph_results:
graph_lines = []
for idx, row in enumerate(graph_results, start=1):
title = row.get("title", row.get("kind", "graph"))
text = row.get("text", "")
date = row.get("date", "")
suffix = f" ({date})" if date else ""
graph_lines.append(f"[{idx}] {title}{suffix}\n{text}")
parts.append("=== Graph Facts ===\n" + "\n\n".join(graph_lines))
return "\n\n".join(parts)
return graph_store.format_search_context(question, top_k=top_k)
def stats(self) -> dict:
return {
"vector_index": meeting_vector_store.get_stats(),
"graph": graph_store.get_stats(),
"state": state_store.get_stats(),
"raw_dir": config.storage.raw_dir,

View File

@ -2,8 +2,8 @@ import hashlib
import json
import logging
import os
from datetime import datetime
from typing import Dict, List, Optional
import re
from typing import List, Optional
logger = logging.getLogger(__name__)
@ -28,8 +28,8 @@ class MeetingStateStore:
try:
with open(self.state_path, "r", encoding="utf-8") as f:
return json.load(f)
except Exception as e:
logger.warning(f"加载状态文件失败,将创建新状态: {e}")
except Exception as exc:
logger.warning("Failed to load state file, creating a new one: %s", exc)
return {
"action_items": {},
"metrics": {},
@ -55,11 +55,10 @@ class MeetingStateStore:
return series_name
def _detect_series(self, title: str) -> str:
import re
cleaned = re.sub(r"\d{4}\w+期)", "", title)
cleaned = re.sub(r"\(\d{4}\w+期\)", "", cleaned)
cleaned = re.sub(r"\d{4}\w+期", "", cleaned)
cleaned = re.sub(r"\d{4}年第\w+次", "", cleaned)
cleaned = re.sub(r"\uFF08\d{4}\u7B2C\w+\u671F\uFF09", "", title)
cleaned = re.sub(r"\(\d{4}\u7B2C\w+\u671F\)", "", cleaned)
cleaned = re.sub(r"\d{4}\u7B2C\w+\u671F", "", cleaned)
cleaned = re.sub(r"\d{4}\u5E74\u7B2C\w+\u6B21", "", cleaned)
cleaned = cleaned.strip("-_ ")
return cleaned or title
@ -122,26 +121,25 @@ class MeetingStateStore:
) -> List[dict]:
merged = []
for m in new_metrics:
metric_name = m.get("metric_name", "")
owner = m.get("owner", "")
for metric in new_metrics:
metric_name = metric.get("metric_name", "")
owner = metric.get("owner", "")
mid = _metric_id(metric_name, owner)
history_entry = {
"date": meeting_date,
"meeting": meeting_filename,
"value": m.get("value", ""),
"target": m.get("target", ""),
"trend": m.get("trend", ""),
"value": metric.get("value", ""),
"target": metric.get("target", ""),
"trend": metric.get("trend", ""),
}
existing = self._state["metrics"].get(mid)
if existing:
existing["history"].append(history_entry)
existing["latest"] = history_entry
item = m
item["_metric_id"] = mid
item["_history"] = list(existing["history"])
metric["_metric_id"] = mid
metric["_history"] = list(existing["history"])
else:
self._state["metrics"][mid] = {
"metric_id": mid,
@ -150,10 +148,10 @@ class MeetingStateStore:
"history": [history_entry],
"latest": history_entry,
}
m["_metric_id"] = mid
m["_history"] = [history_entry]
metric["_metric_id"] = mid
metric["_history"] = [history_entry]
merged.append(m)
merged.append(metric)
return merged

View File

@ -2,7 +2,7 @@ import logging
import os
from datetime import datetime
from config import config
from meeting_memory.config import config
logger = logging.getLogger(__name__)
@ -23,9 +23,12 @@ class RawMeetingStore:
os.makedirs(self.raw_dir, exist_ok=True)
def save(self, text: str, title: str = "", date: str = "") -> str:
os.makedirs(self.raw_dir, exist_ok=True)
date_str = date or datetime.now().strftime("%Y-%m-%d")
safe_date = _sanitize_filename(date_str)[:40]
safe_title = _sanitize_filename(title)[:60]
filename = f"{date_str}_{safe_title}.md"
filename = f"{safe_date}_{safe_title}.md"
filepath = os.path.join(self.raw_dir, filename)
content = "\n".join(

View File

@ -0,0 +1,3 @@
from meeting_memory.services.embedding_service import EmbeddingService, embedding_service
__all__ = ["EmbeddingService", "embedding_service"]

View File

@ -0,0 +1,29 @@
from typing import List, Optional
from openai import OpenAI as OpenAIClient
from meeting_memory.config import config
class EmbeddingService:
def __init__(
self,
model: Optional[str] = None,
api_key: Optional[str] = None,
api_base: Optional[str] = None,
):
self._client = OpenAIClient(
api_key=api_key or config.embedding.api_key or "not-needed",
base_url=api_base or config.embedding.api_base or None,
)
self._model = model or config.embedding.model
def embed_text(self, text: str) -> List[float]:
response = self._client.embeddings.create(
model=self._model,
input=text,
)
return response.data[0].embedding
embedding_service = EmbeddingService()

View File

@ -0,0 +1,3 @@
from meeting_memory.web_demo.server import run_demo_server
__all__ = ["run_demo_server"]

View File

@ -0,0 +1,400 @@
import json
import logging
import mimetypes
import sys
import threading
import time
import uuid
from http import HTTPStatus
from http.server import SimpleHTTPRequestHandler, ThreadingHTTPServer
from pathlib import Path
from urllib.parse import parse_qs, urlparse
if __package__ in (None, ""):
sys.path.insert(0, str(Path(__file__).resolve().parents[2]))
from meeting_memory.config import config
from meeting_memory.graph_store import graph_store
from meeting_memory.meeting_processor import meeting_processor, state_store
logger = logging.getLogger(__name__)
STATIC_DIR = Path(__file__).resolve().parent / "static"
RAW_DIR = Path(config.storage.raw_dir)
IMPORT_JOBS = {}
IMPORT_JOBS_LOCK = threading.Lock()
class GraphDemoHandler(SimpleHTTPRequestHandler):
def __init__(self, *args, **kwargs):
super().__init__(*args, directory=str(STATIC_DIR), **kwargs)
def do_GET(self):
parsed = urlparse(self.path)
if parsed.path == "/api/dashboard":
self._handle_dashboard()
return
if parsed.path == "/api/graph":
self._handle_graph(parsed.query)
return
if parsed.path == "/api/graph-types":
self._handle_graph_types()
return
if parsed.path == "/api/graph-kinds":
self._handle_graph_kinds()
return
if parsed.path == "/api/search":
self._handle_search(parsed.query)
return
if parsed.path == "/api/meetings":
self._handle_meetings(parsed.query)
return
if parsed.path == "/api/meeting":
self._handle_meeting(parsed.query)
return
if parsed.path == "/api/import-status":
self._handle_import_status(parsed.query)
return
if parsed.path in ("/", "/index.html"):
self.path = "/index.html"
elif parsed.path == "/graph":
self.path = "/graph.html"
super().do_GET()
def do_POST(self):
parsed = urlparse(self.path)
if parsed.path == "/api/import":
self._handle_import()
return
self.send_error(HTTPStatus.NOT_FOUND, "Unsupported endpoint")
def log_message(self, format, *args):
logger.info("%s - %s", self.address_string(), format % args)
def end_headers(self):
self.send_header("Cache-Control", "no-store")
super().end_headers()
def guess_type(self, path):
guessed = super().guess_type(path)
if guessed == "application/octet-stream":
return mimetypes.guess_type(path)[0] or guessed
return guessed
def _handle_graph(self, raw_query: str):
params = parse_qs(raw_query)
query = (params.get("q") or [""])[0].strip()
limit_nodes = self._safe_int((params.get("limit_nodes") or ["80"])[0], default=80)
limit_edges = self._safe_int((params.get("limit_edges") or ["160"])[0], default=160)
entity_types = params.get("entity_types")
kinds = params.get("kinds")
payload = graph_store.get_graph_snapshot(
query=query,
entity_types=entity_types if entity_types else None,
kinds=kinds if kinds else None,
limit_nodes=limit_nodes,
limit_edges=limit_edges,
)
self._write_json(payload)
def _handle_graph_types(self):
types = graph_store.get_entity_types()
self._write_json({"types": types})
def _handle_graph_kinds(self):
kinds = graph_store.get_graph_kinds()
self._write_json({"kinds": kinds})
def _handle_search(self, raw_query: str):
params = parse_qs(raw_query)
query = (params.get("q") or [""])[0].strip()
limit = self._safe_int((params.get("limit") or ["8"])[0], default=8)
payload = {
"query": query,
"results": graph_store.hybrid_search(query, limit=limit) if query else [],
}
self._write_json(payload)
def _handle_dashboard(self):
meetings = _load_recent_meetings(limit=6)
action_items = _state_items("action_items", limit=6)
metrics = _state_items("metrics", limit=6)
series = _load_series(limit=6)
graph_stats = graph_store.get_stats()
payload = {
"graph": graph_stats,
"state": state_store.get_stats(),
"meetings": meetings,
"action_items": action_items,
"metrics": metrics,
"series": series,
"highlights": _build_highlights(meetings, action_items, metrics, graph_stats),
}
self._write_json(payload)
def _handle_meetings(self, raw_query: str):
params = parse_qs(raw_query)
limit = self._safe_int((params.get("limit") or ["24"])[0], default=24)
self._write_json({"meetings": _load_recent_meetings(limit=limit)})
def _handle_meeting(self, raw_query: str):
params = parse_qs(raw_query)
filename = (params.get("filename") or [""])[0].strip()
if not filename:
self._write_json({"error": "filename is required"}, status=HTTPStatus.BAD_REQUEST)
return
file_path = RAW_DIR / filename
if not file_path.exists() or file_path.parent != RAW_DIR:
self._write_json({"error": "meeting not found"}, status=HTTPStatus.NOT_FOUND)
return
self._write_json(_serialize_meeting(file_path, include_content=True))
def _handle_import(self):
payload = self._read_json_body()
if payload is None:
self._write_json({"ok": False, "error": "invalid json body"}, status=HTTPStatus.BAD_REQUEST)
return
text = str(payload.get("text") or "").strip()
force = bool(payload.get("force", False))
if not text:
self._write_json({"ok": False, "error": "text is required"}, status=HTTPStatus.BAD_REQUEST)
return
job_id = str(uuid.uuid4())
with IMPORT_JOBS_LOCK:
IMPORT_JOBS[job_id] = {
"job_id": job_id,
"status": "queued",
"message": "任务已创建,等待处理",
"archive_path": "",
"created_at": time.time(),
"updated_at": time.time(),
"steps": [],
}
thread = threading.Thread(
target=_run_import_job,
args=(job_id, text, force),
daemon=True,
)
thread.start()
self._write_json({"ok": True, "job_id": job_id, "status": "queued"})
def _handle_import_status(self, raw_query: str):
params = parse_qs(raw_query)
job_id = (params.get("job_id") or [""])[0].strip()
if not job_id:
self._write_json({"error": "job_id is required"}, status=HTTPStatus.BAD_REQUEST)
return
with IMPORT_JOBS_LOCK:
payload = IMPORT_JOBS.get(job_id)
if not payload:
self._write_json({"error": "job not found"}, status=HTTPStatus.NOT_FOUND)
return
self._write_json(payload)
def _read_json_body(self):
length = self._safe_int(self.headers.get("Content-Length"), default=0)
if length <= 0:
return None
try:
body = self.rfile.read(length)
return json.loads(body.decode("utf-8"))
except Exception:
return None
def _write_json(self, payload, status: HTTPStatus = HTTPStatus.OK):
body = json.dumps(payload, ensure_ascii=False).encode("utf-8")
self.send_response(status)
self.send_header("Content-Type", "application/json; charset=utf-8")
self.send_header("Content-Length", str(len(body)))
self.end_headers()
self.wfile.write(body)
@staticmethod
def _safe_int(raw_value, default: int) -> int:
try:
value = int(raw_value)
except (TypeError, ValueError):
return default
return max(0, value)
def run_demo_server(host: str = "127.0.0.1", port: int = 8765) -> None:
server = ThreadingHTTPServer((host, port), GraphDemoHandler)
logger.info("Graph demo server started at http://%s:%s", host, port)
print(f"Graph demo server started: http://{host}:{port}")
print("Press Ctrl+C to stop.")
try:
server.serve_forever()
except KeyboardInterrupt:
print("\nServer stopped.")
finally:
server.server_close()
def _run_import_job(job_id: str, text: str, force: bool) -> None:
def update(status: str | None = None, message: str | None = None, *, append_step: bool = False):
with IMPORT_JOBS_LOCK:
job = IMPORT_JOBS.get(job_id)
if not job:
return
if status:
job["status"] = status
if message:
job["message"] = message
if append_step:
job["steps"].append(message)
job["updated_at"] = time.time()
def progress(step: int, total: int, message: str):
update(
"running",
f"步骤 {step}/{total}{message}",
append_step=True,
)
update("running", "开始处理会议文本", append_step=True)
try:
archive_path = meeting_processor.process_meeting_text(
text,
force=force,
interactive=False,
progress_callback=progress,
)
if not archive_path:
update("error", "处理被跳过:可能是重复内容,或结构化抽取失败", append_step=True)
return
with IMPORT_JOBS_LOCK:
job = IMPORT_JOBS.get(job_id)
if job:
job["status"] = "done"
job["message"] = "导入完成"
job["archive_path"] = archive_path
job["updated_at"] = time.time()
job["dashboard"] = {
"graph": graph_store.get_stats(),
"state": state_store.get_stats(),
"meetings": _load_recent_meetings(limit=6),
}
job["steps"].append("导入完成")
except Exception as exc:
logger.exception("Meeting import failed")
update("error", f"处理失败:{exc}", append_step=True)
def _load_recent_meetings(limit: int = 6):
if not RAW_DIR.exists():
return []
files = sorted(
RAW_DIR.glob("*.md"),
key=lambda path: path.stat().st_mtime,
reverse=True,
)
return [_serialize_meeting(path) for path in files[:limit]]
def _serialize_meeting(path: Path, include_content: bool = False):
raw_text = path.read_text(encoding="utf-8")
title = ""
date = ""
lines = raw_text.splitlines()
for line in lines[:12]:
if line.startswith('title: "'):
title = line[len('title: "') : -1]
elif line.startswith('date: "'):
date = line[len('date: "') : -1]
content_start = 0
for idx, line in enumerate(lines):
if line.startswith("# "):
content_start = idx + 2
if not title:
title = line[2:].strip()
break
body = "\n".join(lines[content_start:]).strip()
snippet = body[:180] + ("..." if len(body) > 180 else "")
payload = {
"filename": path.name,
"title": title or path.stem,
"date": date,
"snippet": snippet,
"updated_at": int(path.stat().st_mtime),
}
if include_content:
payload["content"] = body
return payload
def _state_items(key: str, limit: int = 6):
bucket = getattr(state_store, "_state", {}).get(key, {})
items = []
for item in bucket.values():
latest = item.get("latest", {})
items.append({**item, "latest": latest})
items.sort(key=lambda row: str(row.get("latest", {}).get("date", "")), reverse=True)
return items[:limit]
def _load_series(limit: int = 6):
series = getattr(state_store, "_state", {}).get("meeting_series", {})
rows = []
for name, payload in series.items():
rows.append(
{
"name": name,
"latest_date": payload.get("latest_date", ""),
"processed_titles": payload.get("processed_titles", []),
"meeting_count": len(payload.get("processed_titles", [])),
}
)
rows.sort(key=lambda row: row.get("latest_date", ""), reverse=True)
return rows[:limit]
def _build_highlights(meetings, action_items, metrics, graph_stats):
latest_meeting = meetings[0] if meetings else {}
top_action = action_items[0] if action_items else {}
top_metric = metrics[0] if metrics else {}
return [
{
"label": "最近归档",
"value": latest_meeting.get("title", "暂无会议"),
"meta": latest_meeting.get("date", ""),
},
{
"label": "待跟进事项",
"value": str(len(action_items)),
"meta": top_action.get("task", ""),
},
{
"label": "图谱节点",
"value": str(graph_stats.get("entities", 0)),
"meta": "Neo4j 实体总数",
},
{
"label": "关键指标",
"value": str(len(metrics)),
"meta": top_metric.get("metric_name", ""),
},
]
if __name__ == "__main__":
logging.basicConfig(
level=logging.INFO,
format="%(asctime)s [%(levelname)s] %(name)s: %(message)s",
datefmt="%H:%M:%S",
)
run_demo_server()

View File

@ -0,0 +1,399 @@
const dashboardUrl = "/api/dashboard";
let currentImportJobId = null;
let importPollTimer = null;
const highlightGrid = document.getElementById("highlightGrid");
const statsList = document.getElementById("statsList");
const meetingCards = document.getElementById("meetingCards");
const actionList = document.getElementById("actionList");
const metricList = document.getElementById("metricList");
const seriesList = document.getElementById("seriesList");
const searchForm = document.getElementById("searchForm");
const searchInput = document.getElementById("searchInput");
const searchResults = document.getElementById("searchResults");
const refreshDashboardBtn = document.getElementById("refreshDashboardBtn");
const importForm = document.getElementById("importForm");
const importFieldset = document.getElementById("importFieldset");
const importSubmitBtn = document.getElementById("importSubmitBtn");
const importFile = document.getElementById("importFile");
const importText = document.getElementById("importText");
const importForce = document.getElementById("importForce");
const importStatus = document.getElementById("importStatus");
const importProgress = document.getElementById("importProgress");
const meetingDialog = document.getElementById("meetingDialog");
const closeDialogBtn = document.getElementById("closeDialogBtn");
const dialogTitle = document.getElementById("dialogTitle");
const dialogMeta = document.getElementById("dialogMeta");
const dialogContent = document.getElementById("dialogContent");
function escapeHtml(value) {
return String(value ?? "")
.replaceAll("&", "&amp;")
.replaceAll("<", "&lt;")
.replaceAll(">", "&gt;")
.replaceAll('"', "&quot;")
.replaceAll("'", "&#39;");
}
function emptyMarkup(message) {
return `<div class="empty-state">${escapeHtml(message)}</div>`;
}
function setImportBusy(isBusy) {
importFieldset.disabled = isBusy;
importSubmitBtn.textContent = isBusy ? "处理中..." : "开始导入";
}
function setImportStatus(message, kind = "info") {
importStatus.textContent = message;
importStatus.dataset.kind = kind;
}
function renderProgress(steps = []) {
if (!steps.length) {
importProgress.innerHTML = emptyMarkup("导入开始后,这里会实时显示处理步骤。");
return;
}
importProgress.innerHTML = steps.map((step, index) => `
<div class="progress-item">
<span class="progress-index">${index + 1}</span>
<span>${escapeHtml(step)}</span>
</div>
`).join("");
}
function renderHighlights(items) {
if (!items?.length) {
highlightGrid.innerHTML = emptyMarkup("暂无概览数据");
return;
}
const colors = ["#4a90d9", "#34c759", "#ff9500", "#53c2da"];
highlightGrid.innerHTML = items.map((item, i) => `
<article class="highlight-card" style="--card-accent: ${colors[i % colors.length]}">
<div class="hc-bar"></div>
<p class="eyebrow">${escapeHtml(item.label)}</p>
<strong>${escapeHtml(item.value)}</strong>
<p>${escapeHtml(item.meta || "")}</p>
</article>
`).join("");
}
function renderStats(graph = {}, state = {}) {
const cards = [
{ label: "Neo4j", value: graph.enabled ? "在线" : "离线", icon: "⬡", color: graph.enabled ? "#34c759" : "#b3261e" },
{ label: "会议", value: graph.meetings ?? 0, icon: "📋", color: "#4a90d9" },
{ label: "实体", value: graph.entities ?? 0, icon: "◆", color: "#53c2da" },
{ label: "关系", value: graph.facts ?? 0, icon: "↗", color: "#ff9500" },
{ label: "行动项", value: state.action_items_tracked ?? 0, icon: "☐", color: "#7f8bff" },
{ label: "指标", value: state.metrics_tracked ?? 0, icon: "📊", color: "#af52de" },
];
statsList.innerHTML = cards.map((c) => `
<div class="mini-stat" style="--stat-color: ${c.color}">
<span class="ms-icon">${c.icon}</span>
<div class="ms-body">
<strong>${escapeHtml(c.value)}</strong>
<p>${escapeHtml(c.label)}</p>
</div>
</div>
`).join("");
}
function renderMeetings(items) {
if (!items?.length) {
meetingCards.innerHTML = emptyMarkup("还没有归档会议");
return;
}
meetingCards.innerHTML = items.map((item) => `
<article class="meeting-card" data-filename="${escapeHtml(item.filename)}">
<div class="mc-date">${escapeHtml(item.date || "??")}</div>
<div class="mc-body">
<h4>${escapeHtml(item.title)}</h4>
<p>${escapeHtml(item.snippet || "暂无摘要")}</p>
</div>
</article>
`).join("");
}
function renderActionItems(items) {
if (!items?.length) {
actionList.innerHTML = emptyMarkup("暂无行动项");
return;
}
const priorityColors = { "高": "#b3261e", "中": "#ff9500", "低": "#34c759" };
actionList.innerHTML = items.map((item) => {
const pri = item.latest?.priority || "普通";
const priColor = priorityColors[pri] || "#68709d";
return `
<article class="list-item">
<div class="li-priority" style="--pri-color: ${priColor}"></div>
<div class="li-body">
<strong>${escapeHtml(item.task || "未命名任务")}</strong>
<p>${escapeHtml(item.assignee || "未分配")} · ${escapeHtml(item.series || "未归类")}</p>
<div class="chip-row">
<span class="chip status-${(item.latest?.status || "unknown").toLowerCase()}">${escapeHtml(item.latest?.status || "未知")}</span>
<span class="chip">${escapeHtml(pri)}</span>
${item.latest?.deadline ? `<span class="chip">${escapeHtml(item.latest.deadline)}</span>` : ""}
</div>
</div>
</article>`;
}).join("");
}
function renderMetrics(items) {
if (!items?.length) {
metricList.innerHTML = emptyMarkup("暂无指标");
return;
}
metricList.innerHTML = items.map((item) => {
const val = parseFloat(item.latest?.value) || 0;
const tgt = parseFloat(item.latest?.target) || 100;
const pct = Math.min(100, Math.round((val / tgt) * 100));
return `
<article class="metric-card">
<div class="mc-head">
<strong>${escapeHtml(item.metric_name || "未命名指标")}</strong>
<span class="mc-value">${escapeHtml(item.latest?.value || "—")}</span>
</div>
<p>${escapeHtml(item.owner || "未指定负责人")}</p>
<div class="mc-bar-track">
<div class="mc-bar-fill" style="width: ${pct}%"></div>
</div>
<div class="chip-row">
${item.latest?.target ? `<span class="chip">目标 ${escapeHtml(item.latest.target)}</span>` : ""}
${item.latest?.trend ? `<span class="chip">${escapeHtml(item.latest.trend)}</span>` : ""}
</div>
</article>`;
}).join("");
}
function renderSeries(items) {
if (!items?.length) {
seriesList.innerHTML = emptyMarkup("暂无会议系列");
return;
}
seriesList.innerHTML = items.map((item) => `
<article class="series-card">
<div class="sc-count">${escapeHtml(item.meeting_count)}</div>
<div class="sc-body">
<strong>${escapeHtml(item.name)}</strong>
<p>最近${escapeHtml(item.latest_date || "未知")}</p>
</div>
</article>
`).join("");
}
function updateDashboard(payload) {
renderHighlights(payload.highlights || []);
renderStats(payload.graph || {}, payload.state || {});
renderMeetings(payload.meetings || []);
renderActionItems(payload.action_items || []);
renderMetrics(payload.metrics || []);
renderSeries(payload.series || []);
}
async function loadDashboard() {
const response = await fetch(dashboardUrl);
const payload = await response.json();
updateDashboard(payload);
}
async function runSearch(query) {
if (!query.trim()) {
searchResults.innerHTML = emptyMarkup("输入问题后,这里会展示混合检索结果。");
return;
}
searchResults.innerHTML = emptyMarkup("正在检索...");
const response = await fetch(`/api/search?q=${encodeURIComponent(query)}&limit=8`);
const payload = await response.json();
const items = payload.results || [];
if (!items.length) {
searchResults.innerHTML = emptyMarkup("没有找到匹配结果");
return;
}
searchResults.innerHTML = items.map((item) => `
<article class="result-card">
<div class="rc-kind">${escapeHtml(item.kind || "item")}</div>
<strong>${escapeHtml(item.title || "结果")}</strong>
<p>${escapeHtml(item.text || "")}</p>
<div class="meta-row">
${item.meeting_title ? `<span class="chip">${escapeHtml(item.meeting_title)}</span>` : ""}
${item.date ? `<span class="chip">${escapeHtml(item.date)}</span>` : ""}
<span class="chip">score ${escapeHtml(item.score)}</span>
</div>
</article>
`).join("");
}
async function openMeeting(filename) {
const response = await fetch(`/api/meeting?filename=${encodeURIComponent(filename)}`);
const payload = await response.json();
dialogTitle.textContent = payload.title || "会议详情";
dialogMeta.textContent = payload.date || payload.filename || "";
dialogContent.textContent = payload.content || "没有可展示的原文";
meetingDialog.showModal();
}
async function readImportText() {
const directText = importText.value.trim();
if (directText) {
return directText;
}
const file = importFile.files?.[0];
if (!file) {
return "";
}
return await file.text();
}
async function pollImportStatus(jobId) {
const response = await fetch(`/api/import-status?job_id=${encodeURIComponent(jobId)}`);
const payload = await response.json();
renderProgress(payload.steps || []);
if (payload.status === "done") {
currentImportJobId = null;
clearTimeout(importPollTimer);
importPollTimer = null;
setImportBusy(false);
setImportStatus(`导入完成:${payload.archive_path || "已归档"}`, "success");
importText.value = "";
importFile.value = "";
await loadDashboard();
return;
}
if (payload.status === "error") {
currentImportJobId = null;
clearTimeout(importPollTimer);
importPollTimer = null;
setImportBusy(false);
setImportStatus(payload.message || "导入失败", "error");
return;
}
setImportStatus(payload.message || "正在处理中...", "info");
importPollTimer = setTimeout(() => {
pollImportStatus(jobId).catch((error) => {
setImportBusy(false);
setImportStatus(`进度查询失败: ${error}`, "error");
});
}, 900);
}
async function submitImport() {
if (currentImportJobId) {
return;
}
const text = (await readImportText()).trim();
if (!text) {
setImportStatus("请先选择文件或粘贴会议文本。", "error");
return;
}
setImportBusy(true);
renderProgress(["任务已提交,准备开始处理"]);
setImportStatus("正在创建导入任务...", "info");
const response = await fetch("/api/import", {
method: "POST",
headers: {
"Content-Type": "application/json",
},
body: JSON.stringify({
text,
force: importForce.checked,
}),
});
const payload = await response.json();
if (!response.ok || !payload.ok) {
setImportBusy(false);
setImportStatus(payload.error || "导入失败", "error");
return;
}
currentImportJobId = payload.job_id;
setImportStatus("任务已创建,正在处理中...", "info");
await pollImportStatus(currentImportJobId);
}
meetingCards?.addEventListener("click", (event) => {
const card = event.target.closest("[data-filename]");
if (!card) {
return;
}
openMeeting(card.dataset.filename).catch((error) => {
dialogTitle.textContent = "加载失败";
dialogMeta.textContent = "";
dialogContent.textContent = String(error);
meetingDialog.showModal();
});
});
searchForm?.addEventListener("submit", (event) => {
event.preventDefault();
runSearch(searchInput.value).catch((error) => {
searchResults.innerHTML = emptyMarkup(`检索失败: ${error}`);
});
});
importForm?.addEventListener("submit", (event) => {
event.preventDefault();
submitImport().catch((error) => {
currentImportJobId = null;
setImportBusy(false);
setImportStatus(`导入失败: ${error}`, "error");
});
});
refreshDashboardBtn?.addEventListener("click", () => {
loadDashboard().catch((error) => {
highlightGrid.innerHTML = emptyMarkup(`刷新失败: ${error}`);
});
});
closeDialogBtn?.addEventListener("click", () => meetingDialog.close());
// Unified panel tab switching
(function initUnifiedTabs() {
const tabs = document.querySelectorAll(".unified-tab");
const panes = {
import: document.getElementById("unifiedImport"),
search: document.getElementById("unifiedSearch"),
stats: document.getElementById("unifiedStats"),
};
tabs.forEach((tab) => {
tab.addEventListener("click", () => {
const target = tab.dataset.tab;
tabs.forEach((t) => t.classList.toggle("active", t === tab));
Object.values(panes).forEach((p) => p?.classList.add("hidden"));
const pane = panes[target];
if (pane) {
pane.classList.remove("hidden");
// Refresh stats layout when switching to stats tab
if (target === "stats" && typeof renderStats === "function") {
// stats already rendered by loadDashboard
}
}
});
});
})();
renderProgress([]);
loadDashboard().catch((error) => {
highlightGrid.innerHTML = emptyMarkup(`加载失败: ${error}`);
});

View File

@ -0,0 +1,69 @@
<!DOCTYPE html>
<html lang="zh-CN">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Neo4j Graph Explorer</title>
<link rel="stylesheet" href="/styles.css">
</head>
<body>
<div class="shell graph-shell">
<aside class="sidebar">
<div class="brand">
<div class="brand-mark">G</div>
<div>
<p class="brand-kicker">Graph Explorer</p>
<h1>Neo4j 图谱</h1>
</div>
</div>
<nav class="nav">
<a class="nav-link" href="/index.html">总览面板</a>
<a class="nav-link active" href="/graph.html">图谱浏览</a>
</nav>
<div class="legend">
<p class="eyebrow" style="margin-bottom:6px">图例</p>
<span><i class="legend-dot meeting"></i>会议</span>
<span><i class="legend-dot episode"></i>片段</span>
<span><i class="legend-dot entity"></i>实体</span>
<span><i class="legend-dot fact"></i>事实</span>
</div>
</aside>
<main class="main">
<div class="graph-toolbar panel">
<form class="graph-controls" id="graphSearchForm">
<input id="graphQueryInput" type="text" placeholder="搜索节点名称或关键词…" class="search-input">
<label class="field-label">节点 <input id="graphNodeLimit" type="number" min="10" max="200" step="10" value="60"></label>
<label class="field-label">关系 <input id="graphEdgeLimit" type="number" min="10" max="300" step="10" value="120"></label>
<button class="btn" type="submit">更新</button>
</form>
<div class="graph-toolbar-row">
<div class="graph-type-filter" id="graphTypeFilter"></div>
<div class="graph-actions">
<span class="graph-meta" id="graphMeta"></span>
</div>
</div>
</div>
<div class="graph-layout">
<div class="panel graph-stage-panel">
<div class="graph-stage" id="graphStage">
<svg id="graphSvg" viewBox="0 0 960 640" preserveAspectRatio="xMidYMid meet"></svg>
</div>
</div>
<div class="panel detail-panel">
<div class="detail-card" id="graphDetail">
<div class="empty-state">点击节点或关系查看详情</div>
</div>
<div class="related-search" id="relatedSearch"></div>
</div>
</div>
</main>
</div>
<script src="/graph.js"></script>
</body>
</html>

View File

@ -0,0 +1,517 @@
const graphForm = document.getElementById("graphSearchForm");
const graphQueryInput = document.getElementById("graphQueryInput");
const graphNodeLimit = document.getElementById("graphNodeLimit");
const graphEdgeLimit = document.getElementById("graphEdgeLimit");
const graphSvg = document.getElementById("graphSvg");
const graphMeta = document.getElementById("graphMeta");
const graphDetail = document.getElementById("graphDetail");
const relatedSearch = document.getElementById("relatedSearch");
const graphTypeFilter = document.getElementById("graphTypeFilter");
let selectedEntityTypes = null;
let selectedKinds = null;
const TRUNCATE_LENGTH = 16;
function h(value) {
return String(value ?? "")
.replaceAll("&", "&amp;")
.replaceAll("<", "&lt;")
.replaceAll(">", "&gt;")
.replaceAll('"', "&quot;")
.replaceAll("'", "&#39;");
}
function truncate(text, maxLen) {
if (!text || text.length <= maxLen) return text || "";
return text.slice(0, maxLen - 1) + "…";
}
function empty(message) {
return `<div class="empty-state">${h(message)}</div>`;
}
async function loadGraphKinds() {
try {
const kindRes = await fetch("/api/graph-kinds");
const kindData = await kindRes.json();
const kinds = kindData.kinds || [];
if (kinds.length) {
selectedKinds = new Set(kinds.map((k) => k.kind));
}
let html = "";
if (kinds.length) {
html += `<span class="field-label" style="margin-right:4px">节点类型:</span>`;
html += kinds.map((k) =>
`<label><input type="checkbox" class="kind-cb" value="${h(k.kind)}" checked> ${h(k.kind)} (${k.count})</label>`
).join("");
html += `<label><input type="checkbox" class="kind-cb" id="kindSelectAll" checked> 全选</label>`;
}
graphTypeFilter.innerHTML = html;
graphTypeFilter.querySelectorAll(".kind-cb").forEach((cb) => {
cb.addEventListener("change", () => {
if (cb.id === "kindSelectAll") {
const checked = cb.checked;
graphTypeFilter.querySelectorAll(".kind-cb:not(#kindSelectAll)").forEach((c) => c.checked = checked);
selectedKinds = checked ? new Set(kinds.map((k) => k.kind)) : new Set();
} else {
if (!cb.checked) {
document.getElementById("kindSelectAll").checked = false;
selectedKinds.delete(cb.value);
} else {
selectedKinds.add(cb.value);
if (selectedKinds.size === kinds.length) {
document.getElementById("kindSelectAll").checked = true;
}
}
}
fetchGraph().catch((error) => renderInspector(empty(`图谱加载失败: ${error}`)));
});
});
} catch (_) {}
}
function renderInspector(content) {
graphDetail.innerHTML = content;
}
async function loadRelated(query) {
if (!query) {
relatedSearch.innerHTML = "";
return;
}
const response = await fetch(`/api/search?q=${encodeURIComponent(query)}&limit=4`);
const payload = await response.json();
const results = payload.results || [];
if (!results.length) {
relatedSearch.innerHTML = empty("没有更多相关检索结果");
return;
}
relatedSearch.innerHTML = `
<div class="panel-head">
<div>
<p class="eyebrow">Related</p>
<h3>相关检索</h3>
</div>
</div>
${results.map((item) => `
<article class="result-card">
<strong>${h(item.title || item.kind || "结果")}</strong>
<p>${h(item.text || "")}</p>
</article>
`).join("")}
`;
}
function renderGraph(payload) {
const nodes = payload.nodes || [];
const edges = payload.edges || [];
const stage = document.getElementById("graphStage");
const rect = stage.getBoundingClientRect();
const svgW = Math.max(600, rect.width - 4);
const svgH = Math.max(400, rect.height - 4);
graphSvg.setAttribute("viewBox", `0 0 ${svgW} ${svgH}`);
graphSvg.setAttribute("width", svgW);
graphSvg.setAttribute("height", svgH);
graphMeta.textContent = `节点 ${nodes.length} · 关系 ${edges.length} · Neo4j ${payload.stats?.enabled ? "已启用" : "未启用"}`;
if (!nodes.length) {
graphSvg.innerHTML = "";
renderInspector(empty("当前没有可显示的图谱数据"));
relatedSearch.innerHTML = "";
return;
}
const nodeRadius = (node) => Math.max(14, Math.min(28, 12 + (node.degree || 0) * 1.4));
const dataNodes = nodes.map((node, i) => ({
...node,
x: svgW / 2 + Math.cos((Math.PI * 2 * i) / nodes.length) * Math.min(svgW, svgH) * 0.28,
y: svgH / 2 + Math.sin((Math.PI * 2 * i) / nodes.length) * Math.min(svgW, svgH) * 0.26,
vx: 0, vy: 0,
pinned: false,
radius: nodeRadius(node),
}));
const nodeById = new Map(dataNodes.map((n) => [n.id, n]));
const dataEdges = edges
.map((e) => ({ ...e, sourceNode: nodeById.get(e.source), targetNode: nodeById.get(e.target) }))
.filter((e) => e.sourceNode && e.targetNode);
const n = dataNodes.length;
const area = svgW * svgH;
const repulsionStr = Math.max(3000, area / Math.max(n, 1));
const linkDist = Math.max(80, Math.min(200, 240 - n));
const linkStr = Math.min(0.06, 4 / Math.max(n, 1));
let zoomTransform = { x: 0, y: 0, k: 1 };
let isDragging = false;
let dragOccurred = false;
let dragNode = null;
let dragOffX = 0, dragOffY = 0;
let isPanning = false;
let panStartX = 0, panStartY = 0;
let simAlpha = 1;
let simRunning = false;
let simId = null;
const mainGroup = document.createElementNS("http://www.w3.org/2000/svg", "g");
graphSvg.innerHTML = "";
graphSvg.appendChild(mainGroup);
const edgeEls = dataEdges.map((edge) => {
const g = document.createElementNS("http://www.w3.org/2000/svg", "g");
g.setAttribute("data-edge-id", edge.id);
g.setAttribute("class", "edge-wrap");
g.style.cursor = "pointer";
const line = document.createElementNS("http://www.w3.org/2000/svg", "line");
line.setAttribute("class", "graph-edge");
g.appendChild(line);
if (edge.predicate) {
const text = document.createElementNS("http://www.w3.org/2000/svg", "text");
text.setAttribute("text-anchor", "middle");
text.setAttribute("font-size", "11");
text.setAttribute("fill", "#7d86b4");
text.setAttribute("data-type", "edge-label");
text.textContent = truncate(edge.predicate, 20);
g.appendChild(text);
}
mainGroup.appendChild(g);
return g;
});
const nodeEls = dataNodes.map((node) => {
const r = node.radius;
const g = document.createElementNS("http://www.w3.org/2000/svg", "g");
const kindClass = `graph-node--${(node.kind || 'entity').toLowerCase()}`;
g.setAttribute("class", `graph-node ${kindClass}`);
g.setAttribute("data-node-id", node.id);
g.style.cursor = "grab";
const circle = document.createElementNS("http://www.w3.org/2000/svg", "circle");
circle.setAttribute("r", r);
g.appendChild(circle);
const text = document.createElementNS("http://www.w3.org/2000/svg", "text");
text.setAttribute("y", r + 16);
text.setAttribute("text-anchor", "middle");
text.setAttribute("font-size", "11");
text.setAttribute("fill", "#22264d");
text.setAttribute("data-type", "node-label");
text.textContent = truncate(node.label, TRUNCATE_LENGTH);
g.appendChild(text);
mainGroup.appendChild(g);
return g;
});
function syncDom() {
for (let i = 0; i < dataEdges.length; i++) {
const edge = dataEdges[i], el = edgeEls[i];
if (!el) continue;
const line = el.querySelector("line");
if (line) {
line.setAttribute("x1", edge.sourceNode.x.toFixed(1));
line.setAttribute("y1", edge.sourceNode.y.toFixed(1));
line.setAttribute("x2", edge.targetNode.x.toFixed(1));
line.setAttribute("y2", edge.targetNode.y.toFixed(1));
}
const label = el.querySelector("text[data-type='edge-label']");
if (label) {
const mx = (edge.sourceNode.x + edge.targetNode.x) / 2;
const my = (edge.sourceNode.y + edge.targetNode.y) / 2;
const angle = Math.atan2(edge.targetNode.y - edge.sourceNode.y, edge.targetNode.x - edge.sourceNode.x) * (180 / Math.PI);
label.setAttribute("x", mx.toFixed(1));
label.setAttribute("y", (my + (Math.abs(angle) < 30 || Math.abs(angle) > 150 ? -12 : 4)).toFixed(1));
}
}
for (let i = 0; i < dataNodes.length; i++) {
const node = dataNodes[i], el = nodeEls[i];
if (el) el.setAttribute("transform", `translate(${node.x.toFixed(1)} ${node.y.toFixed(1)})`);
}
}
function tick() {
const alpha = simAlpha;
if (alpha < 0.001) {
simRunning = false;
simId = null;
syncDom();
return;
}
for (let i = 0; i < dataNodes.length; i++) {
for (let j = i + 1; j < dataNodes.length; j++) {
const a = dataNodes[i], b = dataNodes[j];
let dx = b.x - a.x, dy = b.y - a.y;
const dist = Math.sqrt(dx * dx + dy * dy) || 1;
const minDist = (a.radius + b.radius) * 1.6;
if (dist < minDist) {
const push = (minDist - dist) / dist * 0.5;
a.vx -= dx * push; a.vy -= dy * push;
b.vx += dx * push; b.vy += dy * push;
}
const force = (repulsionStr * alpha) / (dist * dist + 1);
const fx = force * dx / dist, fy = force * dy / dist;
a.vx -= fx; a.vy -= fy;
b.vx += fx; b.vy += fy;
}
}
for (const edge of dataEdges) {
const s = edge.sourceNode, t = edge.targetNode;
const dx = t.x - s.x, dy = t.y - s.y;
const dist = Math.sqrt(dx * dx + dy * dy) || 1;
const force = (dist - linkDist) * linkStr * alpha;
const fx = force * dx / dist, fy = force * dy / dist;
s.vx += fx; s.vy += fy;
t.vx -= fx; t.vy -= fy;
}
const cx = svgW / 2, cy = svgH / 2;
const grav = 0.005 * alpha;
for (const node of dataNodes) {
if (node.pinned) continue;
node.vx += (cx - node.x) * grav;
node.vy += (cy - node.y) * grav;
}
simAlpha *= 0.992;
if (simAlpha < 0.001) simAlpha = 0;
for (const node of dataNodes) {
if (node.pinned) continue;
node.vx *= 0.6;
node.vy *= 0.6;
node.x += node.vx;
node.y += node.vy;
node.x = Math.max(20, Math.min(svgW - 20, node.x));
node.y = Math.max(20, Math.min(svgH - 20, node.y));
}
syncDom();
simId = requestAnimationFrame(tick);
}
function startSim() {
if (simRunning) return;
simRunning = true;
simAlpha = Math.max(simAlpha, 0.15);
if (simId) cancelAnimationFrame(simId);
simId = requestAnimationFrame(tick);
}
function wakeSim() {
simAlpha = Math.max(simAlpha, 0.3);
if (!simRunning) startSim();
}
function applyTransform() {
mainGroup.setAttribute("transform", `translate(${zoomTransform.x} ${zoomTransform.y}) scale(${zoomTransform.k})`);
}
graphSvg.addEventListener("wheel", (e) => {
e.preventDefault();
const delta = e.deltaY > 0 ? 0.9 : 1.1;
const newK = Math.max(0.15, Math.min(6, zoomTransform.k * delta));
const r = graphSvg.getBoundingClientRect();
const cx = e.clientX - r.left, cy = e.clientY - r.top;
zoomTransform.x = cx - (cx - zoomTransform.x) * (newK / zoomTransform.k);
zoomTransform.y = cy - (cy - zoomTransform.y) * (newK / zoomTransform.k);
zoomTransform.k = newK;
applyTransform();
});
graphSvg.addEventListener("mousedown", (e) => {
const target = e.target.closest("[data-node-id]");
if (target) {
isDragging = true;
dragOccurred = false;
dragNode = dataNodes.find((n) => n.id === target.dataset.nodeId);
if (dragNode) {
const r = graphSvg.getBoundingClientRect();
dragOffX = (e.clientX - r.left - zoomTransform.x) / zoomTransform.k - dragNode.x;
dragOffY = (e.clientY - r.top - zoomTransform.y) / zoomTransform.k - dragNode.y;
target.style.cursor = "grabbing";
wakeSim();
}
return;
}
if (e.target === graphSvg || e.target === mainGroup) {
isPanning = true;
panStartX = e.clientX - zoomTransform.x;
panStartY = e.clientY - zoomTransform.y;
graphSvg.style.cursor = "grabbing";
}
});
window.addEventListener("mousemove", (e) => {
if (isDragging && dragNode) {
dragOccurred = true;
const r = graphSvg.getBoundingClientRect();
dragNode.x = (e.clientX - r.left - zoomTransform.x) / zoomTransform.k - dragOffX;
dragNode.y = (e.clientY - r.top - zoomTransform.y) / zoomTransform.k - dragOffY;
dragNode.pinned = true;
syncDom();
} else if (isPanning) {
zoomTransform.x = e.clientX - panStartX;
zoomTransform.y = e.clientY - panStartY;
applyTransform();
}
});
window.addEventListener("mouseup", () => {
if (isDragging && dragNode) {
const el = graphSvg.querySelector(`[data-node-id="${dragNode.id}"]`);
if (el) el.style.cursor = "grab";
dragNode.vx = 0;
dragNode.vy = 0;
wakeSim();
}
isDragging = false;
dragNode = null;
isPanning = false;
graphSvg.style.cursor = "";
});
nodeEls.forEach((el) => {
el.addEventListener("click", (e) => {
if (dragOccurred) { dragOccurred = false; return; }
const node = nodes.find((item) => item.id === el.dataset.nodeId);
const related = edges.filter((edge) => edge.source === node.id || edge.target === node.id);
const kind = (node.kind || "Entity").toLowerCase();
let body = "";
if (kind === "meeting" || kind === "episode") {
body = `
<p>${h(node.description || node.summary || "暂无描述")}</p>
<div class="chip-row">
${node.date ? `<span class="chip">${h(node.date)}</span>` : ""}
<span class="chip">关系 ${h(related.length)}</span>
</div>`;
} else if (kind === "fact") {
body = `
<p>${h(node.fact || node.description || "暂无描述")}</p>
<div class="chip-row">
${node.date ? `<span class="chip">${h(node.date)}</span>` : ""}
<span class="chip">关系 ${h(related.length)}</span>
</div>`;
} else {
body = `
<p>${h(node.description || "暂无描述")}</p>
<div class="chip-row">
${node.entity_type ? `<span class="chip">${h(node.entity_type)}</span>` : ""}
${node.date ? `<span class="chip">${h(node.date)}</span>` : ""}
<span class="chip">关系 ${h(related.length)}</span>
</div>`;
}
renderInspector(`
<div class="detail-card">
<p class="eyebrow">${h(node.kind)}</p>
<h3>${h(node.label)}</h3>
${body}
</div>
${related.map((edge) => `
<article class="result-card">
<strong>${h(edge.source)} ${h(edge.target)}</strong>
<p>${h(edge.fact || edge.description || edge.predicate || "")}</p>
</article>
`).join("")}
`);
loadRelated(node.label).catch(() => relatedSearch.innerHTML = empty("相关检索加载失败"));
});
});
edgeEls.forEach((el) => {
el.addEventListener("click", () => {
graphSvg.querySelectorAll(".graph-edge.active").forEach((item) => item.classList.remove("active"));
const line = el.querySelector(".graph-edge");
line?.classList.add("active");
const edge = edges.find((item) => item.id === el.dataset.edgeId);
renderInspector(`
<div class="detail-card">
<p class="eyebrow">Edge</p>
<h3>${h(edge.source)} ${h(edge.target)}</h3>
<p>${h(edge.fact || edge.description || "暂无补充描述")}</p>
<div class="chip-row">
${edge.predicate ? `<span class="chip">${h(edge.predicate)}</span>` : ""}
${edge.date ? `<span class="chip">${h(edge.date)}</span>` : ""}
<span class="chip">置信度 ${h(edge.confidence ?? 0)}</span>
${edge.meeting_id ? `<span class="chip">${h(edge.meeting_id)}</span>` : ""}
</div>
</div>
`);
loadRelated(`${edge.source} ${edge.predicate} ${edge.target}`).catch(() => relatedSearch.innerHTML = empty("相关检索加载失败"));
});
});
const resetBtn = document.createElement("button");
resetBtn.className = "btn ghost zoom-reset-btn";
resetBtn.textContent = "重置视图";
resetBtn.addEventListener("click", () => {
zoomTransform = { x: 0, y: 0, k: 1 };
applyTransform();
});
const pauseBtn = document.createElement("button");
pauseBtn.className = "btn ghost pause-btn";
pauseBtn.textContent = "⏸ 暂停";
pauseBtn.addEventListener("click", () => {
if (simRunning) {
simRunning = false;
if (simId) cancelAnimationFrame(simId);
simId = null;
pauseBtn.textContent = "▶ 继续";
} else {
wakeSim();
pauseBtn.textContent = "⏸ 暂停";
}
});
document.querySelectorAll(".zoom-reset-btn, .pause-btn").forEach((el) => el.remove());
const toolbar = document.querySelector(".graph-toolbar .graph-controls");
if (toolbar) {
const wrap = toolbar.parentElement;
const hint = document.createElement("div");
hint.className = "zoom-hint";
hint.innerHTML = `滚轮缩放 · 空白拖拽平移 · 拖拽节点重排 · 物理动画自动冷却`;
wrap.appendChild(hint);
const btnRow = document.createElement("div");
btnRow.className = "graph-toolbar-row";
btnRow.appendChild(resetBtn);
btnRow.appendChild(pauseBtn);
wrap.appendChild(btnRow);
}
startSim();
syncDom();
}
async function fetchGraph() {
const query = graphQueryInput.value.trim();
const limitNodes = graphNodeLimit.value || "60";
const limitEdges = graphEdgeLimit.value || "120";
const params = new URLSearchParams();
if (query) params.set("q", query);
params.set("limit_nodes", limitNodes);
params.set("limit_edges", limitEdges);
if (selectedKinds && selectedKinds.size > 0) {
selectedKinds.forEach((k) => params.append("kinds", k));
}
renderInspector(empty("图谱加载中..."));
const response = await fetch(`/api/graph?${params.toString()}`);
const payload = await response.json();
renderGraph(payload);
}
graphForm?.addEventListener("submit", (event) => {
event.preventDefault();
fetchGraph().catch((error) => renderInspector(empty(`图谱加载失败: ${error}`)));
});
loadGraphKinds().catch(() => {});
fetchGraph().catch((error) => renderInspector(empty(`图谱加载失败: ${error}`)));

View File

@ -0,0 +1,141 @@
<!DOCTYPE html>
<html lang="zh-CN">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Meeting Memory Console</title>
<link rel="stylesheet" href="/styles.css">
</head>
<body>
<div class="shell">
<aside class="sidebar">
<div class="brand">
<div class="brand-mark">M</div>
<div>
<p class="brand-kicker">Meeting Memory</p>
<h1>会议记忆中枢</h1>
</div>
</div>
<nav class="nav">
<a class="nav-link active" href="/index.html">总览面板</a>
<a class="nav-link" href="/graph.html">图谱浏览</a>
</nav>
<div class="side-card sidebar-shortcuts">
<a class="pill-link" href="#import-panel">导入会议</a>
<a class="pill-link" href="#search-panel">知识检索</a>
<a class="pill-link" href="/graph.html">图谱页</a>
</div>
</aside>
<main class="main">
<div class="main-toolbar">
<div>
<p class="eyebrow">Dashboard</p>
<h2>会议知识库</h2>
</div>
<div class="main-toolbar-actions">
<button class="btn" id="refreshDashboardBtn">刷新</button>
</div>
</div>
<section class="stats-grid" id="highlightGrid"></section>
<section class="panel unified-panel">
<div class="unified-tabs">
<button class="unified-tab active" data-tab="import">导入</button>
<button class="unified-tab" data-tab="search">检索</button>
<button class="unified-tab" data-tab="stats">统计</button>
</div>
<div class="unified-pane" id="unifiedImport">
<form class="import-form" id="importForm">
<fieldset id="importFieldset" class="import-fieldset">
<label class="field-label" for="importFile">选择文件</label>
<input id="importFile" type="file" accept=".md,.txt,text/markdown,text/plain">
<label class="field-label" for="importText">或直接粘贴会议文本</label>
<textarea id="importText" rows="6" placeholder="把会议纪要、聊天整理稿或录音转写文本粘贴到这里"></textarea>
<label class="check-row">
<input id="importForce" type="checkbox">
<span>发现重复时允许覆盖</span>
</label>
<button class="btn" id="importSubmitBtn" type="submit">开始导入</button>
</fieldset>
</form>
<div class="status-box" id="importStatus">支持 `.md` / `.txt`,也可以直接粘贴文本。</div>
<div class="progress-list" id="importProgress">
<div class="empty-state">导入开始后,这里会实时显示处理步骤。</div>
</div>
</div>
<div class="unified-pane hidden" id="unifiedSearch">
<form class="search-box" id="searchForm">
<input id="searchInput" type="text" placeholder="搜索会议主题、负责人、指标、关系事实...">
<button class="btn" type="submit">搜索</button>
</form>
<div class="search-results" id="searchResults">
<div class="empty-state">输入问题后,这里会展示混合检索结果。</div>
</div>
</div>
<div class="unified-pane hidden" id="unifiedStats">
<div class="mini-stats" id="statsList"></div>
</div>
</section>
<div class="content-grid">
<section class="panel" id="meeting-list">
<div class="panel-head">
<p class="eyebrow">Recent Archives</p>
<h3>最近会议</h3>
</div>
<div class="card-list" id="meetingCards"></div>
</section>
<section class="panel">
<div class="panel-head">
<p class="eyebrow">Action Items</p>
<h3>待跟进行动项</h3>
</div>
<div class="list-stack" id="actionList"></div>
</section>
<section class="panel">
<div class="panel-head">
<p class="eyebrow">Metrics</p>
<h3>关键指标</h3>
</div>
<div class="list-stack" id="metricList"></div>
</section>
<section class="panel">
<div class="panel-head">
<p class="eyebrow">Series</p>
<h3>会议系列</h3>
</div>
<div class="list-stack" id="seriesList"></div>
</section>
</div>
</main>
</div>
<dialog class="detail-modal" id="meetingDialog">
<div class="dialog-head">
<div>
<p class="eyebrow">Archived Meeting</p>
<h3 id="dialogTitle">会议详情</h3>
</div>
<button class="icon-btn" id="closeDialogBtn" type="button">×</button>
</div>
<p class="dialog-meta" id="dialogMeta"></p>
<pre class="dialog-content" id="dialogContent"></pre>
</dialog>
<script src="/app.js"></script>
</body>
</html>

View File

@ -0,0 +1,977 @@
:root {
--primary: #5d67f5;
--primary-2: #7f8bff;
--primary-soft: #edf1ff;
--accent: #53c2da;
--bg: #f5f7ff;
--bg-2: #fbfcff;
--panel: rgba(255, 255, 255, 0.9);
--panel-strong: rgba(255, 255, 255, 0.96);
--border: rgba(212, 221, 247, 0.95);
--text: #22264d;
--muted: #68709d;
--danger: #b3261e;
--success: #11693c;
--shadow: 0 12px 28px rgba(73, 81, 141, 0.08);
--shadow-sm: 0 6px 16px rgba(73, 81, 141, 0.06);
--radius-xl: 20px;
--radius-lg: 16px;
--radius-md: 12px;
--radius-sm: 10px;
}
* { box-sizing: border-box; }
html, body {
margin: 0;
min-height: 100%;
}
body {
font-family: "Segoe UI", "PingFang SC", "Microsoft YaHei", sans-serif;
font-size: 13px;
color: var(--text);
background:
radial-gradient(circle at 10% 10%, rgba(126, 186, 255, 0.16), transparent 24%),
radial-gradient(circle at 88% 14%, rgba(132, 121, 255, 0.12), transparent 22%),
linear-gradient(135deg, #f8faff 0%, var(--bg) 55%, var(--bg-2) 100%);
}
a { color: inherit; text-decoration: none; }
button, input, textarea { font: inherit; }
.shell {
display: grid;
grid-template-columns: 220px minmax(0, 1fr);
gap: 14px;
min-height: 100vh;
padding: 14px;
}
.sidebar, .panel, .detail-modal::backdrop {
backdrop-filter: blur(12px);
}
.sidebar {
display: flex;
flex-direction: column;
gap: 10px;
padding: 14px;
border: 1px solid var(--border);
border-radius: 22px;
background: linear-gradient(180deg, rgba(236, 243, 255, 0.92), rgba(255, 255, 255, 0.8));
box-shadow: var(--shadow);
}
.brand {
display: flex;
gap: 10px;
align-items: center;
}
.brand-mark {
width: 40px;
height: 40px;
display: grid;
place-items: center;
border-radius: 14px;
color: #fff;
font-size: 17px;
font-weight: 800;
background: linear-gradient(135deg, var(--primary), var(--primary-2));
}
.brand-kicker, .eyebrow {
margin: 0 0 3px;
color: var(--primary);
font-size: 10px;
font-weight: 700;
letter-spacing: 0.08em;
text-transform: uppercase;
}
.brand h1, .panel h3, .dialog-head h3 {
margin: 0;
}
.brand h1 { font-size: 18px; }
.nav {
display: grid;
gap: 6px;
}
.nav-link {
padding: 10px 12px;
border: 1px solid transparent;
border-radius: var(--radius-md);
color: var(--muted);
font-size: 13px;
font-weight: 700;
transition: 0.2s ease;
}
.nav-link:hover, .nav-link.active {
color: var(--primary);
border-color: rgba(109, 123, 255, 0.16);
background: rgba(255, 255, 255, 0.78);
}
.side-card, .panel {
border: 1px solid var(--border);
border-radius: var(--radius-xl);
background: var(--panel);
box-shadow: var(--shadow-sm);
}
.panel { padding: 14px; }
.panel-head {
display: flex;
justify-content: space-between;
align-items: start;
gap: 10px;
margin-bottom: 10px;
}
.panel h3 { font-size: 17px; }
.sidebar-shortcuts {
display: flex;
flex-wrap: wrap;
gap: 6px;
padding: 10px;
margin-top: auto;
}
.pill-link, .chip {
display: inline-flex;
align-items: center;
min-height: 24px;
padding: 0 9px;
border-radius: 999px;
font-size: 11px;
font-weight: 700;
}
.pill-link {
background: rgba(255, 255, 255, 0.9);
border: 1px solid var(--border);
}
.chip {
background: var(--primary-soft);
color: var(--primary);
}
.chip.status-done, .chip.status-completed { background: #edfdf4; color: var(--success); }
.chip.status-pending, .chip.status-todo { background: #fff8e7; color: #b8860b; }
.chip.status-in_progress, .chip.status-active { background: #e8f4fd; color: #4a90d9; }
.chip.status-blocked { background: #fff4f2; color: var(--danger); }
.main {
display: flex;
flex-direction: column;
gap: 12px;
min-height: 0;
}
.main-toolbar {
display: flex;
justify-content: space-between;
align-items: center;
gap: 16px;
padding: 16px 18px;
border: 1px solid var(--border);
border-radius: 22px;
background:
radial-gradient(circle at top right, rgba(134, 144, 255, 0.12), transparent 28%),
linear-gradient(180deg, rgba(255, 255, 255, 0.94), rgba(244, 248, 255, 0.96));
box-shadow: var(--shadow);
}
.main-toolbar h2 {
margin: 0;
font-size: 22px;
}
.main-toolbar-actions {
display: flex;
gap: 8px;
}
.btn, .icon-btn {
border: none;
cursor: pointer;
transition: 0.2s ease;
}
.btn {
display: inline-flex;
align-items: center;
justify-content: center;
min-height: 36px;
padding: 0 14px;
border-radius: 11px;
font-size: 12px;
font-weight: 700;
color: #fff;
background: linear-gradient(135deg, var(--primary), var(--primary-2));
box-shadow: 0 8px 18px rgba(93, 103, 245, 0.18);
}
.btn:hover, .icon-btn:hover { transform: translateY(-1px); }
.btn:disabled {
opacity: 0.68;
cursor: not-allowed;
transform: none;
}
.btn.ghost {
color: var(--primary);
background: rgba(255, 255, 255, 0.94);
box-shadow: none;
border: 1px solid var(--border);
}
.stats-grid, .content-grid, .workspace-grid {
display: grid;
gap: 12px;
}
.stats-grid { grid-template-columns: repeat(4, minmax(0, 1fr)); }
.highlight-card {
padding: 0;
border: 1px solid var(--border);
border-radius: var(--radius-lg);
background: var(--panel-strong);
box-shadow: var(--shadow-sm);
overflow: hidden;
}
.highlight-card .hc-bar {
height: 4px;
background: var(--card-accent);
}
.highlight-card .eyebrow {
padding: 12px 14px 0;
}
.highlight-card strong {
display: block;
margin: 4px 0 2px;
padding: 0 14px;
font-size: 26px;
color: var(--card-accent);
}
.highlight-card p:last-child {
padding: 0 14px 14px;
margin: 0;
color: var(--muted);
}
.dashboard-grid {
grid-template-columns: minmax(330px, 1.1fr) minmax(340px, 1fr) minmax(220px, 0.72fr);
align-items: start;
}
.search-box, .import-form, .import-fieldset {
display: grid;
gap: 8px;
}
.import-fieldset {
margin: 0;
padding: 0;
border: 0;
min-width: 0;
}
.import-fieldset:disabled { opacity: 0.6; }
.search-box input, .graph-controls input, textarea, input[type="file"] {
width: 100%;
min-height: 38px;
padding: 9px 12px;
border: 1px solid var(--border);
border-radius: 11px;
background: rgba(255, 255, 255, 0.94);
color: var(--text);
}
textarea {
min-height: 138px;
resize: vertical;
}
.field-label {
font-size: 11px;
font-weight: 700;
color: var(--muted);
}
.check-row {
display: flex;
align-items: center;
gap: 8px;
font-size: 12px;
color: var(--muted);
}
.status-box {
margin-top: 10px;
padding: 10px 12px;
border-radius: 12px;
border: 1px solid var(--border);
background: rgba(255, 255, 255, 0.76);
font-size: 12px;
color: var(--muted);
}
.status-box[data-kind="error"] {
color: var(--danger);
background: #fff4f2;
}
.status-box[data-kind="success"] {
color: var(--success);
background: #edfdf4;
}
.progress-list, .search-results, .mini-stats, .card-list, .list-stack, .related-search {
display: grid;
gap: 8px;
}
.progress-item, .mini-stat, .card, .list-item, .result-card, .detail-card {
padding: 12px;
border: 1px solid var(--border);
border-radius: 14px;
background: rgba(255, 255, 255, 0.88);
}
.progress-item {
display: grid;
grid-template-columns: 24px 1fr;
gap: 8px;
align-items: start;
}
.progress-index {
width: 24px;
height: 24px;
display: grid;
place-items: center;
border-radius: 999px;
background: var(--primary-soft);
color: var(--primary);
font-size: 11px;
font-weight: 700;
}
.mini-stat {
display: flex;
align-items: center;
gap: 10px;
padding: 10px 12px;
}
.ms-icon {
width: 32px;
height: 32px;
display: grid;
place-items: center;
border-radius: 10px;
font-size: 15px;
background: color-mix(in srgb, var(--stat-color) 14%, transparent);
color: var(--stat-color);
flex-shrink: 0;
}
.ms-body strong {
display: block;
font-size: 16px;
line-height: 1.2;
}
.ms-body p {
margin: 0;
font-size: 11px;
color: var(--muted);
}
.mini-stat strong, .card h4, .list-item strong, .result-card strong {
display: block;
margin-bottom: 4px;
}
.card { cursor: pointer; }
.card:hover, .result-card:hover, .list-item:hover {
border-color: rgba(120, 132, 255, 0.34);
}
.content-grid { grid-template-columns: repeat(2, minmax(0, 1fr)); }
/* ── Meeting card ── */
.meeting-card {
display: flex;
gap: 10px;
padding: 12px;
border: 1px solid var(--border);
border-radius: 14px;
background: rgba(255, 255, 255, 0.88);
cursor: pointer;
transition: 0.2s ease;
}
.meeting-card:hover {
border-color: rgba(120, 132, 255, 0.34);
}
.mc-date {
flex-shrink: 0;
width: 44px;
height: 44px;
display: grid;
place-items: center;
border-radius: 10px;
background: var(--primary-soft);
color: var(--primary);
font-size: 11px;
font-weight: 700;
text-align: center;
line-height: 1.2;
}
.mc-body h4 {
margin: 0 0 4px;
font-size: 13px;
}
.mc-body p {
margin: 0;
font-size: 12px;
color: var(--muted);
display: -webkit-box;
-webkit-line-clamp: 2;
-webkit-box-orient: vertical;
overflow: hidden;
}
/* ── List item with priority dot ── */
.list-item {
display: flex;
gap: 10px;
padding: 12px;
border: 1px solid var(--border);
border-radius: 14px;
background: rgba(255, 255, 255, 0.88);
}
.li-priority {
flex-shrink: 0;
width: 4px;
border-radius: 2px;
background: var(--pri-color);
}
.li-body {
flex: 1;
min-width: 0;
}
.li-body strong {
display: block;
margin-bottom: 2px;
}
.li-body p {
margin: 0 0 6px;
font-size: 12px;
color: var(--muted);
}
/* ── Metric card ── */
.metric-card {
padding: 12px;
border: 1px solid var(--border);
border-radius: 14px;
background: rgba(255, 255, 255, 0.88);
}
.mc-head {
display: flex;
justify-content: space-between;
align-items: center;
margin-bottom: 2px;
}
.mc-head strong {
display: block;
}
.mc-value {
font-size: 16px;
font-weight: 700;
color: var(--primary);
}
.metric-card p {
margin: 0 0 8px;
font-size: 12px;
color: var(--muted);
}
.mc-bar-track {
height: 4px;
border-radius: 2px;
background: rgba(212, 221, 247, 0.5);
margin-bottom: 8px;
overflow: hidden;
}
.mc-bar-fill {
height: 100%;
border-radius: 2px;
background: linear-gradient(90deg, var(--primary), var(--primary-2));
transition: width 0.4s ease;
}
/* ── Series card ── */
.series-card {
display: flex;
gap: 10px;
align-items: center;
padding: 12px;
border: 1px solid var(--border);
border-radius: 14px;
background: rgba(255, 255, 255, 0.88);
}
.sc-count {
flex-shrink: 0;
width: 36px;
height: 36px;
display: grid;
place-items: center;
border-radius: 10px;
font-size: 14px;
font-weight: 700;
background: var(--primary-soft);
color: var(--primary);
}
.sc-body strong {
display: block;
margin-bottom: 2px;
}
.sc-body p {
margin: 0;
font-size: 12px;
color: var(--muted);
}
/* ── Unified Import / Search panel ── */
.unified-panel {
display: flex;
flex-direction: column;
}
.unified-tabs {
display: flex;
gap: 4px;
margin-bottom: 12px;
padding: 3px;
border-radius: 11px;
background: rgba(212, 221, 247, 0.3);
}
.unified-tab {
flex: 1;
padding: 7px 12px;
border: none;
border-radius: 8px;
font-size: 12px;
font-weight: 700;
cursor: pointer;
background: transparent;
color: var(--muted);
transition: 0.2s ease;
}
.unified-tab.active {
background: #fff;
color: var(--primary);
box-shadow: 0 2px 6px rgba(73, 81, 141, 0.1);
}
.unified-tab:hover:not(.active) {
color: var(--text);
}
.unified-pane.hidden {
display: none;
}
/* ── Result card with kind badge ── */
.result-card {
position: relative;
}
.rc-kind {
display: inline-block;
padding: 1px 7px;
border-radius: 4px;
font-size: 10px;
font-weight: 700;
text-transform: uppercase;
background: var(--primary-soft);
color: var(--primary);
margin-bottom: 4px;
}
.empty-state {
padding: 16px 14px;
text-align: center;
border: 1px dashed var(--border);
border-radius: 14px;
color: var(--muted);
}
.detail-modal {
width: min(820px, calc(100vw - 24px));
border: 1px solid var(--border);
border-radius: 20px;
padding: 0;
background: rgba(255, 255, 255, 0.97);
box-shadow: var(--shadow);
}
.detail-modal::backdrop {
background: rgba(37, 44, 78, 0.28);
}
.dialog-head {
display: flex;
justify-content: space-between;
gap: 10px;
padding: 16px 16px 6px;
}
.dialog-meta { padding: 0 16px 6px; color: var(--muted); }
.dialog-content {
margin: 0;
padding: 0 16px 16px;
white-space: pre-wrap;
font-family: "Consolas", "Courier New", monospace;
max-height: 60vh;
overflow: auto;
color: var(--muted);
}
.icon-btn {
width: 30px;
height: 30px;
border-radius: 10px;
background: rgba(242, 245, 255, 0.92);
color: var(--primary);
font-size: 20px;
}
/* ── Graph page ── */
.graph-shell {
height: 100vh;
overflow: hidden;
gap: 10px;
padding: 10px;
}
.graph-shell .sidebar {
flex-shrink: 0;
}
.graph-shell .main {
gap: 8px;
}
.graph-shell .graph-layout {
gap: 8px;
}
.graph-shell .graph-layout .panel {
padding: 10px;
}
.graph-layout {
display: grid;
grid-template-columns: 1fr 300px;
gap: 12px;
flex: 1;
min-height: 0;
}
.graph-stage-panel {
display: flex;
flex-direction: column;
padding: 0;
overflow: hidden;
}
.graph-stage {
flex: 1;
min-height: 0;
position: relative;
background:
linear-gradient(180deg, rgba(251, 253, 255, 0.96), rgba(241, 246, 255, 0.94)),
radial-gradient(circle at center, rgba(133, 196, 255, 0.08), transparent 36%);
}
#graphSvg {
width: 100%;
height: 100%;
display: block;
}
.detail-panel {
display: flex;
flex-direction: column;
gap: 8px;
overflow: hidden;
}
.detail-panel .detail-card,
.detail-panel .related-search {
overflow-y: auto;
}
.detail-card {
flex-shrink: 0;
word-break: break-all;
}
.detail-card strong {
word-break: break-word;
}
.related-search {
flex-shrink: 0;
}
.related-search .result-card {
word-break: break-all;
}
/* ── Graph toolbar ── */
.graph-toolbar { padding: 8px 12px; }
.graph-controls {
display: flex;
gap: 6px;
align-items: center;
}
.graph-controls .search-input {
flex: 1;
min-height: 30px;
padding: 6px 10px;
}
.graph-controls label.field-label {
display: flex;
align-items: center;
gap: 2px;
white-space: nowrap;
font-size: 10px;
}
.graph-controls label.field-label input {
width: 44px;
min-height: 26px;
padding: 4px 6px;
}
.graph-controls .btn {
min-height: 30px;
padding: 0 12px;
font-size: 11px;
}
.graph-toolbar-row {
display: flex;
justify-content: space-between;
align-items: center;
flex-wrap: wrap;
gap: 6px;
margin-top: 6px;
}
.graph-actions {
display: flex;
align-items: center;
gap: 8px;
font-size: 11px;
color: var(--muted);
}
.graph-type-filter {
display: flex;
flex-wrap: wrap;
align-items: center;
gap: 4px 10px;
}
.graph-type-filter label {
display: inline-flex;
align-items: center;
gap: 3px;
font-size: 11px;
color: var(--muted);
cursor: pointer;
user-select: none;
}
.graph-type-filter label input {
margin: 0;
accent-color: var(--primary);
}
.graph-meta { font-size: 11px; color: var(--muted); }
/* ── Graph nodes & edges ── */
.graph-node { cursor: pointer; }
.graph-node circle {
stroke: rgba(255, 255, 255, 0.85);
stroke-width: 2;
transition: filter 0.15s;
}
.graph-node--meeting circle { fill: #4a90d9; }
.graph-node--episode circle { fill: #34c759; }
.graph-node--entity circle { fill: var(--accent); }
.graph-node--fact circle { fill: #ff9500; }
.graph-node:hover circle { filter: brightness(1.2); }
.graph-node text {
font-size: 11px;
fill: var(--text);
pointer-events: none;
user-select: none;
}
.graph-edge {
stroke: rgba(120, 136, 194, 0.42);
stroke-width: 1.6;
cursor: pointer;
transition: stroke 0.15s, stroke-width 0.15s;
}
.edge-wrap:hover .graph-edge {
stroke: rgba(120, 136, 194, 0.7);
stroke-width: 2;
}
.graph-edge.active {
stroke: var(--primary);
stroke-width: 2.4;
}
.edge-wrap text {
pointer-events: none;
user-select: none;
}
/* ── Legend ── */
.legend { font-size: 11px; color: var(--muted); }
.legend-dot {
display: inline-block;
width: 9px;
height: 9px;
border-radius: 50%;
margin-right: 6px;
}
.legend-dot.meeting { background: #4a90d9; }
.legend-dot.episode { background: #34c759; }
.legend-dot.entity { background: var(--accent); }
.legend-dot.fact { background: #ff9500; }
.graph-shell .sidebar {
gap: 8px;
padding: 10px;
}
.graph-shell .sidebar .legend {
display: flex;
flex-direction: column;
gap: 3px;
font-size: 11px;
padding: 0 4px;
}
.graph-shell .sidebar .legend .eyebrow {
margin-bottom: 4px;
}
/* ── Graph controls overlay ── */
.zoom-reset-btn, .pause-btn {
font-size: 11px;
min-height: 28px;
padding: 0 10px;
}
.zoom-hint {
font-size: 11px;
color: var(--muted);
padding: 4px 0;
}
/* ── Responsive ── */
@media (max-width: 1240px) {
.shell, .graph-shell, .dashboard-grid, .content-grid, .graph-layout, .stats-grid {
grid-template-columns: 1fr;
}
.sidebar { order: 2; }
.graph-shell { height: auto; overflow: auto; }
}
@media (max-width: 720px) {
.shell, .graph-shell {
padding: 10px;
gap: 10px;
}
.sidebar, .panel { border-radius: 18px; }
.search-box { grid-template-columns: 1fr; }
.graph-stage { min-height: 250px; }
.graph-controls { flex-wrap: wrap; }
.graph-controls .search-input { min-width: 100%; }
}

View File

@ -1,8 +1,4 @@
openai>=1.0.0
pydantic>=2.0.0
llama-index>=0.10.0
llama-index-embeddings-openai>=0.1.0
llama-index-vector-stores-chroma>=0.1.0
chromadb>=0.5.0
python-dotenv>=1.0.0
neo4j>=5.26.0

View File

@ -1,259 +0,0 @@
import hashlib
import json
import logging
import os
import re
from typing import List, Optional
from openai import OpenAI as OpenAI_Client
from llama_index.core import (
Document,
VectorStoreIndex,
StorageContext,
load_index_from_storage,
)
from llama_index.core.embeddings import BaseEmbedding
from llama_index.core.settings import Settings
from config import config
logger = logging.getLogger(__name__)
class CustomOpenAIEmbedding(BaseEmbedding):
def __init__(
self,
model: str = "text-embedding-ada-002",
api_key: Optional[str] = None,
api_base: Optional[str] = None,
**kwargs,
):
super().__init__(model_name=model, **kwargs)
self._client = OpenAI_Client(
api_key=api_key or "not-needed",
base_url=api_base,
)
self._model = model
async def _aget_query_embedding(self, query: str) -> List[float]:
return self._get_embedding(query)
async def _aget_text_embedding(self, text: str) -> List[float]:
return self._get_embedding(text)
def _get_query_embedding(self, query: str) -> List[float]:
return self._get_embedding(query)
def _get_text_embedding(self, text: str) -> List[float]:
return self._get_embedding(text)
def _get_embedding(self, text: str) -> List[float]:
resp = self._client.embeddings.create(
model=self._model,
input=text,
)
return resp.data[0].embedding
class MeetingVectorStore:
def __init__(self):
embed_model = CustomOpenAIEmbedding(
model=config.embedding.model,
api_key=config.embedding.api_key or None,
api_base=config.embedding.api_base if config.embedding.api_base else None,
)
Settings.embed_model = embed_model
self.persist_dir = config.vector_store.persist_dir
self._index: Optional[VectorStoreIndex] = None
self._load_or_create_index()
def _load_or_create_index(self):
if os.path.exists(os.path.join(self.persist_dir, "docstore.json")):
try:
storage_context = StorageContext.from_defaults(persist_dir=self.persist_dir)
self._index = load_index_from_storage(storage_context)
logger.info(f"从磁盘加载向量索引: {self.persist_dir}")
return
except Exception as e:
logger.warning(f"加载向量索引失败,将创建新索引: {e}")
self._index = VectorStoreIndex.from_documents([])
logger.info("创建新的向量索引")
def _save(self):
if self._index:
os.makedirs(self.persist_dir, exist_ok=True)
self._index.storage_context.persist(persist_dir=self.persist_dir)
def _meeting_id(self, meeting_data: dict) -> str:
title = meeting_data.get("title", "")
date = meeting_data.get("date", "")
raw = f"{date}_{title}"
return f"meeting_{hashlib.md5(raw.encode('utf-8')).hexdigest()[:12]}"
def find_meeting(self, title: str, date: str = "") -> Optional[dict]:
if not self._index:
return None
query_text = f"会议标题: {title}"
if date:
query_text += f" 日期: {date}"
try:
results = self.query(query_text, top_k=3)
for r in results:
meta = r.get("metadata", {})
meta_title = meta.get("title", "")
if meta_title == title or (date and meta.get("date") == date):
return meta
return None
except Exception as e:
logger.warning(f"会议查重查询失败: {e}")
return None
def find_similar_text(self, text: str, threshold: float = 0.92) -> Optional[dict]:
if not self._index:
return None
try:
retriever = self._index.as_retriever(similarity_top_k=3)
nodes = retriever.retrieve(text)
for node in nodes:
if node.score is not None and node.score > threshold:
return {
"metadata": node.metadata,
"score": node.score,
}
return None
except Exception as e:
logger.warning(f"文本相似度查重失败: {e}")
return None
def remove_meeting(self, meeting_id: str) -> bool:
if not self._index:
return False
try:
for field in self._FIELD_TYPES:
self._index.delete_ref_doc(f"{meeting_id}_{field}")
self._save()
logger.info(f"已从向量索引移除会议: {meeting_id}")
return True
except Exception as e:
logger.warning(f"移除向量索引失败: {e}")
return False
_FIELD_TYPES = ["header", "summary", "action_items", "metrics", "decisions", "relations", "entities"]
def add_meeting(self, meeting_data: dict) -> bool:
try:
meeting_id = self._meeting_id(meeting_data)
original_text_path = meeting_data.get("_original_text_path", "")
original_text = meeting_data.get("_original_text", "")
base_metadata = {
"title": meeting_data.get("title", ""),
"date": meeting_data.get("date", ""),
"participants": ", ".join(meeting_data.get("participants", [])),
"type": "meeting",
"content_hash": meeting_data.get("_content_hash", ""),
"original_text_path": original_text_path,
"original_text_excerpt": original_text[:500] if original_text else "",
"meeting_id": meeting_id,
}
docs = self._build_field_docs(meeting_data, base_metadata, meeting_id)
if self._index:
for doc in docs:
self._index.insert(doc)
self._save()
logger.info(f"会议 '{meeting_data.get('title')}' 已添加到向量索引 (id={meeting_id}, 字段数={len(docs)})")
return True
except Exception as e:
logger.error(f"添加会议到向量索引失败: {e}")
return False
def _build_field_docs(self, data: dict, base: dict, meeting_id: str) -> List[Document]:
docs = []
header = f"# {data.get('title', '')}"
if data.get("date"):
header += f"\n日期: {data['date']}"
if data.get("participants"):
header += f"\n参会人: {', '.join(data['participants'])}"
docs.append(Document(text=header, metadata={**base, "field": "header"}, doc_id=f"{meeting_id}_header"))
if data.get("summary"):
docs.append(Document(text=data["summary"], metadata={**base, "field": "summary"}, doc_id=f"{meeting_id}_summary"))
if data.get("action_items"):
lines = []
for item in data["action_items"]:
status = item.get('status', '待办')
lines.append(f"- [{status}] {item.get('task', '')} (负责人: {item.get('assignee', '')}, 截止: {item.get('deadline', '')}, 优先级: {item.get('priority', '')})")
history = item.get("_history", [])
if len(history) > 1:
lines.append(" 演变: " + "".join(f"{h.get('date','')}({h.get('status','')})" for h in history))
docs.append(Document(text="\n".join(lines), metadata={**base, "field": "action_items"}, doc_id=f"{meeting_id}_action_items"))
if data.get("metrics"):
lines = []
for m in data["metrics"]:
lines.append(f"- {m.get('metric_name', '')}: {m.get('value', '')} (目标: {m.get('target', '')}, 趋势: {m.get('trend', '')})")
docs.append(Document(text="\n".join(lines), metadata={**base, "field": "metrics"}, doc_id=f"{meeting_id}_metrics"))
if data.get("decisions"):
lines = [f"- {d.get('content', '')}" for d in data["decisions"]]
docs.append(Document(text="\n".join(lines), metadata={**base, "field": "decisions"}, doc_id=f"{meeting_id}_decisions"))
if data.get("relations"):
lines = [f"- {r.get('subject', '')} --{r.get('predicate', '')}--> {r.get('object', '')}" for r in data["relations"]]
docs.append(Document(text="\n".join(lines), metadata={**base, "field": "relations"}, doc_id=f"{meeting_id}_relations"))
if data.get("entities"):
lines = [f"- [{e.get('entity_type', '')}] {e.get('name', '')}: {e.get('description', '')}" for e in data["entities"]]
docs.append(Document(text="\n".join(lines), metadata={**base, "field": "entities"}, doc_id=f"{meeting_id}_entities"))
return docs
def query(self, question: str, top_k: int = 5) -> List[dict]:
if not self._index:
return []
try:
retriever = self._index.as_retriever(similarity_top_k=top_k)
nodes = retriever.retrieve(question)
results = []
for node in nodes:
results.append({
"text": node.text,
"score": node.score,
"metadata": node.metadata,
})
return results
except Exception as e:
logger.error(f"查询向量索引失败: {e}")
return []
def query_as_context(self, question: str, top_k: int = 3) -> str:
results = self.query(question, top_k=top_k)
if not results:
return ""
parts = []
for i, r in enumerate(results):
metadata = r.get("metadata", {})
parts.append(f"[{i+1}] {metadata.get('title', '未知会议')} ({metadata.get('date', '')})\n{r['text']}\n")
return "\n".join(parts)
def get_stats(self) -> dict:
if not self._index:
return {"doc_count": 0, "node_count": 0}
try:
docstore = self._index.docstore
docs = list(docstore.docs.values()) if hasattr(docstore, 'docs') else []
return {
"doc_count": len(docstore.docs) if hasattr(docstore, 'docs') else 0,
"node_count": len(docs),
}
except Exception:
return {"doc_count": 0, "node_count": 0}
meeting_vector_store = MeetingVectorStore()