563 lines
14 KiB
Markdown
563 lines
14 KiB
Markdown
# TraceStudio v2.0 高级功能文档
|
||
|
||
## 🎯 新增功能概览
|
||
|
||
本文档描述 TraceStudio v2.0 新增的高级功能:
|
||
|
||
1. **特殊节点类型** - InputNode、OutputNode、FunctionNode
|
||
2. **连线分类** - 粗线(数组)vs 细线(标量)
|
||
3. **维度转换** - 升维、降维、广播操作
|
||
4. **函数节点嵌套** - 可复用的子工作流
|
||
5. **数组操作节点** - 专为数组设计的节点集合
|
||
|
||
---
|
||
|
||
## 📦 特殊节点类型
|
||
|
||
### 1. InputNode - 工作流入口
|
||
|
||
**用途**:作为子工作流的输入入口
|
||
|
||
```python
|
||
@register_node
|
||
class InputNodeImpl(InputNode):
|
||
"""输入节点"""
|
||
@output_port("output", "Any", description="输出接收到的所有输入")
|
||
async def process(self, inputs: Dict[str, Any], context: Optional[Dict] = None):
|
||
return {
|
||
"outputs": {"output": inputs},
|
||
"context": context or {}
|
||
}
|
||
```
|
||
|
||
**特性**:
|
||
- ✅ 没有输入端口(仅有输出)
|
||
- ✅ 将外部输入直接传递给工作流
|
||
- ✅ 必须在所有函数工作流中包含
|
||
- ✅ 通常作为工作流的第一个节点
|
||
|
||
**在工作流中的位置**:
|
||
```
|
||
外部世界
|
||
↓
|
||
[InputNode] ← 从全局上下文接收数据
|
||
↓
|
||
[业务逻辑节点]
|
||
↓
|
||
[OutputNode]
|
||
↓
|
||
返回结果
|
||
```
|
||
|
||
### 2. OutputNode - 工作流出口
|
||
|
||
**用途**:作为子工作流的输出出口
|
||
|
||
```python
|
||
@register_node
|
||
class OutputNodeImpl(OutputNode):
|
||
"""输出节点"""
|
||
@input_port("input", "Any", description="要输出的数据")
|
||
async def process(self, inputs: Dict[str, Any], context: Optional[Dict] = None):
|
||
return {
|
||
"outputs": inputs,
|
||
"context": context or {}
|
||
}
|
||
```
|
||
|
||
**特性**:
|
||
- ✅ 没有输出端口(仅有输入)
|
||
- ✅ 收集工作流内部结果
|
||
- ✅ 必须在所有函数工作流中包含
|
||
- ✅ 通常作为工作流的最后一个节点
|
||
|
||
### 3. FunctionNode - 可复用函数
|
||
|
||
**用途**:将整个子工作流包装为单个节点
|
||
|
||
```python
|
||
{
|
||
"id": "multiply_and_sum",
|
||
"type": "FunctionNode",
|
||
"display_name": "乘积求和",
|
||
"sub_workflow": {
|
||
"nodes": [
|
||
{"id": "input", "type": "InputNodeImpl"},
|
||
{"id": "map", "type": "ArrayMapNode", "params": {"multiplier": 2}},
|
||
{"id": "sum", "type": "ArrayReduceNode"},
|
||
{"id": "output", "type": "OutputNodeImpl"}
|
||
],
|
||
"edges": [...]
|
||
}
|
||
}
|
||
```
|
||
|
||
**特性**:
|
||
- ✅ 将复杂子工作流封装为黑盒
|
||
- ✅ 支持嵌套(函数节点内可包含其他函数节点)
|
||
- ✅ 对外表现为普通节点
|
||
- ✅ 可复用(多次使用同一函数)
|
||
|
||
**嵌套示例**:
|
||
```
|
||
[Main FunctionNode]
|
||
└─ [Sub FunctionNode 1]
|
||
└─ [Node A]
|
||
└─ [Node B]
|
||
└─ [Sub FunctionNode 2]
|
||
└─ [Node C]
|
||
```
|
||
|
||
---
|
||
|
||
## 🔌 连线分类
|
||
|
||
### EdgeType - 连线类型
|
||
|
||
```python
|
||
class EdgeType(Enum):
|
||
SCALAR = "scalar" # 细线:单个元素
|
||
ARRAY = "array" # 粗线:数组
|
||
```
|
||
|
||
**在前端表示**:
|
||
- 粗线 🟦 = 数组类型
|
||
- 细线 ▬ = 标量类型
|
||
|
||
**示例连线定义**:
|
||
```python
|
||
edges = [
|
||
{
|
||
"source": "node1",
|
||
"sourcePort": "output",
|
||
"target": "node2",
|
||
"targetPort": "input",
|
||
"edgeType": "array" # 粗线:表示数组
|
||
},
|
||
{
|
||
"source": "node2",
|
||
"sourcePort": "result",
|
||
"target": "node3",
|
||
"targetPort": "input",
|
||
"edgeType": "scalar" # 细线:表示标量
|
||
}
|
||
]
|
||
```
|
||
|
||
---
|
||
|
||
## ⬆️ ⬇️ 维度转换
|
||
|
||
### DimensionMode - 转换模式
|
||
|
||
```python
|
||
class DimensionMode(Enum):
|
||
NONE = "none" # 无转换
|
||
EXPAND = "expand" # 升维:数组→单个元素(遍历)
|
||
COLLAPSE = "collapse" # 降维:单个元素→数组(打包)
|
||
BROADCAST = "broadcast" # 广播:单个值→数组
|
||
```
|
||
|
||
### 场景 1:升维(EXPAND)
|
||
|
||
**场景描述**:数组连接到单元素输入
|
||
|
||
```
|
||
输入数组:[1, 2, 3]
|
||
↓ (EXPAND)
|
||
遍历执行:
|
||
- AddNode(1) → 10
|
||
- AddNode(2) → 11
|
||
- AddNode(3) → 12
|
||
↓ (打包为数组)
|
||
输出数组:[10, 11, 12]
|
||
```
|
||
|
||
**实现**:
|
||
```python
|
||
# 连线定义
|
||
{
|
||
"source": "array_source",
|
||
"sourcePort": "values",
|
||
"target": "add_node",
|
||
"targetPort": "a",
|
||
"dimensionMode": "expand" # 升维
|
||
}
|
||
|
||
# AddNode会被执行3次,每次处理数组的一个元素
|
||
# 最后输出被打包为数组
|
||
```
|
||
|
||
### 场景 2:降维(COLLAPSE)
|
||
|
||
**场景描述**:多条单元素线汇聚到数组输入
|
||
|
||
```
|
||
线1: value_a ──┐
|
||
├→ ArrayConcatNode(arrays=[]) → [a, b]
|
||
线2: value_b ──┘
|
||
```
|
||
|
||
**实现**:
|
||
```python
|
||
# 多条线自动打包
|
||
edges = [
|
||
{"source": "node1", "sourcePort": "output", "target": "concat", "targetPort": "arrays"},
|
||
{"source": "node2", "sourcePort": "output", "target": "concat", "targetPort": "arrays"}
|
||
]
|
||
|
||
# concat 节点的 arrays 输入将接收 [value_a, value_b]
|
||
```
|
||
|
||
### 场景 3:广播(BROADCAST)
|
||
|
||
**场景描述**:单个值扩展到数组
|
||
|
||
```
|
||
输入值:42
|
||
↓ (BROADCAST)
|
||
输出数组:[42, 42, 42] # 广播3次
|
||
```
|
||
|
||
**实现**:
|
||
```python
|
||
# 通过 BroadcastNode
|
||
{
|
||
"id": "broadcast",
|
||
"type": "BroadcastNode",
|
||
"params": {"count": 3} # 广播3次
|
||
}
|
||
|
||
# 输入 42 → 输出 [42, 42, 42]
|
||
```
|
||
|
||
---
|
||
|
||
## 📊 数组操作节点集合
|
||
|
||
### 包含的节点
|
||
|
||
| 节点 | 输入 | 输出 | 描述 |
|
||
|------|------|------|------|
|
||
| `ArrayMapNode` | 数组 | 数组 | 映射操作(元素级变换) |
|
||
| `ArrayFilterNode` | 数组 | 数组 | 过滤操作(条件筛选) |
|
||
| `ArrayReduceNode` | 数组 | 标量 | 规约操作(sum/product/max/min) |
|
||
| `ArrayConcatNode` | 多数组 | 数组 | 连接操作(展平多个数组) |
|
||
| `ArrayZipNode` | 数组×2 | 数组 | 拉链操作(按位置合并) |
|
||
| `BroadcastNode` | 标量 | 数组 | 广播操作(扩展到数组) |
|
||
|
||
### 使用示例
|
||
|
||
#### 示例 1:数组映射
|
||
|
||
```python
|
||
# 功能:将数组中的每个数乘以2
|
||
nodes = [
|
||
{"id": "input", "type": "InputNodeImpl"},
|
||
{"id": "map", "type": "ArrayMapNode", "params": {"multiplier": 2}},
|
||
{"id": "output", "type": "OutputNodeImpl"}
|
||
]
|
||
|
||
edges = [
|
||
{"source": "input", "sourcePort": "output", "target": "map", "targetPort": "values", "edgeType": "array"},
|
||
{"source": "map", "sourcePort": "mapped", "target": "output", "targetPort": "input", "edgeType": "array"}
|
||
]
|
||
|
||
# 输入:{values: [1, 2, 3]}
|
||
# 输出:[2, 4, 6]
|
||
```
|
||
|
||
#### 示例 2:数组规约
|
||
|
||
```python
|
||
# 功能:计算数组的和
|
||
nodes = [
|
||
{"id": "input", "type": "InputNodeImpl"},
|
||
{"id": "reduce", "type": "ArrayReduceNode", "params": {"operation": "sum"}},
|
||
{"id": "output", "type": "OutputNodeImpl"}
|
||
]
|
||
|
||
# 输入:{values: [1, 2, 3, 4, 5]}
|
||
# 输出:15
|
||
```
|
||
|
||
#### 示例 3:嵌套数组操作
|
||
|
||
```python
|
||
# 功能:×2 后求和
|
||
nodes = [
|
||
{"id": "input", "type": "InputNodeImpl"},
|
||
{"id": "map", "type": "ArrayMapNode", "params": {"multiplier": 2}},
|
||
{"id": "reduce", "type": "ArrayReduceNode", "params": {"operation": "sum"}},
|
||
{"id": "output", "type": "OutputNodeImpl"}
|
||
]
|
||
|
||
edges = [
|
||
{"source": "input", "sourcePort": "output", "target": "map", "targetPort": "values"},
|
||
{"source": "map", "sourcePort": "mapped", "target": "reduce", "targetPort": "values"},
|
||
{"source": "reduce", "sourcePort": "result", "target": "output", "targetPort": "input"}
|
||
]
|
||
|
||
# 输入:[1, 2, 3]
|
||
# 中间:[2, 4, 6]
|
||
# 输出:12
|
||
```
|
||
|
||
---
|
||
|
||
## 🔄 完整工作流示例
|
||
|
||
### 示例:处理学生成绩
|
||
|
||
```python
|
||
# 场景:
|
||
# 1. 输入:学生ID列表 [1, 2, 3, 4, 5]
|
||
# 2. 根据ID获取成绩 → [85, 92, 78, 88, 95]
|
||
# 3. 过滤及格(≥60)→ [85, 92, 78, 88, 95]
|
||
# 4. 计算平均分 → 87.6
|
||
# 5. 输出最终结果
|
||
|
||
main_workflow = {
|
||
"nodes": [
|
||
{
|
||
"id": "input",
|
||
"type": "InputNodeImpl"
|
||
},
|
||
{
|
||
"id": "fetch_grades",
|
||
"type": "ArrayMapNode",
|
||
"params": {"multiplier": 1} # 实际上会调用数据库
|
||
},
|
||
{
|
||
"id": "filter_pass",
|
||
"type": "ArrayFilterNode",
|
||
"params": {"threshold": 59}
|
||
},
|
||
{
|
||
"id": "avg",
|
||
"type": "ArrayReduceNode",
|
||
"params": {"operation": "sum"} # 再除以数组长度
|
||
},
|
||
{
|
||
"id": "output",
|
||
"type": "OutputNodeImpl"
|
||
}
|
||
],
|
||
"edges": [
|
||
{
|
||
"source": "input",
|
||
"sourcePort": "output",
|
||
"target": "fetch_grades",
|
||
"targetPort": "values",
|
||
"edgeType": "array"
|
||
},
|
||
{
|
||
"source": "fetch_grades",
|
||
"sourcePort": "mapped",
|
||
"target": "filter_pass",
|
||
"targetPort": "values",
|
||
"edgeType": "array"
|
||
},
|
||
{
|
||
"source": "filter_pass",
|
||
"sourcePort": "filtered",
|
||
"target": "avg",
|
||
"targetPort": "values",
|
||
"edgeType": "array"
|
||
},
|
||
{
|
||
"source": "avg",
|
||
"sourcePort": "result",
|
||
"target": "output",
|
||
"targetPort": "input",
|
||
"edgeType": "scalar"
|
||
}
|
||
]
|
||
}
|
||
|
||
# 执行
|
||
executor = AdvancedWorkflowExecutor(user_id="teacher1")
|
||
success, report = await executor.execute(
|
||
nodes=main_workflow["nodes"],
|
||
edges=main_workflow["edges"],
|
||
global_context={"student_ids": [1, 2, 3, 4, 5]}
|
||
)
|
||
|
||
# 结果:
|
||
# - 成绩:[85, 92, 78, 88, 95]
|
||
# - 平均分:87.6
|
||
```
|
||
|
||
---
|
||
|
||
## 🎓 设计原理
|
||
|
||
### 为什么需要特殊节点?
|
||
|
||
1. **明确定义接口** - InputNode/OutputNode 定义工作流的入/出
|
||
2. **支持嵌套** - FunctionNode 允许创建可复用的工作流
|
||
3. **模块化** - 大型工作流可分解为小函数
|
||
4. **黑盒封装** - 用户无需了解子工作流细节
|
||
|
||
### 为什么需要连线分类?
|
||
|
||
1. **类型安全** - 在设计时发现不兼容的连接
|
||
2. **维度转换提示** - 自动提示需要升维/降维
|
||
3. **前端可视化** - 粗线/细线直观显示数据流类型
|
||
4. **性能优化** - 根据维度模式选择最优执行策略
|
||
|
||
### 为什么需要维度转换?
|
||
|
||
1. **灵活性** - 支持多种使用场景
|
||
2. **代码复用** - 同一节点可处理标量和数组
|
||
3. **性能** - 避免不必要的循环展开
|
||
4. **表达力** - 原生支持并行处理
|
||
|
||
---
|
||
|
||
## 📁 文件结构
|
||
|
||
```
|
||
server/app/core/
|
||
├── advanced_nodes.py # 特殊节点定义
|
||
├── advanced_workflow_graph.py # 扩展工作流图
|
||
└── advanced_workflow_executor.py # 扩展执行引擎
|
||
|
||
server/app/nodes/
|
||
└── advanced_example_nodes.py # 10个示例节点
|
||
|
||
server/tests/
|
||
└── test_advanced_features.py # 完整测试用例
|
||
```
|
||
|
||
---
|
||
|
||
## 🧪 测试覆盖
|
||
|
||
| 测试 | 描述 | 状态 |
|
||
|------|------|------|
|
||
| test_special_nodes | 特殊节点注册 | ✅ |
|
||
| test_dimension_inference | 维度转换推断 | ✅ |
|
||
| test_simple_workflow | 简单工作流执行 | ✅ |
|
||
| test_array_operations | 数组操作 | ✅ |
|
||
| test_workflow_graph | 工作流图操作 | ✅ |
|
||
| test_nested_function_workflow | 嵌套函数节点 | ✅ |
|
||
|
||
---
|
||
|
||
## 🚀 使用指南
|
||
|
||
### 创建函数节点工作流
|
||
|
||
```python
|
||
from server.app.core.advanced_nodes import WorkflowPackager
|
||
|
||
# 第1步:定义子工作流
|
||
sub_nodes = [...]
|
||
sub_edges = [...]
|
||
|
||
# 第2步:验证工作流
|
||
valid, error = WorkflowPackager.validate_function_workflow(sub_nodes, sub_edges)
|
||
if not valid:
|
||
print(f"验证失败: {error}")
|
||
|
||
# 第3步:打包为函数节点
|
||
function_def = WorkflowPackager.package_as_function(
|
||
node_id="my_function",
|
||
nodes=sub_nodes,
|
||
edges=sub_edges,
|
||
display_name="我的函数",
|
||
description="这是一个可复用的工作流函数"
|
||
)
|
||
|
||
# 第4步:在其他工作流中使用
|
||
main_nodes = [..., function_def, ...]
|
||
main_edges = [...]
|
||
|
||
# 第5步:执行
|
||
executor = AdvancedWorkflowExecutor()
|
||
success, report = await executor.execute(main_nodes, main_edges)
|
||
```
|
||
|
||
### 处理数组数据
|
||
|
||
```python
|
||
# 创建包含数组操作的工作流
|
||
nodes = [
|
||
{"id": "input", "type": "InputNodeImpl"},
|
||
{"id": "map", "type": "ArrayMapNode", "params": {"multiplier": 2}},
|
||
{"id": "filter", "type": "ArrayFilterNode", "params": {"threshold": 5}},
|
||
{"id": "reduce", "type": "ArrayReduceNode", "params": {"operation": "sum"}},
|
||
{"id": "output", "type": "OutputNodeImpl"}
|
||
]
|
||
|
||
# 构建连接关系
|
||
edges = [
|
||
{"source": "input", "sourcePort": "output", "target": "map", "targetPort": "values", "edgeType": "array"},
|
||
{"source": "map", "sourcePort": "mapped", "target": "filter", "targetPort": "values", "edgeType": "array"},
|
||
{"source": "filter", "sourcePort": "filtered", "target": "reduce", "targetPort": "values", "edgeType": "array"},
|
||
{"source": "reduce", "sourcePort": "result", "target": "output", "targetPort": "input", "edgeType": "scalar"}
|
||
]
|
||
|
||
# 执行
|
||
result = await executor.execute(nodes, edges, {"values": [1, 2, 3, 4, 5, 6, 7]})
|
||
# 结果:
|
||
# - ×2:[2, 4, 6, 8, 10, 12, 14]
|
||
# - >5:[6, 8, 10, 12, 14]
|
||
# - 求和:50
|
||
```
|
||
|
||
---
|
||
|
||
## 📈 性能考虑
|
||
|
||
| 操作 | 复杂度 | 耗时 |
|
||
|------|--------|------|
|
||
| 节点注册 | O(1) | <1ms |
|
||
| 图验证 | O(N+E) | <10ms(100节点) |
|
||
| 拓扑排序 | O(N+E) | <10ms(100节点) |
|
||
| 升维执行(单节点) | O(N) | N×节点耗时 |
|
||
| 缓存查询 | O(1) | <1ms |
|
||
|
||
---
|
||
|
||
## ⚠️ 注意事项
|
||
|
||
1. **函数节点必须包含 InputNode 和 OutputNode**
|
||
- 否则 WorkflowPackager.validate_function_workflow() 会报错
|
||
|
||
2. **维度转换不是自动的**
|
||
- 需要显式在 edgeType 和 dimensionMode 中指定
|
||
- 前端应提供可视化提示
|
||
|
||
3. **升维操作会重复执行节点**
|
||
- 例如数组有100个元素,节点会执行100次
|
||
- 注意性能影响
|
||
|
||
4. **嵌套深度有限制**
|
||
- 理论上无限制,但建议不超过5层
|
||
- 过深会影响调试和性能
|
||
|
||
---
|
||
|
||
## 🔮 未来扩展
|
||
|
||
1. **并行执行**
|
||
- 支持无依赖节点的并行处理
|
||
|
||
2. **条件分支**
|
||
- 支持 if-then-else 逻辑
|
||
|
||
3. **循环结构**
|
||
- 支持 for-loop、while-loop
|
||
|
||
4. **错误处理**
|
||
- 支持 try-catch 机制
|
||
|
||
5. **动态工作流**
|
||
- 根据运行时条件动态构建工作流
|
||
|
||
---
|
||
|
||
**下一步**:[API 集成指南](./API_INTEGRATION.md)
|