ouczbs/TraceStudio-dev

Fork 0

Boshuang Zhao 021d032b7a TreaceStudio 初版，实现资源预加载功能

2026-01-09 21:37:02 +08:00

14 KiB

Raw Blame History

TraceStudio v2.0 高级功能文档

🎯 新增功能概览

本文档描述 TraceStudio v2.0 新增的高级功能：

特殊节点类型 - InputNode、OutputNode、FunctionNode
连线分类 - 粗线（数组）vs 细线（标量）
维度转换 - 升维、降维、广播操作
函数节点嵌套 - 可复用的子工作流
数组操作节点 - 专为数组设计的节点集合

📦 特殊节点类型

1. InputNode - 工作流入口

用途：作为子工作流的输入入口

@register_node
class InputNodeImpl(InputNode):
    """输入节点"""
    @output_port("output", "Any", description="输出接收到的所有输入")
    async def process(self, inputs: Dict[str, Any], context: Optional[Dict] = None):
        return {
            "outputs": {"output": inputs},
            "context": context or {}
        }

特性：

✅ 没有输入端口（仅有输出）
✅ 将外部输入直接传递给工作流
✅ 必须在所有函数工作流中包含
✅ 通常作为工作流的第一个节点

在工作流中的位置：

外部世界
   ↓
[InputNode] ← 从全局上下文接收数据
   ↓
[业务逻辑节点]
   ↓
[OutputNode]
   ↓
返回结果

2. OutputNode - 工作流出口

用途：作为子工作流的输出出口

@register_node
class OutputNodeImpl(OutputNode):
    """输出节点"""
    @input_port("input", "Any", description="要输出的数据")
    async def process(self, inputs: Dict[str, Any], context: Optional[Dict] = None):
        return {
            "outputs": inputs,
            "context": context or {}
        }

特性：

✅ 没有输出端口（仅有输入）
✅ 收集工作流内部结果
✅ 必须在所有函数工作流中包含
✅ 通常作为工作流的最后一个节点

3. FunctionNode - 可复用函数

用途：将整个子工作流包装为单个节点

{
    "id": "multiply_and_sum",
    "type": "FunctionNode",
    "display_name": "乘积求和",
    "sub_workflow": {
        "nodes": [
            {"id": "input", "type": "InputNodeImpl"},
            {"id": "map", "type": "ArrayMapNode", "params": {"multiplier": 2}},
            {"id": "sum", "type": "ArrayReduceNode"},
            {"id": "output", "type": "OutputNodeImpl"}
        ],
        "edges": [...]
    }
}

特性：

✅ 将复杂子工作流封装为黑盒
✅ 支持嵌套（函数节点内可包含其他函数节点）
✅ 对外表现为普通节点
✅ 可复用（多次使用同一函数）

嵌套示例：

[Main FunctionNode]
  └─ [Sub FunctionNode 1]
     └─ [Node A]
     └─ [Node B]
  └─ [Sub FunctionNode 2]
     └─ [Node C]

🔌 连线分类

EdgeType - 连线类型

class EdgeType(Enum):
    SCALAR = "scalar"    # 细线：单个元素
    ARRAY = "array"      # 粗线：数组

在前端表示：

粗线 🟦 = 数组类型
细线 ▬ = 标量类型

示例连线定义：

edges = [
    {
        "source": "node1",
        "sourcePort": "output",
        "target": "node2",
        "targetPort": "input",
        "edgeType": "array"        # 粗线：表示数组
    },
    {
        "source": "node2",
        "sourcePort": "result",
        "target": "node3",
        "targetPort": "input",
        "edgeType": "scalar"       # 细线：表示标量
    }
]

⬆️ ⬇️ 维度转换

DimensionMode - 转换模式

class DimensionMode(Enum):
    NONE = "none"             # 无转换
    EXPAND = "expand"         # 升维：数组→单个元素（遍历）
    COLLAPSE = "collapse"     # 降维：单个元素→数组（打包）
    BROADCAST = "broadcast"   # 广播：单个值→数组

场景 1：升维（EXPAND）

场景描述：数组连接到单元素输入

输入数组：[1, 2, 3]
         ↓ (EXPAND)
遍历执行：
  - AddNode(1) → 10
  - AddNode(2) → 11
  - AddNode(3) → 12
         ↓ (打包为数组)
输出数组：[10, 11, 12]

实现：

# 连线定义
{
    "source": "array_source",
    "sourcePort": "values",
    "target": "add_node",
    "targetPort": "a",
    "dimensionMode": "expand"    # 升维
}

# AddNode会被执行3次，每次处理数组的一个元素
# 最后输出被打包为数组

场景 2：降维（COLLAPSE）

场景描述：多条单元素线汇聚到数组输入

线1: value_a ──┐
               ├→ ArrayConcatNode(arrays=[]) → [a, b]
线2: value_b ──┘

实现：

# 多条线自动打包
edges = [
    {"source": "node1", "sourcePort": "output", "target": "concat", "targetPort": "arrays"},
    {"source": "node2", "sourcePort": "output", "target": "concat", "targetPort": "arrays"}
]

# concat 节点的 arrays 输入将接收 [value_a, value_b]

场景 3：广播（BROADCAST）

场景描述：单个值扩展到数组

输入值：42
   ↓ (BROADCAST)
输出数组：[42, 42, 42]  # 广播3次

实现：

# 通过 BroadcastNode
{
    "id": "broadcast",
    "type": "BroadcastNode",
    "params": {"count": 3}  # 广播3次
}

# 输入 42 → 输出 [42, 42, 42]

📊 数组操作节点集合

包含的节点

节点	输入	输出	描述
`ArrayMapNode`	数组	数组	映射操作（元素级变换）
`ArrayFilterNode`	数组	数组	过滤操作（条件筛选）
`ArrayReduceNode`	数组	标量	规约操作（sum/product/max/min）
`ArrayConcatNode`	多数组	数组	连接操作（展平多个数组）
`ArrayZipNode`	数组×2	数组	拉链操作（按位置合并）
`BroadcastNode`	标量	数组	广播操作（扩展到数组）

使用示例

示例 1：数组映射

# 功能：将数组中的每个数乘以2
nodes = [
    {"id": "input", "type": "InputNodeImpl"},
    {"id": "map", "type": "ArrayMapNode", "params": {"multiplier": 2}},
    {"id": "output", "type": "OutputNodeImpl"}
]

edges = [
    {"source": "input", "sourcePort": "output", "target": "map", "targetPort": "values", "edgeType": "array"},
    {"source": "map", "sourcePort": "mapped", "target": "output", "targetPort": "input", "edgeType": "array"}
]

# 输入：{values: [1, 2, 3]}
# 输出：[2, 4, 6]

示例 2：数组规约

# 功能：计算数组的和
nodes = [
    {"id": "input", "type": "InputNodeImpl"},
    {"id": "reduce", "type": "ArrayReduceNode", "params": {"operation": "sum"}},
    {"id": "output", "type": "OutputNodeImpl"}
]

# 输入：{values: [1, 2, 3, 4, 5]}
# 输出：15

示例 3：嵌套数组操作

# 功能：×2 后求和
nodes = [
    {"id": "input", "type": "InputNodeImpl"},
    {"id": "map", "type": "ArrayMapNode", "params": {"multiplier": 2}},
    {"id": "reduce", "type": "ArrayReduceNode", "params": {"operation": "sum"}},
    {"id": "output", "type": "OutputNodeImpl"}
]

edges = [
    {"source": "input", "sourcePort": "output", "target": "map", "targetPort": "values"},
    {"source": "map", "sourcePort": "mapped", "target": "reduce", "targetPort": "values"},
    {"source": "reduce", "sourcePort": "result", "target": "output", "targetPort": "input"}
]

# 输入：[1, 2, 3]
# 中间：[2, 4, 6]
# 输出：12

🔄 完整工作流示例

示例：处理学生成绩

# 场景：
# 1. 输入：学生ID列表 [1, 2, 3, 4, 5]
# 2. 根据ID获取成绩 → [85, 92, 78, 88, 95]
# 3. 过滤及格（≥60）→ [85, 92, 78, 88, 95]
# 4. 计算平均分 → 87.6
# 5. 输出最终结果

main_workflow = {
    "nodes": [
        {
            "id": "input",
            "type": "InputNodeImpl"
        },
        {
            "id": "fetch_grades",
            "type": "ArrayMapNode",
            "params": {"multiplier": 1}  # 实际上会调用数据库
        },
        {
            "id": "filter_pass",
            "type": "ArrayFilterNode",
            "params": {"threshold": 59}
        },
        {
            "id": "avg",
            "type": "ArrayReduceNode",
            "params": {"operation": "sum"}  # 再除以数组长度
        },
        {
            "id": "output",
            "type": "OutputNodeImpl"
        }
    ],
    "edges": [
        {
            "source": "input",
            "sourcePort": "output",
            "target": "fetch_grades",
            "targetPort": "values",
            "edgeType": "array"
        },
        {
            "source": "fetch_grades",
            "sourcePort": "mapped",
            "target": "filter_pass",
            "targetPort": "values",
            "edgeType": "array"
        },
        {
            "source": "filter_pass",
            "sourcePort": "filtered",
            "target": "avg",
            "targetPort": "values",
            "edgeType": "array"
        },
        {
            "source": "avg",
            "sourcePort": "result",
            "target": "output",
            "targetPort": "input",
            "edgeType": "scalar"
        }
    ]
}

# 执行
executor = AdvancedWorkflowExecutor(user_id="teacher1")
success, report = await executor.execute(
    nodes=main_workflow["nodes"],
    edges=main_workflow["edges"],
    global_context={"student_ids": [1, 2, 3, 4, 5]}
)

# 结果：
# - 成绩：[85, 92, 78, 88, 95]
# - 平均分：87.6

🎓 设计原理

为什么需要特殊节点？

明确定义接口 - InputNode/OutputNode 定义工作流的入/出
支持嵌套 - FunctionNode 允许创建可复用的工作流
模块化 - 大型工作流可分解为小函数
黑盒封装 - 用户无需了解子工作流细节

为什么需要连线分类？

类型安全 - 在设计时发现不兼容的连接
维度转换提示 - 自动提示需要升维/降维
前端可视化 - 粗线/细线直观显示数据流类型
性能优化 - 根据维度模式选择最优执行策略

为什么需要维度转换？

灵活性 - 支持多种使用场景
代码复用 - 同一节点可处理标量和数组
性能 - 避免不必要的循环展开
表达力 - 原生支持并行处理

📁 文件结构

server/app/core/
├── advanced_nodes.py              # 特殊节点定义
├── advanced_workflow_graph.py      # 扩展工作流图
└── advanced_workflow_executor.py   # 扩展执行引擎

server/app/nodes/
└── advanced_example_nodes.py       # 10个示例节点

server/tests/
└── test_advanced_features.py       # 完整测试用例

🧪 测试覆盖

测试	描述	状态
test_special_nodes	特殊节点注册	✅
test_dimension_inference	维度转换推断	✅
test_simple_workflow	简单工作流执行	✅
test_array_operations	数组操作	✅
test_workflow_graph	工作流图操作	✅
test_nested_function_workflow	嵌套函数节点	✅

🚀 使用指南

创建函数节点工作流

from server.app.core.advanced_nodes import WorkflowPackager

# 第1步：定义子工作流
sub_nodes = [...]
sub_edges = [...]

# 第2步：验证工作流
valid, error = WorkflowPackager.validate_function_workflow(sub_nodes, sub_edges)
if not valid:
    print(f"验证失败: {error}")

# 第3步：打包为函数节点
function_def = WorkflowPackager.package_as_function(
    node_id="my_function",
    nodes=sub_nodes,
    edges=sub_edges,
    display_name="我的函数",
    description="这是一个可复用的工作流函数"
)

# 第4步：在其他工作流中使用
main_nodes = [..., function_def, ...]
main_edges = [...]

# 第5步：执行
executor = AdvancedWorkflowExecutor()
success, report = await executor.execute(main_nodes, main_edges)

处理数组数据

# 创建包含数组操作的工作流
nodes = [
    {"id": "input", "type": "InputNodeImpl"},
    {"id": "map", "type": "ArrayMapNode", "params": {"multiplier": 2}},
    {"id": "filter", "type": "ArrayFilterNode", "params": {"threshold": 5}},
    {"id": "reduce", "type": "ArrayReduceNode", "params": {"operation": "sum"}},
    {"id": "output", "type": "OutputNodeImpl"}
]

# 构建连接关系
edges = [
    {"source": "input", "sourcePort": "output", "target": "map", "targetPort": "values", "edgeType": "array"},
    {"source": "map", "sourcePort": "mapped", "target": "filter", "targetPort": "values", "edgeType": "array"},
    {"source": "filter", "sourcePort": "filtered", "target": "reduce", "targetPort": "values", "edgeType": "array"},
    {"source": "reduce", "sourcePort": "result", "target": "output", "targetPort": "input", "edgeType": "scalar"}
]

# 执行
result = await executor.execute(nodes, edges, {"values": [1, 2, 3, 4, 5, 6, 7]})
# 结果：
# - ×2：[2, 4, 6, 8, 10, 12, 14]
# - >5：[6, 8, 10, 12, 14]
# - 求和：50

📈 性能考虑

操作	复杂度	耗时
节点注册	O(1)	<1ms
图验证	O(N+E)	<10ms（100节点）
拓扑排序	O(N+E)	<10ms（100节点）
升维执行（单节点）	O(N)	N×节点耗时
缓存查询	O(1)	<1ms

⚠️ 注意事项

函数节点必须包含 InputNode 和 OutputNode
- 否则 WorkflowPackager.validate_function_workflow() 会报错
维度转换不是自动的
- 需要显式在 edgeType 和 dimensionMode 中指定
- 前端应提供可视化提示
升维操作会重复执行节点
- 例如数组有100个元素，节点会执行100次
- 注意性能影响
嵌套深度有限制
- 理论上无限制，但建议不超过5层
- 过深会影响调试和性能

🔮 未来扩展

并行执行
- 支持无依赖节点的并行处理
条件分支
- 支持 if-then-else 逻辑
循环结构
- 支持 for-loop、while-loop
错误处理
- 支持 try-catch 机制
动态工作流
- 根据运行时条件动态构建工作流

下一步：API 集成指南

14 KiB Raw Blame History Unescape Escape