函数封装机制设计

目标

支持用户封装函数复用基础设施配置过程。

用户故事

我在创建一个基于 LLM 的 Chat 机器人，使用 Pluto 构建应用的后端，支持多种 LLM，每个 LLM 都有两个接口：1）获取当前 LLM 的基本信息 /{model}/info，2）调用 LLM /{model}/invoke。为了复用配置路由的过程，需要实现一个函数为单一 LLM 配置路由，调用多次该函数为每一个 LLM 配置路由。

限制

运行时相关的代码不在编译时代码中调用。
将用户封装的函数（包含了基础设施代码的函数）看作是基础设施函数，类似 Infra API，但区别在于参数列表里可能包含运行时函数需要的闭包变量。
用户自定义封装出的函数，其参数不支持函数变量，即不支持高阶函数。
用户自定义封装出的函数，其参数不可被修改，且保证传入 Infra API 的参数是静态可推导的
只支持在全局作用域和封装的普通函数中调用 Infra API，不支持在 Lambda、Class 等类型的作用域内调用 Infra API

最简单的情况 - 封装的函数仅一层，且仅包含 Infra API 调用

class LLM:
    def __init__(self, model: str, memory: KVStore, info: str):
        self.model = model
        self.memory = memory
        self.info = info
 
    def invoke(self, question: str):
        ans = f"Answer for {question} by {self.model}"
        self.memory.set(question, ans)
        return ans
 
 
def add_routes(router: Router, model: str, llm: LLM):
    def invoke_handler(req):
        question = req.query.get("question")
        return llm.invoke(question)
    
    # 这里先不关心 LLM 的状态存储，假设 LLM 的记忆等状态存储已经在创建时设置了 KV 数据库等外部存储实例
    router.get(f"/{model}/info", lambda x: llm.info)
    router.get(f"/{model}/invoke", invoke_handler)
 
    # 禁止下面这种行为，这其实是编译时访问了运行时的变量
    # llm.invoke("Hi")
    # router.get(f"/{llm["model"]}/info", lambda x: x["info"])
 
 
router = Router("router")
 
llm1 = LLM("model1", KVStore("memory1"), "LLM Info")
llm2 = LLM("model2", KVStore("memory2"), "LLM Info")
 
add_routes(router, "model1", llm1)
add_routes(router, "model2", llm2)

实现方式：

add_routes 包含基础设施代码，因此在 deduce 阶段，会对两个方法的定义与调用进行分析。

对 add_routes 方法定义进行分析，找出运行时相关代码，以及访问的闭包变量，构建局部图
1. 采用已实现的思路构建局部 arch ref， router -> invoke_handler, router -> lambda, invoke_handler -set-> llm.memory
  1. router 和 llm.memory 都依赖于外部参数，留有 placeholder，待参数确定后填充
2. 通过静态分析可以获知 invoke_handler 函数是运行时入口函数
3. invoke_handler 该函数依赖外部变量 llm，llm 是 add_routes 的第三个参数
4. 打包 invoke_handler 函数，但对 llm 留有 placeholder
对 add_routes 方法调用进行分析，填充局部图，构建全局图
1. 上一步知道局部的 arch ref 以及内部打包的运行时函数，将参数带入到上一步分析出的信息中，填充 placeholder

封装的函数嵌套调用函数

def nested_add_routes(router: Router, model: str, llm2: LLM):
    router.get(f"/{model}/info", lambda x: llm2.info)
 
 
def add_routes(router: Router, model: str, llm: LLM):
    def invoke_handler(req):
        question = req.query.get("question")
        return llm.invoke(question)
    
    nested_add_routes(router, model, llm)
    router.get(f"/{model}/invoke", invoke_handler)

封装的函数中包含编译时需执行的代码

def create_react_web(project_path: str, name: str | None):
    def build_react(project_path: str, dist_path: str):
        # 执行 react 编译命令
        pass
 
    # 禁止下面这种行为，否则 Infra API 参数在编译时不可知
    # 这也要求用户感知编译时与运行时差异
    # dist_path = build_react(project_path, dist_path)
 
    dist_path = "path/to/your/dist"
    build_react(project_path, dist_path)
    return Website(dist_path, name)
 
 
website = Website("dist")
 
react_path = "path/to/your/react/app"
website = create_react_web(react_path, "react-app")

封装的函数中包含用户自定义的、编译时需执行的函数 build_react ，有几个问题：

Pluto 设计的 biz code -> arch ref -> IaC，没有添加用户自定义编译时代码的能力。
并且，arch ref 要求所有参数在 deduce 阶段已知，而上述函数封装的情况，arch ref 的参数可能与用户自定义函数有关，注释中的 dist_path。
IaC 代码是 TypeScript 代码，无法支持用户使用 Python 自定义编译时代码

Deducer 需求与设计