--- tags: [] date created: 星期四, 十一月 20日 2025, 8:52:43 晚上 date modified: 星期日, 十二月 7日 2025, 9:26:50 晚上 --- 流的设计模式:B. 多流乒乓/多缓冲 (Multi-Stream Ping-Pong) (推荐) 但是 先设计 A. 单流串行 (Serial Stream) 作为代码调试阶段的轻量级。 1. 缓冲区管理状态机 ```mermaid stateDiagram-v2 %% 状态定义 state "HOST_OWNED
(主机所有)" as HOST state "DEVICE_OWNED_H2D
(传输中: H->D)" as H2D state "DEVICE_OWNED_COMPUTE
(计算中: Kernel)" as COMPUTE state "DEVICE_OWNED_D2H
(传输中: D->H)" as D2H state "RELEASED
(待归还)" as RELEASED %% 流程流转 [*] --> HOST : 从 MemoryPool 申请 HOST --> H2D : I/O线程填充数据\n并调用 cudaMemcpyAsync note right of HOST 此时数据位于页锁定内存 CPU 写入完成 end note H2D --> COMPUTE : 记录 H2D_Event\nStreamWaitEvent note right of H2D DMA 引擎正在搬运 CPU 不阻塞 end note COMPUTE --> D2H : Kernel 执行完毕\n自动触发 D2H note right of COMPUTE GPU 核心正在计算 数据驻留显存 end note D2H --> RELEASED : D2H 完成回调\n或 Event 同步 note right of D2H 结果已写回 Host end note RELEASED --> HOST : DataPacket 析构\n自动归还 Pool RELEASED --> [*] ```