1 TensorRT的多线程并发推理方案
TensorRT在对模型推理速度已经有了非常大的提升了,那如果能够基于TensorRT做并行推理,既可以有效降低推理延迟,也能增加服务吞吐量,那岂不是酷毙了?
那么能用TensorRT做多线程并发吗?
我们看看TensorRT的官方开发者文档怎么说:
In general, TensorRT objects are not thread safe; accesses to an object from different threads must be serialized by the client.
The expected runtime concurrency model is that different threads will operate on different execution contexts. The context contains the state of the network (activation values, and so on) during execution, so using a context concurrently in different threads results in undefined behavior.
To support this model, the following operations are thread safe:
- Nonmodifying operations on a runtime or engine.
- Deserializing an engine from a TensorRT runtime.
- Creating an execution context from an engine.
- Registering and deregistering plug-ins.
There are no thread-safety issues with using multiple builders in different threads; however, the builder uses timing to determine the fastest kernel for the parameters provided, and using multiple builders with the same GPU will perturb the timing and TensorRT’s ability to construct optimal engines. There are no such issues using multiple threads to build with different GPUs.
上面的第一句话清楚的提到,TensorRT不是线程安全的,需要我们自己管理不同线程之间的对象访问。
(1)资源收集自互联网,仅供自我学习,请在下载后24小时内删除该资源,如下载者将此资源用于其他非法用途,本站不承担任何法律责任;如有侵权,请立即联系我,马上删除!
(2)下载单个资源则点击立即下载或者立即购买按钮;本站VIP可下载本站所有资源。
(3)请不要使用手机以及电脑浏览器的无痕模式进行支付操作,以免造成支付成功但未显示下载链接。
(4)如遇支付问题或者资源失效问题请点击按钮点击反馈进行反馈或者发送说明邮件到stubbornhuang@qq.com
本文作者:StubbornHuang
版权声明:本文为站长原创文章,如果转载请注明原文链接!
原文标题:TensorRT – 基于TensorRT的多线程并发推理方案
原文链接:https://www.stubbornhuang.com/2536/
发布于:2023年03月06日 17:48:01
修改于:2023年06月21日 17:06:30
声明:本站所有文章,如无特殊说明或标注,均为本站原创发布。任何个人或组织,在未征得本站同意时,禁止复制、盗用、采集、发布本站内容到任何网站、书籍等各类媒体平台。如若本站内容侵犯了原著者的合法权益,可联系我们进行处理。
评论
52