IBM/onnx-mlir-serving - Gitstar Ranking

IBM

Fetched on 2025/01/20 09:02

ONNX Serving is a project written with C++ to serve onnx-mlir compiled models with GRPC and other protocols.Benefiting from C++ implementation, ONNX Serving has very low latency overhead and high throughput. ONNX Servring provides dynamic batch aggregation and workers pool to fully utilize AI accelerators on the machine. - View it on GitHub

Star

Rank

838961

IBM

IBM / onnx-mlir-serving