Semantic caching demo with real-time streaming and a cost & sizing calculator, powered by Azure Managed Redis and Azure OpenAI.
-
Updated
Nov 12, 2025 - Python
Semantic caching demo with real-time streaming and a cost & sizing calculator, powered by Azure Managed Redis and Azure OpenAI.
Triton inference benchmark with telemetry, correctness gates, and cost-to-serve modeling
A full-stack GPU profiling and simulation framework that bridges high-level Python ML code with low-level hardware metrics (SM Banks, Tensor Cores) for precise performance analysis.
Driver-based model to estimate infrastructure cost impact of product experiments (Lambda, DynamoDB, CloudWatch)
Reproducible microbenchmark for modeling domain crossing energy in heterogeneous compute systems.
Distributed engineering cost modeling and team topology pricing platform for CTO decision making.
Quantifies vehicle design complexity cost using ML and portfolio optimization. Gradient Boosting achieves R²=0.93 on MSRP prediction across 11,914 vehicles.
Add a description, image, and links to the cost-modeling topic page so that developers can more easily learn about it.
To associate your repository with the cost-modeling topic, visit your repo's landing page and select "manage topics."