SKU/Artículo: AMZ-B0G1H5HHLK

LLM DEPLOYMENT & MLOps: Serving Large Language Models from Prototype to Production: A Practical Guide to FastAPI, Kubernetes, and Monitoring

Disponibilidad:
En stock
Peso con empaque:
0.57 kg
Devolución:
Condición
Nuevo
Producto de:
Amazon

Sobre este producto
  • Low-Latency API Design: Build high-speed, asynchronous LLM endpoints using FastAPI to minimize latency and maximize throughput, moving beyond basic REST APIs.
  • Kubernetes Orchestration (K8s): Learn how to configure robust Kubernetes clusters, manage massive model weights, and implement advanced GPU scheduling and resource quotas.
  • Scalability and Cost Control: Implement the Horizontal Pod Autoscaler (HPA) for dynamic scaling and learn the secrets of scaling to zero to eliminate idle cloud compute costs.
  • High-Performance Serving: Maximize GPU utilization using specialized inference servers like vLLM and Triton, leveraging dynamic batching and PagedAttention to achieve state-of-the-art speeds.
  • LLMOps Monitoring: Set up a complete observability stack using Prometheus and Grafana to track critical metrics like P99 latency, cost-per-query, and early detection of model drift.
  • Safe CI/CD: Implement automated, zero-downtime deployment strategies, including Canary Releases and automated rollbacks, ensuring every model update is safe and reliable.
U$S 76,98
55% OFF
U$S 34,99

IMPORTÁ FACIL

Comprando este producto podrás descontar el IVA con tu número de RUT

NO CONSUME FRANQUICIA

Si tu carrito tiene solo libros o CD’s, no consume franquicia y podés comprar hasta U$S 1000 al año.

U$S 76,98
55% OFF
U$S 34,99

¡Comprá en hasta 12 cuotas sin interés con todas tus tarjetas!

10% OFF en tu primera compra en la APP con cupón: APP10

Llega en 5 a 11 días hábiles
con envío
Tienes garantía de entrega
Este producto viaja de USA a tus manos en