This document describes the reference architecture (RA) for Canonical Ubuntu Kubernetes for AI on Lenovo ThinkSystem infrastructure. The intended audience for this document is technical IT architects, system administrators, and managers who are interested in executing Machine Learning/Artificial Intelligence (ML/AI) development workloads on a Kubernetes-based cluster. The RA provides a technical overview of the hardware and software configurations needed for deployment, and explains why the combination of Lenovo servers and Canonical Charmed Kubernetes provides a flexible, best of breed solution for executing Kubernetes-based AI workloads.
This document includes an overview of the business problem and business value that is addressed by Canonical Ubuntu Kubernetes. A description of customer requirements is followed by an architectural overview of the hardware and software solution components, and the operational model for lifecycle management of the Kubernetes cluster. Workload deployment platform options are provided to address end-user requirements for ML/AI workload deployment.
Table of Contents
Business problem and business value
Appendix: Lenovo Bill of materials
To view the document, click the Download PDF button.
Changes in the January 16, 2022 update:
- Update to latest hardware and software versions
- Software: 20.04.3 LTS (Linux Kernel 5.4), Kubernetes 1.21, MAAS 2.9, Juju 2.9.21
- Hardware: Lenovo ThinkSystem SR630 V2, Lenovo ThinkSystem SR650 V2, NVIDIA A100 GPUs
- Updated procedure and tools to reflect the latest hardware and software
Related product families
Product families related to this document are the following: