Lenovo EveryScale HPC & AI Software Stack Product Guide

The Lenovo EveryScale HPC & AI Software Stack combines open-source with proprietary best-of-breed supercomputing software to provide the most consumable open-source HPC software stack embraced by all Lenovo HPC customers.

This product guide provides essential pre-sales information to understand the key features and components of the EveryScale HPC & AI Software Stack. The product guide is intended for technical specialists, sales specialists, sales engineers, IT architects, and other IT professionals who want to learn more about the Lenovo EveryScale HPC & AI Software Stack.

Changes in the July 31, 2025 update:

Replaced image under - Introduction section
The following new subsections have been added under - Software components section
- SchedMD Slurm Support service
- AI: application workload and advanced GPU orchestration
- Energy monitoring and optimization

Introduction

The Lenovo EveryScale HPC & AI Software Stack combines open-source with proprietary best-of-breed Supercomputing software to provide the most consumable open-source HPC software stack embraced by all Lenovo HPC customers.

It provides a fully tested and supported, complete but customizable HPC software stack to enable the administrators and users in optimally and environmentally sustainable utilizing their Lenovo Supercomputers.

The software stack is built on the most widely adopted and maintained HPC community software for orchestration and management. It integrates third party components especially around programming environments and performance optimization to complement and enhance the capabilities, creating the organic umbrella in software and service to add value for our customers.

The software stack offers key software and support components for orchestration and management, programming environments and services and support, as shown in the following figure.

Figure 1. Lenovo EveryScale HPC & AI Software Stack

Did you know?

Lenovo EveryScale HPC & AI Software Stack is a modular software stack tailored to our customer's needs. Thoroughly tested, supported and periodically updated, it combines the latest open-source HPC software releases to enable organizations with an agile and scalable IT infrastructure.

Benefits

The Lenovo EveryScale HPC & AI Software Stack provides the following benefits to customers.

Overcoming the Complexity of HPC Software

An HPC system software stack consists of dozens of components, that administrators must integrate and validate before an organization’s HPC applications can run on top of the stack. Ensuring stable, reliable versions of all stack components is an enormous task due to the numerous interdependencies. This task is very time consuming because of the constant release cycles and updates of individual components.

The Lenovo EveryScale HPC & AI Software Stack is fully tested, supported and periodically updated to combine the latest open-source HPC software releases, enabling organizations with an agile and scalable IT infrastructure.

Benefits of the Open-source Model

Going forward, in IDC's opinion, the development model exemplified by Linux is more workable. In this model, stack development is driven primarily by the open-source community and vendors offer supported distributions with additional capabilities for customers that require and are willing to pay for them. As the Linux initiative demonstrates, a community-based model like this has major advantages for enabling software to keep pace with requirements for HPC computing and storage hardware systems.

This model delivers new capabilities faster to users and makes HPC systems more productive and higher returning investments.

A fair number of foundational open source HPC software components already exist (e.g., Open MPI, Rocky Linux, Slurm, OpenStack, and others). Many HPC community members are already taking advantage of these.

Customers will benefit from the HPC community, as the community works to integrate a multitude of components that are commonly used in HPC systems and are freely available for open source distribution.

The key open-source components of the software stack are:

Confluent Management
Confluent is Lenovo-developed open-source software designed to discover, provision, and manage HPC clusters and the nodes that comprise them. Confluent provides powerful tooling to deploy and update software and firmware to multiple nodes simultaneously, with simple and readable modern software syntax.
Slurm Orchestration
Slurm is integrated as an open source, flexible, and modern choice to manage complex workloads for faster processing and optimal utilization of the large-scale and specialized high-performance and AI resource capabilities needed per workload provided by Lenovo systems. Lenovo provides support in partnership with SchedMD.
LiCO Webportal
Lenovo Intelligent Computing Orchestration (LiCO) is a Lenovo-developed consolidated Graphical User Interface (GUI) for monitoring, managing and using cluster resources. The web portal provides workflows for both AI and HPC, and supports multiple AI frameworks, including TensorFlow, Caffe, Neon, and MXNet, allowing you to leverage a single cluster for diverse workload requirements.
Energy Aware Runtime
EAR is a powerful European open-source energy management suite supporting anything from monitoring over power capping to live-optimization during the application runtime. Lenovo is collaborating with Barcelona Supercomputing Centre (BSC) and EAS4DC on the continuous development and support and offers three versions with differentiating capabilities.

Software components

Components are covered in the following sections:

Orchestration and management
Programming environment
AI: application workload and advanced GPU orchestration
Energy Monitoring and Optimization

Orchestration and management

The following orchestration and management software is available with Lenovo EveryScale HPC & AI Software Stack:

Confluent (Best Recipe interoperability)
Confluent is Lenovo-developed open source software designed to discover, provision, and manage HPC clusters and the nodes that comprise them. Our Confluent management system and LiCO Web portal provide an interface designed to abstract the users from the complexity of HPC cluster orchestration and AI workloads management, making open-source HPC software consumable for every customer. Confluent provides powerful tooling to deploy and update software and firmware to multiple nodes simultaneously, with simple and readable modern software syntax. Additionally, Confluent’s performance scales seamlessly from small workstation clusters to thousand-plus node supercomputers. For more information, see the Confluent documentation and the Lenovo Confluent Management Software paper.
Lenovo Intelligent Computing Orchestration (Best Recipe interoperability)
Lenovo Intelligent Computing Orchestration (LiCO) is a Lenovo-developed software solution that simplifies the management and use of distributed clusters for High Performance Computing (HPC) and Artificial Intelligence (AI) environments. LiCO provides a consolidated Graphical User Interface (GUI) for monitoring and usage of cluster resources, allowing you to easily run both HPC and AI workloads across a choice of Lenovo infrastructure, including both CPU and GPU solutions to suit varying application requirements.

LiCO Web portal provides workflows for both AI and HPC, and supports multiple AI frameworks, including TensorFlow, Caffe, Neon, and MXNet, allowing you to leverage a single cluster for diverse workload requirements. For more information, see the LiCO product guide.
LiCO customization service
Lenovo Intelligent Computing Orchestration (LiCO) customization services enable customers to request customized features tailored for their own needs. The service is evaluated and quoted in the form of man-days, based on the actual order list.
HPC solution sellers need to provide pre-sales support, collaborate with HPC architects, and communicate with LiCO's R&D team (Ding Hong dinghong1@lenovo.com) for workload evaluations. A quote is provided by Lenovo based on the output SOW and analysis of the workload. After the sellers place an order, they will email the R&D team to request implementation. The LiCO R&D team will deliver the work based on the order content and SOW.
Slurm
Slurm is a modern, open-source scheduler designed specifically to satisfy the demanding needs of high-performance computing (HPC), high throughput computing (HTC) and AI. Slurm is developed and maintained by SchedMD® and integrated within LiCO. Slurm maximizes workload throughput, scale, reliability, and results in the fastest possible time while optimizing resource utilization and meeting organizational priorities. Slurm automates job scheduling to help admin and users manage the complexities of on-prem, hybrid, or cloud workspaces. Slurm workload manager executes faster and is more reliable ensuring increased productivity while decreasing costs. Slurm’s modern, plug-in-based architecture runs on a RESTful API supporting both large and small HPC, HTC, and AI environments. Allow your teams to focus on their work while Slurm manages their workloads.
SchedMD Slurm Support service capabilities for Lenovo HPC and AI systems include:
- Level 3 Support: High-performance systems must perform at high utilization and performance to meet end users and management return on the investment expectations. Customers covered by a support contract can reach out to SchedMD engineer experts to promptly resolve complex workload management issues and receive answers back to complex config questions quickly, instead of taking weeks or even months to try to resolve them in-house.
- Remote Consulting: Valuable assistance and implementation expertise that speeds custom configuration tuning to increase throughput and utilization efficiency on complex and large-scale systems. Customers can review cluster requirements, operating environment, and organizational goals directly with a Slurm engineer to optimize the configuration and meet organizational needs.
- Tailored Slurm Training: Tailored Slurm expert training that empowers users on harnessing Slurm capabilities to speed projects and increase technology adoption. A customer scoping call before the onsite Instruction ensures coverage of specific use cases addressing organization needs. An in-depth and comprehensive technical training is delivered in a hands-on lab workshop format to help users feel empowered on Slurm best practices in their site-specific use cases and configuration.
NVIDIA Unified Fabric Manager (UFM) (ISV supported)
NVIDIA Unified Fabric Manager (UFM) is InfiniBand networking management software that combines enhanced, real-time network telemetry with fabric visibility and control to support scale-out InfiniBand data centers. For more information, see the NVIDIA UFM product page.

The two UFM offerings available from Lenovo are as follows:
- UFM Telemetry for Real-Time Monitoring
  The UFM Telemetry platform provides network validation tools to monitor network performance and conditions, capturing and streaming rich real-time network telemetry information, application workload usage, and system configuration to an on-premises or cloud-based database for further analysis.
- UFM Enterprise for Fabric Visibility and Control
  The UFM Enterprise platform combines the benefits of UFM Telemetry with enhanced network monitoring and management. It performs automated network discovery and provisioning, traffic monitoring, and congestion discovery. It also enables job schedule provisioning and integrates with industry-leading job schedulers and cloud and cluster managers, including Slurm and Platform Load Sharing Facility (LSF).

The following table lists all Orchestration and Management software available with Lenovo EveryScale HPC & AI Software Stack.

Table 1. Confluent
Part number	Feature code	Description
Lenovo Confluent support
7S090039WW	S9VH	Lenovo Confluent 1 Year Support per managed node
7S09003AWW	S9VJ	Lenovo Confluent 3 Year Support per managed node
7S09003BWW	S9VK	Lenovo Confluent 5 Year Support per managed node
7S09003CWW	S9VL	Lenovo Confluent 1 Extension Year Support per managed node

Table 2. Lenovo Intelligent Computing Orchestration (LiCO)
Part number	Feature code	Description
Lenovo Intelligent Computing Orchestration (LiCO) HPC AI version
7S090004WW	B1YC	Lenovo HPC AI LiCO Software 90 Day Evaluation License
7S09002BWW	S93A	Lenovo HPC AI LiCO Webportal w/1 yr S&S
7S09002CWW	S93B	Lenovo HPC AI LiCO Webportal w/3 yr S&S
7S09002DWW	S93C	Lenovo HPC AI LiCO Webportal w/5 yr S&S
LiCO customization service
7S09004HWW	SBF2	Lenovo HPC AI LiCO customization Service per man day

Table 3. SchedMD Sum Support
Part number	Description
SchedMD Sum Support for Lenovo HPC Systems
7S09001MWW	SchedMD Slurm Onsite or Remote 3-day Training*
7S09001NWW	SchedMD Slurm Consulting w/Sr.Engineer 2REMOTE Sessions**
7S09001PWW	SchedMD L3 Slurm support up to 100 Sockets/GPUs 3Y
7S09001QWW	SchedMD L3 Slurm support up to 100 Sockets/GPUs 5Y
7S09001RWW	SchedMD L3 Slurm support up to 100 Sockets/GPUs additional 1Y
7S09001SWW	SchedMD L3 Slurm support 101-1000 Sockets/GPUs 3Y
7S09001TWW	SchedMD L3 Slurm support 101-1000 Sockets/GPUs 5Y
7S09001UWW	SchedMD L3 Slurm support 101-1000 Sockets/GPUs additional 1Y
7S09001VWW	SchedMD L3 Slurm support 1001-5000+ Sockets/GPUs 3Y
7S09001WWW	SchedMD L3 Slurm support 1001-5000+ Sockets/GPUs 5Y
7S09001XWW	SchedMD L3 Slurm support 1001-5000+ Sockets/GPUs additional 1Y
7S09001YWW	SchedMD L3 Slurm support up to 100 Sockets/GPUs 3Y EDU&GOV
7S09001ZWW	SchedMD L3 Slurm support up to 100 Sockets/GPUs 5Y EDU&GOV
7S090022WW	SchedMD L3 Slurm support up to 100 Sockets/GPUs additional 1Y EDU&GOV
7S090023WW	SchedMD L3 Slurm support 101-1000 Sockets/GPUs 3Y EDU&GOV
7S090024WW	SchedMD L3 Slurm support 101-1000 Sockets/GPUs 5Y EDU&GOV
7S090026WW	SchedMD L3 Slurm support 101-1000 Sockets/GPUs additional 1Y EDU&GOV
7S090027WW	SchedMD L3 Slurm support 1001-5000+ Sockets/GPUs 3Y EDU&GOV
7S090028WW	SchedMD L3 Slurm support 1001-5000+ Sockets/GPUs 5Y EDU&GOV
7S09002AWW	SchedMD L3 Slurm support 1001-5000+ Sockets/GPUs additional 1Y EDU&GOV

* SchedMD Slurm Onsite or Remote 3-day Training: in-depth and comprehensive site-specific technical training. Can only be added to a support purchase.
** SchedMD Slurm Consulting w/Sr.Engineer 2REMOTE Sessions (Up to 8 hrs): review initial Slurm setup, in-depth technical chats around specific Slurm topics & review site config for optimization & best practices. Required with support purchase, cannot be purchased separately.

Note: SchedMD Slurm Consulting w/Sr.Engineer 2REMOTE Sessions option must be selected and locked in for every SchedMD support selection.

SchedMD Slurm Onsite or Remote 3-day Training option must be selected and locked in for every SchedMD Commercial support selection. Optional for EDU & Government support selections.

Table 4. Lenovo Intelligent Computing Orchestration (LiCO)
Part number	Feature code	Description
UFM Telemetry
7S090011WW	S921	NVIDIA UFM Telemetry 1-year License and 24/7 Support for Lenovo clusters
7S090012WW	S922	NVIDIA UFM Telemetry 3-year License and 24/7 Support for Lenovo clusters
7S090013WW	S923	NVIDIA UFM Telemetry 5-year License and 24/7 Support for Lenovo clusters
7S09002DWW	S93C	Lenovo HPC AI LiCO Webportal w/5 yr S&S
UFM Enterprise
7S09000XWW	S91Y	NVIDIA UFM Enterprise 1-year License and 24/7 Support for Lenovo clusters
7S09000YWW	S91Z	NVIDIA UFM Enterprise 3-year License and 24/7 Support for Lenovo clusters
7S09000ZWW	S920	NVIDIA UFM Enterprise 5-year License and 24/7 Support for Lenovo clusters

Programming environment

The following programming software is available with Lenovo EveryScale HPC&AI Software Stack.

NVIDIA CUDA
NVIDIA CUDA is a parallel computing platform and programming model for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs. When using CUDA, developers program in popular languages such as C, C++, Fortran, Python and MATLAB and express parallelism through extensions in the form of a few basic keywords. For more information, see the NVIDIA CUDA Zone.
NVIDIA HPC Software Development Kit
The NVIDIA HPC SDK C, C++, and Fortran compilers support GPU acceleration of HPC modeling and simulation applications with standard C++ and Fortran, OpenACC directives, and CUDA. GPU-accelerated math libraries maximize performance on common HPC algorithms, and optimized communications libraries enable standards-based multi-GPU and scalable systems programming. Performance profiling and debugging tools simplify porting and optimization of HPC applications, and containerization tools enable easy deployment on-premises or in the cloud. For more information, see the NVIDIA HPC SDK.
Intel oneAPI
The Intel oneAPI Base & HPC Toolkit is a comprehensive software development suite designed to empower developers in creating HPC & AI solutions that exploit the full potential of modern hardware architectures. This toolkit encompasses an array of advanced tools, libraries, and compilers, enabling programmers to efficiently design, optimize, and deploy parallel applications across diverse computing platforms, including CPUs, GPUs, and FPGAs. With a focus on fostering code portability and performance scalability, the Intel oneAPI Base & HPC Toolkit equips developers with the means to enhance productivity, streamline software development, and achieve exceptional performance outcomes in the realm of high-performance computing.
- For more information, see Intel® oneAPI Base and HPC Toolkit
- System-based pricing (introduced in mid-2023) can range from a small system (64-256 nodes), medium system (257-512 nodes), or large system (512+ nodes)
- Developer-based pricing is for systems with fewer than 64 nodes and offers support for a limited number of users.
- Part numbers are available for Commercial or Academic customers.
- Support Renewals are available.
- Commercial parts have different part numbers if they are quoted with or without Intel hardware.

The following tables list the relevant ordering part numbers.

Table 5. NVIDIA CUDA part numbers
Part number	Description
NVIDIA CUDA
7S09001EWW	NVIDIA CUDA Support and Maintenance (up to 200 GPUs), 1 Year
7S09001FWW	NVIDIA CUDA Support and Maintenance (up to 500 GPUs), 1 Year
7S09002EWW	NVIDIA CUDA Support and Maintenance (up to 1000 GPUs), 1 Year
7S09002FWW	NVIDIA CUDA Support and Maintenance (up to 5000 GPUs), 1 Year

Table 6. NVIDIA HPC SDK part numbers
Part number	Description
NVIDIA HPC SDK
7S090014WW	NVIDIA HPC Compiler Support Services, 1 Year
7S090015WW	NVIDIA HPC Compiler Support Services, 3 Years
7S09002GWW	NVIDIA HPC Compiler Support Services, 5 Years
7S090016WW	NVIDIA HPC Compiler Support Services, EDU, 1 Year
7S090017WW	NVIDIA HPC Compiler Support Services, EDU, 3 Years
7S09002HWW	NVIDIA HPC Compiler Support Services, EDU, 5 Years
7S09001CWW	NVIDIA HPC Compiler Support Services - Additional Contact, 1 Year
7S09002JWW	NVIDIA HPC Compiler Support Services - Additional Contact, 3 Years
7S09002KWW	NVIDIA HPC Compiler Support Services - Additional Contact, 5 Years
7S09001DWW	NVIDIA HPC Compiler Support Services - Additional Contact, EDU, 1 Year
7S09002LWW	NVIDIA HPC Compiler Support Services - Additional Contact, EDU, 3 Years
7S09002MWW	NVIDIA HPC Compiler Support Services - Additional Contact, EDU, 5 Years
7S09001AWW	NVIDIA HPC Compiler Premier Support Services, 1 Year
7S09002NWW	NVIDIA HPC Compiler Premier Support Services, 3 Years
7S09002PWW	NVIDIA HPC Compiler Premier Support Services, 5 Years
7S09001BWW	NVIDIA HPC Compiler Premier Support Services, EDU, 1 Year
7S09002QWW	NVIDIA HPC Compiler Premier Support Services, EDU, 3 Years
7S09002RWW	NVIDIA HPC Compiler Premier Support Services, EDU, 5 Years
7S090018WW	NVIDIA HPC Compiler Premier Support Services - Additional Contact, 1 Year
7S09002SWW	NVIDIA HPC Compiler Premier Support Services - Additional Contact, 3 Years
7S09002TWW	NVIDIA HPC Compiler Premier Support Services - Additional Contact, 5 Years
7S090019WW	NVIDIA HPC Compiler Premier Support Services - Additional Contact, EDU, 1 Year
7S09002UWW	NVIDIA HPC Compiler Premier Support Services - Additional Contact, EDU, 3 Years
7S09002VWW	NVIDIA HPC Compiler Premier Support Services - Additional Contact, EDU, 5 Years

Table 7. Intel oneAPU part numbers
Part number	Description
Commercial - Small System (64 - 256 nodes)
7S09003DWW	Intel oneAPI Base & HPC Toolkit (Multi-Node) Commercial Small System support for oneAPI Cluster runtimes 1YSupp with Intel HW
7S09003EWW	Intel oneAPI Base & HPC Toolkit (Multi-Node) Commercial Small System support for oneAPI Cluster runtimes 3YSupp with Intel HW
7S09003FWW	Intel oneAPI Base & HPC Toolkit (Multi-Node) Commercial Small System support for oneAPI Cluster runtimes 4YSupp with Intel HW
7S09003GWW	Intel oneAPI Base & HPC Toolkit (Multi-Node) Commercial Small System support for oneAPI Cluster runtimes 5YSupp with Intel HW
Commercial - Medium System (257 - 512 nodes)
7S09003HWW	Intel oneAPI Base & HPC Toolkit (Multi-Node) Commercial Medium System support for oneAPI Cluster runtimes 1YSupp with Intel HW
7S09003JWW	Intel oneAPI Base & HPC Toolkit (Multi-Node) Commercial Medium System support for oneAPI Cluster runtimes 3YSupp with Intel HW
7S09003KWW	Intel oneAPI Base & HPC Toolkit (Multi-Node) Commercial Medium System support for oneAPI Cluster runtimes 4YSupp with Intel HW
7S09003LWW	Intel oneAPI Base & HPC Toolkit (Multi-Node) Commercial Medium System support for oneAPI Cluster runtimes 5YSupp with Intel HW
Commercial - Large System (512+ nodes)
7S09003QWW	Intel oneAPI Base & HPC Toolkit (Multi-Node) Commercial Large System support for oneAPI Cluster runtimes 1YSupp with Intel HW
7S09003PWW	Intel oneAPI Base & HPC Toolkit (Multi-Node) Commercial Large System support for oneAPI Cluster runtimes 3YSupp with Intel HW
7S09003NWW	Intel oneAPI Base & HPC Toolkit (Multi-Node) Commercial Large System support for oneAPI Cluster runtimes 4YSupp with Intel HW
7S09003MWW	Intel oneAPI Base & HPC Toolkit (Multi-Node) Commercial Large System support for oneAPI Cluster runtimes 5YSupp with Intel HW
Commercial – Developer Based (for systems < 64 nodes)
7S09003RWW	Intel oneAPI Base & HPC Toolkit (Multi-Node) - 2 Concurrent User Commercial 1YSupp with Intel HW
7S09003SWW	Intel oneAPI Base & HPC Toolkit (Multi-Node) - 2 Concurrent User Commercial 3YSupp with Intel HW
7S09003TWW	Intel oneAPI Base & HPC Toolkit (Multi-Node) - 5 Concurrent User Commercial 1YSupp with Intel HW
7S09003UWW	Intel oneAPI Base & HPC Toolkit (Multi-Node) - 5 Concurrent User Commercial 3YSupp with Intel HW
7S09003VWW	Intel oneAPI Base & HPC Toolkit (Multi-Node) - 10 Concurrent User Commercial 1YSupp with Intel HW
7S09003WWW	Intel oneAPI Base & HPC Toolkit (Multi-Node) - 10 Concurrent User Commercial 3YSupp with Intel HW
7S09003XWW	Intel oneAPI Base & HPC Toolkit (Multi-Node) - Enterprise-Above 10 Concurrent Users
Commercial – Small System (64-256 nodes) with non-Intel HW
7S09004JWW	Intel oneAPI Base & HPC Toolkit (Multi-Node) Commercial Small System support for oneAPI Cluster runtimes 1YSupp with non-Intel HW
7S09004KWW	Intel oneAPI Base & HPC Toolkit (Multi-Node) Commercial Small System support for oneAPI Cluster runtimes 3YSupp with non-Intel HW
7S09004LWW	Intel oneAPI Base & HPC Toolkit (Multi-Node) Commercial Small System support for oneAPI Cluster runtimes 4YSupp with non-Intel HW
7S09004MWW	Intel oneAPI Base & HPC Toolkit (Multi-Node) Commercial Small System support for oneAPI Cluster runtimes 5YSupp with non-Intel HW
Commercial – Small System (257 - 512 nodes) with non-Intel HW
7S09004NWW	Intel oneAPI Base & HPC Toolkit (Multi-Node) Commercial Medium System support for oneAPI Cluster runtimes 1YSupp with non-Intel HW
7S09004PWW	Intel oneAPI Base & HPC Toolkit (Multi-Node) Commercial Medium System support for oneAPI Cluster runtimes 3YSupp with non-Intel HW
7S09004QWW	Intel oneAPI Base & HPC Toolkit (Multi-Node) Commercial Medium System support for oneAPI Cluster runtimes 4YSupp with non-Intel HW
7S09004RWW	Intel oneAPI Base & HPC Toolkit (Multi-Node) Commercial Medium System support for oneAPI Cluster runtimes 5YSupp with non-Intel HW
Commercial – Small System (512+ nodes) with non-Intel HW
7S09004VWW	Intel oneAPI Base & HPC Toolkit (Multi-Node) Commercial Large System support for oneAPI Cluster runtimes 1YSupp with non-Intel HW
7S09004UWW	Intel oneAPI Base & HPC Toolkit (Multi-Node) Commercial Large System support for oneAPI Cluster runtimes 3YSupp with non-Intel HW
7S09004TWW	Intel oneAPI Base & HPC Toolkit (Multi-Node) Commercial Large System support for oneAPI Cluster runtimes 4YSupp with non-Intel HW
7S09004SWW	Intel oneAPI Base & HPC Toolkit (Multi-Node) Commercial Large System support for oneAPI Cluster runtimes 5YSupp with non-Intel HW
Commercial – Developer Based (for systems < 64 nodes) with non-Intel HW
7S09004WWW	Intel oneAPI Base & HPC Toolkit (Multi-Node) - 2 Concurrent User Commercial 1YSupp with non-Intel HW
7S09004XWW	Intel oneAPI Base & HPC Toolkit (Multi-Node) - 2 Concurrent User Commercial 3YSupp with non-Intel HW
7S09004YWW	Intel oneAPI Base & HPC Toolkit (Multi-Node) - 5 Concurrent User Commercial 1YSupp with non-Intel HW
7S09004ZWW	Intel oneAPI Base & HPC Toolkit (Multi-Node) - 5 Concurrent Users Commercial 3YSupp with non-Intel HW
7S090050WW	Intel oneAPI Base & HPC Toolkit (Multi-Node) - 10 Concurrent User Commercial 1YSupp with non-Intel HW
7S090051WW	Intel oneAPI Base & HPC Toolkit (Multi-Node) - 10 Concurrent Users Commercial 3YSupp with non-Intel HW
Academic - Small System (64 - 256 nodes)
7S09003YWW	Intel oneAPI Base & HPC Toolkit (Multi-Node) Academic Small System support for oneAPI Cluster runtimes with 1YSupp
7S09003ZWW	Intel oneAPI Base & HPC Toolkit (Multi-Node) Academic Small System support for oneAPI Cluster runtimes with 3YSupp
7S090040WW	Intel oneAPI Base & HPC Toolkit (Multi-Node) Academic Small System support for oneAPI Cluster runtimes with 4YSupp
7S090041WW	Intel oneAPI Base & HPC Toolkit (Multi-Node) Academic Small System support for oneAPI Cluster runtimes with 5YSupp
Academic - Medium System (257 - 512 nodes)
7S090042WW	Intel oneAPI Base & HPC Toolkit (Multi-Node) Academic Medium System support for oneAPI Cluster runtimes with 1YSupp
7S090043WW	Intel oneAPI Base & HPC Toolkit (Multi-Node) Academic Medium System support for oneAPI Cluster runtimes with 3YSupp
7S090044WW	Intel oneAPI Base & HPC Toolkit (Multi-Node) Academic Medium System support for oneAPI Cluster runtimes with 4YSupp
7S090045WW	Intel oneAPI Base & HPC Toolkit (Multi-Node) Academic Medium System support for oneAPI Cluster runtimes with 5YSupp
Academic - Large System (512+ nodes)
7S090046WW	Intel oneAPI Base & HPC Toolkit (Multi-Node) Academic Large System support for oneAPI Cluster runtimes with 1YSupp
7S090047WW	Intel oneAPI Base & HPC Toolkit (Multi-Node) Academic Large System support for oneAPI Cluster runtimes with 3YSupp
7S090048WW	Intel oneAPI Base & HPC Toolkit (Multi-Node) Academic Large System support for oneAPI Cluster runtimes with 4YSupp
7S090049WW	Intel oneAPI Base & HPC Toolkit (Multi-Node) Academic Large System support for oneAPI Cluster runtimes with 5YSupp
Academic - Developer Based (for systems < 64 nodes)
7S09004AWW	Intel oneAPI Base & HPC Toolkit (Multi-Node) - 2 Concurrent User Academic 1YSupp with Intel HW
7S09004BWW	Intel oneAPI Base & HPC Toolkit (Multi-Node) - 2 Concurrent User Academic 3YSupp with Intel HW
7S09004CWW	Intel oneAPI Base & HPC Toolkit (Multi-Node) - 5 Concurrent User Academic 1YSupp with Intel HW
7S09004DWW	Intel oneAPI Base & HPC Toolkit (Multi-Node) - 5 Concurrent User Academic 3YSupp with Intel HW
7S09004EWW	Intel oneAPI Base & HPC Toolkit (Multi-Node) - 10 Concurrent User Academic 1YSupp with Intel HW
7S09004FWW	Intel oneAPI Base & HPC Toolkit (Multi-Node) - 10 Concurrent User Academic 3YSupp with Intel HW

AI: application workload and advanced GPU orchestration

The following software suite is available with Lenovo EveryScale HPC & AI Software Stack.

NVIDIA AI Enterprise
NVAIE is an end-to-end, cloud-native suite of AI and data analytics software, optimized, certified, and supported by NVIDIA to run on VMware vSphere and bare-metal with NVIDIA-Certified Systems™.  It includes key enabling technologies from NVIDIA for rapid deployment, management, and scaling of AI workloads in the modern hybrid cloud. NVAIE is licensed on a per-GPU basis and can be purchased as either a perpetual license with support services, or as an annual or multi-year subscription.
- The perpetual license provides the right to use the NVIDIA AI Enterprise software indefinitely, with no expiration. NVIDIA AI Enterprise with perpetual licenses must be purchased in conjunction with one-year, three-year, or five-year support services. A one-year support service is also available for renewals.
- The subscription offerings are an affordable option to allow IT departments to better manage the flexibility of license volumes. NVIDIA AI Enterprise software products with subscription includes support services for the duration of the software’s subscription license.

The features of NVAIE Software are listed in the following table.

Table 8. Features of NVAIE software
Features	Supported in NVIDIA AI Enterprise
Per GPU Licensing	Yes
Compute Virtualization	Supported
Windows Guest OS Support	No support
Linux Guest OS Support	Supported
Maximum Displays	1
Maximum Resolution	4096 x 2160 (4K)
OpenGL and Vulkan	In-situ Graphics only
CUDA and OpenCL Support	Supported
ECC and Page Retirement	Supported
MIG GPU Support	Supported
Multi-vGPU	Supported
NVIDIA GPUDirect	Supported
Peer-to-Peer over NVLink	Supported
GPU Pass Through Support	Supported
Baremetal Support	Supported
AI and Data Science applications and Frameworks	Supported
Cloud Native ready	Supported

Note: Maximum 10 concurrent VMs per product license.

NVIDIA Run: ai

NVIDIA Run: ai is an enterprise platform for AI workloads and GPU orchestration, accelerating AI operations with dynamic orchestration across the AI life cycle, maximizing GPU efficiency, and integrating seamlessly into hybrid AI infrastructure. The platform provides features such as AI-native workload orchestration, unified AI infrastructure management, flexible AI deployment, and open architecture, supporting public clouds, private clouds, hybrid environments, and on-premises data centers. For more information, see the NVIDIA Run:ai product page.

The following tables list the ordering part numbers.

Table 9. NVIDIA AI Enterprise Software (NVAIE)
Part number	Description
7S02001BWW	NVIDIA AI Enterprise Perpetual License and Support per GPU Socket, 5 Years
7S02001EWW	NVIDIA AI Enterprise Perpetual License and Support per GPU Socket, EDU, 5 Years
7S02001FWW	NVIDIA AI Enterprise Subscription License and Support per GPU Socket, 1 Year
7S02001GWW	NVIDIA AI Enterprise Subscription License and Support per GPU Socket, 3 Years
7S02001HWW	NVIDIA AI Enterprise Subscription License and Support per GPU Socket, 5 Years
7S02001JWW	NVIDIA AI Enterprise Subscription License and Support per GPU Socket, EDU, 1 Year
7S02001KWW	NVIDIA AI Enterprise Subscription License and Support per GPU Socket, EDU, 3 Years
7S02001LWW	NVIDIA AI Enterprise Subscription License and Support per GPU Socket, EDU, 5 Years

Table 10. NVIDIA Run: ai
Part number	Description
Software subscription
7S02004UWW	NVIDIA Run:ai Subscription per GPU 1 Year
7S02004XWW	NVIDIA Run:ai Subscription per GPU 3 Years
7S020050WW	NVIDIA Run:ai Subscription per GPU 5 Years
7S02004VWW	NVIDIA Run:ai Subscription per GPU EDU 1 Year
7S02004YWW	NVIDIA Run:ai Subscription per GPU EDU 3 Year
7S020051WW	NVIDIA Run:ai Subscription per GPU EDU 5 Year
7S02004WWW	NVIDIA Run:ai Subscription per GPU INC 1 Year
7S02004ZWW	NVIDIA Run:ai Subscription per GPU INC 3 Year
7S020052WW	NVIDIA Run:ai Subscription per GPU INC 5 Year
Support Services subscription
7S020053WW	24x7 Support Services for NVIDIA Run:ai Subscription per GPU 1 Year
7S020056WW	24x7 Support Services for NVIDIA Run:ai Subscription per GPU 3 Year
7S020059WW	24x7 Support Services for NVIDIA Run:ai Subscription per GPU 5 Year
7S020054WW	24x7 Support Services for NVIDIA Run:ai Subscription per GPU EDU 1 Year
7S02005AWW	24x7 Support Services for NVIDIA Run:ai Subscription per GPU EDU 3 Year
7S020057WW	24x7 Support Services for NVIDIA Run:ai Subscription per GPU EDU 5 Year
7S020055WW	24x7 Support Services for NVIDIA Run:ai Subscription per GPU INC 1 Year
7S020058WW	24x7 Support Services for NVIDIA Run:ai Subscription per GPU INC 3 Year
7S02005BWW	24x7 Support Services for NVIDIA Run:ai Subscription per GPU INC 5 Year

Energy Monitoring and Optimization

The following Energy Monitoring and Optimization software is available with Lenovo EveryScale HPC & AI Software Stack.

Energy Aware Runtime (EAR)
EAR is licensed under EPL 2.0 and developed by BSC and Energy Aware Solutions (EAS). While the EAR core version remains open-source, EAR also features extensions developed by the EAS team and provided under a proprietary license. EAR packs are the different solutions provided by EAS, which include the EAR core, EAR extensions, and EAS installation and support services. EAR packs can be purchased from Lenovo under the EveryScale HPC & AI Software Stack CTO and are delivered through EAS.

EAS offers three EAR packs:

EAR Detective Pro – Includes Data Center monitoring, Analysis and Accounting modules.
EAR Optimizer – Includes all features of EAR Detective Pro, with the addition of the energy optimization module for both CPU and GPU, and the Job and System Analytics tools.
EAR Optimizer Pro – Build upon EAR Optimizer by introducing the smart power capping module.

EAR is compatible with Slurm, PBS Pro and Kubernetes clusters.

The following table lists the ordering part numbers for EAR Licensing, Support and Services

Table 11. Energy Aware Runtime (EAR) Service and Support
Part number	Description
EAR Detective Pro
7S09005FWW	EAR Detective Pro Worldwide Installation and Training price up to 1000 sockets
7S09005GWW	EAR Detective Pro Worldwide License & Support 1yr - clusters up to 1000 sockets (CPUs and GPUs, up to 1 type of each)
7S09005HWW	EAR Detective Pro Worldwide License & Support 3yr - clusters up to 1000 sockets (CPUs and GPUs, up to 1 type of each)
7S09005JWW	EAR Detective Pro Worldwide License & Support 5yr - clusters up to 1000 sockets (CPUs and GPUs, up to 1 type of each)
EAR Optimizer
7S09005KWW	EAR Optimizer Worldwide Installation and Training - clusters up to 1000 sockets (CPUs only)
7S09005LWW	EAR Optimizer Worldwide Installation and Training - clusters up to 1000 sockets (CPUs and GPUs)
7S09005MWW	EAR Optimizer Worldwide License & Support price 1yr up to 1000 sockets per CPU socket
7S09005NWW	EAR Optimizer Worldwide License & Support price 3yr up to 1000 sockets per CPU socket
7S09005PWW	EAR Optimizer Worldwide License & Support price 5yr up to 1000 sockets per CPU socket
7S09005QWW	EAR Optimizer Worldwide License & Support price 1yr up to 1000 sockets per GPU socket
7S09005RWW	EAR Optimizer Worldwide License & Support price 3yr up to 1000 sockets per GPU socket
7S09005SWW	EAR Optimizer Worldwide License & Support price 5yr up to 1000 sockets per GPU socket

EAR Optimizer Pro* (special BID required)

Note: These part numbers are applicable only to clusters that meet all of the following criteria:

Contain a single CPU type and a single GPU type
Have a combined total of up to 1,000 sockets (including both CPU and GPU sockets)
Do not require smart power capping (i.e. do not use EAR Optimizer Pro)

*Special BID Requirement: A Special BID from EAS is required in the following scenarios:

The cluster exceeds 1,000 sockets
The cluster includes multiple CPU or GPU types
The deployment requires smart power capping, available only through EAR Optimizer Pro

Seller training courses

The following sales training courses are offered for employees and partners (login required). Courses are listed in date order.

VTT HPC: LiCO-Computing Orchestration for AI and HPC
2024-07-30 | 92 minutes | Employees Only
Details
VTT HPC: LiCO-Computing Orchestration for AI and HPC

**NOTE: To download the attached PPT, Launch the course, exit the player to return to this screen, then scroll down to find the PPT to download.**

Please view this session as Ana Irimiea, AI Systems and Solutions Product Manager at ISG, speaks with us about LiCO version 7.2.1.

She will talk about:

-Overview of LiCO
-Administrator and user capabilities
-Deployment options
-Ordering LiCO
-Roadmap

Tags: High-Performance Computing (HPC), Sales, Technical Sales
Published: 2024-07-30
Length: 92 minutes

Start the training:
Employee link: Grow@Lenovo

Course code: DVHPC214
Enterprise Deployment of AI and Phases of Model Development
2024-05-23 | 12 minutes | Employees and Partners
Details
Enterprise Deployment of AI and Phases of Model Development

Lenovo Senior AI Data Scientist Dr David Ellison whiteboards the concepts of using data from multiple sources to derive customer benefits through Artificial Intelligence and LiCO (Lenovo Intelligent Computing Orchestration) software.

By the end of this training, you should be able to:

•Describe enterprise deployment of Artificial Intelligence
•Explain the process of model development
•State the purpose of LiCO (Lenovo Intelligent Computing Orchestration) software
•List three examples of Artificial Intelligence solutions

Tags: Artificial Intelligence (AI), Data & Analytics, Technical Sales
Published: 2024-05-23
Length: 12 minutes

Start the training:
Employee link: Grow@Lenovo
Partner link: Lenovo 360 Learning Center

Course code: DAIS101
Selling Lenovo Intelligent Computing Orchestration
2021-08-25 | 18 minutes | Employees and Partners
Details
Selling Lenovo Intelligent Computing Orchestration

The goal of this course is to help ISG and Business Partner sellers understand Lenovo Intelligent Computing Orchestration (LiCO) software. Learn when and how to propose LiCO in order to continue the conversation with the customer and making a sale.

Tags: Artificial Intelligence (AI)
Published: 2021-08-25
Length: 18 minutes

Start the training:
Employee link: Grow@Lenovo
Partner link: Lenovo 360 Learning Center

Course code: DAIS202

Resources

For more information, see these resources:

LiCO Product Guide:
https://lenovopress.lenovo.com/lp0858-lenovo-intelligent-computing-orchestration-lico#product-families
LiCO website:
https://www.lenovo.com/us/en/data-center/software/lico/
Lenovo DSCS configurator:
https://dcsc.lenovo.com
Optimizing Power and Energy in HPC data centers with Energy Aware Runtime
https://lenovopress.lenovo.com/lp1646
Energy Aware Runtime software and documentation:
http://www.eas4dc.com
Lenovo Confluent documentation:
https://hpc.lenovo.com/users/documentation/
Lenovo Compute Orchestration in HPC Data Centers with Slurm - Solution Brief:
https://lenovopress.lenovo.com/lp1701-lenovo-compute-orchestration-in-hpc-data-centers-with-slurm

Related product families

Product families related to this document are the following:

Trademarks

Lenovo and the Lenovo logo are trademarks or registered trademarks of Lenovo in the United States, other countries, or both. A current list of Lenovo trademarks is available on the Web at https://www.lenovo.com/us/en/legal/copytrade/.

The following terms are trademarks of Lenovo in the United States, other countries, or both:
Lenovo®

The following terms are trademarks of other companies:

Intel® is a trademark of Intel Corporation or its subsidiaries.

Linux® is the trademark of Linus Torvalds in the U.S. and other countries.

Windows® is a trademark of Microsoft Corporation in the United States, other countries, or both.

LSF® is a trademark of IBM in the United States, other countries, or both.

Other company, product, or service names may be trademarks or service marks of others.

Lenovo Press

Lenovo Press

Lenovo EveryScale HPC & AI Software Stack

Product Guide

Author

Updated

Form Number

PDF size

Abstract

Change History

Introduction

Did you know?

Benefits

Software components

Orchestration and management

Programming environment

AI: application workload and advanced GPU orchestration

Energy Monitoring and Optimization

Seller training courses

Resources

Related product families

Trademarks