skip to main content

Lenovo EveryScale HPC & AI Software Stack

Product Guide

Home
Top
Author
Updated
31 Jul 2025
Form Number
LP1651
PDF size
17 pages, 274 KB

Abstract

The Lenovo EveryScale HPC & AI Software Stack combines open-source with proprietary best-of-breed supercomputing software to provide the most consumable open-source HPC software stack embraced by all Lenovo HPC customers.

This product guide provides essential pre-sales information to understand the key features and components of the EveryScale HPC & AI Software Stack. The product guide is intended for technical specialists, sales specialists, sales engineers, IT architects, and other IT professionals who want to learn more about the Lenovo EveryScale HPC & AI Software Stack.

Change History

Changes in the July 31, 2025 update:

  • Replaced image under - Introduction section
  • The following new subsections have been added under - Software components section
    • SchedMD Slurm Support service
    • AI: application workload and advanced GPU orchestration
    • Energy monitoring and optimization
       

Introduction

The Lenovo EveryScale HPC & AI Software Stack combines open-source with proprietary best-of-breed Supercomputing software to provide the most consumable open-source HPC software stack embraced by all Lenovo HPC customers.

It provides a fully tested and supported, complete but customizable HPC software stack to enable the administrators and users in optimally and environmentally sustainable utilizing their Lenovo Supercomputers.

The software stack is built on the most widely adopted and maintained HPC community software for orchestration and management. It integrates third party components especially around programming environments and performance optimization to complement and enhance the capabilities, creating the organic umbrella in software and service to add value for our customers.

The software stack offers key software and support components for orchestration and management, programming environments and services and support, as shown in the following figure.

Lenovo HPC & AI Software Stack
Figure 1. Lenovo EveryScale HPC & AI Software Stack

Did you know?

Lenovo EveryScale HPC & AI Software Stack is a modular software stack tailored to our customer's needs. Thoroughly tested, supported and periodically updated, it combines the latest open-source HPC software releases to enable organizations with an agile and scalable IT infrastructure.

Benefits

The Lenovo EveryScale HPC & AI Software Stack provides the following benefits to customers.

Overcoming the Complexity of HPC Software

An HPC system software stack consists of dozens of components, that administrators must integrate and validate before an organization’s HPC applications can run on top of the stack. Ensuring stable, reliable versions of all stack components is an enormous task due to the numerous interdependencies. This task is very time consuming because of the constant release cycles and updates of individual components.

The Lenovo EveryScale HPC & AI Software Stack is fully tested, supported and periodically updated to combine the latest open-source HPC software releases, enabling organizations with an agile and scalable IT infrastructure.

Benefits of the Open-source Model

Going forward, in IDC's opinion, the development model exemplified by Linux is more workable. In this model, stack development is driven primarily by the open-source community and vendors offer supported distributions with additional capabilities for customers that require and are willing to pay for them. As the Linux initiative demonstrates, a community-based model like this has major advantages for enabling software to keep pace with requirements for HPC computing and storage hardware systems.

This model delivers new capabilities faster to users and makes HPC systems more productive and higher returning investments.

A fair number of foundational open source HPC software components already exist (e.g., Open MPI, Rocky Linux, Slurm, OpenStack, and others). Many HPC community members are already taking advantage of these.

Customers will benefit from the HPC community, as the community works to integrate a multitude of components that are commonly used in HPC systems and are freely available for open source distribution.

The key open-source components of the software stack are:

  • Confluent Management

    Confluent is Lenovo-developed open-source software designed to discover, provision, and manage HPC clusters and the nodes that comprise them. Confluent provides powerful tooling to deploy and update software and firmware to multiple nodes simultaneously, with simple and readable modern software syntax.

  • Slurm Orchestration

    Slurm is integrated as an open source, flexible, and modern choice to manage complex workloads for faster processing and optimal utilization of the large-scale and specialized high-performance and AI resource capabilities needed per workload provided by Lenovo systems. Lenovo provides support in partnership with SchedMD.

  • LiCO Webportal

    Lenovo Intelligent Computing Orchestration (LiCO) is a Lenovo-developed consolidated Graphical User Interface (GUI) for monitoring, managing and using cluster resources. The web portal provides workflows for both AI and HPC, and supports multiple AI frameworks, including TensorFlow, Caffe, Neon, and MXNet, allowing you to leverage a single cluster for diverse workload requirements.

  • Energy Aware Runtime

    EAR is a powerful European open-source energy management suite supporting anything from monitoring over power capping to live-optimization during the application runtime. Lenovo is collaborating with Barcelona Supercomputing Centre (BSC) and EAS4DC on the continuous development and support and offers three versions with differentiating capabilities.

Software components

Orchestration and management

The following orchestration and management software is available with Lenovo EveryScale HPC & AI Software Stack:

  • Confluent (Best Recipe interoperability)

    Confluent is Lenovo-developed open source software designed to discover, provision, and manage HPC clusters and the nodes that comprise them. Our Confluent management system and LiCO Web portal provide an interface designed to abstract the users from the complexity of HPC cluster orchestration and AI workloads management, making open-source HPC software consumable for every customer. Confluent provides powerful tooling to deploy and update software and firmware to multiple nodes simultaneously, with simple and readable modern software syntax. Additionally, Confluent’s performance scales seamlessly from small workstation clusters to thousand-plus node supercomputers. For more information, see the Confluent documentation and the Lenovo Confluent Management Software paper.

  • Lenovo Intelligent Computing Orchestration (Best Recipe interoperability)

    Lenovo Intelligent Computing Orchestration (LiCO) is a Lenovo-developed software solution that simplifies the management and use of distributed clusters for High Performance Computing (HPC) and Artificial Intelligence (AI) environments. LiCO provides a consolidated Graphical User Interface (GUI) for monitoring and usage of cluster resources, allowing you to easily run both HPC and AI workloads across a choice of Lenovo infrastructure, including both CPU and GPU solutions to suit varying application requirements.

    LiCO Web portal provides workflows for both AI and HPC, and supports multiple AI frameworks, including TensorFlow, Caffe, Neon, and MXNet, allowing you to leverage a single cluster for diverse workload requirements. For more information, see the LiCO product guide.

  • LiCO customization service
    Lenovo Intelligent Computing Orchestration (LiCO) customization services enable customers to request customized features tailored for their own needs. The service is evaluated and quoted in the form of man-days, based on the actual order list.
    HPC solution sellers need to provide pre-sales support, collaborate with HPC architects, and communicate with LiCO's R&D team (Ding Hong dinghong1@lenovo.com) for workload evaluations. A quote is provided by Lenovo based on the output SOW and analysis of the workload. After the sellers place an order, they will email the R&D team to request implementation. The LiCO R&D team will deliver the work based on the order content and SOW.

  • Slurm

    Slurm is a modern, open-source scheduler designed specifically to satisfy the demanding needs of high-performance computing (HPC), high throughput computing (HTC) and AI. Slurm is developed and maintained by SchedMD® and integrated within LiCO. Slurm maximizes workload throughput, scale, reliability, and results in the fastest possible time while optimizing resource utilization and meeting organizational priorities. Slurm automates job scheduling to help admin and users manage the complexities of on-prem, hybrid, or cloud workspaces. Slurm workload manager executes faster and is more reliable ensuring increased productivity while decreasing costs. Slurm’s modern, plug-in-based architecture runs on a RESTful API supporting both large and small HPC, HTC, and AI environments. Allow your teams to focus on their work while Slurm manages their workloads.

  • SchedMD Slurm Support service capabilities for Lenovo HPC and AI systems include:
    • Level 3 Support: High-performance systems must perform at high utilization and performance to meet end users and management return on the investment expectations. Customers covered by a support contract can reach out to SchedMD engineer experts to promptly resolve complex workload management issues and receive answers back to complex config questions quickly, instead of taking weeks or even months to try to resolve them in-house.
    • Remote Consulting: Valuable assistance and implementation expertise that speeds custom configuration tuning to increase throughput and utilization efficiency on complex and large-scale systems. Customers can review cluster requirements, operating environment, and organizational goals directly with a Slurm engineer to optimize the configuration and meet organizational needs.
    • Tailored Slurm Training: Tailored Slurm expert training that empowers users on harnessing Slurm capabilities to speed projects and increase technology adoption. A customer scoping call before the onsite Instruction ensures coverage of specific use cases addressing organization needs. An in-depth and comprehensive technical training is delivered in a hands-on lab workshop format to help users feel empowered on Slurm best practices in their site-specific use cases and configuration.
  • NVIDIA Unified Fabric Manager (UFM) (ISV supported)

    NVIDIA Unified Fabric Manager (UFM) is InfiniBand networking management software that combines enhanced, real-time network telemetry with fabric visibility and control to support scale-out InfiniBand data centers. For more information, see the NVIDIA UFM product page.

    The two UFM offerings available from Lenovo are as follows:

    • UFM Telemetry for Real-Time Monitoring

      The UFM Telemetry platform provides network validation tools to monitor network performance and conditions, capturing and streaming rich real-time network telemetry information, application workload usage, and system configuration to an on-premises or cloud-based database for further analysis.

    • UFM Enterprise for Fabric Visibility and Control

      The UFM Enterprise platform combines the benefits of UFM Telemetry with enhanced network monitoring and management. It performs automated network discovery and provisioning, traffic monitoring, and congestion discovery. It also enables job schedule provisioning and integrates with industry-leading job schedulers and cloud and cluster managers, including Slurm and Platform Load Sharing Facility (LSF).

The following table lists all Orchestration and Management software available with Lenovo EveryScale HPC & AI Software Stack.

Table 1. Confluent
Part number Feature code Description
Lenovo Confluent support
7S090039WW S9VH Lenovo Confluent 1 Year Support per managed node
7S09003AWW S9VJ Lenovo Confluent 3 Year Support per managed node
7S09003BWW S9VK Lenovo Confluent 5 Year Support per managed node
7S09003CWW S9VL Lenovo Confluent 1 Extension Year Support per managed node
Table 2. Lenovo Intelligent Computing Orchestration (LiCO)
Part number Feature code Description
Lenovo Intelligent Computing Orchestration (LiCO) HPC AI version
7S090004WW B1YC Lenovo HPC AI LiCO Software 90 Day Evaluation License
7S09002BWW S93A Lenovo HPC AI LiCO Webportal w/1 yr S&S
7S09002CWW S93B Lenovo HPC AI LiCO Webportal w/3 yr S&S
7S09002DWW S93C Lenovo HPC AI LiCO Webportal w/5 yr S&S
LiCO customization service
7S09004HWW SBF2 Lenovo HPC AI LiCO customization Service per man day
Table 3. SchedMD Sum Support
Part number Description
SchedMD Sum Support for Lenovo HPC Systems
7S09001MWW SchedMD Slurm Onsite or Remote 3-day Training*
7S09001NWW SchedMD Slurm Consulting w/Sr.Engineer 2REMOTE Sessions**
7S09001PWW SchedMD L3 Slurm support up to 100 Sockets/GPUs 3Y
7S09001QWW SchedMD L3 Slurm support up to 100 Sockets/GPUs 5Y
7S09001RWW SchedMD L3 Slurm support up to 100 Sockets/GPUs additional 1Y
7S09001SWW SchedMD L3 Slurm support 101-1000 Sockets/GPUs 3Y
7S09001TWW SchedMD L3 Slurm support 101-1000 Sockets/GPUs 5Y
7S09001UWW SchedMD L3 Slurm support 101-1000 Sockets/GPUs additional 1Y
7S09001VWW SchedMD L3 Slurm support 1001-5000+ Sockets/GPUs 3Y
7S09001WWW SchedMD L3 Slurm support 1001-5000+ Sockets/GPUs 5Y
7S09001XWW SchedMD L3 Slurm support 1001-5000+ Sockets/GPUs additional 1Y
7S09001YWW SchedMD L3 Slurm support up to 100 Sockets/GPUs 3Y EDU&GOV
7S09001ZWW SchedMD L3 Slurm support up to 100 Sockets/GPUs 5Y EDU&GOV
7S090022WW SchedMD L3 Slurm support up to 100 Sockets/GPUs additional 1Y EDU&GOV
7S090023WW SchedMD L3 Slurm support 101-1000 Sockets/GPUs 3Y EDU&GOV
7S090024WW SchedMD L3 Slurm support 101-1000 Sockets/GPUs 5Y EDU&GOV
7S090026WW SchedMD L3 Slurm support 101-1000 Sockets/GPUs additional 1Y EDU&GOV
7S090027WW SchedMD L3 Slurm support 1001-5000+ Sockets/GPUs 3Y EDU&GOV
7S090028WW SchedMD L3 Slurm support 1001-5000+ Sockets/GPUs 5Y EDU&GOV
7S09002AWW SchedMD L3 Slurm support 1001-5000+ Sockets/GPUs additional 1Y EDU&GOV

* SchedMD Slurm Onsite or Remote 3-day Training: in-depth and comprehensive site-specific technical training. Can only be added to a support purchase.
** SchedMD Slurm Consulting w/Sr.Engineer 2REMOTE Sessions (Up to 8 hrs): review initial Slurm setup, in-depth technical chats around specific Slurm topics & review site config for optimization & best practices. Required with support purchase, cannot be purchased separately.

Note: SchedMD Slurm Consulting w/Sr.Engineer 2REMOTE Sessions option must be selected and locked in for every SchedMD support selection.

SchedMD Slurm Onsite or Remote 3-day Training option must be selected and locked in for every SchedMD Commercial support selection. Optional for EDU & Government support selections.

Table 4. Lenovo Intelligent Computing Orchestration (LiCO)
Part number Feature code Description
UFM Telemetry
7S090011WW S921 NVIDIA UFM Telemetry 1-year License and 24/7 Support for Lenovo clusters
7S090012WW S922 NVIDIA UFM Telemetry 3-year License and 24/7 Support for Lenovo clusters
7S090013WW S923 NVIDIA UFM Telemetry 5-year License and 24/7 Support for Lenovo clusters
7S09002DWW S93C Lenovo HPC AI LiCO Webportal w/5 yr S&S
UFM Enterprise
7S09000XWW S91Y NVIDIA UFM Enterprise 1-year License and 24/7 Support for Lenovo clusters
7S09000YWW S91Z NVIDIA UFM Enterprise 3-year License and 24/7 Support for Lenovo clusters
7S09000ZWW S920 NVIDIA UFM Enterprise 5-year License and 24/7 Support for Lenovo clusters

Programming environment

The following programming software is available with Lenovo EveryScale HPC&AI Software Stack.

  • NVIDIA CUDA

    NVIDIA CUDA is a parallel computing platform and programming model for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs. When using CUDA, developers program in popular languages such as C, C++, Fortran, Python and MATLAB and express parallelism through extensions in the form of a few basic keywords. For more information, see the NVIDIA CUDA Zone.

  • NVIDIA HPC Software Development Kit

    The NVIDIA HPC SDK C, C++, and Fortran compilers support GPU acceleration of HPC modeling and simulation applications with standard C++ and Fortran, OpenACC directives, and CUDA. GPU-accelerated math libraries maximize performance on common HPC algorithms, and optimized communications libraries enable standards-based multi-GPU and scalable systems programming. Performance profiling and debugging tools simplify porting and optimization of HPC applications, and containerization tools enable easy deployment on-premises or in the cloud. For more information, see the NVIDIA HPC SDK.

  • Intel oneAPI
    The Intel oneAPI Base & HPC Toolkit is a comprehensive software development suite designed to empower developers in creating HPC & AI solutions that exploit the full potential of modern hardware architectures. This toolkit encompasses an array of advanced tools, libraries, and compilers, enabling programmers to efficiently design, optimize, and deploy parallel applications across diverse computing platforms, including CPUs, GPUs, and FPGAs. With a focus on fostering code portability and performance scalability, the Intel oneAPI Base & HPC Toolkit equips developers with the means to enhance productivity, streamline software development, and achieve exceptional performance outcomes in the realm of high-performance computing.
    • For more information, see Intel® oneAPI Base and HPC Toolkit
    • System-based pricing (introduced in mid-2023) can range from a small system (64-256 nodes), medium system (257-512 nodes), or large system (512+ nodes)
    • Developer-based pricing is for systems with fewer than 64 nodes and offers support for a limited number of users.
    • Part numbers are available for Commercial or Academic customers.
    • Support Renewals are available.
    • Commercial parts have different part numbers if they are quoted with or without Intel hardware.

The following tables list the relevant ordering part numbers.

Table 5. NVIDIA CUDA part numbers
Part number Description
NVIDIA CUDA
7S09001EWW NVIDIA CUDA Support and Maintenance (up to 200 GPUs), 1 Year
7S09001FWW NVIDIA CUDA Support and Maintenance (up to 500 GPUs), 1 Year
7S09002EWW NVIDIA CUDA Support and Maintenance (up to 1000 GPUs), 1 Year
7S09002FWW NVIDIA CUDA Support and Maintenance (up to 5000 GPUs), 1 Year
Table 6. NVIDIA HPC SDK part numbers
Part number Description
NVIDIA HPC SDK
7S090014WW NVIDIA HPC Compiler Support Services, 1 Year
7S090015WW NVIDIA HPC Compiler Support Services, 3 Years
7S09002GWW NVIDIA HPC Compiler Support Services, 5 Years
7S090016WW NVIDIA HPC Compiler Support Services, EDU, 1 Year
7S090017WW NVIDIA HPC Compiler Support Services, EDU, 3 Years
7S09002HWW NVIDIA HPC Compiler Support Services, EDU, 5 Years
7S09001CWW NVIDIA HPC Compiler Support Services - Additional Contact, 1 Year
7S09002JWW NVIDIA HPC Compiler Support Services - Additional Contact, 3 Years
7S09002KWW NVIDIA HPC Compiler Support Services - Additional Contact, 5 Years
7S09001DWW NVIDIA HPC Compiler Support Services - Additional Contact, EDU, 1 Year
7S09002LWW NVIDIA HPC Compiler Support Services - Additional Contact, EDU, 3 Years
7S09002MWW NVIDIA HPC Compiler Support Services - Additional Contact, EDU, 5 Years
7S09001AWW NVIDIA HPC Compiler Premier Support Services, 1 Year
7S09002NWW NVIDIA HPC Compiler Premier Support Services, 3 Years
7S09002PWW NVIDIA HPC Compiler Premier Support Services, 5 Years
7S09001BWW NVIDIA HPC Compiler Premier Support Services, EDU, 1 Year
7S09002QWW NVIDIA HPC Compiler Premier Support Services, EDU, 3 Years
7S09002RWW NVIDIA HPC Compiler Premier Support Services, EDU, 5 Years
7S090018WW NVIDIA HPC Compiler Premier Support Services - Additional Contact, 1 Year
7S09002SWW NVIDIA HPC Compiler Premier Support Services - Additional Contact, 3 Years
7S09002TWW NVIDIA HPC Compiler Premier Support Services - Additional Contact, 5 Years
7S090019WW NVIDIA HPC Compiler Premier Support Services - Additional Contact, EDU, 1 Year
7S09002UWW NVIDIA HPC Compiler Premier Support Services - Additional Contact, EDU, 3 Years
7S09002VWW NVIDIA HPC Compiler Premier Support Services - Additional Contact, EDU, 5 Years
Table 7. Intel oneAPU part numbers
Part number Description
Commercial - Small System (64 - 256 nodes)
7S09003DWW Intel oneAPI Base & HPC Toolkit (Multi-Node) Commercial Small System support for oneAPI Cluster runtimes 1YSupp with Intel HW
7S09003EWW Intel oneAPI Base & HPC Toolkit (Multi-Node) Commercial Small System support for oneAPI Cluster runtimes 3YSupp with Intel HW
7S09003FWW Intel oneAPI Base & HPC Toolkit (Multi-Node) Commercial Small System support for oneAPI Cluster runtimes 4YSupp with Intel HW
7S09003GWW Intel oneAPI Base & HPC Toolkit (Multi-Node) Commercial Small System support for oneAPI Cluster runtimes 5YSupp with Intel HW
Commercial - Medium System (257 - 512 nodes)
7S09003HWW Intel oneAPI Base & HPC Toolkit (Multi-Node) Commercial Medium System support for oneAPI Cluster runtimes 1YSupp with Intel HW
7S09003JWW Intel oneAPI Base & HPC Toolkit (Multi-Node) Commercial Medium System support for oneAPI Cluster runtimes 3YSupp with Intel HW
7S09003KWW Intel oneAPI Base & HPC Toolkit (Multi-Node) Commercial Medium System support for oneAPI Cluster runtimes 4YSupp with Intel HW
7S09003LWW Intel oneAPI Base & HPC Toolkit (Multi-Node) Commercial Medium System support for oneAPI Cluster runtimes 5YSupp with Intel HW
Commercial - Large System (512+ nodes)
7S09003QWW Intel oneAPI Base & HPC Toolkit (Multi-Node) Commercial Large System support for oneAPI Cluster runtimes 1YSupp with Intel HW
7S09003PWW Intel oneAPI Base & HPC Toolkit (Multi-Node) Commercial Large System support for oneAPI Cluster runtimes 3YSupp with Intel HW
7S09003NWW Intel oneAPI Base & HPC Toolkit (Multi-Node) Commercial Large System support for oneAPI Cluster runtimes 4YSupp with Intel HW
7S09003MWW Intel oneAPI Base & HPC Toolkit (Multi-Node) Commercial Large System support for oneAPI Cluster runtimes 5YSupp with Intel HW
Commercial – Developer Based (for systems < 64 nodes)
7S09003RWW Intel oneAPI Base & HPC Toolkit (Multi-Node) - 2 Concurrent User Commercial 1YSupp with Intel HW
7S09003SWW Intel oneAPI Base & HPC Toolkit (Multi-Node) - 2 Concurrent User Commercial 3YSupp with Intel HW
7S09003TWW Intel oneAPI Base & HPC Toolkit (Multi-Node) - 5 Concurrent User Commercial 1YSupp with Intel HW
7S09003UWW Intel oneAPI Base & HPC Toolkit (Multi-Node) - 5 Concurrent User Commercial 3YSupp with Intel HW
7S09003VWW Intel oneAPI Base & HPC Toolkit (Multi-Node) - 10 Concurrent User Commercial 1YSupp with Intel HW
7S09003WWW Intel oneAPI Base & HPC Toolkit (Multi-Node) - 10 Concurrent User Commercial 3YSupp with Intel HW
7S09003XWW Intel oneAPI Base & HPC Toolkit (Multi-Node) - Enterprise-Above 10 Concurrent Users
Commercial – Small System (64-256 nodes) with non-Intel HW
7S09004JWW Intel oneAPI Base & HPC Toolkit (Multi-Node) Commercial Small System support for oneAPI Cluster runtimes 1YSupp with non-Intel HW
7S09004KWW Intel oneAPI Base & HPC Toolkit (Multi-Node) Commercial Small System support for oneAPI Cluster runtimes 3YSupp with non-Intel HW
7S09004LWW Intel oneAPI Base & HPC Toolkit (Multi-Node) Commercial Small System support for oneAPI Cluster runtimes 4YSupp with non-Intel HW
7S09004MWW Intel oneAPI Base & HPC Toolkit (Multi-Node) Commercial Small System support for oneAPI Cluster runtimes 5YSupp with non-Intel HW
Commercial – Small System (257 - 512 nodes) with non-Intel HW
7S09004NWW Intel oneAPI Base & HPC Toolkit (Multi-Node) Commercial Medium System support for oneAPI Cluster runtimes 1YSupp with non-Intel HW
7S09004PWW Intel oneAPI Base & HPC Toolkit (Multi-Node) Commercial Medium System support for oneAPI Cluster runtimes 3YSupp with non-Intel HW
7S09004QWW Intel oneAPI Base & HPC Toolkit (Multi-Node) Commercial Medium System support for oneAPI Cluster runtimes 4YSupp with non-Intel HW
7S09004RWW Intel oneAPI Base & HPC Toolkit (Multi-Node) Commercial Medium System support for oneAPI Cluster runtimes 5YSupp with non-Intel HW
Commercial – Small System (512+ nodes) with non-Intel HW
7S09004VWW Intel oneAPI Base & HPC Toolkit (Multi-Node) Commercial Large System support for oneAPI Cluster runtimes 1YSupp with non-Intel HW
7S09004UWW Intel oneAPI Base & HPC Toolkit (Multi-Node) Commercial Large System support for oneAPI Cluster runtimes 3YSupp with non-Intel HW
7S09004TWW Intel oneAPI Base & HPC Toolkit (Multi-Node) Commercial Large System support for oneAPI Cluster runtimes 4YSupp with non-Intel HW
7S09004SWW Intel oneAPI Base & HPC Toolkit (Multi-Node) Commercial Large System support for oneAPI Cluster runtimes 5YSupp with non-Intel HW
Commercial – Developer Based (for systems < 64 nodes) with non-Intel HW
7S09004WWW Intel oneAPI Base & HPC Toolkit (Multi-Node) - 2 Concurrent User Commercial 1YSupp with non-Intel HW
7S09004XWW Intel oneAPI Base & HPC Toolkit (Multi-Node) - 2 Concurrent User Commercial 3YSupp with non-Intel HW
7S09004YWW Intel oneAPI Base & HPC Toolkit (Multi-Node) - 5 Concurrent User Commercial 1YSupp with non-Intel HW
7S09004ZWW Intel oneAPI Base & HPC Toolkit (Multi-Node) - 5 Concurrent Users Commercial 3YSupp with non-Intel HW
7S090050WW Intel oneAPI Base & HPC Toolkit (Multi-Node) - 10 Concurrent User Commercial 1YSupp with non-Intel HW
7S090051WW Intel oneAPI Base & HPC Toolkit (Multi-Node) - 10 Concurrent Users Commercial 3YSupp with non-Intel HW
Academic - Small System (64 - 256 nodes)
7S09003YWW Intel oneAPI Base & HPC Toolkit (Multi-Node) Academic Small System support for oneAPI Cluster runtimes with 1YSupp
7S09003ZWW Intel oneAPI Base & HPC Toolkit (Multi-Node) Academic Small System support for oneAPI Cluster runtimes with 3YSupp
7S090040WW Intel oneAPI Base & HPC Toolkit (Multi-Node) Academic Small System support for oneAPI Cluster runtimes with 4YSupp
7S090041WW Intel oneAPI Base & HPC Toolkit (Multi-Node) Academic Small System support for oneAPI Cluster runtimes with 5YSupp
Academic - Medium System (257 - 512 nodes)
7S090042WW Intel oneAPI Base & HPC Toolkit (Multi-Node) Academic Medium System support for oneAPI Cluster runtimes with 1YSupp
7S090043WW Intel oneAPI Base & HPC Toolkit (Multi-Node) Academic Medium System support for oneAPI Cluster runtimes with 3YSupp
7S090044WW Intel oneAPI Base & HPC Toolkit (Multi-Node) Academic Medium System support for oneAPI Cluster runtimes with 4YSupp
7S090045WW Intel oneAPI Base & HPC Toolkit (Multi-Node) Academic Medium System support for oneAPI Cluster runtimes with 5YSupp
Academic - Large System (512+ nodes)
7S090046WW Intel oneAPI Base & HPC Toolkit (Multi-Node) Academic Large System support for oneAPI Cluster runtimes with 1YSupp
7S090047WW Intel oneAPI Base & HPC Toolkit (Multi-Node) Academic Large System support for oneAPI Cluster runtimes with 3YSupp
7S090048WW Intel oneAPI Base & HPC Toolkit (Multi-Node) Academic Large System support for oneAPI Cluster runtimes with 4YSupp
7S090049WW Intel oneAPI Base & HPC Toolkit (Multi-Node) Academic Large System support for oneAPI Cluster runtimes with 5YSupp
Academic - Developer Based (for systems < 64 nodes)
7S09004AWW Intel oneAPI Base & HPC Toolkit (Multi-Node) - 2 Concurrent User Academic 1YSupp with Intel HW
7S09004BWW Intel oneAPI Base & HPC Toolkit (Multi-Node) - 2 Concurrent User Academic 3YSupp with Intel HW
7S09004CWW Intel oneAPI Base & HPC Toolkit (Multi-Node) - 5 Concurrent User Academic 1YSupp with Intel HW
7S09004DWW Intel oneAPI Base & HPC Toolkit (Multi-Node) - 5 Concurrent User Academic 3YSupp with Intel HW
7S09004EWW Intel oneAPI Base & HPC Toolkit (Multi-Node) - 10 Concurrent User Academic 1YSupp with Intel HW
7S09004FWW Intel oneAPI Base & HPC Toolkit (Multi-Node) - 10 Concurrent User Academic 3YSupp with Intel HW

AI: application workload and advanced GPU orchestration

The following software suite is available with Lenovo EveryScale HPC & AI Software Stack.

  • NVIDIA AI Enterprise
    NVAIE is an end-to-end, cloud-native suite of AI and data analytics software, optimized, certified, and supported by NVIDIA to run on VMware vSphere and bare-metal with NVIDIA-Certified Systems™.  It includes key enabling technologies from NVIDIA for rapid deployment, management, and scaling of AI workloads in the modern hybrid cloud. NVAIE is licensed on a per-GPU basis and can be purchased as either a perpetual license with support services, or as an annual or multi-year subscription.
    • The perpetual license provides the right to use the NVIDIA AI Enterprise software indefinitely, with no expiration. NVIDIA AI Enterprise with perpetual licenses must be purchased in conjunction with one-year, three-year, or five-year support services. A one-year support service is also available for renewals.
    • The subscription offerings are an affordable option to allow IT departments to better manage the flexibility of license volumes. NVIDIA AI Enterprise software products with subscription includes support services for the duration of the software’s subscription license.

The features of NVAIE Software are listed in the following table.

Table 8. Features of NVAIE software
Features Supported in NVIDIA AI Enterprise
Per GPU Licensing Yes
Compute Virtualization Supported
Windows Guest OS Support No support
Linux Guest OS Support Supported
Maximum Displays 1
Maximum Resolution 4096 x 2160 (4K)
OpenGL and Vulkan In-situ Graphics only
CUDA and OpenCL Support Supported
ECC and Page Retirement Supported
MIG GPU Support Supported
Multi-vGPU Supported
NVIDIA GPUDirect Supported
Peer-to-Peer over NVLink Supported
GPU Pass Through Support Supported
Baremetal Support Supported
AI and Data Science applications and Frameworks Supported
Cloud Native ready Supported

Note: Maximum 10 concurrent VMs per product license.

  • NVIDIA Run: ai

NVIDIA Run: ai is an enterprise platform for AI workloads and GPU orchestration, accelerating AI operations with dynamic orchestration across the AI life cycle, maximizing GPU efficiency, and integrating seamlessly into hybrid AI infrastructure. The platform provides features such as AI-native workload orchestration, unified AI infrastructure management, flexible AI deployment, and open architecture, supporting public clouds, private clouds, hybrid environments, and on-premises data centers. For more information, see the NVIDIA Run:ai product page.

The following tables list the ordering part numbers.

Table 9. NVIDIA AI Enterprise Software (NVAIE)
Part number Description
7S02001BWW NVIDIA AI Enterprise Perpetual License and Support per GPU Socket, 5 Years
7S02001EWW NVIDIA AI Enterprise Perpetual License and Support per GPU Socket, EDU, 5 Years
7S02001FWW NVIDIA AI Enterprise Subscription License and Support per GPU Socket, 1 Year
7S02001GWW NVIDIA AI Enterprise Subscription License and Support per GPU Socket, 3 Years
7S02001HWW NVIDIA AI Enterprise Subscription License and Support per GPU Socket, 5 Years
7S02001JWW NVIDIA AI Enterprise Subscription License and Support per GPU Socket, EDU, 1 Year
7S02001KWW NVIDIA AI Enterprise Subscription License and Support per GPU Socket, EDU, 3 Years
7S02001LWW NVIDIA AI Enterprise Subscription License and Support per GPU Socket, EDU, 5 Years
Table 10. NVIDIA Run: ai
Part number Description
Software subscription
7S02004UWW NVIDIA Run:ai Subscription per GPU 1 Year
7S02004XWW NVIDIA Run:ai Subscription per GPU 3 Years
7S020050WW NVIDIA Run:ai Subscription per GPU 5 Years
7S02004VWW NVIDIA Run:ai Subscription per GPU EDU 1 Year
7S02004YWW NVIDIA Run:ai Subscription per GPU EDU 3 Year
7S020051WW NVIDIA Run:ai Subscription per GPU EDU 5 Year
7S02004WWW NVIDIA Run:ai Subscription per GPU INC 1 Year
7S02004ZWW NVIDIA Run:ai Subscription per GPU INC 3 Year
7S020052WW NVIDIA Run:ai Subscription per GPU INC 5 Year
Support Services subscription
7S020053WW 24x7 Support Services for NVIDIA Run:ai Subscription per GPU 1 Year
7S020056WW 24x7 Support Services for NVIDIA Run:ai Subscription per GPU 3 Year
7S020059WW 24x7 Support Services for NVIDIA Run:ai Subscription per GPU 5 Year
7S020054WW 24x7 Support Services for NVIDIA Run:ai Subscription per GPU EDU 1 Year
7S02005AWW 24x7 Support Services for NVIDIA Run:ai Subscription per GPU EDU 3 Year
7S020057WW 24x7 Support Services for NVIDIA Run:ai Subscription per GPU EDU 5 Year
7S020055WW 24x7 Support Services for NVIDIA Run:ai Subscription per GPU INC 1 Year
7S020058WW 24x7 Support Services for NVIDIA Run:ai Subscription per GPU INC 3 Year
7S02005BWW 24x7 Support Services for NVIDIA Run:ai Subscription per GPU INC 5 Year

Energy Monitoring and Optimization

The following Energy Monitoring and Optimization software is available with Lenovo EveryScale HPC & AI Software Stack.

  • Energy Aware Runtime (EAR)
    EAR is licensed under EPL 2.0 and developed by BSC and Energy Aware Solutions (EAS). While the EAR core version remains open-source, EAR also features extensions developed by the EAS team and provided under a proprietary license. EAR packs are the different solutions provided by EAS, which include the EAR core, EAR extensions, and EAS installation and support services. EAR packs can be purchased from Lenovo under the EveryScale HPC & AI Software Stack CTO and are delivered through EAS.


EAS offers three EAR packs:

  • EAR Detective Pro – Includes Data Center monitoring, Analysis and Accounting modules.
  • EAR Optimizer – Includes all features of EAR Detective Pro, with the addition of the energy optimization module for both CPU and GPU, and the Job and System Analytics tools.
  • EAR Optimizer Pro – Build upon EAR Optimizer by introducing the smart power capping module.
  • EAR is compatible with Slurm, PBS Pro and Kubernetes clusters.

The following table lists the ordering part numbers for EAR Licensing, Support and Services

Table 11. Energy Aware Runtime (EAR) Service and Support
Part number Description
EAR Detective Pro
7S09005FWW EAR Detective Pro Worldwide Installation and Training price up to 1000 sockets
7S09005GWW EAR Detective Pro Worldwide License & Support 1yr - clusters up to 1000 sockets (CPUs and GPUs, up to 1 type of each)
7S09005HWW EAR Detective Pro Worldwide License & Support 3yr - clusters up to 1000 sockets (CPUs and GPUs, up to 1 type of each)
7S09005JWW EAR Detective Pro Worldwide License & Support 5yr - clusters up to 1000 sockets (CPUs and GPUs, up to 1 type of each)
EAR Optimizer
7S09005KWW EAR Optimizer Worldwide Installation and Training - clusters up to 1000 sockets (CPUs only)
7S09005LWW EAR Optimizer Worldwide Installation and Training - clusters up to 1000 sockets (CPUs and GPUs)
7S09005MWW EAR Optimizer Worldwide License & Support price 1yr up to 1000 sockets per CPU socket
7S09005NWW EAR Optimizer Worldwide License & Support price 3yr up to 1000 sockets per CPU socket
7S09005PWW EAR Optimizer Worldwide License & Support price 5yr up to 1000 sockets per CPU socket
7S09005QWW EAR Optimizer Worldwide License & Support price 1yr up to 1000 sockets per GPU socket
7S09005RWW EAR Optimizer Worldwide License & Support price 3yr up to 1000 sockets per GPU socket
7S09005SWW EAR Optimizer Worldwide License & Support price 5yr up to 1000 sockets per GPU socket

EAR Optimizer Pro* (special BID required)

Note: These part numbers are applicable only to clusters that meet all of the following criteria:

  • Contain a single CPU type and a single GPU type
  • Have a combined total of up to 1,000 sockets (including both CPU and GPU sockets)
  • Do not require smart power capping (i.e. do not use EAR Optimizer Pro)

*Special BID Requirement: A Special BID from EAS is required in the following scenarios:

  • The cluster exceeds 1,000 sockets
  • The cluster includes multiple CPU or GPU types
  • The deployment requires smart power capping, available only through EAR Optimizer Pro

Seller training courses

The following sales training courses are offered for employees and partners (login required). Courses are listed in date order.

  1. VTT HPC: LiCO-Computing Orchestration for AI and HPC
    2024-07-30 | 92 minutes | Employees Only
    Details
    VTT HPC: LiCO-Computing Orchestration for AI and HPC

    **NOTE: To download the attached PPT, Launch the course, exit the player to return to this screen, then scroll down to find the PPT to download.**

    Please view this session as Ana Irimiea, AI Systems and Solutions Product Manager at ISG, speaks with us about LiCO version 7.2.1.

    She will talk about:

    -Overview of LiCO
    -Administrator and user capabilities
    -Deployment options
    -Ordering LiCO
    -Roadmap

    Tags: High-Performance Computing (HPC), Sales, Technical Sales

    Published: 2024-07-30
    Length: 92 minutes

    Start the training:
    Employee link: Grow@Lenovo

    Course code: DVHPC214
  2. Enterprise Deployment of AI and Phases of Model Development
    2024-05-23 | 12 minutes | Employees and Partners
    Details
    Enterprise Deployment of AI and Phases of Model Development

    Lenovo Senior AI Data Scientist Dr David Ellison whiteboards the concepts of using data from multiple sources to derive customer benefits through Artificial Intelligence and LiCO (Lenovo Intelligent Computing Orchestration) software.

    By the end of this training, you should be able to:

    •Describe enterprise deployment of Artificial Intelligence
    •Explain the process of model development
    •State the purpose of LiCO (Lenovo Intelligent Computing Orchestration) software
    •List three examples of Artificial Intelligence solutions

    Tags: Artificial Intelligence (AI), Data & Analytics, Technical Sales

    Published: 2024-05-23
    Length: 12 minutes

    Start the training:
    Employee link: Grow@Lenovo
    Partner link: Lenovo 360 Learning Center

    Course code: DAIS101
  3. Selling Lenovo Intelligent Computing Orchestration
    2021-08-25 | 18 minutes | Employees and Partners
    Details
    Selling Lenovo Intelligent Computing Orchestration

    The goal of this course is to help ISG and Business Partner sellers understand Lenovo Intelligent Computing Orchestration (LiCO) software. Learn when and how to propose LiCO in order to continue the conversation with the customer and making a sale.

    Tags: Artificial Intelligence (AI)

    Published: 2021-08-25
    Length: 18 minutes

    Start the training:
    Employee link: Grow@Lenovo
    Partner link: Lenovo 360 Learning Center

    Course code: DAIS202

Resources

For more information, see these resources:

 

Related product families

Product families related to this document are the following:

Trademarks

Lenovo and the Lenovo logo are trademarks or registered trademarks of Lenovo in the United States, other countries, or both. A current list of Lenovo trademarks is available on the Web at https://www.lenovo.com/us/en/legal/copytrade/.

The following terms are trademarks of Lenovo in the United States, other countries, or both:
Lenovo®

The following terms are trademarks of other companies:

Intel® is a trademark of Intel Corporation or its subsidiaries.

Linux® is the trademark of Linus Torvalds in the U.S. and other countries.

Windows® is a trademark of Microsoft Corporation in the United States, other countries, or both.

LSF® is a trademark of IBM in the United States, other countries, or both.

Other company, product, or service names may be trademarks or service marks of others.