skip to main content

Lenovo Big Data Reference Design for Cloudera Data Platform on ThinkSystem Servers

Reference Architecture

27 Mar 2024
Form Number
PDF size
38 pages, 1.4 MB


This document describes the reference design for Cloudera Data Platform software on ThinkSystem servers. It provides architecture guidance for designing optimized hardware infrastructure for the Cloudera Data Platform Private Cloud edition, a distribution of Apache Hadoop and Apache Spark with enterprise-ready capabilities from Cloudera. This reference design provides the planning, design considerations, and best practices for implementing Cloudera Data Platform with Lenovo products. It also includes considerations for GPU-acceleration of Apache Spark 3.0 on ThinkSystem servers.

The intended audience for this reference design is IT professionals, technical architects, sales engineers, and consultants assisting in planning, designing, and implementing the big data solution with Lenovo hardware. It is assumed that you are familiar with Cloudera Data Platform components and capabilities.

Table of Contents

  1. Introduction
  2. Business problem and business value
  3. Requirements
  4. Architectural Overview
  5. Component Model
  6. Operational Model
  7. Customer Case
  8. Resources

To view the document, click the Download PDF button.

Change History

Changes in the March 27, 2024 update:

  • Added Intel Emerald Rapids description to section 6.1.1
  • Updated accelerators in section 6.1.3


Related product families

Product families related to this document are the following: