Authors
Published
17 Sep 2025Form Number
LP2269PDF size
7 pages, 79 KBAbstract
todo: add an intro statement about Accelerating Data Science Workflows
In this paper, we quantify generation‑to‑generation performance on common data‑science workloads running entirely on CPUs. We compare Modin data manipulation, scikit‑learn training (DBSCAN, K‑means, KNN‑Classifier, Logistic/Linear Regression, Random Forests) and inference (incl. LightGBM, XGBoost, CatBoost) across 3rd, 5th, and 6th Gen Intel Xeon processors.
The results of this analysis show that 6th Gen Xeon improves the following:
- Data manipulation by up to 6.13× vs 3rd Gen (at 1.6M rows) and 2.64× vs 5th Gen (Modin)
- Training by up to 7.23× vs 3rd Gen and 3.00× vs 5th Gen across algorithms
- Inference by up to 4.37× vs 3rd Gen and 2.38× vs 5th Gen across algorithms
A balanced composite view yields an overall ≈1.6×-2.6× (6th / 5th) and ≈2.5×-6.1× (6th / 3rd) gain for the end‑to‑end pipeline.
This paper targets data scientists, ML engineers, and performance-minded architects who already understand core scikit-learn APIs (fit/predict, pipelines) and basic ML concepts (train-test split, common metrics). We assume readers are comfortable with Python and seek practical guidance on extracting more performance from Intel Xeon-based infrastructure without rewriting their code
Introduction
Most enterprise analytics pipelines still lean on Python DataFrames (pandas/Modin) and classical ML libraries (scikit‑learn, XGBoost), so CPU‑only efficiency directly impacts cost, latency, and throughput. As Intel Xeon generations advance, pairing Modin and Intel Extension for Scikit‑learn turns architectural gains into real end‑to‑end time savings with minimal code change.
todo: what is this list? add an intro statement
- Pandas ↔ Modin. Pandas is the de‑facto DataFrame API; Modin keeps that API while parallelizing execution across cores/cluster back‑ends (Ray). This allows parallel I/O and compute with minimal code change (import swap).
- Intel Extension for Scikit‑learn (sklearn‑intelex). A single call to patch_sklearn() dynamically patches popular estimators to highly‑optimized C++ kernels (oneDAL), accelerating both fit and predict without rewriting pipelines.
- Xeon generations. We focus on realistic CPU‑only deployments comparing 3rd Gen Xeon Scalable, 5th Gen Xeon, and 6th Gen Xeon ("Xeon 6").
Series of papers
This paper represents Part 3 of a series on Accelerating Data Science Workflows:
- Part 1: Modin vs. Pandas for data manipulation (I/O, transforms)
- Part 2: Intel Extension for Scikit‑learn (training & inference)
- Part 3 (this paper): End‑to‑end gains across 3rd, 5th, and 6th Gen Intel Xeon CPUs, covering data manipulation + model training + inference, with composite pipeline speed‑ups
Algorithms and Datasets
To ensure apples‑to‑apples comparisons, we reuse the workloads from Parts 1–2 and run them on identical software stacks across 3rd, 5th, and 6th Gen Intel Xeon CPUs. The subsections below specify workloads, metrics, and environment details. todo: which subsections? do you just mean the text below?
- Workloads:
- Data manipulation (Modin): CSV ingest + transforms at 400K / 800K / 1.6M rows.
- Training (sklearn‑intelex): DBSCAN, K‑means, KNeighborsClassifier, Logistic/Linear Regression, Random Forest Classifier/Regressor.
- Inference: CatBoost, LightGBM, XGBoost, plus the scikit‑learn models above.
- Metrics: Wall‑clock time for each operation; composite results use median to reduce skew from outliers. We also report 6th↔5th and 6th↔3rd speed‑up factors.
- Environment: Same software stack across generations; CPU generations as titled. todo: what does this mean "titled"?
Methodology Notes & Reproducibility
To keep results fair and repeatable, we standardize seeds, force garbage collection between iterations, and repeat runs to smooth variance. Use the notes below to replicate our setup and verify the numbers.
- All timings are wall‑clock.
- Each test repeated multiple iterations with garbage collection (gc.collect()) between runs; median reported.
- Data manipulation used Modin; ML used scikit‑learn patched with Intel Extension for Scikit‑learn.
- Keep the same software version, BIOS settings, and dataset descriptions.
Results
In this section, we list the raw wall clock times and normalized speed ups (5th Gen to 6th Gen, and 3rd Gen to 6th Gen). Lower values are better for time; The ratios in the right two columns highlight 6th Gen uplifts.
- Data Manipulation (Modin)
- Model Training (Intel Extension for Scikit learn)
- Model Inference (Intel Extension for Scikit learn)
- Composite “Full Pipeline” View
todo: would "Improvement" be a better word that "Uplift" in these tables?
Data Manipulation (Modin)
The improvement for Modin across the three sizes are shown in the following table:
- 6th vs 5th = 1.88×-2.64×
- 6th vs 3rd = 3.00×-6.13×
todo: what is the (s) in the heading row - eg "3rd Gen (s)"?
Model Training (Intel Extension for Scikit learn)
Across algorithms (training): todo: reword this intro to refer to the table
- 6th vs 5th ranges 1.11×–3.00×
- 6th vs 3rd ranges 1.44×–7.23×
Model Inference (Intel Extension for Scikit learn)
Across algorithms (inference): todo: reword this intro to refer to the table below
- 6th vs 5th ranges 1.19×–2.38×
- 6th vs 3rd ranges 1.63×–4.37×
Composite “Full Pipeline” View
To avoid over‑weighting any single stage, we compute a balanced median uplift across the three segments (Modin data manipulation, training, inference):
- 6th vs 5th Gen: ≈ 1.6x – 2.5x
- 6th vs 3rd Gen: ≈ 2.5x – 6.1x
Note: If your workload is training‑heavy or inference‑heavy, scale each segment by the appropriate share of wall‑clock time to obtain the scenario‑specific uplift.
Conclusions
6th Gen Intel Xeon consistently advances end‑to‑end, CPU‑only analytics versus 5th and 3rd Gen baselines using the same, familiar software stack (Modin + Intel Extension for Scikit‑learn). In our tests, 6th/5th uplifts typically land in the 1.88×–2.64× range for data manipulation, 1.11×–3.00× for model training, and 1.19×–2.38× for inference; against 3rd Gen, ranges widen to 3.00×–6.13×, 1.44×–7.23×, and 1.63×–4.37×, respectively. The full‑pipeline gain is workload‑dependent but commonly falls around ≈1.6×–2.6× vs 5th Gen (and ≈2.5×–6.1× vs 3rd Gen) when combining prep, fit, and predict.
Practical takeaways:
- Prioritize CPU upgrades when your pipelines are I/O‑heavy (large CSV/Parquet ingestion, wide group‑bys) or rely on tree ensembles, clustering, or distance‑based methods. These show the largest improvements from gen‑to‑gen and from the Intel‑optimized kernels.
- Mind training vs inference trade‑offs. Some algorithms may train modestly faster but infer dramatically faster (or vice‑versa). Choose CPU generation and algorithmic settings based on where your SLA or cost is constrained (e.g., batch‑training windows vs. online latency).
- Adoption is low‑friction. The improvements arrive with minimal code change: an import swap for Modin and a patch_sklearn() call for Intel Extension for Scikit‑learn, preserving APIs and model semantics.
- Right‑size with pipeline weights. Apply the per‑stage ranges to your own time profile (e.g., 40% prep / 35% train / 25% infer) to estimate business impact. Where inference dominates, favor gains in predict‑time algorithms; where training windows dominate, weight fit‑time uplifts more heavily.
Overall, upgrading to 6th Gen Intel Xeon turns many formerly multi‑second steps into sub‑second operations and materially compresses end‑to‑end latency, without abandoning the mainstream pandas/scikit‑learn ecosystem.
Lab Configurations
Our test server had the hardware and software configuration listed in the following table.
todo: I note that the memory is different for each server. Does this make a difference? How about adding some text about this?
References
For more information, see these web resources:
- Modin documentation
https://modin.readthedocs.io/ - Intel Extension for Scikit‑learn Overview:
https://www.intel.com/content/www/us/en/developer/tools/oneapi/scikit-learn.html ; - Intel Extension for Scikit‑learn Getting Started:
https://www.intel.com/content/www/us/en/developer/articles/guide/intel-extension-for-scikit-learn-getting-started.html - 3rd Gen Intel Xeon Scalable:
https://www.intel.com/content/www/us/en/products/docs/processors/xeon-accelerated/3rd-gen-xeon-scalable-processors.html - 5th Gen Intel Xeon Scalable:
https://www.intel.com/content/www/us/en/products/docs/processors/xeon/5th-gen-xeon-scalable-processors.html - Intel Xeon 6:
https://www.intel.com/content/www/us/en/products/details/processors/xeon.html
Authors
Kelvin He is an AI Data Scientist at Lenovo. He is a seasoned AI and data science professional specializing in building machine learning frameworks and AI-driven solutions. Kelvin is experienced in leading end-to-end model development, with a focus on turning business challenges into data-driven strategies. He is passionate about AI benchmarks, optimization techniques, and LLM applications, enabling businesses to make informed technology decisions.
David Ellison is the Chief Data Scientist for Lenovo ISG. Through Lenovo’s US and European AI Discover Centers, he leads a team that uses cutting-edge AI techniques to deliver solutions for external customers while internally supporting the overall AI strategy for the Worldwide Infrastructure Solutions Group. Before joining Lenovo, he ran an international scientific analysis and equipment company and worked as a Data Scientist for the US Postal Service. Previous to that, he received a PhD in Biomedical Engineering from Johns Hopkins University. He has numerous publications in top tier journals including two in the Proceedings of the National Academy of the Sciences.
Trademarks
Lenovo and the Lenovo logo are trademarks or registered trademarks of Lenovo in the United States, other countries, or both. A current list of Lenovo trademarks is available on the Web at https://www.lenovo.com/us/en/legal/copytrade/.
The following terms are trademarks of Lenovo in the United States, other countries, or both:
Lenovo®
ThinkSystem®
The following terms are trademarks of other companies:
Intel® and Xeon® are trademarks of Intel Corporation or its subsidiaries.
Linux® is the trademark of Linus Torvalds in the U.S. and other countries.
Other company, product, or service names may be trademarks or service marks of others.
Configure and Buy
Full Change History
Course Detail
Employees Only Content
The content in this document with a is only visible to employees who are logged in. Logon using your Lenovo ITcode and password via Lenovo single-signon (SSO).
The author of the document has determined that this content is classified as Lenovo Internal and should not be normally be made available to people who are not employees or contractors. This includes partners, customers, and competitors. The reasons may vary and you should reach out to the authors of the document for clarification, if needed. Be cautious about sharing this content with others as it may contain sensitive information.
Any visitor to the Lenovo Press web site who is not logged on will not be able to see this employee-only content. This content is excluded from search engine indexes and will not appear in any search results.
For all users, including logged-in employees, this employee-only content does not appear in the PDF version of this document.
This functionality is cookie based. The web site will normally remember your login state between browser sessions, however, if you clear cookies at the end of a session or work in an Incognito/Private browser window, then you will need to log in each time.
If you have any questions about this feature of the Lenovo Press web, please email David Watts at dwatts@lenovo.com.