Build Intel® Extension for PyTorch from source
Build IPEX from source
This is a record on how to build Intel® Extension for PyTorch (IPEX) from source. After three days of torturing, it is finally done. In memory of this, I will write down the steps here. Most steps remain still as the official document, but there are some hidden bugs not mentioned. It is recommended to read the official document thoroughly first, for some bugs may be warned in later steps. You can find a TL;DR version in the end section as a supplement for the official document.
My compilation environment is listed as follows:
Term | Environment |
---|---|
OS | Windows 11 23H2 64-bit |
CPU | 13th Gen Intel(R) Core(TM) i7-1360P @ 2.611GHz |
GPU | Intel(R) Iris(R) Xe Graphics |
Python | 3.11.12 (Intel® Distribution for Python) in conda environment |
GCC | version 11.2.0 (x86_64-posix-sjlj-rev1, Built by MinGW-W64 project) |
System Requirements
First, follow the instructions to install GPU drivers, Intel® Deep Learning Essentials/Intel® oneAPI Base Toolkit, Python and the compiler.
- Make sure the version of oneAPI tools are the same as that given in the document to avoid any version mismatch or dependency issues. For example, if the document uses
2025.0
, then the first two version numbers of your oneAPI tools should be2025.0
as well. The last one may be different, such as2025.0.2
or2025.0.4
are both acceptable.- Note that the Python version is exactly the one that you are building for. So please plan ahead, or you will need to rebuild the whole thing again.
- Use Python in conda environment.
- Use
GCC 11
is recommended, since higher versions such asGCC 14
may fail to compile.
Build Process
Open a cmd terminal, setup the oneAPI, conda and proxy (since there are git operations during setup) environment vars. For example, I use the init.bat
file below to do this automatically.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
@echo off
rem Replace the path with your own oneAPI installation path
call "D:\\Intel\\oneAPI\\setvars.bat"
rem Replace the env name with your own conda env name
conda activate ipex
rem Copied from Clash for Windows
set http_proxy=http://127.0.0.1:1169 & set https_proxy=http://127.0.0.1:1169
echo "setup oneAPI vars, conda env-ipex, and proxy"
rem [OPTIONAL SETTINGS]
rem If your hardware is not so good, then stack overflow may happen while compiling pytorch.
rem It's because the default stack size is small and the default parallel compiling uses too many threads.
rem You can set the following env vars to limit the number of threads and memory usage.
set MAX_JOBS=2
set CL=/Zm800
Remember to run the following command to install the required dependencies before building (which are not properly set in the official compile_bundle.bat
).
1
conda install -y -c conda-forge libjpeg-turbo libpng json5
Then call the compile_bundle.bat
(download it from github, see the official document for reference) with the following command, and just wait.
1
2
# compile_bundle.bat <DPCPP_ROOT> <ONEMKL_ROOT> <AOT>
compile_bundle.bat "D:\\Intel\\oneAPI\\compiler\\latest" "D:\\Intel\\oneAPI\\mkl\\latest" "xe-lpg,mtl-u"
- Here I specify
DPCPP_ROOT
andONEMKL_ROOT
explicitly to avoid unnecessary errors. - The legal AOT targets can be found in official document and the
csrc/gpu/aten/operators/xetla/kernels/CMakeLists.txt
file in IPEX’s github repository. Please check your hardware types carefully. You can usesycl-ls --verbose
in an oneAPI-enabled terminal to check your hardware types. - The AOT targets listed in the
CMakeLists.txt
file mentioned above (in the latest commit till 2025.04.23) are as follows:
1
2
3
set(AOT_REGEX_XE_HPC "(xe-hpc.*|pvc|bmg|bmg-.*|lnl-.*|xe2-.*)")
set(AOT_REGEX_XE_HPG "(xe-hpg.*|ats-m.*|acm-.*|dg2|dg2-.*|arl-h)")
set(AOT_REGEX_XE_LPG "(xe-lpg.*|mtl|mtl-.*|arl-u|arl-s|0x7d55|0x7dd5|0x7d57|0x7dd7)")
The compile process is quite time-consuming and storage-consuming. The main parts are Pytorch and IPEX. To compile Pytorch, it will require about 30~40 GB of storage (build cache included) and about 6~7 hours. To compile IPEX, it will require about 10 GB of storage and about 4 hours. So make yourself fully prepared in advance.
Summary of Encountered Problems
Here I will record the major problems I encountered during the build process, which are already integrated into the steps above yet still listed here as a reference.
- Use
GCC 14
instead ofGCC 11
at first, and failed every time (also mentioned in ipex issues #790). - Use oneAPI
2025.1
instead of2025.0
. - Use the oneAPI command prompt, yet to find “Use individual component-specific activation scripts to activate required components listed below one-by-one” in the document.
- Forget to setup proxy or install dependencies after creating a brand new conda environment. (So I write the
init.bat
and modifiedcompile_bundle.bat
to do this automatically.) - Haven’t set the
DPCPP_ROOT
andONEMKL_ROOT
explicitly but use theONEAPIROOT
, which may cause some errors. - Set unsupported AOT targets, such as
rpl-p
(listed in the official document, yet absent in theCMakeLists.txt
) orxe_lpg
(spelling error). - No enough disk space.
- Stack overflow during compilation. (Set
MAX_JOBS
to limit the parallel jobs of ninja, andCL
to expand the stack size. The former is more important, since I only setCL
to800
at first and it still overflowed. Even worse, I tried twice without realizing this issue.) - Run part of the
compile_bundle.bat
manually, yet comment out the wrong lines (to be more specific, I comment out some necessary lines while forget to comment out some useless lines).