Apple Silicon Machines
With their Apple Silicon chips, the Apple machines have moved to an arm64
architecture (from the x86-64
of the Intel ones).
While the new chips are a great upgrade on their Intel counterparts, the new architecture involves some caveats and requires a transition from most pieces of software to run natively.
Thankfully, all programs meant for the retired Intel x86-64
chips can be run through Rosetta 2, a translation layer provided by Apple.
The following link is a useful resource to see which apps and pieces of software are optimized for Apple Silicon devices: https://isapplesiliconready.com/.
Rosetta 2
Using the Rosetta 2 translation layer is straightforward: when running a program requiring translation for the first time, the user will be prompted to install Rosetta 2, and no more work is ever required.
Alternatively it is possible to install it manually from the command line with:
softwareupdate --install-rosetta
Python Setup
The conda-forge
people have gone through a lot of work to make automatic (as much as can be) cross-compilation of packages to various target architectures.
The easiest way to handle your Python virtual environments is to use miniforge
, which is essentially a miniconda
-like distribution pre-configured to use the conda-forge
channel and with support for various CPU architectures.
Python Virtual Environments
Don't know what a virtual environment is or what purpose it serves? Here is a good (but lengthy) primer on virtual environments by RealPython.
The GitHub page of Miniforge contains download links to their installers for various configurations (conda/mamba, CPython/Pypy, amd64/arm64/macos-arm64 etc.).
I personally recommend the mambaforge
distributions, as the mamba
tool is an amazing alternative to conda
that one would be missing out on.
Intel-Type Python Environments
One might need, in order to install Python packages that do not run natively under Apple Silicon, to create a virtual environment with an x86-64
architecture.
To do so, define the CONDA_SUBDIR
environment variable to osx-64
before running the environment creation command.
For instance, to create a new virtual Python 3.9 environment named example
with x86-64
architecture, do:
CONDA_SUBDIR=osx-64 conda create -n example python=3.9
As the whole environment will run through Rosetta 2, do not be surprised if its first startup (call to Python) is a little slow. Further startups will be much faster, as the translations are cached.
MAD-X & cpymad
MAD-X
does not provide native builds for Apple Silicon, however one can perfectly install the madx-macosx64-gnu
version and have it run through Rosetta2.
The performance difference is non-existent.
As MAD-X
does not run natively on Apple Silicon chips, neither does cpymad
.
As a consequence, in order to install cpymad
on Apple Silicon, one needs to create an osx-64
-type architecture environment in which to install the package.
For instructions on how to do so, see the Intel-Type Python Environments tooltip in the Python Setup section.
TensorFlow on Silicon GPU
In version 2.5
, TensorFlow has introduced the PluggableDevice plugins API, which Apple has used to provide a plugin to make tensorflow
aware of the GPU available on Apple Silicon chips.
If one has a miniforge
or mambaforge
setup as instructed in the Python Setup above, creating a Python 3.9 environment to make use of tensorflow
natively, running on the Apple Silicon GPU, is as simple as:
conda create -n tensorflow python=3.9 --yes
conda activate tensorflow
conda install -c apple tensorflow-deps tensorflow --yes
python -m pip install --upgrade tensorflow-metal # PluggableDevice plugin
That's it! There is nothing more to do, TensorFlow will automatically detect the GPU and will use it for computations.
Verifying the Install
One can test that the installation has made tensorflow
aware of the Apple Silicon GPU by running the following script:
from tensorflow.keras import layers, models
from tensorflow.keras.datasets import mnist
from tensorflow.keras.utils import to_categorical
model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation="relu", input_shape=(28, 28, 1)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation="relu"))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation="relu"))
model.add(layers.Flatten())
model.add(layers.Dense(64, activation="relu"))
model.add(layers.Dense(10, activation="softmax"))
model.summary()
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()
train_images = train_images.reshape((60000, 28, 28, 1))
train_images = train_images.astype("float32") / 255
test_images = test_images.reshape((10000, 28, 28, 1))
test_images = test_images.astype("float32") / 255
train_labels = to_categorical(train_labels)
test_labels = to_categorical(test_labels)
model.compile(optimizer="rmsprop", loss="categorical_crossentropy", metrics=["accuracy"])
model.fit(train_images, train_labels, epochs=5, batch_size=64)
test_loss, test_acc = model.evaluate(test_images, test_labels)
print(test_acc)
When running the script, one should see logged the following line, confirming that tensorflow
is indeed using the Apple Silicon GPU through the ML Compute
framework and the Metal
backend:
tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>)
Note that the device is detected with 0 MB memory
, which is normal since the Apple Silicon GPU shares a unified memory with the CPU and does not have a dedicated device memory.
Additionally, one can open the Activity Monitor during the model training and confirm that the Python process gets a high percentage value in the GPU
column.
PyTorch on Silicon GPU
Since version 1.12
, PyTorch includes native support for Apple Silicon GPUs through Apple's Metal Performance Shaders framework.
If one has a miniforge
or mambaforge
setup as instructed in the Python Setup above, creating a Python 3.9 environment to make use of pytorch
natively, running on the Apple Silicon GPU, is as simple as:
conda create -n pytorch python=3.9 --yes
conda activate pytorch
python -m pip install torch torchvision torchaudio
Making PyTorch use the GPU
The PyTorch integration with Metal is not as seamless as the TensorFlow one, and just like with any other accelerators, PyTorch requires you to explicitly set the device
for calculations.
To use the Apple Silicon GPU, one has to specify the device as mps
(Metal Performance Shaders) in either the torch.device
constructor, or when creating tensors:
import torch
gpu = torch.device("mps")
x = torch.ones(5, device=gpu)
# or alternatively: x = torch.ones(5, device="mps")
All following calculations will be done on the GPU:
y = x ** 2
y # >> will print tensor([1., 1., 1., 1., 1.], device='mps:0')
One can easily load pre-existing models and transfer them to the GPU for inference:
model = YourFavoriteNet()
model.to(gpu) # the torch.device("mps") defined two blocks above
# Now every call runs on the GPU
predictions = model(inputs)
Docker
Starting with Docker Desktop 4.3.0, the application can run natively on Apple Silicon chips.
Image Platform Warning
One might encounter the following warning when running images:
WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested
To circumvent these, make sure to provide the --platform
flag when running the image, with the appropriate value linux/amd64
.
For instance, running the quickstart example from the Docker documentation would then be:
docker run --platform linux/amd64 -dp 80:80 docker/getting-started
Click here to get to the BE-ABP Docker image made by Guido.
Another good starting point is to have a look at the AccPy images.
Here is a guide provided by the AccPy
maintainers.
AFS
The Auristor
file system works on Apple Silicon and can enable one to access AFS
.
A step-by-step guide is available at the following link.
Here Be Demons!
The installation procedure for Auristor
requires you to lower the security systems of macOS
.
While this is in practice an ok thing to do, please DO NOT DO SO unless you know your system well and know exactly what you are doing.