GPU?

1# check if GPU available?
2import tensorflow as tf
3tf.config.list_physical_devices('GPU')
4
ย 
1# prevent tf uses gpu
2# add below before any tf import
3import os
4os.environ["CUDA_VISIBLE_DEVICES"] = "-1"
5

Installation with docker

๐Ÿ‘‰ Official guide.
๐Ÿ‘‰ Note:
Docker & GPU .
The advantage of this method is that you only have to install GPU driver on the host machine.

Without docker-compose

1# pull the image
2docker pull tensorflow/tensorflow:latest-gpu-jupyter
3
4# run a container
5mkdir ~/Downloads/test/notebooks
6docker run --name docker_thi_test -it --rm -v $(realpath ~/Downloads/test/notebooks):/tf/notebooks -p 8888:8888 tensorflow/tensorflow:latest-gpu-jupyter
1# check if gpu available?
2nvidia-smi
3
4# check if tf2 working?
5docker exec -it docker_thi_test bash
6python
1import tensorflow as tf
2tf.config.list_physical_devices('GPU')

With docker-compose?

๐Ÿ‘‰ Read Docker & GPU instead.

On Windows WSL2

Update laterโ€ฆ

Install directly on Linux (without docker)

On my computer, Dell XPS 15 7590 - NVIDIAยฎ GeForceยฎ GTX 1650 Mobile.
๐Ÿšจ
This section is not complete, the guide is still not working!

Installation

This guide is specific for:
1pip show tensorflow # 2.3.1
2pip show tensorflow-gpu # 2.3.1
3nvidia-smi # NVIDIA-SMI 450.80.02 Driver Version: 450.80.02 CUDA Version: 11.0
๐Ÿ‘‰ Note: PyTorch .
๐Ÿ‘‰ Note:
Fresh Ubuntu / Pop!_OS Installation .
๐Ÿ‘‰ Note:
Linux .

Errors?

๐Ÿžย Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory
Need to install new cuda & CUDNN libraries and tensorflow. (This note is for tensorflow==2.3.1 and CUDA 11.1) (ref).
1# update path
2export PATH=/usr/local/cuda-11.1/bin${PATH:+:${PATH}}
3export LD_LIBRARY_PATH=/usr/local/cuda-11.1/lib\\
4                         ${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
5
6# quickly test cuda version
7nvcc --version

๐Ÿžย WARNING:tensorflow:Your input ran out of data; interrupting training. Make sure that your dataset or generator can generate at least `steps_per_epoch * epochs` batches (in this case, 2000 batches). You may need to use the repeat() function when building your dataset.
Problem come from you don't have enough images!
1train_generator = train_datagen.flow_from_directory(batch_size = 20)
2validation_generator =  test_datagen.flow_from_directory(batch_size  = 20)
3
4# Found 1027 images belonging to 2 classes.
5# Found 256 images belonging to 2 classes.
6
7model.fit(
8    validation_data = validation_generator,
9    steps_per_epoch = 100,
10    epochs = 20,
11    validation_steps = 50,
12    verbose = 2)
We must have steps_per_epoch * batch_size <= #of images, in this case 100*20 = 2000 > 1027. Check this answer for more information.
1# correct
2model.fit(
3    ...
4    steps_per_epoch = 50, # batches in the generator are 20, so it takes 1027//20 batches to get to 1027 images
5    ...
6    validation_steps = 12, # batches in the generator are 20, so it takes 256//20 batches to get to 256 images
7    ...)

๐Ÿžย Not found: No algorithm worked! OR This is probably because cuDNN failed to initialize
1nvidia-smi
2# check and kill the process that uses GPU much
3# restart the task
1# OR: add the following to your code
2from tensorflow.compat.v1 import ConfigProto
3from tensorflow.compat.v1 import InteractiveSession
4
5config = ConfigProto()
6config.gpu_options.allow_growth = True
7session = InteractiveSession(config=config)