2021/07/03
はじめに
以前構築したtensorflow-gpu
は,バージョンが23
と古く,気分が悪かったので更新しました.
環境
~ $ neofetch
.-/+oossssoo+/-. ***@***********
`:+ssssssssssssssssss+:` ------------------
-+ssssssssssssssssssyyssss+- OS: Ubuntu 20.04.* LTS x86_64
.ossssssssssssssssssdMMMNysssso. Host: ******
/ssssssssssshdmmNNmmyNMMMMhssssss/ Kernel: 5.8.0-59-generic
+ssssssssshmydMMMMMMMNddddyssssssss+ Uptime: 21 mins
/sssssssshNMMMyhhyyyyhmNMMMNhssssssss/ Packages: ***** (dpkg), *** (snap)
.ssssssssdMMMNhsssssssssshNMMMdssssssss. Shell: ************
+sssshhhyNMMNyssssssssssssyNMMMysssssss+ Resolution: 2560x1440
ossyNMMMNyMMhsssssssssssssshmmmhssssssso DE: GNOME
ossyNMMMNyMMhsssssssssssssshmmmhssssssso WM: Mutter
+sssshhhyNMMNyssssssssssssyNMMMysssssss+ WM Theme: Adwaita
.ssssssssdMMMNhsssssssssshNMMMdssssssss. Theme: Yaru-dark [GTK2/3]
/sssssssshNMMMyhhyyyyhdNMMMNhssssssss/ Icons: Yaru [GTK2/3]
+sssssssssdmydMMMMMMMMddddyssssssss+ Terminal: gnome-terminal
/ssssssssssshdmNNNNmyNMMMMhssssss/ CPU: AMD Ryzen 5 3600X (12) @ 3.800GHz
.ossssssssssssssssssdMMMNysssso. GPU: NVIDIA GeForce RTX 2070 SUPER
-+sssssssssssssssssyyyssss+- Memory: 5275MiB / 32041MiB
`:+ssssssssssssssssss+:`
.-/+oossssoo+/-.
~ $ nvidia-smi
Sat Jul 3 00:54:15 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.42.01 Driver Version: 470.42.01 CUDA Version: 11.4 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... On | 00000000:0A:00.0 On | N/A |
| 52% 36C P8 23W / 215W | 535MiB / 7959MiB | 13% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 1133 G /usr/lib/xorg/Xorg 53MiB |
| 0 N/A N/A 1878 G /usr/lib/xorg/Xorg 185MiB |
| 0 N/A N/A 2014 G /usr/bin/gnome-shell 50MiB |
| 0 N/A N/A 2472 G ...AAAAAAAAA= --shared-files 139MiB |
| 0 N/A N/A 6843 G ...AAAAAAAAA= --shared-files 92MiB |
+-----------------------------------------------------------------------------+
~ $ python --version
Python 3.8.5
ココを見よう
ココにすべてが載っている.
私の場合は,「もう最新にしちゃえ」と思い,最上段の構成に決めた.該当部の抜粋が以下.
Version | Python version | Compiler | Build tools | cuDNN | CUDA |
---|---|---|---|---|---|
tensorflow-2.5.0 | 3.6-3.9 | GCC 7.3.1 | Bazel 3.7.2 | 8.1 | 11.2 |
とりあえず pip
2021/07/03 現在,tensroflow-gpu
は2.5.0
が最新版とのことなので,バージョン指定なしでインストールした.
install_tensorflow_gpu
$ pip install tensorflow-gpu
check_tensorflow_version
~ $ pip list | grep tensorflow-gpu
tensorflow-gpu 2.5.0
CUDA をインストール
このページに,自分の環境に合うようボタンをクリックしていく(私の環境で選択済みのものがこちら).
Installer Type
については,好きなものを選ぶとよい.
ボタンの下にあるコマンドを順番に実行していくとインストールが完了する.
cuDNN をインストール
nvidia developer program に登録しなければダウンロード出来ない.
こちらからダウンロードできる.
まず,サインインし,I Agree To the Terms of the cuDNN Software License Agreement
にチェックを入れてライセンスに同意する.
次に,今回は最新でない cuDNN をインストールするので,Archived cuDNN Releases
をクリックし,アーカイブされた cuDNN を見る(クリックしたものがこちら).
Download cuDNN v8.1.1 (Feburary 26th, 2021), for CUDA 11.0,11.1 and 11.2
を見つけてクリックし,以下2つをダウンロードする.
cuDNN Runtime Library for Ubuntu20.04 x86_64 (Deb)
cuDNN Developer Library for Ubuntu20.04 x86_64 (Deb)
そして,ターミナル上で以下を実行.
install_cudnn
~ $ cd Download
~/Download $ sudo dpkg --install libcudnn8_8.1.1.33-1+cuda11.2_amd64.deb
~/Download $ sudo dpkg --install libcudnn8-dev_8.1.1.33-1+cuda11.2_amd64.deb
GPU チェック
check_tensorflow_gpu
~ $ python
Python 3.8.5 (default, May 27 2021, 13:30:53)
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
2021-07-03 01:22:51.436038: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
>>tf.config.list_physical_devices('GPU')
2021-07-03 01:22:57.140152: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcuda.so.1
2021-07-03 01:22:57.178630: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-03 01:22:57.179008: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties:
pciBusID: 0000:0a:00.0 name: NVIDIA GeForce RTX 2070 SUPER computeCapability: 7.5
coreClock: 1.77GHz coreCount: 40 deviceMemorySize: 7.77GiB deviceMemoryBandwidth: 417.29GiB/s
2021-07-03 01:22:57.179028: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2021-07-03 01:22:57.181244: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.11
2021-07-03 01:22:57.181271: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublasLt.so.11
2021-07-03 01:22:57.182348: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcufft.so.10
2021-07-03 01:22:57.182474: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcurand.so.10
2021-07-03 01:22:57.182747: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusolver.so.11
2021-07-03 01:22:57.183209: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusparse.so.11
2021-07-03 01:22:57.183292: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudnn.so.8
2021-07-03 01:22:57.183367: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-03 01:22:57.183750: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-03 01:22:57.184083: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
エラーで困っている人
何かをする前にまずは,以下を実行することを勧める.
fix_bugs
$ reboot
おわりに
環境構築はやはり難しいですね.