Correctly configured NVIDIA drivers
Please skip this section if your system doesn’t have an NVIDIA GPU.
This section’s main purpose is to help diagnose and solve the problem of getting
False when running:
import torch print(torch.cuda.is_available())
nvidia-smi working just fine. Which means that
pytorch can’t find the NVIDIA drivers that match the currently active kernel modules.
pytorch installs itself as
torch. So we refer to the project and its packages as
pytorch, but inside python we use it as
First, starting with
pytorch-1.0.x it doesn’t matter which CUDA version you have installed on your system, always try first to install the latest
cuda92 - it has all the required libraries built into the package. However, note, that you most likely will need 396.xx+ driver for
pytorch built with
cuda92. For older drivers you will probably need to install
cuda90 or ever earlier.
The only thing you to need to ensure is that you have a correctly configured NVIDIA driver, which usually you can test by running:
nvidia-smi in your console.
If you have
nvidia-smi working and
pytorch still can’t recognize your NVIDIA CPU, most likely your system has more than one version of NVIDIA driver installed (e.g. one via
apt and another from source). If that’s the case, wipe any remnants of NVIDIA drivers on your system, and install one NVIDIA driver of your choice. The following are just the guidelines, you will need to find your platform-specific commands to do it right.
sudo apt-get purge nvidia-*
Note, not remove, but purge! purge in addition to removing the package, also removes package-specific configuration and any other files that were created by it.
Once you uninstalled the old drivers, make sure you don’t have any orphaned NVIDIA drivers on your system remaining from manual installs. Usually it’s enough to run:
find /usr/ | grep libcuda.so
or a more sweeping one (if your system updates the
locate -e libcuda.so
If you get any listing as a result of running of these commands, that means you still have some orphaned nvidia driver on your system.
Next, make sure you clear out any remaining dkms nvidia modules:
dkms status | grep nvidia
If after uninstalling all NVIDIA drivers, you still get the command listing dkms nvidia modules, uninstall those too.
For example if the output is:
nvidia-396, 396.44, 4.15.0-34-generic, x86_64: installed nvidia-396, 396.44, 4.15.0-36-generic, x86_64: installed
dkms remove nvidia/396.44 --all
dkmsmanual for the complete details if need be.
Install NVIDIA drivers according to your platform’s way (need 396.xx+ driver for
Check that your
nvidia-smiworks, and then check that
import torch print(torch.cuda.is_available())
Also note that
pytorch will silently fallback to CPU if it reports
False, so the only indicator of something being wrong will be that your notebooks will be running very slowly and you will hear your CPU revving up (if you are using a local system). Run:
python -c 'import fastai; fastai.show_install(1)'
to detect such issues. If you have this problem it’ll say that your torch cuda is not available.
If you’re not sure which nvidia driver to install here is a reference table:
|CUDA Toolkit||Linux x86_64||Windows x86_64|
|CUDA 10.0.130||>= 410.48||>= 411.31|
|CUDA 9.2||>= 396.26||>= 397.44|
|CUDA 9.0||>= 384.81||>= 385.54|
|CUDA 8.0||>= 367.48||>= 369.30|
You can find a complete table with extra variations here.
libcuda.so.1: cannot open shared object file
This section is only relevant if you build
pytorch from source -
pytorch conda and pip packages link statically to
libcuda and therefore
libcuda.so.1 is not required to be installed.
If you get an error:
ImportError: libcuda.so.1: cannot open shared object file: No such file or directory
that means you’re missing
libcuda.so.1 from your system.
On most Linux systems the
libcuda package is optional/recommended and doesn’t get installed unless explicitly instructed to do so. You need to install the specific version matching the
nvidia driver. For the sake of this example, let’s assume that you installed
nvidia-396 debian package.
find the apt package the file belongs to -
$ dpkg -S libcuda.so.1 libcuda1-396: /usr/lib/i386-linux-gnu/libcuda.so.1 libcuda1-396: /usr/lib/x86_64-linux-gnu/libcuda.so.1
- check whether that package is installed - no, it is not installed:
$ apt list libcuda1-396 Listing... Done
- let’s install it:
$ apt install libcuda1-396
- check whether that package is installed - yes, it is installed:
$ apt list libcuda1-396 Listing... Done libcuda1-396/unknown,now 396.44-0ubuntu1 amd64 [installed]
Now, you shouldn’t have this error anymore when you load
To check which
nvidia driver the installed package depends on, run:
$ apt-cache rdepends libcuda1-396 | grep nvidia nvidia-396
To check that
nvidia-396 is installed:
$ apt list nvidia-396 nvidia-396/unknown,now 396.44-0ubuntu1 amd64 [installed,automatic]
In your situation, change
396 to whatever version of the driver you’re using.
This should be more or less similar on the recent Ubuntu Linux versions, but could vary on other systems. If it’s different on your system, find out which package contains
libcuda.so.1 and install that package.
Do not mix conda-forge packages
fastai depends on a few packages that have a complex dependency tree, and
fastai has to manage those very carefully, so in the conda-land we rely on the anaconda main channel and test everything against that.
If you install some packages from the
conda-forge channel you may have problems with
fastai installation since you may have wrong versions of packages installed.
So see which packages need to come from the main conda channel do:
conda search --info -c fastai "fastai=1.0.7"
1.0.7 with the version you’re trying to install. (Without version it’ll show you all versions of
fastai and its dependencies)
Can’t install the latest fastai conda package
Say there is
fastai-1.0.12 out there, but when you run:
conda install -c fastai fastai
It installs only
This means that you previously installed some conda package that for example requires
spacy <=2.0.15, and now you’re trying to install
fastai==1.0.12 which requires
spacy==2.0.16 there is a conflict, and conda will search older
fastai packages until it finds one where the dependencies don’t conflict. And
fastai-1.0.6 happens to be that.
One solution is to find the package that causes a dependency conflict and update/remove it. Unfortunately the
conda client doesn’t tell you where the conflict lies and silently downgrades or tells you it can’t install it. So there is no easy way to find out which package causes the conflict.
A much simpler solution is to create a fresh conda environment and not install anything there, other than
fastai and its requirements, and keep it that way. If you do that, you will not have any such conflicts in the future. Of course, you can install other packages into that environment, but keep track of what you install so that you could revert them in the future if such conflict arises again.
And of course it’d be nice if
conda could just tell the user which package causes the conflict.
fastai has a relatively complex set of python dependencies, and it’s the best not to install those system-wide, but to use a virtual environment instead (
[conda](https://conda.io/docs/user-guide/tasks/manage-environments.html) or others). A lot of problems disappear when a fresh dedicated to
fastai virtual environment is created.
The following example is for using a conda environment.
First you need to install miniconda or anaconda. The former comes with bare minimum of packages preinstalled, the latter has hundreds more. If you haven’t changed the default configuration, miniconda usually ends up under
~/miniconda3/, and anaconda under
Once you have the software installed, here is a quick way to set up a dedicated environment for just
python-3.6 (of course feel free to name it the way you want it to):
conda update conda conda create -y python=3.6 --name fastai-3.6 conda activate fastai-3.6 conda install -y conda conda install -y pip setuptools
Now you can install
fastai prerequisites and itself using
The only thing you need to remember when you start using a virtual environment is that you must activate it before using it. So for example when you open a new console and want to start
jupyter, instead of doing:
you’d change your script to:
conda activate fastai-3.6 jupyter notebook
sometimes when you’re outside of conda the above doesn’t work and you need to do:
source ~/anaconda3/bin/activate fastai-3.6 jupyter notebook
(of course adjust the path to your conda installation if need to).
Virtual environments provide a lot of conveniences - for example if you want to have a stable env and an experimental one you can clone them in one command:
conda create --name fastai-3.6-experimental --clone fastai-3.6
or say you want to see how the well-working python-3.6 env will work with python-3.7:
conda create --name fastai-3.7 --clone fastai-3.6 conda install -n fastai-3.7 python=3.7 conda update -n fastai-3.7 --all
If you use advanced bash prompt functionality, like with git-prompt, it’ll now tell you automatically which environment has been activated, no matter where you’re on your system. e.g. on my setup it shows:
/fastai:[master|✚1…4⚑3] > conda activate (base) /fastai:[master|✚1…4⚑3] > conda activate fastai-3.6 (fastai-3.6) /fastai:[master|✚1…4⚑3] > conda deactivate /fastai:[master|✚1…4⚑3] >
I tweaked the prompt output for this example by adding whitespace to align the entries to make it easy to see the differences. That leading white space is not there normally. Besides the virtual env, it also shows me which git branch I’m on, and various git status information.
So now you don’t need to guess and you know exactly which environment has been activated if any before you execute any code.
Am I using my GPU(s)?
It’s possible that your system is misconfigured and while you think you’re using your GPU you could be running on your CPU only.
You can check that by checking the output of
import torch; print(torch.cuda.is_available()) - it should return
pytorch sees your GPU(s). You can also see the state of your setup with:
python -c 'import fastai; fastai.show_install(1)'
which will include that check in its report.
But the simplest direct check is to observe the output of
nvidia-smi while you run your code. If you don’t see the process show up when you run the code, then you aren’t using your GPU. The easiest way to watch the updated status is through:
watch -n 1 nvidia-smi
If you’re on a local system, another way to tell you’re not using your GPU would be an increased noise from your CPU fan.
Please skip this section unless you have post- successful install update issues.
Normally you’d update
fastai by running
pip install -U fastai or
conda update fastai, using the same package manager you used to install it in first place (but in reality, either will work).
If you use the developer setup, then you need to do a simple:
cd path/to/your/fastai/clone git pull
Sometimes jupyter notebooks get messed up, and
git pull might fail with an error like:
error: Your local changes to the following files would be overwritten by merge: examples/cifar.ipynb Please, commit your changes or stash them before you can merge. Aborting
then either make a new
fastai clone (the simplest), or resolve it: disable the nbstripout filter, clean up your checkout and re-enable the filter.
tools/trust-origin-git-config -d git stash git pull tools/trust-origin-git-config
git stash pop if you made some local changes and you want them back.
You can also overwrite the folder if you have no changes you want to keep with
git checkout examples in this case, or do a reset - but you have to do it after you disabled the filters, and then remember to re-enable them back. The instructions above do it in a non-destructive way.
The stripout filter allows us to collaborate on the notebooks w/o having conflicts with different execution counts, locally installed extensions, etc., keeping under git only the essentials. Ideally, even the
outputs should be stripped, but that’s a problem if one uses the notebooks for demo, as it is the case with
Managing Multiple Installations
It’s possible to have multiple
fastai installs - usually in different conda environments. And when you do that it’s easy to get lost in which environment of
fastai you currently use.
conda activate wanted-env here are some tips to find your way around:
Tell me which environment modules are imported from:
import sys print(sys.path)
Tell me which
fastai library got loaded (we want to know the exact location)
import sys, fastai print(sys.modules['fastai'])
At times a quick hack can be used to get your first notebook working and then sorting out the setup. Say you checked out
cd /tmp/ git clone https://github.com/fastai/fastai cd fastai
So now you know that your uninstalled
fastai is located under
/tmp/fastai/. Next, put the following on the very top of your notebook:
import sys sys.path.append("/tmp/fastai") import fastai
and it should just work. Now, go and sort out the rest of the installation, so that you don’t need to do it for every notebook.
Conda environments not showing up in Jupyter Notebook
While normally you shouldn’t have this problem, and all the required things should get installed automatically, some users report that their jupyter notebook does not recognize newly created environments at times. They reported the following to work:
conda activate fastai-3.6 conda install jupyter conda install nb_conda conda install nb_conda_kernels conda install ipykernel python -m ipykernel install --user --name fastai-3.6 --display-name "Python (fastai-3.6)"
See also Kernels for different environments.
Before making a new issue report, please:
- Make sure you have the latest
pip, depending on the package manager you use:
pip install pip -U conda install conda
and then repeat the steps and see whether the problem you wanted to report still exists.
Make sure your platform is supported by the preview build of
pytorch-1.0.0. You may have to build
pytorchfrom source if it isn’t.
- Make sure you follow the exact installation instructions. If you improvise and it works that’s great, if it fails please RTFM ;)
If you followed the steps in this document and couldn’t find a resolution, please post a comment in this thread.
If the issue is still relevant, make sure to include in your post:
- the output of the following script (including the ```text opening and closing ``` so that it’s formatted properly in your post):
git clone https://github.com/fastai/fastai cd fastai python -c 'import fastai; fastai.show_install(1)'
If you already have a
fastaicheckout, then just update it first:
cd fastai git pull python -c 'import fastai; fastai.show_install(1)'
The reporting script won’t work if
pytorchwasn’t installed, so if that’s the case, then send in the following details:
- output of
- your OS: linux/osx/windows / and linux distro+version if relevant
- output of
nvidia-smi(or say CPU if none)
- output of
- a brief summary of the problem
- the exact installation steps you followed
If the resulting output is very long, please paste it to https://pastebin.com/ and include a link to your paste
Do’s and Don’ts:
please do not send screenshots with trace/error messages - we can’t copy-n-paste from the images, instead paste them verbatim into your post and use the markdown gui menu so that it’s code-formatted.
If your system is configured to use a non-English locale, if possible, re-run the problematic code after running:
So that the error messages will be in English. You can run
localeto see which locales you have installed.
Bug Reports and PRs
If you found a bug and know how to fix it please submit a PR with the fix here. Thank you.