If you are behind a proxy and want to proxy docker registry or have multiple machines pulling the same images over and over (CI/CD/ML/DL etc..) and just want to cache them locally the following is a good choice.

create a folder docker-registry-local-cache and create docker-compose.yml file as follows and customize it with your env variables.

vi docker-compose.yml


version: "2"
services:
registry2:
image: registry:2
ports:
- 5000:5000
environment:
- REGISTRY_PROXY_REMOTEURL="https://registry-1.docker.io"
- HTTP_PROXY=example.com:80
- HTTPS_PROXY=example.com:80
- NO_PROXY="localhost,127.0.0.1,10.0.0.7"
- no_proxy="localhost,127.0.0.1,10.0.0.7"
volumes:
- /opt/registry:/var/lib/registry

run the container with

docker-compose up -d

run
docker logs dockerregistrylocalcache_registry2_1
and you should see the following
time="2018-04-04T23:18:28Z" level=info msg="Registry configured as a proxy cache to https://registry-1.docker.io" go.version=go1.7.6 instance.id=76e861f3-cd4c-463e-880f-847d152cb565 version=v2.6.2
time="2018-04-04T23:18:28Z" level=info msg="listening on [::]:5000" go.version=go1.7.6 instance.id=76e861f3-cd4c-463e-880f-847d152cb565 version=v2.6.2

run
curl http://10.0.0.7:5000/v2/_catalog
should output in something similar to this
{"repositories":[]}

Next configure your docker client to use this mirror. See this previous post on how to do that.

Once client side is configured, you can pull a image from a remote dockerhub via your local mirror. For example run
docker pull ubuntu:17.10
like you normally would. then run
curl http://10.0.0.7:5000/v2/_catalog"
again to see the following
{"repositories":["library/ubuntu"]}

This should significantly improve the speed of any subsequent pull from the local clients. Hope you someone finds this useful.

Create this folder if this /etc/systemd/system/docker.service.d/ if it does not exist yet.

For adding local registry mirror add override.conf and file to the folder and the following config


sudo mkdir /etc/systemd/system/docker.service.d/
sudo vi /etc/systemd/system/docker.service.d/override.conf


[Service]
ExecStart=
ExecStart=/usr/bin/dockerd -H fd:// --registry-mirror=http://10.0.0.7:5000

For adding proxy to docker add proxy.conf with the following config


sudo vi /etc/systemd/system/docker.service.d/http-proxy.conf


[Service]
Environment="HTTP_PROXY=http://proxy.example.com:80/" "HTTPS_PROXY=http://proxy.example.com:80/" "NO_PROXY=localhost,127.0.0.0,10.0.0.7"

excluding 10.0.0.7 because, that my local registry mirror 😉 in both cases you need to reload the systemd and docker daemon to take effect.


sudo systemctl daemon-reload
sudo service docker restart

After reload run
sudo docker system info
and see output to confirm the changes have taken effect


..
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Http Proxy: http://proxy.example.com:80/
Https Proxy: http://proxy.example.com:80/
No Proxy: localhost,127.0.0.0,10.0.0.7
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
127.0.0.0/8
Registry Mirrors:
http://10.0.0.7:5000/
Live Restore Enabled: false

Since its been a while I decided to upgrade my ml box to cuda 9.0, man that was fun, lots of googling with multiple visits to ubuntu and nvidia forums and reading up on several blog posts and stackoverflow articles and almost at the end of the long day am running cuda 9.0, Cudnn 7 and tensorflow 1.5 GPU enabled with models with Keras 2.1.x.

the short version is almost 80% of problems were from lingering packages and changes made to the machine during the last install . So the key is to make sure you roll back and remove the packages cleanly before proceeding. the final step is actually very simple, good job nvidia!.

first we need to remove all the old packages installed


sudo apt-get purge nvidia-* -y 
sudo apt-get purge cuda-* -y
sudo apt-get purge libcuda* -y
sudo apt-get purge libcudnn* -y
sudo apt-get autoremove -y
sudo apt-get autoclean -y
sudo apt-get update

Then remove any repo’s that you have added


sudo rm /etc/apt/sources.list.d/nvidia-diag-driver-local-384.66.list
sudo rm /etc/apt/sources.list.d/graphics-drivers-ubuntu-ppa-xenial.list

Then make sure there is nothing left over.


sudo dpkg --list | grep nvidia
sudo dpkg --list | grep cuda
sudo dpkg --list | grep libcudnn

If you find any packages use dpkg to remove them, ex:


sudo dpkg --purge libcudnn5
sudo dpkg --purge cuda-repo-ubuntu1604
sudo dpkg --purge cuda-cudart-8-0 cuda-cudart-dev-8-0 cuda-cufft-8-0 cuda-curand-8-0 cuda-cusolver-8-0 cuda-cusparse-8-0 cuda-npp-8-0 cuda-nvgraph-8-0 cuda-nvrtc-8-0 cuda-toolkit-9-0

revert gcc and g++ to ver 5 as the latest theano and tf have been updated.


sudo ln -s /usr/bin/gcc-5 /usr/bin/gcc -f
sudo ln -s /usr/bin/g++-5 /usr/bin/g++ -f

now reboot the machine and then once it loads make sure there is not old packages and nvidia kernel module is not loaded


lsmod | grep nvidia

now install the cuda repo package and add the cuda gpk keys before installing the cuda meta package.


sudo dpkg -i cuda-repo-ubuntu1604_9.1.85-1_amd64.deb
sudo apt-key adv --fetch-keys http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/7fa2af80.pub
sudo apt-get update
sudo apt-get install cuda-9-0 -y

This option seems to have been significantly improved, it automatically installed the correct nvidia drivers (390.30) via the cuda-drivers package and the blas package (cuda-cublas-9-0) without any mucking around from the user, it does take a while though.

Once its complete, go ahead and reboot the machine and once its back up you should have the nvidia module loaded


lsmod | grep nvidia
nvidia_uvm            761856  4
nvidia_drm             40960  0
nvidia_modeset       1093632  1 nvidia_drm
drm_kms_helper        155648  1 nvidia_drm
drm                   364544  3 drm_kms_helper,nvidia_drm
nvidia              14327808  494 nvidia_modeset,nvidia_uvm
ipmi_msghandler        49152  3 ipmi_ssif,nvidia,ipmi_si

also run nvidia-smi


 nvidia-smi
Fri Mar  2 00:20:04 2018
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 390.30                 Driver Version: 390.30                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 1070    Off  | 00000000:03:00.0 Off |                  N/A |
|  0%   37C    P0    33W / 166W |      0MiB /  8119MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX 1070    Off  | 00000000:04:00.0 Off |                  N/A |
|  0%   39C    P5    15W / 166W |      0MiB /  8119MiB |      2%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Sep__1_21:08:03_CDT_2017
Cuda compilation tools, release 9.0, V9.0.176

if any of the above does not work, remember to update the .bashrc PATH variables to the cuda 9.0 folder


export PATH=/usr/local/cuda-9.0/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-9.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-9.0/include${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
export CUDA_HOME="/usr/local/cuda"
export MKL_THREADING_LAYER=GNU

I will talk about installing theano/tensorflow and keras in another post.

I finally upgraded from my previous GTX 980 Ti to GTX 1070 last week, unfortunately that meant revisiting some of my previous issues with ubuntu and various incompatibilities among the graphics drivers and cuda components.  In any case I decided this time I will document some of this stuff more cleanly so I can refer to it later.

My Setup:

My previous primary desktop currently re-purposed for machine learning and docker experiments etc..

Hardware:

  • Intel(R) Core(TM) i7-3770K CPU
  • ASUS MAXIMUS IV GENE-Z Motherboard
  • Nvidia GTX 1070

Software:

  • Ubuntu 16.04
  • Nvidia driver 367.35
  • CUDA 8.0 RC
  • Anaconda/Theano/Keras native and as docker containers

Steps:

After replacing the 980 ti card with 1070 I reloaded the machine and it just went into a crash and backtrace loop as the previous nvidia driver nvidia-352 did not support the GTX 1070.

Step one was recovering the system, load into recovery mode, load networking (optional), drop into root shell.

sudo apt-get purge nvidia-*
sudo apt-get autoremove
sudo reboot

This will remove the previous nvidia drivers and dependencies and allow you to do a fresh install of the drivers.

if you haven’t done already make sure you are running  gcc 4.9 to avoid compile errors with Theano

sudo apt-get install gcc-4.9 g++-4.9
sudo ln -s  /usr/bin/gcc-4.9 /usr/bin/gcc -f
sudo ln -s  /usr/bin/g++-4.9 /usr/bin/g++ -f

Download CUDA 8.0 RC, download the runfile(local), when installing cuda 8.0 decline on installing NVIDIA drivers. then reboot.

Install NVIDIA 365.35 drivers

sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt-get update
sudo apt-get install nvidia-367
sudo reboot

At this point you should be able to run nvidia-smi and get some results like this

Thu Aug 4 01:19:40 2016
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 367.35 Driver Version: 367.35 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 1070 Off | 0000:01:00.0 Off | N/A |
| 0% 40C P8 11W / 166W | 103MiB / 8112MiB | 0% Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 4921 C ...y/anaconda3/envs/keras104_py27/bin/python 101MiB |
+-----------------------------------------------------------------------------+

At this point make sure you have the binary and library path’s setup correctly and that nvcc is working fine, adding the following to your .bashrc should do the trick.

export PATH=/usr/local/cuda-8.0/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

nvcc -V should give you

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2016 NVIDIA Corporation
Built on Wed_May__4_21:01:56_CDT_2016
Cuda compilation tools, release 8.0, V8.0.26

and you should be able to run the example here and get “Used the gpu” as output.

My several hours of research in a 10min post 🙂

I was re-purposing my old desktop for machine learning with gpu support for Theno and Keras, I ran into several issues and ended up writing some code to workaround some of them and make others easier and more manageable. Someday I will write a more detailed series of articles on how I did that and what I learnt in the process, but today I just wanted to document the last steps I ran into when trying to convert my anaconda based adhoc ipython notebook server into a persistent service.

The initial logic came from, this blog post. However I ran into several issues, first the config is for native ipython not anaconda based ipython and it did not pull in env variables need for theano to pull in the nvc compiler optimizations needed for gpu support.

Here is the final config of the file /etc/systemd/system/ipython-nb-srv.service

[Unit]
Description=Jupyter Notebook Server

[Service]
Type=simple
Environment="PATH=/home/ipynbusr/anaconda3/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin"
ExecStart=/home/ipynbusr/anaconda3/bin/jupyter-notebook
User=ipynbusr
Group=ipynbusr
WorkingDirectory=/home/ipynbusr

[Install]
WantedBy=multi-user.target

After this you do

systemctl daemon-reload
systemctl enable ipython-nb-srv
systemctl start ipython-nb-srv

Some of the previous high-level steps are

  1. Install cuda packages for ubuntu
  2. Install Anaconda
  3. Create a py env
  4. Install the required packages (Theano, keras, numpy, scipy, ipython-notebook etc)
  5. Create .theanorc to make sure theano uses gpu
  6. Create .ipython notebook profile to run as server
  7. Create ipython notebook server service
  8. Enjoy your ipython notebooks from chomebook or windows machine 🙂