Glamdring¶
Glamdring (also called the Foe-Hammer) was a hand-and-a-half sword, forged for Turgon, the King of Gondolin during the First Age, and later owned by the wizard Gandalf. 1
It also is the hostname of the 4 GPU workstation that the lab has acquired for use in Deep Learning and other applications that require computational resources beyond those available in individual workstations, but not requiring the use of the Fulton Supercomputing Lab. Glamdring resides the basement of the Talmage Building in 1059 TMCB. If anyone requires physical access to the machine then, you will need to be escorted by one of the CSRs from the BYU CS department who has access.
Specifications¶
Resource Type | Description | Quantity |
---|---|---|
Motherboard | MSI X99S XPower Gaming Titanium | 1 |
CPU | Intel Core i7 6950X - 10 Cores / 20 Threads @ 3.0 GHz | 1 |
RAM | G.SKILL TridentZ 16GB 288-Pin DDR4 3200MHz | 8: 128GB |
Storage | Samsung 850 EVO 4TB 2.5-Inch SATA III | 1: 4 TB |
GPU | NVIDIA/MSI GTX 1080 SEA HAWK X w/ 200CFM Fans | 4 |
PSU | SilverStone SST-ST1500 1500W 80 Plus Silver | 1 |
Case | Thermaltake Core X5 | 1 |
Software¶
Type | Name | Version |
---|---|---|
OS | Ubuntu Server | 16.04 LTS |
GPU drivers | Nvidia | 390.30 |
GPU development environment | Nvidia Cuda Toolkit | 10.2 |
GPU-accelerated library | Nvidia CuDNN | 7.6.5 for CUDA 10.2 |
DL framework | TensorFlow | 1.5.0 |
Keras | 2.1.3 | |
Theano | 1.0.1 | |
PyTorch | 0.3.1 |
The versions of Cuda, CuDNN, the Nvidia drivers, and Tensorflow all need to be compatible/play nicely, and I've found this combination above to be the most compatible at the time of installation. The other frameworks (Keras, Theano, PyTorch, et. al) can probably be updated as needed, but the Nvidia stuff needs to be carefully done to avoid poisoning the environment.
Installation of everything¶
Getting the machine up and running was a bit of a beast. I generally followed the advice of the following list of internet articles/blog posts, in addition to my own internet sleuthing:
- Build and Setup Your Own Deep Learning Server From Scratch
- GitHub: floydhub/dl-setup
- The $1700 great Deep Learning box: Assembly, setup and benchmarks
- Deep learning setup for Ubuntu 16.04: Tensorflow 1.2, keras, opencv3, python3, cuda8 and cudnn5.1
- Installing Tensorflow
Note that most of these were a bit out of date when this machine was first set up in February 2018, and will only get worse, but they give a general process to follow once you have the version numbers figured out/
- Install Ubuntu Server LTS.
- Originally tried with Lubuntu, but the graphical Desktop Environment seemed to cause problems when trying to install the Nvidia drivers. Just use Ubuntu Server since it's lighter than even Lubuntu due to not have a DE. I also remember that after installing the Nvidia driver, a DE started up, so you might need to disable that by following the instructions here: https://askubuntu.com/questions/16371/how-do-i-disable-x-at-boot-time-so-that-the-system-boots-in-text-mode/79682#79682
You need to tell systemd to not load the graphical login manager:
sudo systemctl enable multi-user.target --force
sudo systemctl set-default multi-user.target
You will still be able to use X by typingstartx
after you logged in.
- Install dependencies like cmake, gcc, etc (check the articles above for details)
- Install the Nvidia drivers, something like:
1 2 3
sudo add-apt-repository ppa:graphics-drivers/ppa sudo apt-get update sudo apt-get install nvidia-390 # Change 390 as needed
- Install the needed version of CUDA (check the TensorFlow installation instructions for the version you want to work with)
- Install CuDNN in accordance with the version of CUDA and TensorFlow to be used
- Install any further dependencies (OpenBLAS, Numpy, etc.)
- Install TensorFlow (tensorflow-gpu)
- Install the rest of the deep learning frameworks
- Do cool stuff.
If you're needing to install this stuff for the first time, I'd highly recommend reading through all of the above articles and others to get a feel for the process, and then go through it carefully.
Accessing Glamdring¶
Note
This currently only works from within the CS department network, so checkout the page on using the VPN from the Getting Started section.
Password authentication is disabled, so it will require using an SSH private/public key pair, which can be generated on your local machine as described on the Getting Started - SSH page.
Talk to Sean Lane about having an account created for you, and send him your public key (probably at ~/.ssh/id_rsa.pub
), which will be used to authenticate your SSH session. Finally, connect with the following command
1 | $ ssh <your-username-here>@192.168.17.19 |
Danger
Never send your private key to anyone unless you know what you're doing. Anyone who has access to your private key can authenticate with it. Consider also encrypting it with a password, as described in the instructions, if using a shared computer.
Useful commands¶
Check the state of the GPUs (temperatures, workloads) with the command
1 | $ nvidia-smi |
Check the version of the CUDA Toolkit with
1 | $ nvcc --version |