-
Notifications
You must be signed in to change notification settings - Fork 0
/
install.html
206 lines (194 loc) · 13.9 KB
/
install.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta http-equiv="Content-Style-Type" content="text/css" />
<meta name="generator" content="pandoc" />
<meta name="author" content="November 1, 2017" />
<title>Savio SL7/installation training</title>
<style type="text/css">code{white-space: pre;}</style>
</head>
<body>
<div id="header">
<h1 class="title">Savio SL7/installation training</h1>
<h2 class="author">November 1, 2017</h2>
</div>
<h1 id="introduction">Introduction</h1>
<p>We'll spend about 30 minutes going through this material as a demonstration. I encourage you to login to your account and try out the various examples yourself as we go through them. The remainder of the scheduled time will be open time for you to work on your installations with each other and with help from us.</p>
<p>Some of this material is based on the extensive Savio documention we have prepared and continue to prepare, available at <a href="http://research-it.berkeley.edu/services/high-performance-computing/user-guide" class="uri">http://research-it.berkeley.edu/services/high-performance-computing/user-guide</a>. In particular we have some detailed installation instructions under the "Installing your own" tab at <a href="http://research-it.berkeley.edu/services/high-performance-computing/accessing-and-installing-software" class="uri">http://research-it.berkeley.edu/services/high-performance-computing/accessing-and-installing-software</a>.</p>
<p>The materials for this tutorial are available using git at <a href="https://github.com/ucberkeley/savio-training-sl7-2017" class="uri">https://github.com/ucberkeley/savio-training-sl7-2017</a> or simply as a <a href="https://github.com/ucberkeley/savio-training-sl7-2017/archive/master.zip">zip file</a>.</p>
<p>These <em>install.html</em> and <em>install_slides.html</em> files were created from <em>install.md</em> by running 'make' within the repository directory (see <em>Makefile</em> for details on how that creates the html files).</p>
<p>Please see this <a href="https://github.com/ucberkeley/savio-training-intro-2017/archive/master.zip">zip file</a> for materials from our introductory training on September 19, including accessing Savio, data transfer, and basic job submission.</p>
<h1 id="outline">Outline</h1>
<p>This training session will cover the following topics:</p>
<ul>
<li>Differences between SL6 and SL7 on Savio</li>
<li>Software installation
<ul>
<li>Installing third-party software</li>
<li>Installing Python and R packages that rely on third-party software
<ul>
<li>Python example</li>
<li>R example</li>
</ul></li>
</ul></li>
<li>Hands-on software installation help</li>
</ul>
<h1 id="whats-new-with-sl7-on-savio">What's new with SL7 on Savio?</h1>
<ul>
<li>Updated Intel and GNU (gcc) compilers, and updated openMPI</li>
<li>All Python and R packages will be available as soon as you load the main <code>python</code> or <code>r</code> modules</li>
<li>Updated Python (Python 3.6) and R (R 3.4.x) will be available.</li>
<li>Python installation now uses Anaconda distribution instead of being build from source.</li>
<li>An extensive set of machine learning/deep learning and related packages available: tensorflow, h2o, mxnet, theano, superlearner.</li>
</ul>
<p>You can ssh to <code>sl7.brc.berkeley.edu</code> and run <code>module avail</code> to see the list. Do note that we'll update the versions in some cases before final release.</p>
<p>Please see the <a href="http://research-it.berkeley.edu/services/high-performance-computing/sl7-upgrade">SL7 migration page</a> for more details.</p>
<h1 id="modules">Modules</h1>
<p>Recall that very little software is available except via the modules system.</p>
<p>Simple tools like gnuplot, newer versions of git, cmake, gcc, etc... are all available in the software module farm, not loaded by default in user environments.</p>
<h1 id="third-party-software-installation---overview">Third-party software installation - overview</h1>
<p>In general, third-party software will provide installation instructions on a webpage, Github README, or install file inside the package source code.</p>
<p>The key for installing on Savio is making sure everything gets installed in your own home, project, or scratch directory and making sure you have the packages on which the software depends on also installed by you or loaded from the Savio modules.</p>
<p>A common installation approach is the GNU build system (Autotools), which involves three steps: configure, make, and make install.</p>
<ul>
<li><em>configure</em>: this queries your system to find out what tools (e.g., compilers and other packages) you have available to use in building and using the software</li>
<li><em>make</em>: this compiles the source code in the software package</li>
<li><em>make install</em>: this moves the compiled code (library files and executables) and header files and the like to their permanent home</li>
</ul>
<h1 id="installation-logistics">Installation logistics</h1>
<p>We recommend you set up a directory structure that mimics how we install software on Savio. We'll assume you are installing in a group directory, but you could equally well do this in your home directory or even on scratch (albeit with the danger the installation might be purged).</p>
<p>If you haven't already set up the directory structure for your group, here are the steps:</p>
<pre><code>cd /global/home/groups/my_group
mkdir sl7
mkdir sl7/sources
mkdir sl7/modules
mkdir sl7/scripts
mkdir sl7/modfiles
chmod -R g+rwX sl7</code></pre>
<p>However, you can manage your directories as you like; the above is just a suggestion.</p>
<h1 id="third-party-software-installation---examples">Third-party software installation - examples</h1>
<p>Here are a couple examples of installing a piece of software in your home directory.</p>
<p>I'm actually going to do the installations under SL6 because the demos were developed for SL6 and don't readily port over to SL7 for reasons that are not germane to the topic of demonstrating installation. In general what we'll do here is how you'll install things on SL7 as well.</p>
<h3 id="yaml-package-example">yaml package example</h3>
<pre><code># install yaml, an optional dependency for Python yaml package
PKG=yaml
V=0.1.7
my_group=co_stat
os=sl6
DIR=/global/home/groups/${my_group}/${os}
SRCDIR=${DIR}/sources/${PKG}/${V}
INSTALLDIR=${DIR}/modules/${PKG}/${V}
mkdir -p ${INSTALLDIR}
mkdir -p ${SRCDIR}
cd ${SRCDIR}
wget http://pyyaml.org/download/libyaml/${PKG}-${V}.tar.gz
tar -xvzf ${PKG}-${V}.tar.gz
cd ${PKG}-${V}
# --prefix is key to install in directory you have access to
./configure --prefix=$INSTALLDIR | tee ../configure-${V}.log
make | tee ../make-${V}.log
make install | tee ../install-${V}.log</code></pre>
<p>To save the information on how you installed the software, we recommend you save the code above. In this case you would save it in <code>${os}/scripts/yaml/install.sh</code>.</p>
<h1 id="third-party-software-installation---examples-1">Third-party software installation - examples</h1>
<p>Here are a couple examples of installing a piece of software in your home directory.</p>
<h3 id="geos-package-example">geos package example</h3>
<pre><code>PKG=geos
V=3.6.2
DIR=/global/home/groups/${my_group}/${os}
SRCDIR=${DIR}/sources/${PKG}/${V}
INSTALLDIR=${DIR}/modules/${PKG}/${V}
mkdir -p ${INSTALLDIR}
mkdir -p ${SRCDIR}
cd ${SRCDIR}
wget http://download.osgeo.org/${PKG}/${PKG}-${V}.tar.bz2
tar -xvjf ${PKG}-${V}.tar.bz2
cd ${PKG}-${V}
# --prefix is key to install in directory you have access to
./configure --prefix=$INSTALLDIR | tee ../configure-${V}.log
make | tee ../make-${V}.log
make install | tee ../install-${V}.log</code></pre>
<p>For Cmake, the following may work (this is not a worked example, just some template code):</p>
<pre><code>PKG=foo
INSTALLDIR=~/software/${PKG}
cmake -DCMAKE_INSTALL_PREFIX=${INSTALLDIR} . | tee ../cmake.log</code></pre>
<h1 id="third-party-software-installation---final-steps">Third-party software installation - final steps</h1>
<p>To use the software you just installed as a dependency of other software you want to install (e.g., Python or R packages), you often need to make Linux aware of the location of the following in your newly-installed software:</p>
<ul>
<li>library files (for other software to link against)</li>
<li>header files (for other software to compile against).</li>
</ul>
<p>If a library file can't be found, you'll likely see an error message such as <em><code>libfoo.so</code> not found</em>. In this case make sure make sure the compiler can find the <em>lib</em> or <em>lib64</em> directory that contains the file(s).</p>
<p>If a header file can't be found, you'll likely see comments about <em>.h</em> files not being found. In this case make sure make sure the compiler can find the <em>include</em> directory that contains the file(s).</p>
<h1 id="installing-python-and-r-packages">Installing Python and R packages</h1>
<p>Note: in the following we'll install in our user directory; it would take some additional wrangling to install in a location available to other group members.</p>
<pre><code>module load python/2.7.8 pip
PYPKG=pyyaml
V=0.1.7
PKGDIR=/global/home/groups/${my_group}/${os}/modules/yaml/${V}
# This fails:
pip install --user ${PYPKG}
ls .local/lib/python2.7/site-packages
# can't find yaml.h file
# This fails too:
pip install --user --ignore-installed --global-option=build_ext \
--global-option="-I/${PKGDIR}/include" ${PYPKG}
# can't find libyaml.so file
# This succeeds:
pip install --user --ignore-installed --global-option=build_ext \
--global-option="-I/${PKGDIR}/include" \
--global-option="-L/${PKGDIR}/lib" ${PYPKG}</code></pre>
<p>Sidenote: pyyaml is already installed for Python on SL7 (hence the reason this demo wouldn't work on SL7).</p>
<h1 id="installing-python-and-r-packages-1">Installing Python and R packages</h1>
<p>Here's an R example. In this case compilation would go ok, except R needs access to a geos-related binary for configuration and to the libgeos library file when it test runs the installed rgeos package at the end of the process.</p>
<pre><code>V=3.6.2
PKGDIR=/global/home/groups/${my_group}/${os}/modules/geos/${V}
module load r/3.4.2
Rscript -e "install.packages('rgeos', repos = 'http://cran.cnr.berkeley.edu')"
# that fails as it can't find geos-config,
# an executable installed as part of geos
export PATH=${PKGDIR}/bin:${PATH}
Rscript -e "install.packages('rgeos', repos = 'http://cran.cnr.berkeley.edu')"
# that installs (almost), but fails to run when tested because
# libgoes-3.6.2.so can't be found
export LD_LIBRARY_PATH=${INSTALLDIR}/lib:${LD_LIBRARY_PATH}
Rscript -e "install.packages('rgeos', repos = 'http://cran.cnr.berkeley.edu')"</code></pre>
<p>You may sometimes need to use the <code>configure.args</code> or <code>configure.vars</code> arguments to <code>install.packages()</code> to provide information on the location of <code>include</code> and <code>lib</code> directories of dependencies (similar to what we did for pyyaml.</p>
<h1 id="accessing-executables-and-libraries-at-run-time">Accessing executables and libraries at run-time</h1>
<p>The issues we saw installing rgeos were not really compilation issues, but rather issues in finding binaries and libraries at run time.</p>
<p>For binaries, Linux only looks in certain directories for executables. These directories are in your PATH variable. So in the previous example we added the directory containing geos-config to our PATH.</p>
<p>For library (.so) files Linux only looks in certain directories for the location of .so library files. To tell Linux to look in additional directories, you can add them to your LD_LIBRARY_PATH variable.</p>
<p>In the rgeos case, anytime we want to use the rgeos package, we need to make sure LD_LIBRARY_PATH is set so that rgeos can find <code>libgoes-3.6.2.so</code>.</p>
<h1 id="installation-for-an-entire-group">Installation for an entire group</h1>
<p>Optionally if you'd like your group members to have write access to the directories for the software you installed, you would do this:</p>
<pre><code>PKG=rgeos
cd /global/home/groups/${my_group}/${os}
chmod -R g+rwX {sources,modfiles,modules,scripts}/${PKG}</code></pre>
<h1 id="setting-up-a-module-for-your-software">Setting up a module for your software</h1>
<p>You may also want to set up a module for the software you just installed. This allows you and your group members to easily set your environment so that the software is accessible for you. To do this you need to do the following.</p>
<p>First we'll need a directory in which to store our module files:</p>
<pre><code>V=3.6.2
MYMODULEPATH=/global/home/groups/${my_group}/${os}/modfiles
mkdir -p ${MYMODULEPATH}/geos
export MODULEPATH=${MODULEPATH}:${MYMODULEPATH} # good to put this in your .bashrc</code></pre>
<p>Now we create a module file for the version (or one each for multiple versions) of the software we have installed. E.g., for our geos installation we would edit <code>${MYMODULEPATH}/geos/3.6.2</code> based on looking at examples of other module files. An example module file for our geos example is <em>example-modulefile</em>.</p>
<pre><code>cp example-modulefile ${MYMODULEPATH}/geos/3.6.2</code></pre>
<p>Or see some of the Savio system-level modules in <code>/global/software/sl-7.x86_64/modfiles</code>.</p>
<pre><code>cat /global/software/sl-7.x86_64/modfiles/apps/ml/tensorflow/1.0.0-py36</code></pre>
<h1 id="how-to-get-additional-help">How to get additional help</h1>
<ul>
<li>For technical issues and questions about using Savio:
<ul>
<li>brc-hpc-help@berkeley.edu</li>
</ul></li>
<li>For questions about computing resources in general, including cloud computing:
<ul>
<li>brc@berkeley.edu</li>
</ul></li>
<li>For questions about data management (including HIPAA-protected data):
<ul>
<li>researchdata@berkeley.edu</li>
</ul></li>
</ul>
</body>
</html>