-
Notifications
You must be signed in to change notification settings - Fork 4
/
README
183 lines (141 loc) · 6.85 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
The directory structure is designed so each benchmark can be used on
its own without any of the rest of the benchmarks. Since there is a
lot of sharing of code between benchmarks, the main copy of the code
is kept in "common" directories. This code is then either copied or
linked into the specific benchmark when needed. This should all be
done by the Makefiles. Note the infrastructure was all developed
under linux, it will require some more work to make it work under
windows.
Whether you have downloaded the whole suite, or just one benchmark,
you should be able to make and run an implementation of a benchmark by
going to the directory, running make, and then running "./testInputs".
For example to run sampleSort:
cd comparisonSort/sampleSort
make
./testInputs
To compile with Cilk++ the environment variable CILK needs to be set,
and to compile with Cilk Plus the environment variable CILKP needs to be set.
Otherwise it will just compile with g++ and run sequentially. The
code seems to run faster with CILK++.
Below is an outline of how the directories are structured if you
have ALL the benchmarks.
If you have just downloaded one of the benchmarks then you won't have
the TOPLEVEL directory but instead will have one of the <benchmark>
directories. All common files should already be copied into
this benchmark directory.
***********************************
TOPLEVEL
***********************************
common/
This subdirectory includes code and other files that are common across
multiple benchmarks. The code in this directory is linked into each
of the benchmarks in which it is used (or copied when making a
standalone benchmark).
./runall
Runs all the benchmarks
./clean
Removes all temporary or generated files from all subdirectories.
This can significantly reduce diskspace since the generated data files
can be large.
***********************************
TEST DATA
***********************************
testData/
This directory includes all the data and data generation code used to
generate inputs for the benchmarks.
testData/<datatype>
Includes generators for the particular type of data. Currently this
includes sequenceData, graphData, and geometryData.
testData/<datatype>/data
Includes the actual data files generated by the generators or included
as part of benchmark suite. Run "Make clean" to remove all the
temporary files. These can take a lot of space.
testData/<datatype>/data/Makefile
Includes rules for making various instances of this data type
***********************************
BENCHMARKS
***********************************
<benchmark>/
These directories contain all the files for the benchmarks. Some of
these files are linked from the COMMON or TEST DATA areas. The idea
is that including the linked files (copying them instead) these
directories are supposed to be standalone (i.e. the directory can be
distributed and all files compiled on its own).
<benchmark>/<datatype>
This is a link to testdata/<datatype>. The datatype will depend on
the benchmark (e.g. graphs, sequences or geometric data). So far no
benchmark includes more than one datatype, but there is no reason not
to.
<benchmark>/common
Code and other files that are common across implementations of the
benchmark, e.g. the check code.
<benchmark>/common/<bnchmrk>Check.C
Code for checking the correctness of the output for the benchmark.
<bnchmrk> is typically an abbreviation for <benchmark>, e.g. "isort" is short
for "integerSort", and hence the file integerSort/common/isortCheck.C.
Running "make" will typically make the check file.
It is then used in the form
"<bnchmrk>Check <inputFile> <outputFile>".
<benchmark>/common/testInputs
A script that runs the benchmark on all the inputs. This file
includes the list of input files that are part of this benchmark.
It is typically copied over to the directory for each implementation.
<benchmark>/common/<bnchmrk>Time.C
This is a driver for running the benchmark. This can be used if the
benchmark implementation code is written in C or can be linked with C.
Otherwise the benchmark implementation might require its own driver.
***********************************
BENCHMARK IMPLEMENTATIONS
***********************************
<benchmark>/<implementation>/
These directories contain a particular implementation of the benchmark,
for example "comparisonSort/sampleSort/".
<benchmark>/<implementation>/Makefile
Running "make" should make the benchmark code and generate a file
called <bnchmrk>. This includes linking in files from "common" and
"benchmark/<common>" if needed.
<benchmark>/<implementation>/testInputs
This file might not be in the directory before the "make", but should
be copied over by the make. It is used to run the benchmark on all
the test inputs and check the results for correctness. The
benchmark can also be run on its own on a single input without testing by using
<benchmark>/<implementation>/<bnchmrk> <infile>. Here <bnchmrk> is
the same abbreviation as used in the common directory (e.g. "isort").
For example in the directory "comparisonSort/sampleSort", run:
./sort ../sequenceData/data/randomSeq_10000000_double
which will just print out the runtime, and perhaps some statistics,
Using the -o option:
./sort -o <fname> ../sequenceData/data/randomSeq_10000000_double
will output the result to the file <fname>, which can then be tested
with the check program:
../common/<bnchmrk>Check ../sequenceData/data/randomSeq_10000000_double <fname>
Note, however, that the input file might not exist, in which case
go to the "data" directory and:
make randomSeq_10000000_double
***********************************
ADDING AN IMPLEMENTATION
***********************************
Within the <benchmark> directory create a new directory for the
implementation and copy over the "testInputs" file from "common".
If your code is linkable with C++ then you should use the benchmark
driver. To do this you need to implement a function with the
interface given in "common/<bnchmrk>.h". You then need to copy
"common/<bnchmrk>Time.C" and any files it needs to your implementation
directory. The files it needs should be listed in
"common/timeRequiredFiles". Then you can compile
"<bnchmrk>Time.o" and link it with your implementation.
You should now be able to run:
./<bnchmrk> -r <nrounds> -o <outfile> <infile>
or "./testInputs".
If your code is NOT linkable with C++ then you need to create a
standalone executable named <bnchmrk> (the short name for the
benchmark). This needs to run with the command line
<bnchmrk> -r <nrounds> -o <outfile> <infile>
where the <outfile> and <infile> are in the appropriate format. The
-r <nrounds> arugment specifies the number of rounds and running the
program. The program must output to stdout <nrounds> lines each of
the form:
PBBS-time: <time>
giving the time in seconds for one execution of the program on the
input. Other output can also be printed as long as it does not start
with "PBBS-time".