Skip to content
Christian Baun edited this page Oct 9, 2016 · 3 revisions

This page contains information about the measurements which were carried out with the Task-Distributor script and it contains some of the results.

The measurements were carried out on a cluster of 8 Raspberry Pi Model B single-board computers. Each node was overclocked to 800 MHz, connected to a 100 Mbps Ethernet switch and equipped with a 16 GB SanDisk Ultra Class 10 SDHC card. The operating system used Raspbian (image date: 2014-06-20) containing Linux kernel version 3.12.

Procedure

Measurements were carried out automatically with this script benchmark_start.sh.

#!/bin/bash
#
# title:        benchmark_start.sh
# description:  This script starts the benchmark runs of Task-Distributor.
# author:       Dr. Christian Baun --- http://www.christianbaun.de
# url:          https://github.com/christianbaun/task-distributor/wiki
# license:      GPLv2
# date:         October 9th 2016
# version:      1.6.1
# bash_version: 4.2.37(1)-release
# requires:     
# notes: 
# ----------------------------------------------------------------------------

RAW_DATA_PATH="Measurements_Raspberry_Pi2_900MHz_POV-Ray_2015" 

# Check if the directory for the results does not already exist  
if [ ! -d ${RAW_DATA_PATH} ]; then  
  mkdir ${RAW_DATA_PATH}                
fi

# Path of the folder (on a distributed file system) where the workers 
# store the povray files
PATH_FS="/glusterfs8repl/povray" 

# Path of the lockfile on a file system, which can be accessed by all nodes
LOCKFILE="/glusterfs8repl/povray/lockfile"

# Check if the lockfile already exists
if [ -e ${LOCKFILE} ] ; then
  # Terminate the script, in case the lockfile already exists
  echo "File ${LOCKFILE} already exists!" && exit 1
fi

for x in 600 800 1024 1280 1600 3200 4800 6400 9600
do
  if [ $x -eq 400 ]  ; then y=300  ; fi
  if [ $x -eq 800 ]  ; then y=600  ; fi
  if [ $x -eq 1024 ] ; then y=768  ; fi
  if [ $x -eq 1280 ] ; then y=960  ; fi
  if [ $x -eq 1600 ] ; then y=1200 ; fi  
  if [ $x -eq 3200 ] ; then y=2400 ; fi
  if [ $x -eq 4800 ] ; then y=3600 ; fi
  if [ $x -eq 6400 ] ; then y=4800 ; fi  
  if [ $x -eq 9600 ] ; then y=7200 ; fi
for i in 1 2 4 8
  do
    ./task-distributor-master.sh -n ${i} -x ${x} -y ${y} -p ${PATH_FS} -c > ${RAW_DATA_PATH}/${x}x${y}_${i}_Nodes_`date +%Y_%m_%d_%H:%M:%S`.txt 2>&1
    sleep 10
  done
done

This script benchmark_analyze.sh collects the results from the measurement files, sums them up and prints them in a way which is easy to read.

#!/bin/bash
#
# title:        benchmark_analyze.sh
# description:  This script analyzes the results of Task-Distributor runs.
#               The script searches for result files in the path which is 
#               stored in the variable $RAW_DATA_PATH.
#               This script outputs the results in the shell and it creats a
#               file results.csv too which contains the results for further
#               analyzing with gnuplot or any other tool.
# author:       Dr. Christian Baun --- http://www.christianbaun.de
# url:          https://github.com/christianbaun/task-distributor
# license:      GPLv2
# date:         October 9th 2016
# version:      1.4.1
# bash_version: 4.2.37(1)-release
# requires:     bc 1.06.95
# notes: 
# ----------------------------------------------------------------------------

RAW_DATA_PATH="Measurements_Raspberry_Pi_800MHz_POV-Ray"
RESULTS_FILE="results2015_V3.csv"

# If a CSV file with the results already exists => erase it
if [ -e ${RESULTS_FILE} ] ; then
  rm ${RESULTS_FILE}
fi

# Print out the headline of the CSV file
echo "X-Resolution Y-Resolution Nodes DurSeqPart1 DurSeqPart2 DurParPart EntireDurSum ParPort SeqPort" >> "${RESULTS_FILE}"

for X in 200 400 800 1024 1280 1600 3200 4800 6400 9600
do
  if [ $X -eq 200 ]  ; then Y=150  ; fi
  if [ $X -eq 400 ]  ; then Y=300  ; fi
  if [ $X -eq 800 ]  ; then Y=600  ; fi
  if [ $X -eq 1024 ] ; then Y=768  ; fi
  if [ $X -eq 1280 ] ; then Y=960  ; fi
  if [ $X -eq 1600 ] ; then Y=1200 ; fi
  if [ $X -eq 3200 ] ; then Y=2400 ; fi
  if [ $X -eq 4800 ] ; then Y=3600 ; fi
  if [ $X -eq 6400 ] ; then Y=4800 ; fi
  if [ $X -eq 9600 ] ; then Y=7200 ; fi
  for N in 1 2 4 8
  do
    # It is important here to not check the existence of files via
    # [ -e <filesname_with_wildcard> ] 
    # Otherwise we get an error "binary operator expected" when more than
    # just a single file meets the test criteria.
    ls ${RAW_DATA_PATH}/${X}x${Y}_${N}_Nodes_*.txt > /dev/null 2>&1
    # "$?" contains the return code of the last command executed.
    if [ "$?" = "0" ] ; then
      echo "Resolution:                         ${X}x${Y}"
      echo "Nodes = CPUs:                       ${N}"
      echo "Number of raw data files fount:     `ls -l ${RAW_DATA_PATH}/${X}x${Y}_${N}_Nodes_*.txt | wc -l`"
      SEQ1=`tail --lines=3 ${RAW_DATA_PATH}/${X}x${Y}_${N}_Nodes_*.txt | grep 1st | awk '{ SUM += $9} END { print SUM/NR }'`
      echo "Duration 1st sequential part:       ${SEQ1} s"
      SEQ2=`tail --lines=3 ${RAW_DATA_PATH}/${X}x${Y}_${N}_Nodes_*.txt | grep 2nd | awk '{ SUM += $9} END { print SUM/NR }'`
      echo "Duration 2nd sequential part:       ${SEQ2} s"
      PAR=`tail --lines=3 ${RAW_DATA_PATH}/${X}x${Y}_${N}_Nodes_*.txt | grep parallel | awk '{ SUM += $8} END { print SUM/NR }'`
      echo "Duration parallel Part:             ${PAR} s"
      SUM=`echo "${SEQ1} + ${SEQ2} + ${PAR}" | bc`    
      echo "Entire duration (sum):              ${SUM} s"
      PARPOR=`echo "scale = 4 ; (${PAR} / ${SUM})/1" | bc`
      PARPORFINAL=`echo "scale = 2 ; (${PARPOR} * 100)/1" | bc`
      echo "Parallel portion:                   ${PARPORFINAL} %"
      # The sed command ensures that results < 1 have a leading 0 before the "."
      SEQPOR=`echo "scale = 2 ; ((1 - ${PARPOR}) * 100)/1" | bc | sed 's/^\./0./'`
      echo "Sequential portion:                 ${SEQPOR} %"  
   
      # If the number of nodes = 1, we need to save the entire duration (SUM) to calculate the speedup.
      if [ "${N}" = "1" ] ; then
         SUM_ONE_NODE=${SUM}
      fi
      # Calculate the speedup
      SPEEDUP=`echo "scale = 2 ;(${SUM_ONE_NODE} / ${SUM})/1" | bc`
      echo "Speedup = (time_1_CPU/time_N_CPUs): ${SPEEDUP}"
      
      echo "${X} ${Y} ${N} ${SEQ1} ${SEQ2} ${PAR} ${SUM} ${PARPORFINAL} ${SEQPOR} ${SPEEDUP}" >> ${RESULTS_FILE}

      # This is just an empty line at the end of each block.
      echo ""
    else
      echo "${RAW_DATA_PATH}/${X}x${Y}_${N}_Nodes_*.txt does not exist."
      # This is just an empty line at the end of each block.
      echo ""
    fi
  done
done

Results

$ ./benchmark_analyze.sh
Resolution:                         200x150
Nodes = CPUs:                       1
Number of raw data files fount:     12
Duration 1st sequential part:       0.209917 s
Duration 2nd sequential part:       0.203583 s
Duration parallel Part:             4.47133 s
Entire duration (sum):              4.884830 s
Parallel portion:                   91.53 %
Sequential portion:                 8.47 %
Speedup = (time_1_CPU/time_N_CPUs): 1.00

Resolution:                         200x150
Nodes = CPUs:                       2
Number of raw data files fount:     12
Duration 1st sequential part:       0.19975 s
Duration 2nd sequential part:       0.448167 s
Duration parallel Part:             4.62708 s
Entire duration (sum):              5.274997 s
Parallel portion:                   87.71 %
Sequential portion:                 12.29 %
Speedup = (time_1_CPU/time_N_CPUs): .92

Resolution:                         200x150
Nodes = CPUs:                       4
Number of raw data files fount:     12
Duration 1st sequential part:       0.197167 s
Duration 2nd sequential part:       0.544667 s
Duration parallel Part:             4.98383 s
Entire duration (sum):              5.725664 s
Parallel portion:                   87.04 %
Sequential portion:                 12.96 %
Speedup = (time_1_CPU/time_N_CPUs): .85

Resolution:                         200x150
Nodes = CPUs:                       8
Number of raw data files fount:     12
Duration 1st sequential part:       0.215333 s
Duration 2nd sequential part:       0.7405 s
Duration parallel Part:             6.11467 s
Entire duration (sum):              7.070503 s
Parallel portion:                   86.48 %
Sequential portion:                 13.52 %
Speedup = (time_1_CPU/time_N_CPUs): .69

Resolution:                         400x300
Nodes = CPUs:                       1
Number of raw data files fount:     14
Duration 1st sequential part:       0.214857 s
Duration 2nd sequential part:       0.201929 s
Duration parallel Part:             10.5046 s
Entire duration (sum):              10.921386 s
Parallel portion:                   96.18 %
Sequential portion:                 3.82 %
Speedup = (time_1_CPU/time_N_CPUs): 1.00

Resolution:                         400x300
Nodes = CPUs:                       2
Number of raw data files fount:     14
Duration 1st sequential part:       0.198857 s
Duration 2nd sequential part:       0.679214 s
Duration parallel Part:             7.60257 s
Entire duration (sum):              8.480641 s
Parallel portion:                   89.64 %
Sequential portion:                 10.36 %
Speedup = (time_1_CPU/time_N_CPUs): 1.28

Resolution:                         400x300
Nodes = CPUs:                       4
Number of raw data files fount:     14
Duration 1st sequential part:       0.207 s
Duration 2nd sequential part:       0.776714 s
Duration parallel Part:             7.04214 s
Entire duration (sum):              8.025854 s
Parallel portion:                   87.74 %
Sequential portion:                 12.26 %
Speedup = (time_1_CPU/time_N_CPUs): 1.36

Resolution:                         400x300
Nodes = CPUs:                       8
Number of raw data files fount:     14
Duration 1st sequential part:       0.200214 s
Duration 2nd sequential part:       1.00193 s
Duration parallel Part:             7.42157 s
Entire duration (sum):              8.623714 s
Parallel portion:                   86.06 %
Sequential portion:                 13.94 %
Speedup = (time_1_CPU/time_N_CPUs): 1.26

Resolution:                         800x600
Nodes = CPUs:                       1
Number of raw data files fount:     10
Duration 1st sequential part:       0.2055 s
Duration 2nd sequential part:       0.1938 s
Duration parallel Part:             34.9172 s
Entire duration (sum):              35.3165 s
Parallel portion:                   98.86 %
Sequential portion:                 1.14 %
Speedup = (time_1_CPU/time_N_CPUs): 1.00

Resolution:                         800x600
Nodes = CPUs:                       2
Number of raw data files fount:     10
Duration 1st sequential part:       0.2066 s
Duration 2nd sequential part:       1.5514 s
Duration parallel Part:             21.1664 s
Entire duration (sum):              22.9244 s
Parallel portion:                   92.33 %
Sequential portion:                 7.67 %
Speedup = (time_1_CPU/time_N_CPUs): 1.54

Resolution:                         800x600
Nodes = CPUs:                       4
Number of raw data files fount:     10
Duration 1st sequential part:       0.2027 s
Duration 2nd sequential part:       1.6316 s
Duration parallel Part:             14.9445 s
Entire duration (sum):              16.7788 s
Parallel portion:                   89.06 %
Sequential portion:                 10.94 %
Speedup = (time_1_CPU/time_N_CPUs): 2.10

Resolution:                         800x600
Nodes = CPUs:                       8
Number of raw data files fount:     10
Duration 1st sequential part:       0.242 s
Duration 2nd sequential part:       1.8385 s
Duration parallel Part:             12.7261 s
Entire duration (sum):              14.8066 s
Parallel portion:                   85.94 %
Sequential portion:                 14.06 %
Speedup = (time_1_CPU/time_N_CPUs): 2.38

Resolution:                         1024x768
Nodes = CPUs:                       1
Number of raw data files fount:     10
Duration 1st sequential part:       0.2212 s
Duration 2nd sequential part:       0.2217 s
Duration parallel Part:             55.1782 s
Entire duration (sum):              55.6211 s
Parallel portion:                   99.20 %
Sequential portion:                 0.80 %
Speedup = (time_1_CPU/time_N_CPUs): 1.00

Resolution:                         1024x768
Nodes = CPUs:                       2
Number of raw data files fount:     10
Duration 1st sequential part:       0.2309 s
Duration 2nd sequential part:       2.22 s
Duration parallel Part:             31.676 s
Entire duration (sum):              34.1269 s
Parallel portion:                   92.81 %
Sequential portion:                 7.19 %
Speedup = (time_1_CPU/time_N_CPUs): 1.62

Resolution:                         1024x768
Nodes = CPUs:                       4
Number of raw data files fount:     10
Duration 1st sequential part:       0.2012 s
Duration 2nd sequential part:       2.3647 s
Duration parallel Part:             22.5345 s
Entire duration (sum):              25.1004 s
Parallel portion:                   89.77 %
Sequential portion:                 10.23 %
Speedup = (time_1_CPU/time_N_CPUs): 2.21

Resolution:                         1024x768
Nodes = CPUs:                       8
Number of raw data files fount:     10
Duration 1st sequential part:       0.1975 s
Duration 2nd sequential part:       2.7001 s
Duration parallel Part:             16.8449 s
Entire duration (sum):              19.7425 s
Parallel portion:                   85.32 %
Sequential portion:                 14.68 %
Speedup = (time_1_CPU/time_N_CPUs): 2.81

Resolution:                         1280x960
Nodes = CPUs:                       1
Number of raw data files fount:     10
Duration 1st sequential part:       0.2066 s
Duration 2nd sequential part:       0.1963 s
Duration parallel Part:             84.6913 s
Entire duration (sum):              85.0942 s
Parallel portion:                   99.52 %
Sequential portion:                 0.48 %
Speedup = (time_1_CPU/time_N_CPUs): 1.00

Resolution:                         1280x960
Nodes = CPUs:                       2
Number of raw data files fount:     10
Duration 1st sequential part:       0.2059 s
Duration 2nd sequential part:       3.1389 s
Duration parallel Part:             47.5662 s
Entire duration (sum):              50.9110 s
Parallel portion:                   93.43 %
Sequential portion:                 6.57 %
Speedup = (time_1_CPU/time_N_CPUs): 1.67

Resolution:                         1280x960
Nodes = CPUs:                       4
Number of raw data files fount:     10
Duration 1st sequential part:       0.1972 s
Duration 2nd sequential part:       3.1778 s
Duration parallel Part:             32.5017 s
Entire duration (sum):              35.8767 s
Parallel portion:                   90.59 %
Sequential portion:                 9.41 %
Speedup = (time_1_CPU/time_N_CPUs): 2.37

Resolution:                         1280x960
Nodes = CPUs:                       8
Number of raw data files fount:     10
Duration 1st sequential part:       0.2001 s
Duration 2nd sequential part:       3.3895 s
Duration parallel Part:             23.008 s
Entire duration (sum):              26.5976 s
Parallel portion:                   86.50 %
Sequential portion:                 13.50 %
Speedup = (time_1_CPU/time_N_CPUs): 3.19

Resolution:                         1600x1200
Nodes = CPUs:                       1
Number of raw data files fount:     10
Duration 1st sequential part:       0.2051 s
Duration 2nd sequential part:       0.2044 s
Duration parallel Part:             132.84 s
Entire duration (sum):              133.2495 s
Parallel portion:                   99.69 %
Sequential portion:                 0.31 %
Speedup = (time_1_CPU/time_N_CPUs): 1.00

Resolution:                         1600x1200
Nodes = CPUs:                       2
Number of raw data files fount:     10
Duration 1st sequential part:       0.2165 s
Duration 2nd sequential part:       4.6648 s
Duration parallel Part:             73.4805 s
Entire duration (sum):              78.3618 s
Parallel portion:                   93.77 %
Sequential portion:                 6.23 %
Speedup = (time_1_CPU/time_N_CPUs): 1.70

Resolution:                         1600x1200
Nodes = CPUs:                       4
Number of raw data files fount:     10
Duration 1st sequential part:       0.2421 s
Duration 2nd sequential part:       4.6244 s
Duration parallel Part:             48.7796 s
Entire duration (sum):              53.6461 s
Parallel portion:                   90.92 %
Sequential portion:                 9.08 %
Speedup = (time_1_CPU/time_N_CPUs): 2.48

Resolution:                         1600x1200
Nodes = CPUs:                       8
Number of raw data files fount:     10
Duration 1st sequential part:       0.2034 s
Duration 2nd sequential part:       4.8014 s
Duration parallel Part:             32.4577 s
Entire duration (sum):              37.4625 s
Parallel portion:                   86.64 %
Sequential portion:                 13.36 %
Speedup = (time_1_CPU/time_N_CPUs): 3.55

Resolution:                         3200x2400
Nodes = CPUs:                       1
Number of raw data files fount:     10
Duration 1st sequential part:       0.2411 s
Duration 2nd sequential part:       0.233 s
Duration parallel Part:             609.427 s
Entire duration (sum):              609.9011 s
Parallel portion:                   99.92 %
Sequential portion:                 0.08 %
Speedup = (time_1_CPU/time_N_CPUs): 1.00

Resolution:                         3200x2400
Nodes = CPUs:                       2
Number of raw data files fount:     10
Duration 1st sequential part:       0.2026 s
Duration 2nd sequential part:       15.2428 s
Duration parallel Part:             331.059 s
Entire duration (sum):              346.5044 s
Parallel portion:                   95.54 %
Sequential portion:                 4.46 %
Speedup = (time_1_CPU/time_N_CPUs): 1.76

Resolution:                         3200x2400
Nodes = CPUs:                       4
Number of raw data files fount:     10
Duration 1st sequential part:       0.2297 s
Duration 2nd sequential part:       15.1452 s
Duration parallel Part:             209.375 s
Entire duration (sum):              224.7499 s
Parallel portion:                   93.15 %
Sequential portion:                 6.85 %
Speedup = (time_1_CPU/time_N_CPUs): 2.71

Resolution:                         3200x2400
Nodes = CPUs:                       8
Number of raw data files fount:     10
Duration 1st sequential part:       0.2171 s
Duration 2nd sequential part:       15.5424 s
Duration parallel Part:             131.78 s
Entire duration (sum):              147.5395 s
Parallel portion:                   89.31 %
Sequential portion:                 10.69 %
Speedup = (time_1_CPU/time_N_CPUs): 4.13

Resolution:                         4800x3600
Nodes = CPUs:                       1
Number of raw data files fount:     12
Duration 1st sequential part:       0.225083 s
Duration 2nd sequential part:       0.3545 s
Duration parallel Part:             1361.12 s
Entire duration (sum):              1361.699583 s
Parallel portion:                   99.95 %
Sequential portion:                 0.05 %
Speedup = (time_1_CPU/time_N_CPUs): 1.00

Resolution:                         4800x3600
Nodes = CPUs:                       2
Number of raw data files fount:     12
Duration 1st sequential part:       0.211667 s
Duration 2nd sequential part:       32.6508 s
Duration parallel Part:             744.946 s
Entire duration (sum):              777.808467 s
Parallel portion:                   95.77 %
Sequential portion:                 4.23 %
Speedup = (time_1_CPU/time_N_CPUs): 1.75

Resolution:                         4800x3600
Nodes = CPUs:                       4
Number of raw data files fount:     12
Duration 1st sequential part:       0.206917 s
Duration 2nd sequential part:       32.0748 s
Duration parallel Part:             464.963 s
Entire duration (sum):              497.244717 s
Parallel portion:                   93.50 %
Sequential portion:                 6.50 %
Speedup = (time_1_CPU/time_N_CPUs): 2.73

Resolution:                         4800x3600
Nodes = CPUs:                       8
Number of raw data files fount:     12
Duration 1st sequential part:       0.213417 s
Duration 2nd sequential part:       31.9905 s
Duration parallel Part:             289.651 s
Entire duration (sum):              321.854917 s
Parallel portion:                   89.99 %
Sequential portion:                 10.01 %
Speedup = (time_1_CPU/time_N_CPUs): 4.23

Resolution:                         6400x4800
Nodes = CPUs:                       1
Number of raw data files fount:     18
Duration 1st sequential part:       0.218222 s
Duration 2nd sequential part:       0.467056 s
Duration parallel Part:             2434.63 s
Entire duration (sum):              2435.315278 s
Parallel portion:                   99.97 %
Sequential portion:                 0.03 %
Speedup = (time_1_CPU/time_N_CPUs): 1.00

Resolution:                         6400x4800
Nodes = CPUs:                       2
Number of raw data files fount:     16
Duration 1st sequential part:       0.210875 s
Duration 2nd sequential part:       152.037 s
Duration parallel Part:             1321.64 s
Entire duration (sum):              1473.887875 s
Parallel portion:                   89.67 %
Sequential portion:                 10.33 %
Speedup = (time_1_CPU/time_N_CPUs): 1.65

Resolution:                         6400x4800
Nodes = CPUs:                       4
Number of raw data files fount:     14
Duration 1st sequential part:       0.219 s
Duration 2nd sequential part:       120.324 s
Duration parallel Part:             821.244 s
Entire duration (sum):              941.787 s
Parallel portion:                   87.20 %
Sequential portion:                 12.80 %
Speedup = (time_1_CPU/time_N_CPUs): 2.58

Resolution:                         6400x4800
Nodes = CPUs:                       8
Number of raw data files fount:     14
Duration 1st sequential part:       0.2085 s
Duration 2nd sequential part:       120.471 s
Duration parallel Part:             509.707 s
Entire duration (sum):              630.3865 s
Parallel portion:                   80.85 %
Sequential portion:                 19.15 %
Speedup = (time_1_CPU/time_N_CPUs): 3.86

Resolution:                         9600x7200
Nodes = CPUs:                       1
Number of raw data files fount:     4
Duration 1st sequential part:       0.211 s
Duration 2nd sequential part:       0.78525 s
Duration parallel Part:             11084.6 s
Entire duration (sum):              11085.59625 s
Parallel portion:                   99.99 %
Sequential portion:                 0.01 %
Speedup = (time_1_CPU/time_N_CPUs): 1.00

Resolution:                         9600x7200
Nodes = CPUs:                       2
Number of raw data files fount:     10
Duration 1st sequential part:       0.2162 s
Duration 2nd sequential part:       379.675 s
Duration parallel Part:             3058.45 s
Entire duration (sum):              3438.3412 s
Parallel portion:                   88.95 %
Sequential portion:                 11.05 %
Speedup = (time_1_CPU/time_N_CPUs): 3.22

Resolution:                         9600x7200
Nodes = CPUs:                       4
Number of raw data files fount:     10
Duration 1st sequential part:       0.2316 s
Duration 2nd sequential part:       347.795 s
Duration parallel Part:             1892.8 s
Entire duration (sum):              2240.8266 s
Parallel portion:                   84.46 %
Sequential portion:                 15.54 %
Speedup = (time_1_CPU/time_N_CPUs): 4.94

Resolution:                         9600x7200
Nodes = CPUs:                       8
Number of raw data files fount:     10
Duration 1st sequential part:       0.2267 s
Duration 2nd sequential part:       349.833 s
Duration parallel Part:             1170.18 s
Entire duration (sum):              1520.2397 s
Parallel portion:                   76.97 %
Sequential portion:                 23.03 %
Speedup = (time_1_CPU/time_N_CPUs): 7.29
Clone this wiki locally