Test Cases

From FarmShare

(Difference between revisions)
Jump to: navigation, search
(TC7: check disk throughput performance - shared fs)
Line 121: Line 121:
Your results should be something like this (mine ran for 3h44m):
Version  1.96      ------Sequential Output------ --Sequential Input- --Random-
Concurrency  1    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
barley18.st 193760M    10  30 40498  21 27087  20  3481  96 291925  49 244.9  11
Latency              1131ms    925ms    2834ms  27065us    976ms    101ms
Version  1.96      ------Sequential Create------ --------Random Create--------
barley18.stanford.e -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                16  432  8  2945  16  554  7  450  7  1544  9  500  7
Latency              814ms    4220us  13577us  12176us    7681us    654ms

Revision as of 23:57, 14 December 2011

This page should have some "test cases" that users or sysadmins can run to verify the functionality of the barleys.


TC1: submit a job from a corn via /mnt/glusterfs

  1. cd /mnt/glusterfs/your_sunetid
  2. echo "hostname" | qsub -cwd
  3. qstat # check job status
  4. Check that the stderr output file is empty and the stdout output file contains the hostname of the machine that the job ran on

This test verifies that the shared filesystem is available and the job submission process works as expected.

TC2: submit a job from a corn with AUKS support

  1. kinit / klist -f # check that your ticket is forwardable
  2. aklog / tokens # check that you have an AFS token
  3. echo "hostname" | qsub
  4. qstat # check job status
  5. Check that the stderr output file is empty and the stdout output file contains the hostname of the machine that the job ran on

This test verifies that AUKS handles the Kerberos/AFS tickets/tokens correctly.

TC3: check memory tracking

R script:

$ cat R8GB.R 
x <- array(1:1073741824, dim=c(1024,1024,1024)) 
x <- gaussian(); 

submit script:

$ cat r_test.script

# use the current directory
#$ -cwd
# mail this address
#$ -M chekh@stanford.edu
# send mail on begin, end, suspend
#$ -m bes
# get rid of spurious messages about tty/terminal types
#$ -S /bin/sh

R --vanilla --no-save < R8GB.R 

  1. submit this job with 'qsub r_test.script' (with AUKS or not)
  2. check that you get an e-mail
  3. check that the e-mail correctly reports ~8GB maxvmem (our current version does not, it's a bug, we need to upgrade)

TC4: check time tracking

submit a job like

 echo "sleep 3600" | qsub -cwd -m bes -M chekh@stanford.edu

Check that the job completion mail says 1hr of walltime elapsed

submit a job like

 echo "sleep 3600" | qsub -cwd -m bes -M chekh@stanford.edu -l h_rt=72:00:00

Check that the job went into long.q

TC5: check maximum job numbers

We currently have:

max_u_jobs                   100
max_jobs                     3000
 for i in `seq -w 01 200`; do echo "sleep 600" | qsub -cwd ; done

That will submit 200 jobs, and you should see that only 100 of them will be accepted and an error like

 Unable to run job: job rejected: Only 100 jobs are allowed per user (current job count: 100).

TC6: check disk throughput performance - local

Use the bonnie++ executable and run it on local disk with a submit script like this:


#$ -m bes
#$ -M chekh@stanford.edu
#$ -cwd


echo $TMPDIR


Check that the performance numbers in the output roughly match these, my job ran 2h40m:

$ cat bonnie.script.o24714 
Version  1.96       ------Sequential Output------ --Sequential Input- --Random-
Concurrency   1     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
barley06.st 193760M   583  98 76227  22 40798  14  3514  97 96284  15 103.7  41
Latency             30420us   18681ms    1842ms   18715us     480ms     489ms
Version  1.96       ------Sequential Create------ --------Random Create--------
barley06.stanford.e -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                 16  3487   4 +++++ +++ 22471  23 29117  37 +++++ +++ +++++ +++
Latency             13181us    1138us     552us    1048us      72us     483us

TC7: check disk throughput performance - shared fs


#$ -m bes
#$ -M chekh@stanford.edu
#$ -cwd



echo $DIR


Your results should be something like this (mine ran for 3h44m):

Version  1.96       ------Sequential Output------ --Sequential Input- --Random-
Concurrency   1     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
barley18.st 193760M    10  30 40498  21 27087  20  3481  96 291925  49 244.9  11
Latency              1131ms     925ms    2834ms   27065us     976ms     101ms
Version  1.96       ------Sequential Create------ --------Random Create--------
barley18.stanford.e -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                 16   432   8  2945  16   554   7   450   7  1544   9   500   7
Latency               814ms    4220us   13577us   12176us    7681us     654ms
Personal tools