If your batch job crashed and you can't find the core file
Your batch job on one of the Indiana University research supercomputers may have crashed because the stack size limit is set incorrectly.
If you're using bash
or ksh
, use the
ulimit
command to check the stack size:
[dartmaul@h2 ~]$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 258007
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 10240
cpu time (seconds, -t) unlimited
max user processes (-u) 1024
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
If you're using tcsh
or csh
, use limit
:
[palpatin@h1 ~]$ limit
cputime unlimited
filesize unlimited
datasize unlimited
stacksize unlimited
coredumpsize 0 kbytes
memoryuse unlimited
vmemoryuse 4194304 kbytes
descriptors 4096
memorylocked unlimited
maxproc 1024
In both examples above, note that the stack size is set to 10240
. Edit the initialization file for your shell to set the stack size to unlimited
, and then try running your job again:
Shell | Initialization file | Command |
---|---|---|
bash |
.bashrc |
ulimit -s unlimited |
ksh |
.profile |
ulimit -s unlimited |
csh |
.cshrc |
limit stacksize unlimited |
tcsh |
.cshrc |
limit stacksize unlimited |
Once you've set your stack size to any value other than unlimited
, you cannot raise it above that value in your current process. You must log out and log in again to reset your stack size to a higher value. In this situation, ulimit
will return an error that looks like:
-bash: ulimit: stack size: cannot modify limit: Operation not permitted
By default, most versions of Linux set the core file size to zero. To generate a core file when you run a job on a research supercomputer at IU, use ulimit
or limit
to set the file size to unlimited
:
Shell | Initialization file | Command |
---|---|---|
bash |
.bashrc |
ulimit -c unlimited |
ksh |
.profile |
ulimit -c unlimited |
csh |
.cshrc |
limit coredumpsize unlimited |
tcsh |
.cshrc |
limit coredumpsize unlimited |
To make sure you have enough space, create the core
file in your Slate-Scratch space, and then link to it; for example (replace username
with your IU username):
ln -s /N/scratch/username/core ./core
This is document awdh in the Knowledge Base.
Last modified on 2023-07-24 13:01:22.