If your batch job crashed and you can't find the core file

Your batch job on one of the Indiana University research supercomputers may have crashed because the stack size limit is set incorrectly.

If you're using bash or ksh, use the ulimit command to check the stack size:

[dartmaul@h2 ~]$ ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 258007
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 10240
cpu time               (seconds, -t) unlimited
max user processes              (-u) 1024
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

If you're using tcsh or csh, use limit:

[palpatin@h1 ~]$ limit
cputime      unlimited
filesize     unlimited
datasize     unlimited
stacksize    unlimited
coredumpsize 0 kbytes
memoryuse    unlimited
vmemoryuse   4194304 kbytes
descriptors  4096
memorylocked unlimited
maxproc      1024

In both examples above, note that the stack size is set to 10240. Edit the initialization file for your shell to set the stack size to unlimited, and then try running your job again:

Shell Initialization file Command
bash .bashrc ulimit -s unlimited
ksh .profile ulimit -s unlimited
csh .cshrc limit stacksize unlimited
tcsh .cshrc limit stacksize unlimited

Once you've set your stack size to any value other than unlimited, you cannot raise it above that value in your current process. You must log out and log in again to reset your stack size to a higher value. In this situation, ulimit will return an error that looks like:

-bash: ulimit: stack size: cannot modify limit:
Operation not permitted

By default, most versions of Linux set the core file size to zero. To generate a core file when you run a job on a research supercomputer at IU, use ulimit or limit to set the file size to unlimited:

Shell Initialization file Command
bash .bashrc ulimit -c unlimited
ksh .profile ulimit -c unlimited
csh .cshrc limit coredumpsize unlimited
tcsh .cshrc limit coredumpsize unlimited

To make sure you have enough space, create the core file in your Slate-Scratch space, and then link to it; for example (replace username with your IU username):

ln -s /N/scratch/username/core ./core

This is document awdh in the Knowledge Base.
Last modified on 2023-07-24 13:01:22.