6

I am trying to run a shell script which will create process using a shell script. I get Resource temporarily unavailable error. how to identify which limit (memory/process/filecount) is creating this problem. Below is my ulimit -a results.

core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 563959
max locked memory       (kbytes, -l) unlimited
max memory size         (kbytes, -m) unlimited
open files                      (-n) 65535
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) unlimited
cpu time               (seconds, -t) unlimited
max user processes              (-u) 10000000
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited
Pablo Bianchi
  • 15,657
  • You will have to give us more information. Perhaps post your bash script. i've never had trouble with this. Example: Before: ps aux | wc -l gave 183 and after running a script to create 4000 processes it gave 4183. – Doug Smythies Nov 04 '16 at 22:01
  • It's a simple script which will invoke background process for each invocation. #!/bin/sh while read zi; do (source example.sh &) done <$1 With few more iterations of run I understood that its not the process count which is limiting the execution but it is memory. When I run another simple script I am able to reach to count of 1900 processes but after which I again see this issue. Which memory exactly would play role here, it is stack memory in ulimits or any specific memory for bash process itself is limiting the execution – Viswanath Nov 08 '16 at 03:38
  • 3
  • watch cat /proc/meminfo – Elder Geek Nov 09 '16 at 03:58
  • Please look at the screen shot and let me know if you notice any thing abnormal. I am using 140 GB RAM with Ubuntu OS 16.04 and a Cloud machine. – Viswanath Nov 09 '16 at 08:49
  • with quick calculation of memory used by application I could notice it is around 3.068756 GB – Viswanath Nov 09 '16 at 08:57
  • According to your ulimits memory is not limited. A process may still fail because a memory allocation fails but that would probably look different (unless the developer chose to give that misleading error message) since malloc sets errno to ENOMEM and not EAGAIN. Could you maybe run strace on your processes and watch for the last system call(s) before termination? – David Foerster Nov 10 '16 at 00:47
  • htop shows heavy cpu consumption, 20 cores of CPU are exhausted by 370 program instances where as RAM consumption is just ~3GB out of 140GB htop image. – Viswanath Nov 10 '16 at 06:23
  • strace_img1 strace_img2 do you see any thing suspicious in this log – Viswanath Nov 10 '16 at 09:12
  • Did you figure this out? From your htop image it looks to me as though you are running out of processes or threads, not memory. It seems that your process spins out many threads per process, hitting a limit at about 12480 (a little less actually for your case). I can only get a maximum of 12480 tasks or threads, and that seems independent of how much stack is allocated per thread. I have not been able to figure out a way to get more. – Doug Smythies Nov 20 '16 at 20:34
  • Yes, I have observed the same in my server where maximum of 12320 tasks could be created where as I am invoking much more than that with a simple script. I am looking out for a way to increase this limit if possible, any directions in this can really help. my ulimits how ever shows the max user process as 10000000. – Viswanath Dec 02 '16 at 13:54
  • Is there any thing related to memory as I see 4 GB is the memory generally which is consumed htopimage – Viswanath Dec 05 '16 at 07:51
  • Do you see need of PAE installation in Ubuntu 16.04 in server machine. – Viswanath Dec 05 '16 at 08:01

1 Answers1

12

For the case in the comments, where you were not using much memory per thread, you were hitting the cgroup limits. You will find the default to be around 12288, but the value is writable:

$ cat /sys/fs/cgroup/pids/user.slice/user-1000.slice/pids.max
12288
$ echo 15000 | sudo tee /sys/fs/cgroup/pids/user.slice/user-1000.slice/pids.max
15000
$ cat /sys/fs/cgroup/pids/user.slice/user-1000.slice/pids.max
15000

EDIT: For more recent versions of Ubuntu (i.e. 24.04) the location has changed:

$ cat /sys/fs/cgroup/user.slice/user-1000.slice/pids.max
20668 

And if I use my "what is the thread limit" program (found here) to check, before:

$ ./thread-limit
Creating threads ...
100 threads so far ...
200 threads so far ...
...
12100 threads so far ...
12200 threads so far ...
Failed with return code 11 creating thread 12281 (Resource temporarily unavailable).
Malloc worked, hmmm

and after:

$ ./thread-limit
Creating threads ...
100 threads so far ...
200 threads so far ...
300 threads so far ...
...
14700 threads so far ...
14800 threads so far ...
14900 threads so far ...
Failed with return code 11 creating thread 14993 (Resource temporarily unavailable).
Malloc worked, hmmm

Of course, the numbers above are not exact because the "doug" user has a few other threads running, such as my SSH sessions to my sever. Check with:

$ cat /sys/fs/cgroup/pids/user.slice/user-1000.slice/pids.current
8

EDIT: For more recent versions of Ubuntu (i.e. 24.04) the location has changed:

$ cat /sys/fs/cgroup/user.slice/user-1000.slice/pids.current
12

Program used:

/* compile with:   gcc -pthread -o thread-limit thread-limit.c */
/* originally from: http://www.volano.com/linuxnotes.html */

#include <stdlib.h> #include <stdio.h> #include <unistd.h> #include <pthread.h> #include <string.h>

#define MAX_THREADS 100000 #define PTHREAD_STACK_MIN 110241024*1024 int i;

void run(void) { sleep(60 * 60); }

int main(int argc, char *argv[]) { int rc = 0; pthread_t thread[MAX_THREADS]; pthread_attr_t thread_attr;

pthread_attr_init(&thread_attr); pthread_attr_setstacksize(&thread_attr, PTHREAD_STACK_MIN);

printf("Creating threads ...\n"); for (i = 0; i < MAX_THREADS && rc == 0; i++) { rc = pthread_create(&(thread[i]), &thread_attr, (void *) &run, NULL); if (rc == 0) { pthread_detach(thread[i]); if ((i + 1) % 100 == 0) printf("%i threads so far ...\n", i + 1); } else { printf("Failed with return code %i creating thread %i (%s).\n", rc, i + 1, strerror(rc));

  // can we allocate memory?
  char *block = NULL;
  block = malloc(65545);
  if(block == NULL)
    printf(&quot;Malloc failed too :( \n&quot;);
  else
    printf(&quot;Malloc worked, hmmm\n&quot;);
}

} sleep(60*60); // ctrl+c to exit; makes it easier to see mem use exit(0); }

See also here

EDIT May, 2020: For newer versions of Ubuntu, the default maximum PID number is now 4194304, and therefore adjusting it is not needed.

Now, if you have enough memory, the next limit will be defined by the default maximum PID number, which is 32768, but is also writable. Obvioulsy in order to have more than 32768 simultaneous processes or tasks or threads their PID will have to be allowed to be higher:

$ cat /proc/sys/kernel/pid_max
32768
$ echo 80000 | sudo tee /proc/sys/kernel/pid_max
80000
$ cat /proc/sys/kernel/pid_max
80000

Note that is quite on purpose that a number bigger than 2**16 was chosen, to see if it was actually allowed. And so now, set the cgroup max to, say 70000:

$ echo 70000 | sudo tee /sys/fs/cgroup/pids/user.slice/user-1000.slice/pids.max
70000
$ cat /sys/fs/cgroup/pids/user.slice/user-1000.slice/pids.max
70000

And at this point, realize that the above listed program seems to have a limit of about 32768 threads, even if resources are still available, and so use another method. My test server with 16 gigabytes of memory seems to exhaust some other resource at about 62344 tasks, even though there does seem to still be memory available.

$ cat /sys/fs/cgroup/pids/user.slice/user-1000.slice/pids.current
62344

top:

top - 13:48:26 up 21:08,  4 users,  load average: 281.52, 134.90, 70.93
Tasks: 62535 total, 201 running, 62334 sleeping,   0 stopped,   0 zombie
%Cpu0  : 96.6 us,  2.4 sy,  0.0 ni,  1.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu1  : 95.7 us,  2.4 sy,  0.0 ni,  1.9 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu2  : 95.1 us,  3.1 sy,  0.0 ni,  1.8 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu3  : 93.5 us,  4.6 sy,  0.0 ni,  1.9 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu4  : 94.8 us,  3.4 sy,  0.0 ni,  1.8 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu5  : 95.5 us,  2.6 sy,  0.0 ni,  1.9 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu6  : 94.7 us,  3.5 sy,  0.0 ni,  1.9 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu7  : 93.8 us,  4.5 sy,  0.0 ni,  1.7 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem : 15999116 total,   758684 free, 10344908 used,  4895524 buff/cache
KiB Swap: 16472060 total, 16470396 free,     1664 used.  4031160 avail Mem

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 37884 doug 20 0 108052 68920 3104 R 5.7 0.4 1:23.08 top 24075 doug 20 0 4360 652 576 S 0.4 0.0 0:00.31 consume 26006 doug 20 0 4360 796 720 S 0.4 0.0 0:00.09 consume 30062 doug 20 0 4360 732 656 S 0.4 0.0 0:00.17 consume 21009 doug 20 0 4360 748 672 S 0.3 0.0 0:00.26 consume

Seems I finally hit my default ulimit settings for both user processes and number of timers (signals):

$ ulimit -i
62340
doug@s15:~$ ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 62340
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 32768
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 62340
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

If I raise those limits, in my case, I did it via /etc/security/limits.conf:

# /etc/security/limits.conf
#
# S18 specific edits. 2019.12.24
#       also for a ridiculous number of threads test.
#
#Each line describes a limit for a user in the form:
#
#<domain>        <type>  <item>  <value>
#
#Where:
#<domain> can be:
#        - a user name
#        - a group name, with @group syntax
#        - the wildcard *, for default entry
#        - the wildcard %, can be also used with %group syntax,
#                 for maxlogin limit
#        - NOTE: group and wildcard limits are not applied to root.
#          To apply a limit to the root user, <domain> must be
#          the literal username root.
#
#<type> can have the two values:
#        - "soft" for enforcing the soft limits
#        - "hard" for enforcing hard limits
#
#<item> can be one of the following:
#        - core - limits the core file size (KB)
#        - data - max data size (KB)
#        - fsize - maximum filesize (KB)
#        - memlock - max locked-in-memory address space (KB)
#        - nofile - max number of open file descriptors
* - nofile 32768
#        - rss - max resident set size (KB)
#        - stack - max stack size (KB)
#        - cpu - max CPU time (MIN)
#        - nproc - max number of processes
* - nproc 200000
#        - as - address space limit (KB)
#        - maxlogins - max number of logins for this user
#        - maxsyslogins - max number of logins on the system
#        - priority - the priority to run user process with
#        - locks - max number of file locks the user can hold
#        - sigpending - max number of pending signals
* - sigpending 200000
#        - msgqueue - max memory used by POSIX message queues (bytes)
#        - nice - max nice priority allowed to raise to values: [-20, 19]
#        - rtprio - max realtime priority
#        - chroot - change root to directory (Debian-specific)
#
#<domain>      <type>  <item>         <value>
#

#* soft core 0 #root hard core 100000 #* hard rss 10000 #@student hard nproc 20 #@faculty soft nproc 20 #@faculty hard nproc 50 #ftp hard nproc 0 #ftp - chroot /ftp #@student - maxlogins 4

End of file

I am able to go to 126020 threads, before the return of the inability to fork. This time the limit was (keep in mind that there are about `150 root owned threads on this server, before the test starts):

cat /proc/sys/kernel/threads-max
126189

O.K. so now adjusting that parameter:

echo 99999999 | sudo tee /proc/sys/kernel/threads-max
99999999

I can get to about 132,000 threads before my 16 gigabyte server starts to swap memory, and trouble errupts.

$ cat /sys/fs/cgroup/pids/user.slice/user-1000.slice/pids.current
132016

Note: running top places a significant additional load on the system under these conditions, so I didn't run it. However memory:

doug@s18:~/config/etc/security$ free -m
              total        used        free      shared  buff/cache   available
Mem:          15859       15509         270           1          79         137
Swap:          2047           4        2043

At some point you will get into trouble, but it is absolutely amazing how gracefully the system bogs down. Once my system starts to swap, it totally boggs down and I had many of these errors:

Feb 17 16:13:02 s15 kernel: [  967.907305] INFO: task waiter:119371 blocked for more than 120 seconds.
Feb 17 16:13:02 s15 kernel: [  967.907335]       Not tainted 4.10.0-rc8-stock #194
Feb 17 16:13:02 s15 kernel: [  967.907357] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

And my load average ballooned to ~29000. But I just left the computer for an hour and it sorted itself out. I staggered the spin out of the threads by 200 microseconds per spin out, and that also seemed to help.

Doug Smythies
  • 15,448
  • 5
  • 44
  • 61
  • 3
    On Ubuntu with systemd, the resource limits can be configured in /etc/systemd/system.conf . https://www.freedesktop.org/software/systemd/man/systemd.resource-control.html explains the relevant configuration options. Setting "DefaultTasksAccounting=no" in /etc/systemd/system.conf removes the default of setting limits for pids.max. – Lari Hotari Nov 09 '17 at 13:14
  • Seems that adding "UserTasksMax=infinity" to /etc/systemd/logind.conf would remove the limit. – Lari Hotari Nov 09 '17 at 13:29
  • @LariHotari : Thanks for your information. Indeed your method works fine. I still have to raise the pid_max value and the ulimits. I might edit my answer to include your method. – Doug Smythies Nov 09 '17 at 17:03
  • This answer helped me figure out errors on Ubuntu 18.04 related to "java.lang.outofmemoryerror unable to create native thread". My research had shown me a lot of "ulimit" related thread limitation problems. But his answer led me to understand new systemd limitations. Thanks. – Josh Aug 22 '19 at 22:03