I'm running into an issue where running multiple instances of the same Python program on my server is causing some of the jobs to throw the following error:
File "/home/joe/workspace/20211016_235943_532300_ltr/venv38/lib/python3.8/site-packages/paramiko/client.py", line 406, in connect t.start_client(timeout=timeout) File "/home/joe/workspace/20211016_235943_532300_ltr/venv38/lib/python3.8/site-packages/paramiko/transport.py", line 653, in start_client self.start() File "/usr/local/lib/python3.8/threading.py", line 852, in start _start_new_thread(self._bootstrap, ())
RuntimeError: can't start new threadThe error message is clear and it looks like the Paramiko library that makes SSH connections is unable to create new threads. I've read up on this issue on Stackoverflow and even tried implementing this solution -
$ ps -fLu exec | wc -l
3956
$ ulimit -u
16384
$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 513699
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 16384
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimitedThis has not solved the issue and I'm unable to find better solutions. This server's sole purpose is to carry out these jobs so I can dedicate any maximums that the system can support.
Any advice on how to resolve this issue would be really appreciated.
Some relevant system info:
$ free -h total used free shared buff/cache available
Mem: 125G 5.0G 1.3G 93M 119G 120G
Swap: 62G 36K 62G
$ cat /proc/sys/kernel/threads-max
1027399
$ cat /proc/sys/kernel/pid_max
32768
$ cat /proc/sys/vm/max_map_count
65530 Reset to default