Skip to content

Setting limits in sub-process #176

@orenbenkiki

Description

@orenbenkiki

Calling threadpool_limits in a sub-process fails (hangs) on some of my servers but fails in ones with a specific OS version:

$ hostnamectl
   Static hostname: n86.my.domain
         Icon name: computer-server
           Chassis: server
        Machine ID: 196b497eccff4526a8e34834c95e3de5
           Boot ID: b8318d26cd394a85b706beb2d7324f73
  Operating System: AlmaLinux 8.9 (Midnight Oncilla)
       CPE OS Name: cpe:/o:almalinux:almalinux:8::baseos
            Kernel: Linux 4.18.0-513.18.2.el8_9.x86_64
      Architecture: x86-64

The code is:

import os
import sys
from threadpoolctl import threadpool_limits
from multiprocessing import get_context

def eprintln(text):
    print(text, file=sys.stderr, flush=True)

DID_THREADCTL_FOR_PID = None

def invocation(index: int) -> int:
    global DID_THREADCTL_FOR_PID
    if os.getpid() != DID_THREADCTL_FOR_PID:
        DID_THREADCTL_FOR_PID = os.getpid()
        eprintln(f"PID: {os.getpid()} invocation index: {index} Do threadpool_limits...")
        threadpool_limits(limits=1)
        eprintln(f"PID: {os.getpid()} invocation index: {index} Did threadpool_limits.")
    else:
        eprintln(f"PID: {os.getpid()} invocation index: {index} Old threadpool_limits.")
    return index

invocations = 4
processes = 2
threadpool_limits(limits=processes)

results = [None] * invocations
eprintln(f"PID: {os.getpid()} Do imap...")
with get_context("fork").Pool(2) as pool:
    for index in pool.imap_unordered(invocation, range(invocations)):
        results[index] = index
        eprintln(f"PID: {os.getpid()} - Did imap index: {index}")

eprintln(f"PID: {os.getpid()} Did imap results: {results}")
assert results == list(range(len(results)))

When I run it on the above OS, in Python 3.12.3, threadpoolctl version 3.4.0, I get:

$ python3 bug.py 
PID: 1576849 Do imap...
PID: 1576852 invocation index: 0 Do threadpool_limits...
PID: 1576853 invocation index: 1 Do threadpool_limits...
PID: 1576853 invocation index: 1 Did threadpool_limits.
PID: 1576853 invocation index: 2 Old threadpool_limits.
PID: 1576853 invocation index: 3 Old threadpool_limits.
PID: 1576849 - Did imap index: 1
PID: 1576849 - Did imap index: 2
PID: 1576849 - Did imap index: 3

And the process hangs. Poking around it seems that libc.dl_iterate_phdr does not return (each match_library_callback call does return). I am using a Python 3.12.3 that was compiled from source on this OS, followed by pip installation of numpy, pandas, scipy etc.

This same thing works fine in older versions of the OS. E.g., in:

$ hostnamectl
   Static hostname: n97.my.domain
         Icon name: computer-server
           Chassis: server
        Machine ID: 5e543d50691943628e8e20441f502406
           Boot ID: 0d876250c0ec4a149e8bdb12c99c20eb
  Operating System: CentOS Linux 7 (Core)
       CPE OS Name: cpe:/o:centos:centos:7
            Kernel: Linux 3.10.0-1160.15.2.el7.x86_64
      Architecture: x86-64

With Python version 3.12.2, again with threadpoolctl version 3.4.0, I get the expected output:

$ python3 bug.py 
PID: 32872 Do imap...
PID: 32874 invocation index: 0 Do threadpool_limits...
PID: 32875 invocation index: 1 Do threadpool_limits...
PID: 32874 invocation index: 0 Did threadpool_limits.
PID: 32875 invocation index: 1 Did threadpool_limits.
PID: 32874 invocation index: 2 Old threadpool_limits.
PID: 32872 - Did imap index: 0
PID: 32875 invocation index: 3 Old threadpool_limits.
PID: 32872 - Did imap index: 1
PID: 32872 - Did imap index: 2
PID: 32872 - Did imap index: 3
PID: 32872 Did imap results: [0, 1, 2, 3]

Any ideas on what I can do to fix this?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions