Error 439: CANNOT_SCHEDULE_TASK

подсказка

This error occurs when ClickHouse cannot allocate a new thread from the thread pool to execute a task. It indicates that either the thread pool is exhausted (all threads busy), system thread limit is reached, or the OS cannot create new threads.

Most common causes

Thread pool exhausted
- All threads in pool busy with active tasks
- Too many concurrent queries requesting threads
- Thread pool size limit reached (threads = max pool size)
- Jobs queued waiting for available threads
System thread limit reached
- OS kernel thread limit exceeded
- ulimit -u (max user processes) reached
- System-wide thread limit hit
- Container or cgroup thread limit reached
High query concurrency with max_threads settings
- Many queries each requesting max_threads threads
- max_insert_threads setting too high with many concurrent inserts
- Thread demand exceeds available thread pool capacity
- Spike in concurrent query workload
Resource exhaustion
- System cannot allocate memory for new threads
- Out of memory for thread stack allocation
- System resource limits preventing thread creation
- Container memory limits affecting thread creation
Misconfigured thread pool settings
- max_thread_pool_size set too low for workload
- Thread pool not properly sized for concurrent queries
- Imbalance between query concurrency and thread availability

Common solutions

1. Check current thread usage

-- View current thread pool status
SELECT 
    metric,
    value
FROM system.metrics
WHERE metric LIKE '%Thread%'
ORDER BY metric;

-- Key metrics to check:
-- QueryPipelineExecutorThreads - active query execution threads
-- QueryPipelineExecutorThreadsActive - threads currently executing
-- GlobalThread - total threads in global pool

2. Check thread pool configuration

-- View thread pool settings
SELECT 
    name,
    value,
    description
FROM system.server_settings
WHERE name LIKE '%thread%'
ORDER BY name;

-- Key settings:
-- max_thread_pool_size - maximum threads in global pool
-- max_thread_pool_free_size - idle threads kept in pool
-- thread_pool_queue_size - max tasks waiting in queue

3. Reduce per-query thread usage

-- Limit threads for specific query
SELECT * FROM large_table
SETTINGS max_threads = 4;

-- Reduce insert threads
INSERT INTO table
SETTINGS max_insert_threads = 4;

-- Set user-level defaults
ALTER USER your_user SETTINGS max_threads = 8;

4. Check system thread limits

# Check current thread limits
ulimit -u

# Check system-wide limits
cat /proc/sys/kernel/threads-max
cat /proc/sys/kernel/pid_max

# Check current thread count
ps -eLf | wc -l

# For containers, check cgroup limits
cat /sys/fs/cgroup/pids/pids.max

5. Enable concurrency control (if available)

-- Check concurrency control settings
SELECT 
    name,
    value
FROM system.server_settings
WHERE name LIKE '%concurrent_threads_soft_limit%';

-- concurrent_threads_soft_limit_ratio_to_cores - limits threads per core
-- concurrent_threads_soft_limit_num - absolute thread limit

примечание

Concurrency control was broken in versions before October 2024 fix. Fixed properly in 24.10+.

6. Monitor and limit concurrent queries

-- Check concurrent query count
SELECT count() AS concurrent_queries
FROM system.processes;

-- Limit concurrent queries per user
ALTER USER your_user SETTINGS max_concurrent_queries_for_user = 10;

-- Check thread usage per query
SELECT 
    query_id,
    user,
    ProfileEvents['QueryPipelineExecutorThreads'] AS threads_used,
    query
FROM system.processes
ORDER BY threads_used DESC;

Common scenarios

Scenario 1: No free threads in pool

Error: Cannot schedule a task: no free thread (timeout=0) 
(threads=15000, jobs=15000)

Cause: Thread pool completely saturated; all 15000 threads busy.

Solution:

Reduce concurrent query load
Lower max_threads and max_insert_threads settings
Increase `max_thread_pool_size if system can handle it
Wait for queries to complete and retry

Scenario 2: Failed to start thread

Error: Cannot schedule a task: failed to start the thread 
(threads=14755, jobs=14754)

Cause: System unable to create new thread (OS or resource limit).

Solution:

# Increase system limits
ulimit -u 65535

# Or in /etc/security/limits.conf
* soft nproc 65535
* hard nproc 65535

# Increase kernel limits
sysctl -w kernel.threads-max=100000
sysctl -w kernel.pid_max=100000

Scenario 3: Cannot allocate thread

Error: Cannot schedule a task: cannot allocate thread

Cause: Memory or system resources insufficient for thread creation.

Solution:

Check available memory: free -h
Check if system is swapping: vmstat 1
Reduce concurrent query load
Increase system memory or reduce thread usage

Scenario 4: Insert spike with max_insert_threads

Error: CANNOT_SCHEDULE_TASK during high insert load

Cause: Many concurrent inserts each using high max_insert_threads.

Solution:

-- Reduce insert threads globally
SET max_insert_threads = 4;

-- For specific insert
INSERT INTO table
SETTINGS max_insert_threads = 2;

-- Use async inserts to batch operations
SET async_insert = 1;

Scenario 5: Query spike exhausting thread pool

Error appears during traffic spike
Multiple queries failing simultaneously

Cause: Sudden increase in concurrent queries.

Solution:

Implement query queuing or rate limiting on client side
Reduce max_threads per query
Increase max_thread_pool_size (if system allows)
Scale horizontally (add more replicas)

Prevention tips

Set reasonable thread limits: Don't use excessively high max_threads values
Monitor thread usage: Track thread pool metrics regularly
Configure system limits: Ensure OS limits are appropriate for workload
Use async inserts: Reduce thread usage for insert workloads
Implement rate limiting: Control concurrent query load
Scale horizontally: Add replicas to distribute thread demand
Optimize queries: Efficient queries need fewer threads and complete faster

Debugging steps

Check recent CANNOT_SCHEDULE_TASK errors:

SELECT 
    event_time,
    query_id,
    user,
    exception,
    query
FROM system.query_log
WHERE exception_code = 439
    AND event_date >= today() - 1
ORDER BY event_time DESC
LIMIT 20;

Monitor thread pool metrics:

SELECT 
    event_time,
    CurrentMetric_GlobalThread AS global_threads,
    CurrentMetric_QueryPipelineExecutorThreads AS executor_threads,
    CurrentMetric_QueryPipelineExecutorThreadsActive AS active_threads,
    CurrentMetric_Query AS concurrent_queries
FROM system.metric_log
WHERE event_time >= now() - INTERVAL 1 HOUR
ORDER BY event_time DESC
LIMIT 100;

Check concurrent query patterns:

SELECT 
    toStartOfMinute(event_time) AS minute,
    count() AS query_count,
    countIf(exception_code = 439) AS thread_errors,
    avg(ProfileEvents['QueryPipelineExecutorThreads']) AS avg_threads
FROM system.query_log
WHERE event_time >= now() - INTERVAL 1 HOUR
GROUP BY minute
ORDER BY minute DESC;

Identify high thread-consuming queries:

SELECT 
    query_id,
    user,
    ProfileEvents['QueryPipelineExecutorThreads'] AS threads,
    ProfileEvents['QueryPipelineExecutorThreadsActive'] AS active_threads,
    normalizeQuery(query) AS query_pattern
FROM system.query_log
WHERE event_time >= now() - INTERVAL 1 HOUR
    AND type = 'QueryFinish'
ORDER BY threads DESC
LIMIT 20;

Check system thread limits:

# Check user process limit
ulimit -u

# Check current thread count
ps -eLf | wc -l

# Check system limits
cat /proc/sys/kernel/threads-max
cat /proc/sys/kernel/pid_max

# For containers
cat /sys/fs/cgroup/pids/pids.current
cat /sys/fs/cgroup/pids/pids.max

Review thread pool configuration:

SELECT 
    name,
    value,
    default
FROM system.server_settings
WHERE name IN (
    'max_thread_pool_size',
    'max_thread_pool_free_size',
    'thread_pool_queue_size',
    'concurrent_threads_soft_limit_num',
    'concurrent_threads_soft_limit_ratio_to_cores'
);

Special considerations

For ClickHouse Cloud:

Thread pool sized based on instance tier
Cannot directly configure max_thread_pool_size
Errors may indicate need to scale up instance
Temporary spikes should be tolerated with retry logic

Thread pool types:

Global thread pool: General query execution threads
Background pool: Merges and mutations
IO pool: Disk and network I/O operations
Schedule pool: Background scheduled tasks

Concurrency control:

Feature to limit threads based on CPU cores
Was broken in versions before ~October 2024
Fixed properly in 24.10+
Settings: concurrent_threads_soft_limit_ratio_to_cores

Thread vs query limits:

max_concurrent_queries limits number of queries
max_threads limits threads per query
Total threads = queries × threads_per_query
Thread pool must accommodate total demand

Thread-related settings

Server-level (config.xml):

<clickhouse>
    <!-- Maximum threads in global pool -->
    <max_thread_pool_size>10000</max_thread_pool_size>
    
    <!-- Idle threads kept in pool -->
    <max_thread_pool_free_size>1000</max_thread_pool_free_size>
    
    <!-- Max tasks waiting for threads -->
    <thread_pool_queue_size>10000</thread_pool_queue_size>
    
    <!-- Soft limit based on cores (if > 0) -->
    <concurrent_threads_soft_limit_ratio_to_cores>2</concurrent_threads_soft_limit_ratio_to_cores>
</clickhouse>

Query-level:

-- Threads for reading/processing
SET max_threads = 8;

-- Threads for parallel inserts
SET max_insert_threads = 4;

-- Threads for distributed queries
SET max_distributed_connections = 1024;

-- Background operations
SET background_pool_size = 16;
SET background_merges_mutations_concurrency_ratio = 2;

System limit configuration

Linux ulimits:

# Temporary increase
ulimit -u 65535

# Permanent configuration in /etc/security/limits.conf
clickhouse soft nproc 65535
clickhouse hard nproc 65535

# Or for all users
* soft nproc 65535
* hard nproc 65535

Kernel parameters:

# Increase thread limits
sysctl -w kernel.threads-max=200000
sysctl -w kernel.pid_max=200000

# Make permanent in /etc/sysctl.conf
kernel.threads-max = 200000
kernel.pid_max = 200000

Container limits (Kubernetes):

# Pod spec - adjust if needed
spec:
  containers:
  - name: clickhouse
    resources:
      limits:
        # Memory affects thread creation
        memory: 32Gi

Error message variations

"no free thread":

Thread pool at capacity
All threads busy with tasks
More common, usually temporary

"failed to start the thread":

System failed to create new thread
OS or resource limit reached
More serious, indicates system issue

"cannot allocate thread":

Memory allocation failed for thread
System resource exhaustion
May indicate memory pressure

Monitoring thread health

-- Real-time thread usage
SELECT 
    metric,
    value,
    description
FROM system.metrics
WHERE metric IN (
    'GlobalThread',
    'GlobalThreadActive',
    'LocalThread',
    'LocalThreadActive',
    'QueryPipelineExecutorThreads',
    'QueryPipelineExecutorThreadsActive'
);

-- Thread usage over time
SELECT 
    toStartOfMinute(event_time) AS minute,
    max(CurrentMetric_GlobalThread) AS max_threads,
    max(CurrentMetric_GlobalThreadActive) AS max_active,
    max(CurrentMetric_Query) AS max_queries
FROM system.metric_log
WHERE event_time >= now() - INTERVAL 1 HOUR
GROUP BY minute
ORDER BY minute DESC;

-- Queries that failed due to thread exhaustion
SELECT 
    toStartOfMinute(event_time) AS minute,
    count() AS error_count,
    count(DISTINCT user) AS affected_users
FROM system.query_log
WHERE exception_code = 439
    AND event_date >= today() - 7
GROUP BY minute
HAVING error_count > 0
ORDER BY minute DESC;

Recovery and mitigation

Immediate actions:

Wait and retry - Thread pool may free up quickly

Kill long-running queries - Free up threads

-- Find long-running queries
SELECT query_id, user, elapsed, query
FROM system.processes
WHERE elapsed > 300
ORDER BY elapsed DESC;

-- Kill if appropriate
KILL QUERY WHERE query_id = 'long_running_query';

Reduce query load - Temporarily throttle queries on client side
Restart ClickHouse - Clears thread pool (last resort)

Long-term fixes:

Optimize query thread usage:

-- Set sensible defaults
ALTER USER default SETTINGS max_threads = 8;
ALTER USER default SETTINGS max_insert_threads = 4;

Increase thread pool size (if system can handle it):

<max_thread_pool_size>20000</max_thread_pool_size>

Configure concurrency control:

<!-- Limit threads based on CPU cores -->
<concurrent_threads_soft_limit_ratio_to_cores>2</concurrent_threads_soft_limit_ratio_to_cores>

Increase system limits:

# Increase user process limit
ulimit -u 100000

# Increase kernel limits
sysctl -w kernel.threads-max=200000

Prevention tips

Set appropriate max_threads: Don't use default if you have high concurrency
Monitor thread metrics: Track thread pool usage trends
Configure system limits properly: Ensure OS limits match workload
Use async inserts: Reduce thread consumption for insert operations
Implement rate limiting: Control concurrent query load
Test under load: Verify thread pool sizing for peak loads
Keep ClickHouse updated: Concurrency control improvements in newer versions

Known issues and fixes

Issue: Concurrency control broken before October 2024

Affected: Versions before ~24.10
Symptom: concurrent_threads_soft_limit_ratio_to_cores not working
Fix: Merged in October 2024, available in 24.10+
Impact: Thread pool could be exhausted more easily

Issue: High insert threads with concurrent inserts

Symptom: Many inserts with max_insert_threads exhausting pool
Cause: Each insert requesting many threads simultaneously
Solution: Reduce max_insert_threads or use async inserts

Issue: Query pipeline executor threads

Symptom: QueryPipelineExecutorThreadsActive reaching pool limit
Context: Modern query execution uses pipeline executor threads
Solution: Proper concurrency control (fixed in 24.10+)

Diagnosing thread pool exhaustion

-- Snapshot of thread usage at error time
WITH error_times AS (
    SELECT DISTINCT toStartOfMinute(event_time) AS error_minute
    FROM system.query_log
    WHERE exception_code = 439
        AND event_time >= now() - INTERVAL 6 HOUR
)
SELECT 
    m.event_time,
    m.CurrentMetric_GlobalThread AS total_threads,
    m.CurrentMetric_GlobalThreadActive AS active_threads,
    m.CurrentMetric_Query AS concurrent_queries,
    m.CurrentMetric_QueryPipelineExecutorThreads AS executor_threads
FROM system.metric_log m
INNER JOIN error_times e ON toStartOfMinute(m.event_time) = e.error_minute
ORDER BY m.event_time;

-- What was running when error occurred
SELECT 
    user,
    count() AS query_count,
    sum(ProfileEvents['QueryPipelineExecutorThreads']) AS total_threads_requested
FROM system.processes
WHERE query_start_time >= 'time_of_error' - INTERVAL 1 MINUTE
    AND query_start_time <= 'time_of_error' + INTERVAL 1 MINUTE
GROUP BY user
ORDER BY total_threads_requested DESC;

Recommended thread settings

For high-concurrency workloads:

-- Per-query thread limits
SET max_threads = 4;  -- Instead of default (CPU cores)
SET max_insert_threads = 4;

-- Enable concurrency control
-- (server config for 24.10+)
concurrent_threads_soft_limit_ratio_to_cores = 2

For analytical workloads:

-- Can use more threads per query
SET max_threads = 16;

-- But limit concurrent queries
SET max_concurrent_queries_for_user = 5;

For mixed workloads:

-- Balance between parallelism and concurrency
SET max_threads = 8;
SET max_insert_threads = 4;
SET max_concurrent_queries_for_user = 20;

When to `increase max_thread_pool_size`

Consider increasing if:

Consistently hitting thread pool limit
High concurrency is expected workload pattern
System has sufficient resources (CPU, memory)
Errors correlate with legitimate traffic spikes

Don't increase if:

System already at resource limits
Better to reduce per-query thread usage
Horizontal scaling is an option
Queries can be optimized to use fewer threads

Thread pool sizing guidelines

Recommended max_thread_pool_size calculation:
= (concurrent_queries × max_threads_per_query) × 1.5 safety margin

Example:
- Expected concurrent queries: 50
- Average max_threads: 8
- Calculation: 50 × 8 × 1.5 = 600 threads

But also consider:
- System CPU cores (more threads than cores causes context switching)
- Available memory (each thread has stack, typically 8-10 MB)
- Background operations (merges, mutations need threads too)

Temporary workarounds

While waiting for long-term fixes:

-- Reduce thread usage across all queries
ALTER SETTINGS PROFILE default SET max_threads = 4;

-- Prioritize critical queries
SELECT * FROM important_table
SETTINGS priority = 1;  -- Higher priority

-- For non-critical queries
SELECT * FROM less_important_table
SETTINGS priority = 10,  -- Lower priority
         max_threads = 2;  -- Fewer threads

For ClickHouse Cloud users

Limitations:

Cannot directly configure max_thread_pool_size
Thread pool sized by instance tier
Need to upgrade tier if consistently hitting limits

Recommendations:

Set appropriate max_threads and max_insert_threads
Monitor thread usage metrics
Scale up tier if thread exhaustion is frequent
Implement retry logic for transient errors
Consider horizontal scaling (more replicas)

Escalation:

If errors persist after optimization
If thread pool appears undersized for tier
Contact support with thread usage metrics

If you're experiencing this error:

Check if this is a transient spike (retry may succeed)
Review current thread pool usage in system.metrics
Check for traffic spike or abnormal query patterns
Verify system thread limits are adequate
Reduce max_threads and max_insert_threads if set too high
Monitor for queries using excessive threads
For persistent issues, increase max_thread_pool_size (self-managed) or scale up (Cloud)
Ensure concurrency control is working (upgrade to 24.10+ if needed)
Implement client-side retry with exponential backoff

Related documentation:

Most common causes​

Common solutions​

Common scenarios​

Prevention tips​

Debugging steps​

Special considerations​

Thread-related settings​

System limit configuration​

Error message variations​

Monitoring thread health​

Recovery and mitigation​

Prevention tips​

Known issues and fixes​

Diagnosing thread pool exhaustion​

Recommended thread settings​

When to increase max_thread_pool_size​

Thread pool sizing guidelines​

Temporary workarounds​

For ClickHouse Cloud users​

Most common causes

Common solutions

Common scenarios

Prevention tips

Debugging steps

Special considerations

Thread-related settings

System limit configuration

Error message variations

Monitoring thread health

Recovery and mitigation

Prevention tips

Known issues and fixes

Diagnosing thread pool exhaustion

Recommended thread settings

When to `increase max_thread_pool_size`

Thread pool sizing guidelines

Temporary workarounds

For ClickHouse Cloud users