Error 209: SOCKET_TIMEOUT

подсказка

This error occurs when a network socket operation (reading or writing) exceeds the configured timeout period. It indicates that data could not be sent to or received from a network connection within the allowed time, typically due to network issues, slow client response, or overloaded connections.

Most common causes

Network connectivity issues
- Network latency or packet loss
- Unstable network connection
- Firewall or network security appliance delays
- Network congestion between client and server
Client not reading data fast enough
- Client application blocked or frozen
- Client receive buffer full
- Client is processing slower than server sending
- Client-side timeout shorter than data transfer time
Large result sets
- Query returning huge amount of data
- Client unable to consume data at required rate
- Network bandwidth insufficient for data volume
- No LIMIT clause on queries returning millions of rows
Slow or overloaded client
- Client CPU or memory exhausted
- Client garbage collection pauses
- Client application not responsive
- Too many concurrent connections on client
TCP window exhaustion
- Client TCP receive window fills up (window size = 1)
- Client not acknowledging packets fast enough
- TCP backpressure from slow consumer
Load balancer or proxy timeout
- Intermediate proxy timing out connection
- Load balancer idle timeout
- Service mesh timeout configuration

Common solutions

1. Increase timeout settings

-- Server-side settings (in config.xml or user settings)
<send_timeout>300</send_timeout>
<receive_timeout>300</receive_timeout>

-- Or set per query
SET send_timeout = 600;
SET receive_timeout = 600;

2. Reduce result set size

-- Add LIMIT to queries
SELECT * FROM large_table
LIMIT 10000;

-- Use pagination
SELECT * FROM large_table
ORDER BY id
LIMIT 10000 OFFSET 0;

-- Filter data more aggressively
SELECT * FROM large_table
WHERE date >= today() - INTERVAL 1 DAY;

3. Use compression

-- Enable compression to reduce data transfer
SET enable_http_compression = 1;
SET http_zlib_compression_level = 3;

For client connections:

# Python clickhouse-connect
client = clickhouse_connect.get_client(
    host='your-host',
    compress=True
)

4. Optimize client data consumption

# Stream results instead of loading all into memory
# Python example
for row in client.query_rows_stream(query):
    process_row(row)  # Process immediately

# Don't do this for large results:
# result = client.query(query)  # Loads all into memory

5. Check network connectivity

# Test network latency
ping your-clickhouse-server

# Check for packet loss
mtr your-clickhouse-server

# Test bandwidth
iperf3 -c your-clickhouse-server

# Check TCP settings
netstat -an | grep ESTABLISHED

6. Configure TCP keep-alive

<!-- In ClickHouse config.xml -->
<tcp_keep_alive_timeout>300</tcp_keep_alive_timeout>
<tcp_keep_alive_interval>10</tcp_keep_alive_interval>

Client-side (Linux):

# Configure TCP keep-alive
sysctl -w net.ipv4.tcp_keepalive_time=300
sysctl -w net.ipv4.tcp_keepalive_intvl=60
sysctl -w net.ipv4.tcp_keepalive_probes=9

7. Increase client buffer sizes

# JDBC example
Properties props = new Properties();
props.setProperty("socket_timeout", "300000");  // 5 minutes
props.setProperty("socket_rcvbuf", "524288");   // 512KB receive buffer
props.setProperty("socket_sndbuf", "524288");   // 512KB send buffer

Common scenarios

Scenario 1: Timeout writing to socket

Error: Code: 209. DB: Timeout exceeded while writing to socket

Cause: Client not consuming data fast enough, TCP window exhausted.

Solution:

Add LIMIT to reduce result size
Enable compression
Increase send_timeout setting
Optimize client to consume data faster
Check client isn't blocked or frozen

Scenario 2: Distributed query socket timeout

Error: Timeout exceeded while writing to socket (distributed query)

Cause: Remote shard not responding or network issue between nodes.

Solution:

-- Increase distributed query timeouts
SET send_timeout = 600;
SET receive_timeout = 600;
SET connect_timeout_with_failover_ms = 5000;

Scenario 3: Client receive window = 1

TCP window size drops to 1 byte, then timeout

Cause: Client application stopped reading from socket.

Solution:

Check client application health
Ensure client is actively consuming results
Verify client has sufficient resources (CPU, memory)
Add rate limiting on server side

Scenario 4: Network problems

Error: Timeout exceeded (with network packet loss visible in tcpdump)

Cause: Network connectivity issues, packet loss, or routing problems.

Solution:

Diagnose network with ping, traceroute, mtr
Check firewall rules and network ACLs
Verify network bandwidth is sufficient
Check for network security appliances causing delays

Scenario 5: External network problems

Error: Code 209 timeout writing to socket

Cause: Issues with internet connectivity or cloud provider network.

Solution:

Check cloud provider status page
Verify VPC/network configuration
Test connectivity from multiple locations
Contact network or cloud support

Prevention tips

Set appropriate timeouts: Match client and server timeout settings
Use LIMIT clauses: Prevent queries from returning too much data
Enable compression: Reduce network bandwidth requirements
Monitor network health: Track latency and packet loss
Optimize queries: Return only needed data
Stream results: Process data as it arrives, don't buffer all
Configure TCP properly: Set appropriate keep-alive and buffer sizes

Debugging steps

Check recent socket timeout errors:

SELECT 
    event_time,
    query_id,
    user,
    exception,
    query
FROM system.query_log
WHERE exception_code = 209
  AND event_date >= today() - 1
ORDER BY event_time DESC
LIMIT 10;

Check current timeout settings:

SELECT 
    name,
    value
FROM system.settings
WHERE name LIKE '%timeout%' OR name LIKE '%send%' OR name LIKE '%receive%';

Monitor active connections:

SELECT 
    user,
    address,
    elapsed,
    formatReadableSize(memory_usage) AS memory,
    query
FROM system.processes
WHERE elapsed > 60
ORDER BY elapsed DESC;

Check network statistics:

# On server
netstat -s | grep -i timeout
netstat -s | grep -i retrans

# Check TCP connections
ss -tn | grep ESTAB

Capture network traffic (if needed):

# Capture packets for analysis
tcpdump -i any -w socket_timeout.pcap host client-ip

# Analyze with wireshark or tcpdump
tcpdump -r socket_timeout.pcap -nn

Check query result size:

SELECT 
    query_id,
    formatReadableSize(result_bytes) AS result_size,
    result_rows,
    query_duration_ms,
    query
FROM system.query_log
WHERE query_id = 'your_query_id';

Special considerations

For HTTP interface:

HTTP connections can be affected by load balancer timeouts
Check http_send_timeout and http_receive_timeout settings
Load balancers may have their own timeout configurations

For distributed queries:

Timeout can occur when sending results between nodes
Each hop adds latency
Use send_timeout and receive_timeout for inter-node communication

For large result sets:

Consider using LIMIT and pagination
Use SELECT only needed columns, not SELECT *
Apply filters to reduce data volume
Consider materialized views for aggregations

TCP window size = 1:

This is a strong indicator that the client stopped reading
The server has data to send, but the client buffer is full
Usually client-side issue, not ClickHouse issue

Timeout-related settings

<!-- Server configuration -->
<clickhouse>
    <!-- TCP socket timeouts (seconds) -->
    <send_timeout>300</send_timeout>
    <receive_timeout>300</receive_timeout>
    
    <!-- HTTP timeouts (seconds) -->
    <http_send_timeout>1800</http_send_timeout>
    <http_receive_timeout>1800</http_receive_timeout>
    
    <!-- TCP keep-alive (seconds) -->
    <tcp_keep_alive_timeout>300</tcp_keep_alive_timeout>
    
    <!-- Connection timeouts (seconds) -->
    <connect_timeout>10</connect_timeout>
    <connect_timeout_with_failover_ms>50</connect_timeout_with_failover_ms>
</clickhouse>

Query-level settings:

SET send_timeout = 600;          -- Timeout for sending data (seconds)
SET receive_timeout = 600;       -- Timeout for receiving data (seconds)
SET tcp_keep_alive_timeout = 300; -- TCP keep-alive timeout (seconds)

Client-side configuration

Python (clickhouse-connect):

client = clickhouse_connect.get_client(
    host='your-host',
    send_receive_timeout=300,  # Seconds
    compress=True
)

JDBC:

Properties props = new Properties();
props.setProperty("socket_timeout", "300000");  // Milliseconds
props.setProperty("connect_timeout", "10000");

HTTP:

# Set timeout in curl
curl --max-time 300 'http://clickhouse:8123/?query=SELECT...'

Distinguishing from `TIMEOUT_EXCEEDED (159)`

SOCKET_TIMEOUT (209): Network-level timeout during data transfer
TIMEOUT_EXCEEDED (159): Query execution time limit exceeded

SOCKET_TIMEOUT is about network I/O, while TIMEOUT_EXCEEDED is about query execution time.

If you're experiencing this error:

Check if client is actively consuming results
Verify network connectivity and latency
Add LIMIT to queries returning large results
Enable compression to reduce bandwidth usage
Increase send_timeout and receive_timeout if appropriate
Monitor client application health and resource usage
Check for TCP window size dropping to 1 (indicates client not reading)
Verify no intermediate proxies or load balancers timing out
Test with simpler/smaller queries to isolate the issue

Related documentation:

Most common causes​

Common solutions​

Common scenarios​

Prevention tips​

Debugging steps​

Special considerations​

Timeout-related settings​

Client-side configuration​

Distinguishing from TIMEOUT_EXCEEDED (159)​

Most common causes

Common solutions

Common scenarios

Prevention tips

Debugging steps

Special considerations

Timeout-related settings

Client-side configuration

Distinguishing from `TIMEOUT_EXCEEDED (159)`