conn-pool: fix use-after-free crash during connectivity grid teardown (#45324)
### Description This PR resolves a use-after-free (UAF) / pure virtual method call crash that occurs during `ConnectivityGrid` teardown when connection pools (HTTP/3, HTTP/2, or HTTP/1) are destroyed while active connection attempts are still ongoing in the background. --- ### 💥 The Bug & Crash Callstack When the `ConnectivityGrid` is deleted, its destructor destroys its underlying connection pools in reverse order. If a pool has active connection attempts, deleting the pool synchronously cancels those attempts. In C++, once a derived class destructor finishes, the object's vtable pointer transitions to point to the base class vtable. Because `destructAllConnections()` is invoked in `~HttpConnPoolImplBase()` (the base class destructor), the connection cancellation callback wrappers execute `onConnectionAttemptFailed()` after the derived pool object has already been destructed. When `onConnectionAttemptFailed()` attempts to write a log trace calling `describePool(attempt->pool())` (which executes `attempt->pool().protocolDescription()`), virtual dispatch invokes a **pure virtual function call** on the base class `HttpConnPoolImplBase`, crashing Envoy instantly: ```log Program received signal SIGSEGV, Segmentation fault. 0x000056453663656e in Envoy::Http::ConnectivityGrid::WrapperCallbacks::onConnectionAttemptFailed() Backtrace: #0 0x000056453663656e in Envoy::Http::ConnectivityGrid::WrapperCallbacks::onConnectionAttemptFailed() #1 0x000056453663606d in Envoy::Http::ConnectivityGrid::WrapperCallbacks::ConnectionAttemptCallbacks::onPoolFailure() #2 0x000056453664e7f1 in Envoy::Http::HttpConnPoolImplBase::onPoolFailure() #3 0x00005645366cb447 in Envoy::ConnectionPool::ConnPoolImplBase::purgePendingStreams() #4 0x00005645366c9d6e in Envoy::ConnectionPool::ConnPoolImplBase::onConnectionEvent() #5 0x000056453666394d in Envoy::Tcp::ActiveTcpClient::onEvent() #6 0x000056453709b0e9 in Envoy::Network::MockConnectionBase::raiseEvent() ... #24 0x00005645366be287 in Envoy::ConnectionPool::ConnPoolImplBase::destructAllConnections() #25 0x00005645366766b8 in Envoy::Http::HttpConnPoolImplBase::~HttpConnPoolImplBase() #26 0x000056453664e70f in Envoy::Http::HttpConnPoolImplMixed::~HttpConnPoolImplMixed() #27 0x000056453664e739 in Envoy::Http::HttpConnPoolImplMixed::~HttpConnPoolImplMixed() #28 0x000056453647099c in std::__1::default_delete<>::operator()() #29 0x000056453647091c in std::__1::unique_ptr<>::reset() #30 0x000056453663a3ad in Envoy::Http::ConnectivityGrid::~ConnectivityGrid() #31 0x000056453663a4f9 in Envoy::Http::ConnectivityGrid::~ConnectivityGrid() #32 0x00005645365de65c in std::__1::default_delete<>::operator()() #33 0x000056453656b55c in std::__1::unique_ptr<>::reset() #34 0x00005645363174fa in ConnectivityGridTest_DestroyGridWithActiveH2Attempts_Test::TestBody() ``` --- ### 🛠️ The Fix Introduced a reloadable feature flag guard `envoy.reloadable_features.conn_pool_grid_early_return_on_teardown` (enabled by default) to shield both H3 and H2/H1 pools during destruction: * **Early Return**: Inside `WrapperCallbacks::onConnectionAttemptFailed()`, we check if `delete_started_` is `true`. If the flag is enabled and the wrapper is already being deleted/torn down, we return immediately, completely skipping the unsafe virtual method lookups on the half-destructed pool object. * **ALPN & Protocol Safety**: This safely protects the system regardless of which pool is being torn down (H3 or H2/H1 mixed pools). --- ### 🧪 Testing & Verification * Added the `DestroyGridWithActiveH2Attempts` unit test to verify that tearing down the `ConnectivityGrid` while an active H2 connection attempt is running does crash without the fix and not with the fix. Risk Level: low, only exposed with log level trace Testing: new unit test Docs Changes: N/A Release Notes: Y Platform Specific Features: N/A Runtime guard: envoy.reloadable_features.conn_pool_grid_early_return_on_teardown Signed-off-by: Dan Zhang <danzh@google.com> Co-authored-by: Dan Zhang <danzh@google.com>
D
danzh committed
2729a01ed090585f6007259986abfa0d260fcb63
Parent: 86dd82f
Committed by GitHub <noreply@github.com>
on 6/1/2026, 5:55:16 PM