Idle tcp connections

Bug #1947552 reported by Johan Charpentier
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
unbound (Ubuntu)
Expired
Undecided
Unassigned

Bug Description

Description: Ubuntu 20.04.3 LTS
Release: 20.04
unbound/focal,focal-updates,focal-security,now 1.9.4-2ubuntu1.2 amd64 [installed]

Hello,

We have an issue with unbound package.
In some case of usage we have a certain number of idle tcp-sessions wich blocks any new tcp sessions to this server.

1. One of our users initiate a wild number of tcp sessions and stops 30 min later.
2. Thoses sessions max out the `thread0.tcpusage` accordingly to our setting `incoming-num-tcp:`
3. No more TCP cnx are possible. UDP still responding
4. We log the established tcp sessions with an `lsof -i :53`
5. *12 hours later* the *same* TCP sessions are still ESTABLISHED (same client ports same host)
6. A tcpdump on this interface show no tcp packet at all for more than 15min ... but `net.ipv4.tcp_keepalive_time = 7200` or tcp-idle-timeout (actually 30 sec) should have kicked.
7. Still no more TCP cnx are possible

A restart of this service resolves the bug.

This bug is hard to reproduce and we didn't find the client/usage which works all times nor we have more information on our client real infra/clients/libraries. But in the right conditions, it can append a lot.
But we think we are not the only one experiencing this find of bug :

This is the same as 2 bugs reported on the unbound mailing list :
 * https://lists.nlnetlabs.nl/pipermail/unbound-users/2019-August/006361.html
 * https://lists.nlnetlabs.nl/pipermail/unbound-users/2019-October/006487.html

And this seems to be fixed by this MR on the next version of unbound :

On Unbound 1.9.6 Changelog ( https://www.nlnetlabs.nl/projects/unbound/download/#unbound-1-9-6 ) :

```
Merge pull request #122 from he32: In tcp_callback_writer(), don't disable time-out when changing to read.
```

Refering to : https://github.com/NLnetLabs/unbound/pull/122

This MR/Fix is quite simple, so I'm asking if we can cherry pick this fix on this version of unbound to avoid a potential DoS on this service and fix that issue

Thanks in advance :)

Revision history for this message
Lucas Kanashiro (lucaskanashiro) wrote :

Thanks for taking the time to file this bug and trying to make Ubuntu better.

Good investigation work, it seems your analysis is correct, and the fix was acknowledge by upstream. However, to incorporate this in Focal we would need a set of steps to reproduce the bug and show to the SRU team that we will be able to validate whether the bug is present or fixed in a given unbound package.

Did you try to use this newer unbound version to make sure it fixes your problem? Or at least to apply the one-liner patch and see if it does what we expect? If not I believe this is a good next step to make us confident that that patch is sufficient to fix the issue.

I am setting the status of this bug to Incomplete, and once you have more information please set it back to New and we will revisit the bug.

Changed in unbound (Ubuntu):
status: New → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for unbound (Ubuntu) because there has been no activity for 60 days.]

Changed in unbound (Ubuntu):
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.