sssd not using offline credentials even no network available

Bug #1928954 reported by Walter Kovacs
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
sssd (Ubuntu)
In Progress
Undecided
Unassigned

Bug Description

I installed a new Ubuntu 20.04 with a new image from ubuntu.org

After the installation I joined our domain and logged in with domain users credentials. Then I logged out and disconnected LAN cable. I could not log in with my offline credentials.

If you wait long enough sometimes 15 minutes sometimes hours sssd will finally enter offline mode while not connected to LAN.

The only solution so far was to add our domain controller to /etc/hosts but this is not a permanent solution.

ProblemType: Bug
DistroRelease: Ubuntu 20.04
Package: sssd 2.2.3-3ubuntu0.4
ProcVersionSignature: Ubuntu 5.4.0-73.82-generic 5.4.106
Uname: Linux 5.4.0-73-generic x86_64
ApportVersion: 2.20.11-0ubuntu27.17
Architecture: amd64
CasperMD5CheckResult: skip
Date: Wed May 19 16:38:47 2021
ProcEnviron:
 TERM=xterm
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=en_US.UTF-8
 SHELL=/bin/bash
SourcePackage: sssd
UpgradeStatus: No upgrade log present (probably fresh install)

Revision history for this message
Walter Kovacs (w-kovacs) wrote :
Revision history for this message
Sergio Durigan Junior (sergiodj) wrote :

Hello Walter,

Thank you for reporting this bug, and apologies for the dealy in getting back to it. Unfortunately the bug fell through the cracks and our team is somewhat busy with other stuff.

Anyway, I have finally had the time to try to reproduce this. I set up a VM with a Samba AD DC + Kerberos auth (server), and an LXD container acting as a client. Then, after creating a user/principal on the server, I was able to successfully login with it inside the client (as expected). With that in place, I brought the network connectivity down on the client and tried logging in again with the same user. Everything worked. I also tried doing some research online to see if I could find similar issues reported against sssd, but came up with nothing.

Given that I could not reproduce the issue, I would like to ask you for more information about your setup. If you can provide configuration files for SSSD and you AD DC, that would be great. If you can provide detailed reproduction steps, that would be even better.

For now, I am going to set this bug's status to Incomplete. When you provide the requested information, feel free to set it back to New.

Thank you in advance.

Changed in sssd (Ubuntu):
status: New → Incomplete
Revision history for this message
Walter Kovacs (w-kovacs) wrote (last edit ):

Hello,

today I installed 7 laptops with Ubuntu 20.04 LTS directly from ubuntu repo. Afterwards I set up our domain and all of these laptops have above problem.

As far as I can tell the sssd demon does not enter offline state even when the LAN cable is disconnected.

We have a "normal" Windows Server as domain controller I am not sure which informations will help you there.

Here is our sssd.conf

[sssd]
services = nss, pam, ssh
config_file_version = 2
domains = REALM.LO
debug_level = 1

[domain/REALM.LO]
debug_level = 1
id_provider = ad
access_provider = ad
auth_provider = ad
krb5_store_password_if_offline = True
enumerate = False
ignore_group_members = True
use_fully_qualified_names = False

auto_private_groups = True
cache_credentials = True

ad_gpo_access_control = permissive

krb5_server = realm.lo
krb5_realm = realm.lo

# SSH Key Login
ldap_user_extra_attrs = altSecurityIdentities:altSecurityIdentities
ldap_user_ssh_public_key = altSecurityIdentities
ldap_use_tokengroups = True

# home directory
override_homedir = /home/%d/%u
default_shell = /bin/bash

-------------------------------------------------
and our krb5.conf

[libdefaults]
    default_realm = REALM.LO
    ticket_lifetime = 24h
    renew_lifetime = 7d
    forwardable = true
    dns_lookup_realm = true
    dns_lookup_kdc = true
    rdn = true

[realms]
    REALM.LO = {
        kdc = realm.lo
        admin_server = realm.lo
    }

[domain_realms]
    .realm.lo = REALM.LO

------------------------------------------------

I would gladly provide more informations if needed

Changed in sssd (Ubuntu):
status: Incomplete → New
Revision history for this message
Walter Kovacs (w-kovacs) wrote :

Today I installed another laptop. I logged in with one user and entered a few commands (mokutil). After that I logged out, disconnected the LAN cable and was able to log in.

Then I connected the LAn cable again. Logged in with a different user immediately logged out and during that disconnected the LAN cable. With the new user I was not able to log in while disconnected.

Is there a delay between login and creating the offline credentials?

Revision history for this message
Sergio Durigan Junior (sergiodj) wrote :

Hi, sorry about the delay in replying; we're busy preparing the new release.

Thanks for providing more information about the bug. I compared the configuration you provided with the one I have installed in my test environment, and it seems like they're pretty much the same. At least I don't see anything that might be a problem in your configuration, and I do see the right setting that is necessary to make offline logins work (cache_credentials = True).

I did another test here and created a new user ("samba-tool user create blabla password"), logged in with it, logged out, powered off the AD DC VM, and then tried to log in again. Although the login process takes a bit more time (i.e., a few more seconds) than what is normally expected (due to the DC being offline), it eventually succeeds and I can successfully login using my offline credentials.

I noticed that you have a debug level set to 1 in your sssd.conf file. Could you set it to 6 instead (you can also use the sss_debuglevel tool to do that) and then attach the log files that live inside /var/log/sssd/ to this bug, please? I'm interested in the files that end with ".log" (i.e., I'm not interested in the files named .1, .2.gz, etc.). Also, make sure to promptly try to login using your offline credentials after setting the debug level, because then we have a better chance at catching the problem.

Lastly, I'd like to ask if it's possible for you to create an Ubuntu Impish LXD and configure it as a client in your environment so that you can try to reproduce the problem there.

Thank you in advance.

Changed in sssd (Ubuntu):
status: New → Incomplete
Revision history for this message
Walter Kovacs (w-kovacs) wrote :

Hello,

today I was able to play around some more. I installed a new VM with Ubuntu 20.04. The domain join is done by our install script. I changed it to set log level to 6.

I logged in with a user, disconnected the LAN connection and logged out.

Then I was not able to login for a few minutes. Without doing anything but wait a login was possible. I guess it took a while to discover that the offline credentials needed to be used.

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for sssd (Ubuntu) because there has been no activity for 60 days.]

Changed in sssd (Ubuntu):
status: Incomplete → Expired
Revision history for this message
Walter Kovacs (w-kovacs) wrote :

This problem still exists. Why is it cloesed?

Changed in sssd (Ubuntu):
status: Expired → In Progress
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Hi Walter,
that is a default behavior of launchpad since the bug was on "expired" for too long it was auto-closed.

The last few week was the Christmas shutdown period and even before a lot of things were busy.
I'm sure Sergio or someone else will have a deeper look at your new logs provided in commend #6 once back. Setting it back to in-progress (or new) with a ping on the bug was just the right thing to do.

Revision history for this message
Sergio Durigan Junior (sergiodj) wrote :

Hi Walter,

Sorry that it took so long for me to get back to this bug.

Anyway, I would like to touch base with you and check if you are still experiencing this issue. I did look into the log files you attached and there doesn't seem to be anything problematic in them. Last I checked I was still unable to reproduce the problem, but I will give it another try with a fresh setup here.

Thanks.

Revision history for this message
Walter Kovacs (w-kovacs) wrote :

Hi,

thanks for your reply. Because we setup our laptops with a "bugfix" adding a IP to our DC to /etc/hosts I did not see the problem lately.

Because this fix is not a real solution I would like to get to the bottom of this. Therefor I will setup a new laptop without this "fix" and check if this problem still exists

Regards

Revision history for this message
Sergio Durigan Junior (sergiodj) wrote : Re: [Bug 1928954] Re: sssd not using offline credentials even no network available

On Friday, June 24 2022, Walter Kovacs wrote:

> Hi,
>
> thanks for your reply. Because we setup our laptops with a "bugfix"
> adding a IP to our DC to /etc/hosts I did not see the problem lately.
>
> Because this fix is not a real solution I would like to get to the
> bottom of this. Therefor I will setup a new laptop without this "fix"
> and check if this problem still exists

Thank you, Walter.

Looking forward to seeing your test results.

--
Sergio
GPG key ID: E92F D0B3 6B14 F1F4 D8E0 EB2F 106D A1C8 C3CB BF14

Revision history for this message
Walter Kovacs (w-kovacs) wrote :

Hello,

I installed a new Ubuntu 20.04 and then SSSD. I logged into the newly installed OS with an domain user and disconnected the LAN cable.

I had to wait a long time (felt around 10 min) before I could finally log in.

This is the same behavior as before. But the time seems to differ every time. I remember that I waited over 2 hours in hope I could finally log in.

I attached the logs in hope this will give you more insight.

Regards

wKovacs

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

> thanks for your reply. Because we setup our laptops with a "bugfix" adding a IP to our DC to
> /etc/hosts I did not see the problem lately.

If you add the IP of the domain controller to /etc/hosts, then offline logins work when you unplug the network?

Revision history for this message
Walter Kovacs (w-kovacs) wrote :

Yes - this is the "fix" to avoid the problem. Especially with a lot of home office user we have to asure the login will work while not in office (not connected to the domain).

But this fix is not a solution. We have 4 DC which could be used to connect but only one IP in /etc/hosts. If this DC fails our linux users are not able to work. Therefore I like to get rid of this entry in /etc/hosts

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

Are you testing with a desktop ubuntu install, graphical login, or a console (text) login?

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

The logs show a mixture of sssd restarts, network restarts, going offline and online, it's difficult to pinpoint exactly when the login failed.

Right after you pull the network, I assume you have a local console still open. Have you checked what sssd thinks the online/offline status is? That can be done with the sssctl command. Here is what I do in my testing rig:

# sssctl domain-status internal.example.fake
Online status: Offline

Active servers:
AD Global Catalog: not connected
AD Domain Controller: win-kriet1e5elo.internal.example.fake

Discovered AD Global Catalog servers:
None so far.
Discovered AD Domain Controller servers:
- win-kriet1e5elo.internal.example.fake

If it still thinks it's online, when in fact it isn't, and logins are failing, see if by forcing the offline mode you can make the logins work:

kill -SIGUSR1 $(pidof sssd)

You seem to have 4 DCs:
(Mon Jun 27 13:25:05 2022) [be[IBEO.AS]] [get_server_status] (0x1000): Status of server 'dc01.ibeo.as' is 'not working'
(Mon Jun 27 13:25:05 2022) [be[IBEO.AS]] [get_server_status] (0x1000): Status of server 'dc03.ibeo.as' is 'not working'
(Mon Jun 27 13:25:05 2022) [be[IBEO.AS]] [get_server_status] (0x1000): Status of server 'dc04.ibeo.as' is 'working'
(Mon Jun 27 13:25:05 2022) [be[IBEO.AS]] [get_port_status] (0x1000): Port status of port 389 for server 'dc04.ibeo.as' is 'not working'
(Mon Jun 27 13:25:05 2022) [be[IBEO.AS]] [get_server_status] (0x1000): Status of server 'dc02.ibeo.as' is 'working'
(Mon Jun 27 13:25:05 2022) [be[IBEO.AS]] [get_port_status] (0x1000): Port status of port 389 for server 'dc02.ibeo.as' is 'not working'
(Mon Jun 27 13:25:23 2022) [be[IBEO.AS]] [get_server_status] (0x1000): Status of server 'dc02.ibeo.as' is 'working'

I see a mix of them either "working" or "not working", or even "neutral". When you adjust /etc/hosts, do you pick one of them, or all 4?

Finally, the other difference you have to our testing rig (besides 4 controllers, and we only have 1) is probably group policies, although I see you have it set to "permissive", so it shouldn't block logins. But we don't have any GPO set here I believe (just defaults from a 2016 AD DC server).

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.