cache responding server in case of primary one is unavailable #6

vchlum · 2024-08-07T07:29:33Z

Problem Description:

If the primary server is unavailable, the client always tries to connect to the primary one first. We could save some time and use the responding server (e.g. the secondary one) for a short time as the primary server.

Change Description:

In this change, a cache file is used to store the responding server. It is stored only if the responding server is not the primary one. The servers are sorted in order to have the responding server in the first position. The original order of servers is restored after a configurable cache timeout. The default cache timeout is 30 seconds. The cache timeout can be changed in /etc/ktb5.conf in the section [appdefaults] via the variable krb525_cache_timeout. The value is in seconds.

Testing:

Testing the cache works:

(BOOKWORM)root@torque1:~# cat /etc/krb5.conf | grep krb525_server
        krb525_server = kdc1.anonym
        krb525_server = kdc2.anonym
(BOOKWORM)root@torque1:~# iptables -A INPUT --src kdc1.anonym -p tcp --sport 6565 -j DROP
(BOOKWORM)root@torque1:~# time /usr/bin/krb525_renew vchlum@META && time /usr/bin/krb525_renew vchlum@META
Type: Kerberos
Valid until: 1723100309
doICPDC<anonymized>qAjAA

real	0m5.532s
user	0m0.000s
sys	0m0.010s
Type: Kerberos
Valid until: 1723100314
doICPDC<anonymized>qAjAA

real	0m0.295s
user	0m0.007s
sys	0m0.000s

Testing the cache deletes and order of server is restored after the cache timeout:

(BOOKWORM)root@torque1:~# cat /etc/krb5.conf | grep krb525
        krb525_server = kdc1.anonym
        krb525_server = kdc2.anonym
    krb525_cache_timeout = 10
(BOOKWORM)root@torque1:~# time /usr/bin/krb525_renew vchlum@META && sleep 5 && time /usr/bin/krb525_renew vchlum@META && sleep 5  && time /usr/bin/krb525_renew vchlum@META
Type: Kerberos
Valid until: 1723100766
doICPDC<anonymized>qAjAA

real	0m5.500s
user	0m0.005s
sys	0m0.005s
Type: Kerberos
Valid until: 1723100776
doICPDC<anonymized>qAjAA

real	0m0.293s
user	0m0.004s
sys	0m0.004s
Type: Kerberos
Valid until: 1723100782
doICPDC<anonymized>qAjAA

real	0m5.576s
user	0m0.005s
sys	0m0.005s
(BOOKWORM)root@torque1:~#

kouril · 2024-11-19T15:16:07Z

Thanks for the contribution. A couple of observations is below, I'm not sure to what extent they pose a problem in our deployment. Anyway, here we go:

a static cache filename is used (/tmp/krb525_endpoint.cache). It would break the caching for processes running under different uid's.
the access() and remove()/fopen() pairs open a race condition, when the process may delete/open another file than checked (it should be harmless but could have some operations implications, esp. the fopen() part and relevant check).
the access to the file isn't synchronized/locked, I'm not sure how much we could rely on small data written in the cache (if we can assume the operation will be almost atomic).
the cache file content can be crafted by another (potentially malicious) process/user and loaded/used by the library. I don't see an immediate bad effect/exploit, but it's potentially dangerous.

Didn't you consider keeping the cache individually per a uid (and keep in something like /var/run/user/UID)?

cache responding server in case of primary one is unavailable

c26461f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cache responding server in case of primary one is unavailable #6

cache responding server in case of primary one is unavailable #6

vchlum commented Aug 7, 2024

kouril commented Nov 19, 2024

cache responding server in case of primary one is unavailable #6

Are you sure you want to change the base?

cache responding server in case of primary one is unavailable #6

Conversation

vchlum commented Aug 7, 2024

kouril commented Nov 19, 2024