[__letscrypt_cert] Hooks are wrongly implemented #13

Closed
opened 2021-11-20 11:24:47 +00:00 by ungleich-gitea · 8 comments

Tangentially related to: !974 and !873

Fixing this would also fix ungleich-public/cdist-contrib#2

Description of the issues

I detected a renewal hook did not run, even though the cert itself did get renewed, upon inspection I see:

# cat /etc/cron.d/certbot  | grep -v '^#'
SHELL=/bin/sh
PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin

0 */12 * * * root test -x /usr/bin/certbot -a \! -d /run/systemd/system && perl -e 'sleep int(rand(43200))' && certbot -q renew

# crontab -l | grep certb
47 0 * * * /usr/bin/certbot renew -q  --renew-hook "service nginx reload" # __cron/letsencrypt-certbot

Race condition due to two renewal jobs

So there is a bit of a race condition: on the /etc/cron.d/certbot entry, there is a random sleep of anything between 0 to 12h, if that happens to be less than 47 minutes (about 6.53% chances), that job will renew the certificate but not apply the hook as expected.

Misleading expectation of per-certificate hook

__letsencrypt_cert example.com \
    --admin-email root@example.com \
    --automatic-renewal \
    --renew-hook "service nginx reload" \
    --webroot /data/letsencrypt/root

__letsencrypt_cert smtp.example.com \
    --admin-email root@example.com \
    --automatic-renewal \
    --renew-hook "service postfix reload" \
    --webroot /data/letsencrypt/root

I would expect this to:

  1. Work
  2. Apply the different hooks to each certificate

Instead (not tested) this execution would fail due to discordant arguments for the __cron/letsencrypt-certbot object.

Conclusion

!974 and !873 are not sufficient to fix these issues, we should instead rely on per-realm/certificate configuration, using following.

I intend to implement this with a clear migration path for current installations rendering those MRs redundant by:

Using:
https://certbot.eff.org/docs/using.html?highlight=hook#pre-and-post-validation-hooks

  1. Deprecating --renew-hook, having it work as an alias for --deploy-hook (see below).
  2. Adding optional arguments: --deploy-hook, --pre-hook, --post-hook
  3. These would create a ${ETC_DIR}/letsencrypt/renewal-hooks/{deploy,pre,post}/${__object_id}.cdist.sh script with the proper permissions.
    And contents similar to:
#!/bin/sh -e

if echo "${CERTBOT_ALL_DOMAINS}" | tr ',' '\n' | grep "^${__object_id}$"; then
    ${HOOK_STRING_CONTENTS}
fi
Tangentially related to: !974 and !873 Fixing this would also fix ungleich-public/cdist-contrib#2 # Description of the issues I detected a renewal hook did not run, even though the cert itself did get renewed, upon inspection I see: ``` # cat /etc/cron.d/certbot | grep -v '^#' SHELL=/bin/sh PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin 0 */12 * * * root test -x /usr/bin/certbot -a \! -d /run/systemd/system && perl -e 'sleep int(rand(43200))' && certbot -q renew # crontab -l | grep certb 47 0 * * * /usr/bin/certbot renew -q --renew-hook "service nginx reload" # __cron/letsencrypt-certbot ``` ## Race condition due to two renewal jobs So there is a bit of a race condition: on the `/etc/cron.d/certbot` entry, there is a random sleep of anything between 0 to 12h, if that happens to be less than 47 minutes (about 6.53% chances), that job will renew the certificate but not apply the hook as expected. ## Misleading expectation of per-certificate hook ``` __letsencrypt_cert example.com \ --admin-email root@example.com \ --automatic-renewal \ --renew-hook "service nginx reload" \ --webroot /data/letsencrypt/root __letsencrypt_cert smtp.example.com \ --admin-email root@example.com \ --automatic-renewal \ --renew-hook "service postfix reload" \ --webroot /data/letsencrypt/root ``` I would expect this to: 1. Work 2. Apply the different hooks to each certificate Instead (not tested) this execution would fail due to discordant arguments for the `__cron/letsencrypt-certbot` object. # Conclusion !974 and !873 are not sufficient to fix these issues, we should instead rely on per-realm/certificate configuration, using following. I intend to implement this with a clear migration path for current installations rendering those MRs redundant by: Using: https://certbot.eff.org/docs/using.html?highlight=hook#pre-and-post-validation-hooks 1. Deprecating `--renew-hook`, having it work as an alias for `--deploy-hook` (see below). 2. Adding `optional` arguments: `--deploy-hook`, `--pre-hook`, `--post-hook` 3. These would create a `${ETC_DIR}/letsencrypt/renewal-hooks/{deploy,pre,post}/${__object_id}.cdist.sh` script with the proper permissions. And contents similar to: ``` #!/bin/sh -e if echo "${CERTBOT_ALL_DOMAINS}" | tr ',' '\n' | grep "^${__object_id}$"; then ${HOOK_STRING_CONTENTS} fi ```
evilham was assigned by ungleich-gitea 2021-11-20 11:24:47 +00:00
Author
Owner

mentioned in issue cdist-contrib#2

mentioned in issue cdist-contrib#2
Author
Owner

mentioned in commit b3a9c907ad

mentioned in commit b3a9c907ad3811676e75145e8678a55b8bbf41d3
Author
Owner

mentioned in merge request !977

mentioned in merge request !977
Author
Owner

mentioned in commit bc145bbc27

mentioned in commit bc145bbc27a45e382adb744e6774fb803f4fda0a
Author
Owner

I also propose deprecating --automatic-renewals, it makes no sense and is a maintenance nightmare. If an OS needs it (like Alpine and, from what I can tell Arch), we can enable autorenewal system-wide, which is what was happening now anyway.

I also propose deprecating `--automatic-renewals`, it makes no sense and is a maintenance nightmare. If an OS needs it (like Alpine and, from what I can tell Arch), we can enable autorenewal system-wide, which is what was happening now anyway.
Author
Owner

I don't have much experience with Alpine, can someone confirm that it does not have an in-distro mechanism for automatic renewals?

I don't have much experience with Alpine, can someone confirm that it does not have an in-distro mechanism for automatic renewals?
Author
Owner

Sounds like the proper solution to me. We can even deprecate --renew-hook and remove it in the next version change of cdist.

Sounds like the proper solution to me. We can even deprecate --renew-hook and remove it in the next version change of cdist.
Author
Owner

@nico / @ander / @steven at your scale you have probably hit this, does the proposal sound good? Should I work on it?

@nico / @ander / @steven at your scale you have probably hit this, does the proposal sound good? Should I work on it?
Sign in to join this conversation.
No Milestone
No project
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: ungleich-public/cdist#13
No description provided.