[PROPOSAL] Allow wildcard dependency #15

Open
opened 2021-11-20 11:24:49 +00:00 by ungleich-gitea · 6 comments

Something just occurred to me, and I think, it'd be a great addition. Could also (partially) solve #843:

Long story short, I wrote a couple of types to help me set up my hosting infrastructure and a handful of vhosts on Apache. As a result, I can either reload/restart the apache2 service quite often, or I need one long statement to restart at the end:

require="__module/x __module/y __vhost/foo __vhost/bar" \
    __systemctl_service apache2 --action restart --if-needed

Obviously, in practice it's much-much longer, which means it's rather clumsy, ugly, and error prone

So, I am proposing to extend the dependency resolver to allow wildcards, like this:

require="__module/* __vhost/*" systemctl_service apache2 --action reload --if-required
Something just occurred to me, and I think, it'd be a great addition. Could also (partially) solve #843: Long story short, I wrote a couple of types to help me set up my hosting infrastructure and a handful of vhosts on Apache. As a result, I can either `reload`/`restart` the `apache2` service quite often, or I need one long statement to restart at the end: ``` require="__module/x __module/y __vhost/foo __vhost/bar" \ __systemctl_service apache2 --action restart --if-needed ``` Obviously, in practice it's much-much longer, which means it's rather clumsy, ugly, and **error prone** So, I am proposing to extend the dependency resolver to allow **wildcards**, like this: ``` require="__module/* __vhost/*" systemctl_service apache2 --action reload --if-required ```
Author
Owner

Supporting regular expressions or globs in __require has been suggested multiple times already (last I remember was on 2020-11-06 on Matrix), but adding that would make the already complex dependency resolver even more complex.

Matrix log (reformatted):

ungleich-bridge  11/6 1:03 AM
  <woky> it'd be nice if require="" could match multiple objects with regex :3
  <woky> (of course I know that MRs are welcome :))

Manis  11/6 9:09 AM
  woky: I would prefer shell globs for `require` matching. They fit the
  shell-"style" a bit better and have less characters with special meaning to
  worry about.

  I have also proposed this some time ago, but it gets a bit hairy once you go
  into the details.  Biggest question is: when should the `require`s be
  processed?  Depending on when you do that, you might get a different result.

nico  11/6 9:35 AM
  Manis: supporting globs in `require`: that one is nicely tricky - if you want
  to, start a docs/dev/logs entry for this. I think even handling the affected
  objects "at the end" might be tricky, as there might be multiple of them and
  they might create objects that the other "last ones" depend on - but it's
  certainly an interesting line of thought

Manis  11/6 9:45 AM
  nico: AFAIK `require` only applies to code execution, so the creation of new
  objects should work (fingers crossed). But I agree that there are details that
  need to be carefully thought through before such a change is "attacked".

Steven Armstrong  11/8 11:10 AM
  > <@nico:ungleich.ch> Manis: supporting globs in `require`: that one is nicely
  tricky - if you want to, start a docs/dev/logs entry for this. I think even
  handling the affected objects "at the end" might be tricky, as there might be
  multiple of them and they might create objects that the other "last ones"
  depend on - but it's certainly an interesting line of thought

  @nico don't you remember the days ;-) My initial implementation used shell
  globs. I used the [same
  code](https://github.com/asteven/cdist-ng/blob/master/cdist/manager.py#L83) in
  my cdist-ng experiment. Using globs instead of a exact match is trivial.
  What's non trivial is that this changes the meaning. It is suddenly not
  deterministic anymore what your require="" means. What require is evaluated to
  suddenly is dependent on which objects have already been created. So it may
  actually change during a execution. In other words, an object could be applied
  because it's requirements are fulfilled. Then the next object creates 2 new
  objects that would again match the first objects require="". This get's tricky
  very quickly.

Steven Armstrong  11/8 11:20 AM
  > <@manis:matrix.org> nico: AFAIK `require` only applies to code execution, so
  the creation of new objects should work (fingers crossed). But I agree that
  there are details that need to be carefully thought through before such a
  change is "attacked".

  [this](https://github.com/asteven/cdist-ng/blob/master/docs/dev/dependencies)
  basically describes the different kinds of dependencies we have to deal
  with. These notes are also from the very very early days of cdist.

nico  11/8 11:21 AM
  > <@asteven:matrix.org> @nico don't you remember the days ;-) My initial
  implementation used shell globs. I used the [same
  code](https://github.com/asteven/cdist-ng/blob/master/cdist/manager.py#L83) in
  my cdist-ng experiment. Using globs instead of a exact match is
  trivial. What's non trivial is that this changes the meaning. It is suddenly
  not deterministic anymore what your require="" means. What require is
  evaluated to suddenly is dependent on which objects have already been
  created. So it may actually change during a execution. In other words, an
  object could be applied because it's requirements are fulfilled. Then the next
  object creates 2 new objects that would again match the first objects
  require="". This get's tricky very quickly.

  That's exactly my point

Manis  11/8 11:23 AM
  I think we're having a violent agreement here :-)

Steven Armstrong  11/8 11:26 AM
  You would basically need a way to mark an object as not-applied during
  execution so that it is re-applied if deps change at runtime. But that would
  mean that ALL types have to be idempotent.

Ander  11/8 11:34 AM
  something something `onchange='__object/which-changed' __another_object to-be-executed-if-first-changes --foo bar`
  how difficult this would be?
  also, what a nice and sunny day for father's day (at least in Estonia)

ungleich-bridge  11/8 1:02 PM
  <matze> ander: You mean the `__another_object/to-be-executed-if-first-changes`
  will **only** be executed if the first one emits a message? Sounds nice.

Ander  11/8 1:03 PM
  not a message, but generates code

ungleich-bridge  11/8 1:03 PM
  <matze> Yes, this makes more sense
  <matze> I wrote something for `__systemd_service` to only apply the action
  (reload, restart) if a required object wrote a message.

Manis  11/8 4:06 PM
  > <@ander:kvlt.ee> something something `onchange='__object/which-changed'
  __another_object to-be-executed-if-first-changes --foo bar`

  When I was looking into how to fix the cdist dependency resolver for !886 I
  came to the conclusion that checking for messages in manifests is evil and
  that something like an `onmessage` environment variable should be added.

  I think adding `onchange` or `onmessage` "filters" should not be too difficult
  to add.  It should essentially be a `require` of that object with some lambda
  that has to be evaluated just before the object would be executed and then
  either execute it or not.

  (In case the object would not be executed, this would create some "dry-run"
  shells of these objects.)

Ander  11/8 6:10 PM
  `onchange=""` var would be awesome
  it's basically same as `require=""`, except it allows execution when
  dependency generated code and with this we wouldn't need `--onchange` on types
  anymore but we would need generic non-idempotent `__execute` instead 😀 tho,
  main win is that you can run types 'on change' and not just plain onliner i'll
  create an issue

GitLab  11/8 6:31 PM
  [ungleich-public / cdist] Ander Punnar opened issue [onchange variable (#843)](https://code.ungleich.ch/ungleich-public/cdist/-/issues/843)
Supporting regular expressions or globs in `__require` has been suggested multiple times already (last I remember was on 2020-11-06 on Matrix), but adding that would make the already complex dependency resolver even more complex. Matrix log (reformatted): ``` ungleich-bridge 11/6 1:03 AM <woky> it'd be nice if require="" could match multiple objects with regex :3 <woky> (of course I know that MRs are welcome :)) Manis 11/6 9:09 AM woky: I would prefer shell globs for `require` matching. They fit the shell-"style" a bit better and have less characters with special meaning to worry about. I have also proposed this some time ago, but it gets a bit hairy once you go into the details. Biggest question is: when should the `require`s be processed? Depending on when you do that, you might get a different result. nico 11/6 9:35 AM Manis: supporting globs in `require`: that one is nicely tricky - if you want to, start a docs/dev/logs entry for this. I think even handling the affected objects "at the end" might be tricky, as there might be multiple of them and they might create objects that the other "last ones" depend on - but it's certainly an interesting line of thought Manis 11/6 9:45 AM nico: AFAIK `require` only applies to code execution, so the creation of new objects should work (fingers crossed). But I agree that there are details that need to be carefully thought through before such a change is "attacked". Steven Armstrong 11/8 11:10 AM > <@nico:ungleich.ch> Manis: supporting globs in `require`: that one is nicely tricky - if you want to, start a docs/dev/logs entry for this. I think even handling the affected objects "at the end" might be tricky, as there might be multiple of them and they might create objects that the other "last ones" depend on - but it's certainly an interesting line of thought @nico don't you remember the days ;-) My initial implementation used shell globs. I used the [same code](https://github.com/asteven/cdist-ng/blob/master/cdist/manager.py#L83) in my cdist-ng experiment. Using globs instead of a exact match is trivial. What's non trivial is that this changes the meaning. It is suddenly not deterministic anymore what your require="" means. What require is evaluated to suddenly is dependent on which objects have already been created. So it may actually change during a execution. In other words, an object could be applied because it's requirements are fulfilled. Then the next object creates 2 new objects that would again match the first objects require="". This get's tricky very quickly. Steven Armstrong 11/8 11:20 AM > <@manis:matrix.org> nico: AFAIK `require` only applies to code execution, so the creation of new objects should work (fingers crossed). But I agree that there are details that need to be carefully thought through before such a change is "attacked". [this](https://github.com/asteven/cdist-ng/blob/master/docs/dev/dependencies) basically describes the different kinds of dependencies we have to deal with. These notes are also from the very very early days of cdist. nico 11/8 11:21 AM > <@asteven:matrix.org> @nico don't you remember the days ;-) My initial implementation used shell globs. I used the [same code](https://github.com/asteven/cdist-ng/blob/master/cdist/manager.py#L83) in my cdist-ng experiment. Using globs instead of a exact match is trivial. What's non trivial is that this changes the meaning. It is suddenly not deterministic anymore what your require="" means. What require is evaluated to suddenly is dependent on which objects have already been created. So it may actually change during a execution. In other words, an object could be applied because it's requirements are fulfilled. Then the next object creates 2 new objects that would again match the first objects require="". This get's tricky very quickly. That's exactly my point Manis 11/8 11:23 AM I think we're having a violent agreement here :-) Steven Armstrong 11/8 11:26 AM You would basically need a way to mark an object as not-applied during execution so that it is re-applied if deps change at runtime. But that would mean that ALL types have to be idempotent. Ander 11/8 11:34 AM something something `onchange='__object/which-changed' __another_object to-be-executed-if-first-changes --foo bar` how difficult this would be? also, what a nice and sunny day for father's day (at least in Estonia) ungleich-bridge 11/8 1:02 PM <matze> ander: You mean the `__another_object/to-be-executed-if-first-changes` will **only** be executed if the first one emits a message? Sounds nice. Ander 11/8 1:03 PM not a message, but generates code ungleich-bridge 11/8 1:03 PM <matze> Yes, this makes more sense <matze> I wrote something for `__systemd_service` to only apply the action (reload, restart) if a required object wrote a message. Manis 11/8 4:06 PM > <@ander:kvlt.ee> something something `onchange='__object/which-changed' __another_object to-be-executed-if-first-changes --foo bar` When I was looking into how to fix the cdist dependency resolver for !886 I came to the conclusion that checking for messages in manifests is evil and that something like an `onmessage` environment variable should be added. I think adding `onchange` or `onmessage` "filters" should not be too difficult to add. It should essentially be a `require` of that object with some lambda that has to be evaluated just before the object would be executed and then either execute it or not. (In case the object would not be executed, this would create some "dry-run" shells of these objects.) Ander 11/8 6:10 PM `onchange=""` var would be awesome it's basically same as `require=""`, except it allows execution when dependency generated code and with this we wouldn't need `--onchange` on types anymore but we would need generic non-idempotent `__execute` instead 😀 tho, main win is that you can run types 'on change' and not just plain onliner i'll create an issue GitLab 11/8 6:31 PM [ungleich-public / cdist] Ander Punnar opened issue [onchange variable (#843)](https://code.ungleich.ch/ungleich-public/cdist/-/issues/843) ```
Author
Owner

Actually, this proposal was in discussion a while ago. The current way to go is to list all dependencies in the require before calling the type or write a variable like this:

__foo bar --test
foo_require="$foo_require __foo/bar"

# ...

require="$foo_require" __systemd_service apache2 --action reload --if-required

This doesn't coming handy, as it's pretty many things to write and to write wrong (cause you repeat some parts). You can automate the building of the require string with a function, but that won't be that easy.

So the best option would be to wildcard match previous types in the require field. The proposed wildcard match on a type basis is fine, but can be annoying if you have many different types to match. By introducing greater wildcard matches, the scope becomes a more important topic. So you could just do require="*" to reload the service at last.

About the scope: What if we introduce them? So the basic wildcard character would match the current scope/manifest and a double wildcard character would match all scopes if we need to serve all needs?

Actually, this proposal was in discussion a while ago. The current way to go is to list **all** dependencies in the `require` before calling the type or write a variable like this: ```sh __foo bar --test foo_require="$foo_require __foo/bar" # ... require="$foo_require" __systemd_service apache2 --action reload --if-required ``` This doesn't coming handy, as it's pretty many things to write and to write wrong (cause you repeat some parts). You can automate the building of the `require` string with a function, but that won't be that easy. So the best option would be to wildcard match previous types in the `require` field. The proposed wildcard match on a type basis is fine, but can be annoying if you have many different types to match. By introducing greater wildcard matches, the scope becomes a more important topic. So you could just do `require="*"` to reload the service at last. About the scope: What if we introduce them? So the basic wildcard character would match the current scope/manifest and a double wildcard character would match all scopes if we need to serve all needs?
Author
Owner

Indeed. That's what I was referring to with the scoping issues...

The whole idea came about as I was toying with the following concept:

Types must be reusable and generic, whereas machine specific things should be configured in manifest (called from the initial manifest).

So, I either write a custom type for every single machine, and then have the luxury of a gencode-remote script, or go with the above rule of thumb, and put the actual config of the machines in (the) manifest(s).

The wildcard feature would come in handy in the latter case.

Indeed. That's what I was referring to with the scoping issues... The whole idea came about as I was toying with the following concept: > Types must be reusable and generic, whereas machine specific things should be configured in manifest (called from the initial manifest). So, I either write a custom type for every single machine, and then have the luxury of a `gencode-remote` script, or go with the above rule of thumb, and put the actual config of the machines in (the) manifest(s). The wildcard feature would come in handy in the latter case.
Author
Owner

Works in your 'top-level' type, but not if your type is a dependency.

Works in your 'top-level' type, but not if your type is a dependency.
Author
Owner

Well, indeed, that's a risk here. However, in practice one would rather do something like this:

__module dav
__module dav_f
__module cgi --state disabled

__vhost one ...
__vhost two ...
...
__vhost fortytwo

require=__module/dav __module/dav_fs __module/cgi __vhost/* __systemctl_service apache2 --action restart --if-required

... at least in my mind.

But I do agree, it introduces the need to be careful; an – on a second thought – there could be a scoping issue here too...

Well, indeed, that's a risk here. However, in practice one would rather do something like this: ``` __module dav __module dav_f __module cgi --state disabled __vhost one ... __vhost two ... ... __vhost fortytwo require=__module/dav __module/dav_fs __module/cgi __vhost/* __systemctl_service apache2 --action restart --if-required ``` ... at least in my mind. But I do agree, it introduces the need to be careful; an – on a second thought – there could be a scoping issue here too...
Author
Owner

I'm worried about the side effects here: it means you can't have __module/anything depend on __systemctl_service/apache2. Won't have unsolvable dependency loops everywhere if we introduce this?

I'm worried about the side effects here: it means you can't have `__module/anything` depend on `__systemctl_service/apache2`. Won't have unsolvable dependency loops everywhere if we introduce this?
Sign in to join this conversation.
No Milestone
No project
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: ungleich-public/cdist#15
No description provided.