Make cdist truely language-agnostic (polyglot) #365

Open
tabulon wants to merge 6 commits from tabulon/cdist:dev-tabulon-feat-polyglot-PR1 into master
First-time contributor

The changes in this PR allow Cdist to become essentially language-agnostic (polyglot).

Please refer to #249 for earlier discussion, including : rationale, the suggested solution, and possible caveats...

Prior to this PR, cdist did have some support for the shebang for scripts running locally, without a need to set the CDIST_LOCAL_SHELL environment variable, as long the script file has EXECUTE permissions set,

In practice, this meant that we could write manifest and gencode in any scripting language.

The missing pieces were:

  • scripts destined for remote target hosts (e.g. explorers)
  • generated code (output from gencode-*, be it local or remote)

This PR extends the polyglot support to include all possible scripts currently used for configuring or extending cdist.
This is done by:

  • running remote executable scripts directly (instead of feeding them to a remote shell), which by itself enables transparent shebang support for those;
    (this was already the case when running executable scripts locally)

  • automatically setting execute permissions on scripts that correspond to code generated via gencode-* scripts (local or remote), upon detecting a shebang.

As a reminder, the shebang mecanism is directly supported by Unix/Linux kernels for srcripts with execute permissions. Therefore, in order to support it, cdist doesn't really need to parse or do anything clever with it.

Note that the actual code changes are quite minimal and the bulk of the changes contained herein pertain to documentation and tests.

Here are some questions and points to watch out for.

Code

  1. Spotted (but didn't fix) a potential bug which is present in both remote.py and local.py
    (the "-e" switch appears to be systematically passed to $CDIST_LOCAL_SHELL / $CDIST_REMOTE_SHELL as if it were /bin/sh.
    Actually, there are no guarantees for the -e switch to be undrestood universally.)

  2. When running a script, we only check for execute permissions, without even looking for a shebang.
    This is the expected -usual- behaviour from a runner...

    But could this pose issues for backwards compatibility ?

    Quite unlikely, I would think: The only scripts that could perhaps misbehave are those that have execute permissions but no shebang (see the polyglot documentation)

Documentation:

I have taken the liberty to add :

  1. a dedicated page for "polyglot" scripting
  2. a "tip" and a reference to that new page on several relevant places (explorer, manifest, type, ...)
  3. some more text to the two overview pages: "why?" and "features"

It is possible that I have taken too much liberty with regards to documentation (especially the 3rd item above).
Please let me know.

Tests:

  1. For one of the test modules (/cdist/test/code/__init__.py), the change appears to be much more than what it really is, mainly because
    I had to resort to burrying the original test class (CodeTestCase) so as to be able to subclass it safely.

Let me know if you have any questions / remarks / suggestions.

Cheers,
Tabulon

The changes in this PR allow **Cdist** to become essentially **language-agnostic (polyglot)**. Please refer to #249 for earlier discussion, including : rationale, the suggested solution, and possible caveats... Prior to this PR, `cdist` did have some support for the **shebang** for scripts running **locally**, without a need to set the CDIST_LOCAL_SHELL environment variable, as long the script file has **EXECUTE** permissions set, In practice, this meant that we could write `manifest` and `gencode` in any scripting language. The missing pieces were: - scripts destined for remote target hosts (e.g. explorers) - generated code (output from `gencode-*`, be it local or remote) This PR extends the polyglot support to include all possible scripts currently used for configuring or extending `cdist`. This is done by: - running remote **executable** scripts directly (instead of feeding them to a remote shell), which by itself enables transparent shebang support for those; (this was already the case when running executable scripts locally) - automatically setting *execute* permissions on scripts that correspond to code generated via gencode-* scripts (local or remote), upon detecting a shebang. As a reminder, the **shebang** mecanism is directly supported by Unix/Linux kernels for srcripts with *execute* permissions. Therefore, in order to support it, `cdist` doesn't really need to parse or do anything clever with it. Note that the actual code changes are quite minimal and the bulk of the changes contained herein pertain to documentation and tests. Here are some questions and points to watch out for. ### Code 1) Spotted (but didn't fix) a potential bug which is present in both remote.py and local.py (the "-e" switch appears to be systematically passed to `$CDIST_LOCAL_SHELL` / `$CDIST_REMOTE_SHELL` as if it were `/bin/sh`. Actually, there are no guarantees for the `-e` switch to be undrestood universally.) 2) When running a script, we only check for execute permissions, without even looking for a shebang. This is the expected -usual- behaviour from a runner... But could this pose issues for backwards compatibility ? Quite unlikely, I would think: The only scripts that could perhaps misbehave are those that have execute permissions but no shebang (see the polyglot documentation) ### Documentation: I have taken the liberty to add : 1) a dedicated page for "polyglot" scripting 2) a "tip" and a reference to that new page on several relevant places (explorer, manifest, type, ...) 3) some more text to the two overview pages: "why?" and "features" It is possible that I have taken too much liberty with regards to documentation (especially the 3rd item above). Please let me know. ### Tests: 1) For one of the test modules (`/cdist/test/code/__init__.py`), the change appears to be much more than what it really is, mainly because I had to resort to burrying the original test class (`CodeTestCase`) so as to be able to subclass it safely. _______ Let me know if you have any questions / remarks / suggestions. Cheers, Tabulon
tabulon added 6 commits 2023-04-15 03:36:30 +00:00
Author
First-time contributor

Hi @nico,

Any idea when you could get a chance to review this?
(I hope it can happen when things are still sort of fresh in my mind)

Ayhan

Hi @nico, Any idea when you could get a chance to review this? *(I hope it can happen when things are still sort of fresh in my mind)* Ayhan
Author
First-time contributor

Hi @nico,

Is everything allright ?

Hi @nico, Is everything allright ?
tabulon changed title from WIP: Cdist becomes truely language-agnostic (polyglot) to Cdist becomes truely language-agnostic (polyglot) 2024-03-24 12:08:19 +00:00
tabulon changed title from Cdist becomes truely language-agnostic (polyglot) to Make cdist truely language-agnostic (polyglot) 2024-03-24 12:24:14 +00:00
Author
First-time contributor

Hi @nico (or other maintainers of cdist),

Is there anything I can do expedite the review process around this PR which I had submitted about a year ago, following your initial encouragement on the fringes of #249 ?

FYI: Just in case the WIP prefix causes some notifications to be filtered out on your end, I have just removed that prefix and renamed the PR, which has anyhow been ready for review from the begining.

At this point in time, it will probably be necessary to rebase the PR on current master before it can be merged. I can take care of that once you complete the review -- or beforehand, if you really prefer.

Or else, if you see any issues with the feature or its implementation in this PR, please let me know.

(I am cross-posting this comment on #249 in order to increase its chances to get noticed).

Hi @nico (or other maintainers of cdist), Is there anything I can do expedite the review process around this PR which I had submitted about a year ago, following your initial encouragement on the fringes of #249 ? _FYI: Just in case the WIP prefix causes some notifications to be filtered out on your end, I have just removed that prefix and renamed the PR, which has anyhow been ready for review from the begining._ At this point in time, it will probably be necessary to rebase the PR on current master before it can be merged. I can take care of that once you complete the review -- or beforehand, if you really prefer. Or else, if you see any issues with the feature or its implementation in this PR, please let me know. (I am cross-posting this comment on #249 in order to increase its chances to get noticed).
fnux self-assigned this 2024-04-24 15:30:40 +00:00
Collaborator

I'll review this on Friday.

I'll review this on Friday.
fnux requested changes 2024-04-26 08:26:06 +00:00
fnux left a comment
Collaborator

Hei Tabulon,

I quickly went through this. The implementation itself looks good.. but, as you mentioned, you did take too much liberty with the documentation.

As far as cdist upstream is concerned, you added logic so that advanced users can executed manifests & friends with arbitrary interpreters... as in any shell environment. That's nice, but it doesn't give cdist a "language-agnostic extension system" - and it's not because you can that you should do it. cdist users are still expected and will continue to use /bin/sh, and upstream cdist won't accept or encourage anything else anytime soon.

I like the idea to be able to use arbitrary shebangs (= we're closer to a standard shell environment), but I consider using non-sh manifests/scripts to usually be bad manner:

  • It increases the maintenance burden.
    • Both technical and human-wise.
  • It makes distribution much harder.
  • There's a risk to fragment the ecosystem.

The use-case I see is for advanced users with very specific needs to develop their own integrations, tied to their environment. 10 lines of documentation saying "you actually can use arbitrary shebangs" would have been enough as far as I am concerned.

Can you update your changes by just mentioning that while we use sh, it is possible to use arbitrary shebangs (more infos on the polyglot page)? I'd also like the polyglot page to adapted so that it is clear that it is not something you should do (= stick to sh unless you have very specific needs, and think about it twice).

Note: I haven't checked out the tests yet - please explain why you had to nest the test classes first.

Best,
Timo

Hei Tabulon, I quickly went through this. The implementation itself looks good.. but, as you mentioned, you did take too much liberty with the documentation. As far as cdist upstream is concerned, you added logic so that advanced users can executed manifests & friends with arbitrary interpreters... as in any shell environment. That's nice, but it doesn't give cdist a "language-agnostic extension system" - and it's not because you can that you should do it. cdist users are still expected and will continue to use /bin/sh, and upstream cdist won't accept or encourage anything else anytime soon. I like the idea to be able to use arbitrary shebangs (= we're closer to a standard shell environment), but I consider using non-sh manifests/scripts to usually be bad manner: * It increases the maintenance burden. - Both technical and human-wise. * It makes distribution much harder. * There's a risk to fragment the ecosystem. The use-case I see is for advanced users with very specific needs to develop their own integrations, tied to their environment. 10 lines of documentation saying "you actually can use arbitrary shebangs" would have been enough as far as I am concerned. Can you update your changes by just mentioning that while we use sh, it is possible to use arbitrary shebangs (more infos on the polyglot page)? I'd also like the polyglot page to adapted so that it is clear that it is not something you should do (= stick to sh unless you have very specific needs, and think about it twice). Note: I haven't checked out the tests yet - please explain why you had to nest the test classes first. Best, Timo
@ -37,123 +38,191 @@ my_dir = op.abspath(op.dirname(__file__))
fixtures = op.join(my_dir, 'fixtures')
conf_dir = op.join(fixtures, 'conf')
class Burried:
Collaborator

I haven't read the details - why do we/you need this now?

I haven't read the details - why do we/you need this now?
@ -0,0 +62,4 @@
fd.close()
return value
def slurp_file(path, ignore=None):
Collaborator

Is this used anywhere?

Is this used anywhere?
@ -10,2 +11,4 @@
type explorers.
.. tip::
An :program:`explorer` can be written in **any scripting language**,
Collaborator

Same as for the manifest doc, "While explorer are typically written in POSIX shell, ..."

Same as for the manifest doc, "While explorer are typically written in POSIX shell, ..."
@ -45,3 +51,3 @@
UNIX, simplicity, familiar environment
cdist is configured in POSIX shell
The ubiquitious `POSIX shell <https://en.wikipedia.org/wiki/Unix_shell>`_ is the recommended language for configuring and extending cdist.
Collaborator

I would just mention here that it is possible to user languages via shebang, just as in your standard UNIX environment. CDIST users are expected to (and will) use sh... and advanced users can plug something else in specific places if they want to. Putting it along the core features makes the whole thing pretty confusing for new users.

I would just mention here that it is possible to user languages via shebang, just as in your standard UNIX environment. CDIST users are expected to (and will) use sh... and advanced users can plug something else in specific places if they want to. Putting it along the core features makes the whole thing pretty confusing for new users.
@ -4,3 +4,3 @@
Description
-----------
Manifests are used to define which objects to create.
Manifests are scripts that are executed *locally* (on master)
Collaborator

(on master)

master is confusing, please remove it.

... and we already mention a few lines below that the manifests are executed locally. I don't see much use to this change.

> (on master) `master` is confusing, please remove it. ... and we already mention a few lines below that the manifests are executed locally. I don't see much use to this change.
@ -28,3 +30,3 @@
configuration on the target host.
Manifests are executed locally as a shell script using **/bin/sh -e**.
Manifests are executed *locally* (on master).
Collaborator

Manifest are executed locally as a shell script using /bin/sh -e, unless another shebang as been set.
Please let the default / expected behavior in this sentence.

Manifest are executed locally as a shell script using /bin/sh -e, unless another shebang as been set. Please let the default / expected behavior in this sentence.
@ -38,1 +40,4 @@
.. tip::
A manifest can be written in **any scripting language**,
Collaborator

Again, don't take the default behavior away.

While manifest are usually written in shell, it is possible to ....

Again, don't take the default behavior away. While manifest are usually written in shell, it is possible to ....
@ -0,0 +5,4 @@
-----------
Although **cdist** itself is written in **Python**, it features a
*language-agnostic* (and hence *polyglot*) extension system.
Collaborator

cdist types are written in POSIX shell, and it happen that advanced users can call gencode-* and manifests with arbitrary interpreters. There's no language-agnostic extension system - and cdist upstream won't accept non-shell types anytime soon (much harder to maintain, much harder to ship).

Please add a big fat "ADVANCED USERS / AT YOUR OWN RISKS" warning at the top of this page.

cdist types are written in POSIX shell, and it happen that advanced users can call gencode-* and manifests with arbitrary interpreters. There's no language-agnostic extension system - and cdist upstream won't accept non-shell types anytime soon (much harder to maintain, much harder to ship). Please add a big fat "ADVANCED USERS / AT YOUR OWN RISKS" warning at the top of this page.
@ -0,0 +34,4 @@
<details>
<summary>
<a>You do not have to commit to any single language...</a>
Collaborator

Do not advertise the polyglot feature as something you should be doing. You should not, unless you know what you're doing.

Do not advertise the polyglot feature as something you should be doing. You should not, unless you know what you're doing.
@ -4,44 +4,65 @@ Why should I use cdist?
There are several motivations to use cdist, these
are probably the most popular ones.
Known language
Collaborator

I wouldn't make any change to this page. cdist stays what it is, even if you can execute manifests with arbitrary interpreters.

I wouldn't make any change to this page. cdist stays what it is, even if you can execute manifests with arbitrary interpreters.
Owner

I had a short look at the comments above and I think generally speaking we are talking about a "control host" or "source host" where we configure the target hosts from. I personally find introducing new references to "master" not exactly contemporary.

I had a short look at the comments above and I think generally speaking we are talking about a "control host" or "source host" where we configure the *target hosts* from. I personally find introducing new references to "master" not exactly contemporary.
This pull request can be merged automatically.
This branch is out-of-date with the base branch
You are not authorized to merge this pull request.
You can also view command line instructions.

Step 1:

From your project repository, check out a new branch and test the changes.
git checkout -b tabulon-dev-tabulon-feat-polyglot-PR1 master
git pull dev-tabulon-feat-polyglot-PR1

Step 2:

Merge the changes and update on Gitea.
git checkout master
git merge --no-ff tabulon-dev-tabulon-feat-polyglot-PR1
git push origin master
Sign in to join this conversation.
No reviewers
No Milestone
No project
No Assignees
3 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: ungleich-public/cdist#365
No description provided.