forked from ungleich-public/cdist
444 lines
16 KiB
ReStructuredText
444 lines
16 KiB
ReStructuredText
|
Polyglot
|
||
|
========
|
||
|
|
||
|
Description
|
||
|
-----------
|
||
|
|
||
|
Although **cdist** itself is written in **Python**, it features a
|
||
|
*language-agnostic* (and hence *polyglot*) extension system.
|
||
|
|
||
|
As such, **cdist** can be extended with a mix-and-match of
|
||
|
**any scripting language** in addition to the usual -and recommended-
|
||
|
**POSIX shell** (`sh`): `bash`, `perl`, `python`, `ruby`, `node`, ... whatever.
|
||
|
|
||
|
This is true for all extension mechanisms available for **cdist**, namely:
|
||
|
|
||
|
.. list-table::
|
||
|
|
||
|
* - :doc:`manifests <cdist-manifest>`
|
||
|
- (including :ref:`manifest/init <cdist-manifest#initial-and-type-manifests>`
|
||
|
and :ref:`type manifests <cdist-type#manifest>`)
|
||
|
|
||
|
* - :doc:`explorers <cdist-explorer>`
|
||
|
- (both **global** and :ref:`type explorers <cdist-type#explorers>`)
|
||
|
|
||
|
* - :ref:`gencode-* scripts <cdist-type#gencode-scripts>`
|
||
|
- (both :program:`gencode-local` and :program:`gencode-remote`)
|
||
|
|
||
|
* - and even :ref:`generated code <cdist-type#gencode-scripts>`
|
||
|
- (i.e. the outputs from
|
||
|
:ref:`gencode-* scripts <cdist-type#gencode-scripts>`)
|
||
|
|
||
|
|
||
|
.. raw:: html
|
||
|
|
||
|
<details>
|
||
|
<summary>
|
||
|
<a>You do not have to commit to any single language...</a>
|
||
|
</summary>
|
||
|
|
||
|
.. container::
|
||
|
|
||
|
.. note::
|
||
|
|
||
|
It's indeed possible (though not necessarily recommended)
|
||
|
to **mix-and-match** different
|
||
|
languages when extending **cdist**, for example:
|
||
|
|
||
|
A **type** could, in principal, have a `manifest` and an **explorer** written
|
||
|
in **POSIX shell**, a `gencode-remote` in **Python**
|
||
|
(which could generate code in **POSIX shell**) and a `gencode-local`
|
||
|
in **Perl** (which could generate code in **Perl**,
|
||
|
or some other language), while you are at it...
|
||
|
|
||
|
Just don't expect to submit such a hodge-podge as a candidate for being
|
||
|
distributed with **cdist** itself, though... :-)
|
||
|
especially if it turns out to be something that can be acheieved with
|
||
|
reasonable effort in **POSIX shell**.
|
||
|
|
||
|
In practise, you would at least want to enforce some consistency, if anything for
|
||
|
code maintainibility and your own sanity, in addition to the
|
||
|
the `CAVEATS`_ mentioned down below.
|
||
|
|
||
|
.. raw:: html
|
||
|
|
||
|
</details>
|
||
|
<br/>
|
||
|
|
||
|
Needless to say, just because you *can* do something,
|
||
|
doesn't mean you *should* be doing it, or it's even a *good idea* to do so.
|
||
|
|
||
|
As a general rule of thumb, when extending **cdist**,
|
||
|
there are many good reasons in favor of sticking with the **POSIX shell**
|
||
|
wherever you can, and very few in favor of opting for some other
|
||
|
scripting language.
|
||
|
|
||
|
This is particularly true for any code that is meant to be run *remotely*
|
||
|
on **target hosts** (such as **explorers**),
|
||
|
where it is usually important to keep assumptions and requirements/dependencies
|
||
|
to a bare minimum. See the `CAVEATS`_ down below.
|
||
|
|
||
|
That being said, **polyglot** capabilities of **cdist** can come
|
||
|
quite handy for when you really need this sort of thing,
|
||
|
provided that you are ready to bare the consequences,
|
||
|
including the burden of extra dependecies
|
||
|
--- which is usually not that hard for code run *locally* on **master**
|
||
|
(`manifests`, `gencode-*` scripts, and code generated by `gencode-local`).
|
||
|
|
||
|
In any case, the mere fact of knowing we *can* escape the POSIX hatch
|
||
|
if we really have to, can be quite comforting for those of us suffering
|
||
|
from POSIX claustrophobia... which *is* of course a real health hazard
|
||
|
associated with high anxiety levels and all,
|
||
|
in case you didn't already know... ;-)
|
||
|
|
||
|
|
||
|
Writing polyglot extensions for **cdist**
|
||
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||
|
|
||
|
Whatever the kind of script (`manifest`, explorer, ...) you are writing,
|
||
|
you need to ensure that all 3 conditions below are met:
|
||
|
|
||
|
1. your script starts with an appropriate **shebang** line, such as::
|
||
|
|
||
|
#!/usr/bin/env bash
|
||
|
|
||
|
.. comment: It would have been nice to make use of an extension
|
||
|
(such as `"sphinx_design"`) which provides a `.. dropdown::`
|
||
|
directive (for toggling visibility) which is the reason for
|
||
|
the ugly `.. raw:: html` stuff below...
|
||
|
|
||
|
.. raw:: html
|
||
|
|
||
|
<details>
|
||
|
<summary><a>It's usually preferable to rely on the <b>env</b> program...</a></summary>
|
||
|
|
||
|
.. container::
|
||
|
|
||
|
It's usually preferable to rely on the :program:`env` program,
|
||
|
like in the example above, to find the interpreter by searching the PATH.
|
||
|
|
||
|
The :program:`env` program is almost guaranteed to exist even on a rudimentary
|
||
|
UNIX/Linux system at a pretty stable location: `/usr/bin/env`
|
||
|
|
||
|
It is, of course, also possible to write down a **hard coded** path
|
||
|
for the interpreter, if you are certain that it will always be
|
||
|
located at that location, like so::
|
||
|
|
||
|
#!/bin/bash
|
||
|
|
||
|
This may sometimes be desirable, for example when you want to ascertain
|
||
|
using a specific version of an interpreter or when you are unsure about
|
||
|
what might get foundthrough the PATH.
|
||
|
|
||
|
.. raw:: html
|
||
|
|
||
|
</details>
|
||
|
|
||
|
2. your script has "*execute*" permissions set (in the Unix/Linux sense),
|
||
|
like so::
|
||
|
|
||
|
chmod a+x /path/to/your/script
|
||
|
|
||
|
This is essentially what matters to **cdist**, which it will take as a
|
||
|
clue for invoking your script *directly* (instead of passing it
|
||
|
to a shell as an argument).
|
||
|
|
||
|
For **generated code**, `cdist` will automatically take care of setting
|
||
|
*execute* permissions for you,
|
||
|
based on the presence of a leading **shebang** within the generated code.
|
||
|
|
||
|
3. the **interpreter** referenced by the **shebang** is available on any host(s)
|
||
|
where your code will run.
|
||
|
|
||
|
.. raw:: html
|
||
|
|
||
|
<details>
|
||
|
<summary>
|
||
|
<a>
|
||
|
Even for the <b>POSIX shell</b>,
|
||
|
it is still recommended to <b>follow the same guidelines</b> outlined above.
|
||
|
</a>
|
||
|
</summary>
|
||
|
|
||
|
.. note::
|
||
|
|
||
|
Even if you are just writing for the **POSIX shell**,
|
||
|
it is still recommended to follow the same guidelines outlined above.
|
||
|
|
||
|
At the very least, make sure your script has a proper **shebang**.
|
||
|
|
||
|
- If you have been following the usual **cdist** advise:
|
||
|
you probably already have a proper **shebang** at the very beginning
|
||
|
of your POSIX shell scripts.
|
||
|
|
||
|
|
||
|
- If (and *only* if), your POSIX shell script *does* contain a proper **shebang**:
|
||
|
you are also encouraged to also give it *"execute"* permissions,
|
||
|
so that your **shebang** will actually get honored.
|
||
|
|
||
|
.. raw:: html
|
||
|
|
||
|
</details>
|
||
|
<br/>
|
||
|
|
||
|
|
||
|
That's pretty much it... except...
|
||
|
|
||
|
.. seealso:: The `CAVEATS`_ below.
|
||
|
|
||
|
|
||
|
CAVEATS
|
||
|
^^^^^^^^^^^^
|
||
|
|
||
|
Shebang and execute permissions
|
||
|
"""""""""""""""""""""""""""""""""
|
||
|
In general, the first two conditions above are trivial to satisfy:
|
||
|
Just make sure you put in a **shebang** and mark your script as *executable*.
|
||
|
|
||
|
|
||
|
**Beware**, however, that:
|
||
|
|
||
|
.. attention::
|
||
|
|
||
|
- If your script lacks `execute` permissions (regardless of any **shebang**):
|
||
|
**cdist** will end up passing your script to `/bin/sh -e`
|
||
|
(or to `local_shell` / `remote_shell`,
|
||
|
if one is configured for the current context),
|
||
|
which may or may not be what you want.
|
||
|
|
||
|
- If your script *does* have `execute` permissions but *lacks* a **shebang**:
|
||
|
you can no longer be sure which interpreter (if any) will end up running your script.
|
||
|
|
||
|
What is certain, on the other hand, is that there is a wide range of
|
||
|
different things that could happen in such a case, depending on the OS and the chain
|
||
|
of execution up to that point...
|
||
|
|
||
|
It is possible (but not certain) that, in such a case, your script may
|
||
|
end up getting fed into `/bin/sh` or the default shell
|
||
|
(whatever it happens to be for the current user).
|
||
|
|
||
|
There's even a legend according to which even `csh` may get a chance to feed
|
||
|
on your script, and then proceed to burning your barn...
|
||
|
|
||
|
So, don't do that.
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
Interpreter availibility
|
||
|
"""""""""""""""""""""""""""""""""
|
||
|
|
||
|
For the last condition (interpreter availability),
|
||
|
your mileage may vary for languages other than the **POSIX shell**.
|
||
|
|
||
|
- For scripts meant to be run *locally* on the **master**, things remain relatively easy :
|
||
|
All you may need, if anything,
|
||
|
is a one time installation of stuff.
|
||
|
|
||
|
So, things should be realtively easy when it comes to: :file:`manifest` and :file:`gencode-*` scripts themselves, as well as any code generated by :file:`gencode-local`.
|
||
|
|
||
|
|
||
|
- For scripts meant to be run *remotely* on **target hosts**, things might get quite tricky,
|
||
|
depending on how likely it is
|
||
|
for the desired **interpreter** to be installed by default
|
||
|
on the **target system**.
|
||
|
|
||
|
This is an important concern for :file:`explorer` scripts
|
||
|
and any code generated by :file:`gencode-remote`.
|
||
|
|
||
|
.. warning::
|
||
|
|
||
|
Apart from the POSIX shell (`/bin/sh`), there aren't many interpreters out
|
||
|
there that are likely to have a guaranteed presence on a pristine system.
|
||
|
|
||
|
At the very least, you would have to make sure that the required interpreter
|
||
|
(and any extra modules/libraries your script might depend on)
|
||
|
are indeed available on those host(s)
|
||
|
before your script is invoked...
|
||
|
which kind of goes against the near-zero-dependency philosphy embraced
|
||
|
by **cdist**.
|
||
|
|
||
|
Depending on the target host OS, you might get lucky with
|
||
|
`bash`, `perl`, or `python` being preinstalled.
|
||
|
Even then, those may not necessarily be the version you expect
|
||
|
or have the extra modules/libraries your script might require.
|
||
|
|
||
|
**You have been warned.**
|
||
|
|
||
|
|
||
|
More details
|
||
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||
|
|
||
|
As mentioned earlier, **cdist** itself mostly cares about the script
|
||
|
being marked as an *executable*, which it will take as a clue for invoking
|
||
|
that script *directly* (instead of passing it to a shell as an argument).
|
||
|
|
||
|
The **shebang** magic is handled by the usual process `exec` mechanisms
|
||
|
of the host OS (where the script is invoked) that will take over from
|
||
|
that point on.
|
||
|
|
||
|
|
||
|
Here is a simplified summary :
|
||
|
|
||
|
+-------------+---------------+------------------------------+--------------+--------------------------------------------------------+
|
||
|
| executable? | shebang | invocation resembles | interpreter | remarks |
|
||
|
+=============+===============+==============================+==============+========================================================+
|
||
|
| yes | `#!/bin/sh` | `/path/to/script` | `/bin/sh` | shebang **honored** by OS |
|
||
|
+-------------+---------------+------------------------------+--------------+--------------------------------------------------------+
|
||
|
| yes | `#!/bin/bash` | `/path/to/script` | `/bin/bash` | shebang **honored** by OS |
|
||
|
+-------------+---------------+------------------------------+--------------+--------------------------------------------------------+
|
||
|
| yes | | `/path/to/script` | *uncertain* | shebang **absent** |
|
||
|
+-------------+---------------+------------------------------+--------------+--------------------------------------------------------+
|
||
|
| no | `#!/bin/sh` | `/bin/sh -e /path/to/script` | `/bin/sh -e` | shebang **irrelevant** (as script is not "executable") |
|
||
|
+-------------+---------------+------------------------------+--------------+--------------------------------------------------------+
|
||
|
| no | `#!/bin/bash` | `/bin/sh -e /path/to/script` | `/bin/sh -e` | shebang **irrelevant** (as script is not "executable") |
|
||
|
+-------------+---------------+------------------------------+--------------+--------------------------------------------------------+
|
||
|
| no | | `/bin/sh -e /path/to/script` | `/bin/sh -e` | shebang **irrelevant** (as script is not "executable") |
|
||
|
+-------------+---------------+------------------------------+--------------+--------------------------------------------------------+
|
||
|
|
||
|
In fact, it's a little bit more involved than the above. Remember:
|
||
|
|
||
|
- As a special case, for any **generated code** (output by `gencode-*` scripts),
|
||
|
**cdist** will solely rely on the presence (or absence) of a leading **shebang**,
|
||
|
and set the executable bits accordingly, for obvious reasons.
|
||
|
|
||
|
- In the end, if a script is NOT marked as "executable",
|
||
|
it will simply be passed as an argument to the configured shell
|
||
|
that corresponds to the relevant context (i.e. `local_shell` or `remote_shell`),
|
||
|
if one is defined within the **cdist** configuration,
|
||
|
or else to `/bin/sh -e`, as a fallback in in both cases.
|
||
|
|
||
|
Well, there are also some gory implementation details
|
||
|
(related to how environment variables get propagated),
|
||
|
but those should normally have no relevance to this discussion.
|
||
|
|
||
|
|
||
|
The API between **cdist** and any polyglot extensions
|
||
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||
|
|
||
|
Conceptually, the API, based on well-known UNIX constructs,
|
||
|
remains exactly the same as it is for
|
||
|
any extension written for the **POSIX shell**.
|
||
|
|
||
|
Basically, you are all set as long as your scripting language is capable of:
|
||
|
|
||
|
- accessing **environment variables**;
|
||
|
- reading from and writing to the **filesystem** (files, directories, ...);
|
||
|
- reading from :file:`STDIN` and writing to :file:`STDOUT` (and eventually to :file:`STDERR`)
|
||
|
- **executing** other programs/commands;
|
||
|
- **exiting** with an appropriate **status code** (where 0=>success).
|
||
|
|
||
|
For all we know, no serious scripting language out there
|
||
|
would be missing any such basics.
|
||
|
|
||
|
The actual syntax and mechanisms will obviously be different,
|
||
|
the shell idioms usually being much more concise for this sort of thing,
|
||
|
as expected.
|
||
|
|
||
|
See the below example entitled "`Interacting with the cdist API`_".
|
||
|
|
||
|
|
||
|
Examples
|
||
|
-------------------
|
||
|
|
||
|
Interacting with the cdist API
|
||
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||
|
|
||
|
As an API example, here's an excerpt from a **cdist** `type manifest`,
|
||
|
written for the POSIX shell, showing how one would get at the name
|
||
|
of the kernel on the **target host**:::
|
||
|
|
||
|
kernel_name=$(cat "${__global}/explorer/kernel_name")
|
||
|
|
||
|
# ... do something with kernel_name ...
|
||
|
|
||
|
|
||
|
In a nutshell, the above snippet gives the general idea about the cdist API:
|
||
|
|
||
|
Basically, we are stuffing a shell variable with the contents of a file...
|
||
|
which happens to contain the output from the `kernel_name` explorer...
|
||
|
|
||
|
Before invoking our `manifest` script, **cdist** would have, among other things,
|
||
|
run all **global explorers** on the **target host**,
|
||
|
collected and copied their outputs under a temporary directory on the **master**, and
|
||
|
set a specific environment variable (`$__global`)
|
||
|
to the path of a specifc subdirectory of that temporary working area.
|
||
|
|
||
|
At this point, that file (which contains the kernel name) is sitting there,
|
||
|
ready to be slurped... which can obviously be done from any language
|
||
|
that can access environment variables and read files from the filesystem...
|
||
|
|
||
|
Here's how you could do the same thing in **Python**:
|
||
|
|
||
|
.. code-block:: python
|
||
|
|
||
|
#!/usr/bin/env python
|
||
|
|
||
|
import os
|
||
|
|
||
|
def read_file(path):
|
||
|
content = ""
|
||
|
try:
|
||
|
with open(path, "r") as fd:
|
||
|
content = fd.read().rstrip('\n')
|
||
|
except EnvironmentError:
|
||
|
pass
|
||
|
return content
|
||
|
|
||
|
kernel_name = read_file( os.environ['__global'] + '/explorer/kernel_name' )
|
||
|
|
||
|
# ... do something with kernel_name ...
|
||
|
|
||
|
|
||
|
And in **Perl**, it could look like:
|
||
|
|
||
|
.. code-block:: perl
|
||
|
|
||
|
#!/usr/bin/env perl
|
||
|
|
||
|
sub read_file {
|
||
|
my ($path) = @_;
|
||
|
return unless open( my $fh, $path );
|
||
|
local ($/);
|
||
|
<$fh>
|
||
|
}
|
||
|
|
||
|
my $kernel_name = read_file("$ENV{__global}/explorer/kernel_name");
|
||
|
|
||
|
# ... do something with kernel_name ...
|
||
|
|
||
|
|
||
|
Incidently, this example also helps appreciate some aspects of programming
|
||
|
for the shell... which were designed for this sort of thing in the first place...
|
||
|
|
||
|
A polygot type explorer (in Perl)
|
||
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||
|
|
||
|
Here's an imaginary type explorer written in **Perl**,
|
||
|
that ouputs the version of the perl interpreter running on the target host:
|
||
|
|
||
|
.. code-block:: perl
|
||
|
|
||
|
#!/usr/bin/env perl
|
||
|
|
||
|
use English;
|
||
|
|
||
|
print "${PERL_VERSION}\n";
|
||
|
|
||
|
If the path to the intended interpreter can be ascertained, you can
|
||
|
put that down directly on the **shebang**, like so::
|
||
|
|
||
|
#!/usr/bin/perl
|
||
|
|
||
|
However, more often than not, you would want to rely
|
||
|
on the `env` program (`/usr/bin/env`) to
|
||
|
invoke the first interpreter with the given name (`perl`, in this case)
|
||
|
found on the current PATH, like in the above example.
|
||
|
|
||
|
Don't forget to set *execute* permissions on the script file:::
|
||
|
|
||
|
chmod a+x ...
|
||
|
|
||
|
Or else **cdist** will feed it to a shell instance...
|
||
|
which may burn your barn... :-)
|