Add documentation and mentions for polyglot scripting capabilities for configuring and extending cdist [POLYGLOT]

2023-04-15 04:50:00 +02:00 · 2023-04-15 04:50:00 +02:00 · ae10ff49dd
parent 9b3505e8a1
commit ae10ff49dd
7 changed files with 673 additions and 67 deletions
--- a/docs/src/cdist-explorer.rst
+++ b/docs/src/cdist-explorer.rst
@ -3,12 +3,29 @@ Explorer

 Description
 -----------
-Explorers are small shell scripts, which will be executed on the target
-host. The aim of each explorer is to give hints to types on how to act on the
+Explorers are small scripts, typically written in POSIX shell,
+which will be executed on the target host.
+The aim of each explorer is to give hints to types on how to act on the
 target system. An explorer outputs the result to stdout, which is usually
 a one liner, but may be empty or multi line especially in the case of
 type explorers.

+.. tip::
+    An :program:`explorer` can be written in **any scripting language**,
+    provided it is executable and has a proper **shebang**.
+
+    Nevertheless, for explorers, it is usually best to stick with the
+    **POSIX shell** in order to minimize
+    requirements on target hosts where they would need to be executed.
+
+    For executable shell code, the recommended shebang is :code:`#!/bin/sh -e`.
+
+    If an :program:`explorer` lacks `execute` permissions,
+    :program:`cdist` assumes it to be written in **shell** and executes it using
+    `$CDIST_REMOTE_SHELL`, which defaults to :code:`/bin/sh -e`.
+
+    For more details and examples, see :doc:`cdist-polyglot`.
+
 There are general explorers, which are run in an early stage, and
 type explorers. Both work almost exactly the same way, with the difference
 that the values of the general explorers are stored in a general location and
@ -32,9 +49,14 @@ error message on stderr, which will cause cdist to abort.
 You can also use stderr for debugging purposes while developing a new
 explorer.

+
 Examples
 --------
-A very simple explorer may look like this::
+A very simple explorer may look like this:
+
+.. code-block:: sh
+
+    #!/bin/sh -e

    hostname

@ -44,6 +66,8 @@ A type explorer, which could check for the status of a package may look like thi

 .. code-block:: sh

+    #!/bin/sh -e
+
    if [ -f "$__object/parameter/name" ]; then
       name="$(cat "$__object/parameter/name")"
    else
--- a/docs/src/cdist-features.rst
+++ b/docs/src/cdist-features.rst
@ -22,8 +22,14 @@ Fast development
    Focus on straightforwardness of type creation is a main development objective
    Batteries included: A lot of requirements can be solved using standard types

-Modern Programming Language
-    cdist is written in Python
+Modern Programming Language (for cdist itself)
+    cdist itself is written in Python
+
+Language-agnostic / Polyglot (for the rest)
+    Although cdist itself is written in Python, it can be configured
+    and extended with any scripting language available.
+
+    (The `POSIX shell <https://en.wikipedia.org/wiki/Unix_shell>`_ is recommended, especially for any code destined to run on target hosts)

 Requirements, Scalability
    No central server needed, cdist operates in push mode and can be run from any computer
@ -44,5 +50,6 @@ UNIX, familiar environment, documentation
    Is available as manpages and HTML

 UNIX, simplicity, familiar environment
-    cdist is configured in POSIX shell
+    The ubiquitious `POSIX shell <https://en.wikipedia.org/wiki/Unix_shell>`_ is the recommended language for configuring and extending cdist.

+    The :program:`Cdist API` is based on simple and familiar UNIX constructs: environment variables, standard I/O, and files/directories
--- a/docs/src/cdist-manifest.rst
+++ b/docs/src/cdist-manifest.rst
@ -3,7 +3,9 @@ Manifest

 Description
 -----------
-Manifests are used to define which objects to create.
+Manifests are scripts that are executed *locally* (on master)
+for the purpose of defining which objects to create.
+
 Objects are instances of **types**, like in object oriented programming languages.
 An object is represented by the combination of
 **type + slash + object name**: **\__file/etc/cdist-configured** is an
@ -24,10 +26,10 @@ at an example::
    # Same with the __directory type
    __directory /tmp/cdist --state present

-These two lines create objects, which will later be used to realise the 
+These two lines create objects, which will later be used to realise the
 configuration on the target host.

-Manifests are executed locally as a shell script using **/bin/sh -e**.
+Manifests are executed *locally* (on master).
 The resulting objects are stored in an internal database.

 The same object can be redefined in multiple different manifests as long as
@ -36,6 +38,20 @@ the parameters are exactly the same.
 In general, manifests are used to define which types are used depending
 on given conditions.

+.. tip::
+
+    A manifest can be written in **any scripting language**,
+    provided that the script is executable and has a proper **shebang**.
+
+    For executable shell code, the recommended shebang is :code:`#!/bin/sh -e`.
+
+    If :program:`manifest` lacks `execute` permissions,  :program:`cdist` assumes
+    it to be written in **shell** and executes it using
+    `$CDIST_LOCAL_SHELL`, which defaults to :code:`/bin/sh -e`.
+
+    For more details and examples, see :doc:`cdist-polyglot`.
+
+.. _cdist-manifest#initial-and-type-manifests:

 Initial and type manifests
 --------------------------
@ -64,14 +80,14 @@ environment variable **__target_host** and/or **__target_hostname** and/or
       ;;
    esac

-This manifest says: Independent of the host, always use the type 
+This manifest says: Independent of the host, always use the type
 **__cdistmarker**, which creates the file **/etc/cdist-configured**,
 with the timestamp as content.
-The directory **/home/services/kvm-vm**, including all parent directories, 
+The directory **/home/services/kvm-vm**, including all parent directories,
 is only created on the host **localhost**.

 As you can see, there is no magic involved, the manifest is simple shell code that
-utilises cdist types. Every available type can be executed like a normal 
+utilises cdist types. Every available type can be executed like a normal
 command.


@ -102,17 +118,17 @@ delimiters including space, tab and newline.

     1 # No dependency
     2 __file /etc/cdist-configured
-     3 
+     3
     4 # Require above object
     5 require="__file/etc/cdist-configured" __link /tmp/cdist-testfile \
     6    --source /etc/cdist-configured  --type symbolic
-     7 
+     7
     8 # Require two objects
     9 require="__file/etc/cdist-configured __link/tmp/cdist-testfile" \
    10    __file /tmp/cdist-another-testfile


-Above the "require" variable is only set for the command that is 
+Above the "require" variable is only set for the command that is
 immediately following it. Dependencies should always be declared that way.

 On line 4 you can see that the instantiation of a type "\__link" object needs
@ -156,7 +172,7 @@ in `cdist execution stages <cdist-stages.html>`_ and of how types work in `cdist

 Create dependencies from execution order
 -----------------------------------------
-You can tell cdist to execute all types in the order in which they are created 
+You can tell cdist to execute all types in the order in which they are created
 in the manifest by setting up the variable CDIST_ORDER_DEPENDENCY.
 When cdist sees that this variable is setup, the current created object
 automatically depends on the previously created object.
@ -288,14 +304,14 @@ and there are no other dependencies from this manifest.

 Overrides
 ---------
-In some special cases, you would like to create an already defined object 
+In some special cases, you would like to create an already defined object
 with different parameters. In normal situations this leads to an error in cdist.
 If you wish, you can setup the environment variable CDIST_OVERRIDE
-(any value or even empty is ok) to tell cdist, that this object override is 
+(any value or even empty is ok) to tell cdist, that this object override is
 wanted and should be accepted.
-ATTENTION: Only use this feature if you are 100% sure in which order 
+ATTENTION: Only use this feature if you are 100% sure in which order
 cdist encounters the affected objects, otherwise this results
-in an undefined situation. 
+in an undefined situation.

 If CDIST_OVERRIDE and CDIST_ORDER_DEPENDENCY are set for an object,
 CDIST_ORDER_DEPENDENCY will be ignored, because adding a dependency in case of
@ -348,11 +364,11 @@ How to override objects:
    # (e.g. for example only sourced if a special application is on the target host)

    # this leads to an error ...
-    __user foobar --password 'some_other_hash' 
+    __user foobar --password 'some_other_hash'

    # this tells cdist, that you know that this is an override and should be accepted
    CDIST_OVERRIDE=yes __user foobar --password 'some_other_hash'
-    # it's only an override, means the parameter --home is not touched 
+    # it's only an override, means the parameter --home is not touched
    # and stays at the original value of /home/foobarexample

 Dependencies defined by execution order work as following:
--- a/docs/src/cdist-polyglot.rst
+++ b/docs/src/cdist-polyglot.rst
@ -0,0 +1,443 @@
+Polyglot
+========
+
+Description
+-----------
+
+Although **cdist** itself is written in **Python**, it features a
+*language-agnostic* (and hence *polyglot*) extension system.
+
+As such, **cdist** can be extended with a mix-and-match of
+**any scripting language** in addition to the usual -and recommended-
+**POSIX shell** (`sh`): `bash`, `perl`, `python`, `ruby`, `node`, ... whatever.
+
+This is true for all extension mechanisms available for **cdist**, namely:
+
+.. list-table::
+
+    * - :doc:`manifests <cdist-manifest>`
+      - (including :ref:`manifest/init <cdist-manifest#initial-and-type-manifests>`
+        and :ref:`type manifests <cdist-type#manifest>`)
+
+    * - :doc:`explorers <cdist-explorer>`
+      - (both **global** and :ref:`type explorers <cdist-type#explorers>`)
+
+    * - :ref:`gencode-* scripts <cdist-type#gencode-scripts>`
+      - (both :program:`gencode-local` and :program:`gencode-remote`)
+
+    * - and even :ref:`generated code <cdist-type#gencode-scripts>`
+      - (i.e. the outputs from
+        :ref:`gencode-* scripts <cdist-type#gencode-scripts>`)
+
+
+.. raw:: html
+
+    <details>
+    <summary>
+        <a>You do not have to commit to any single language...</a>
+    </summary>
+
+.. container::
+
+    .. note::
+
+        It's indeed possible (though not necessarily recommended)
+        to **mix-and-match** different
+        languages when extending **cdist**, for example:
+
+        A **type** could, in principal, have a `manifest` and an **explorer** written
+        in **POSIX shell**, a `gencode-remote` in **Python**
+        (which could generate code in **POSIX shell**) and a `gencode-local`
+        in **Perl**  (which could generate code in **Perl**,
+        or some other language), while you are at it...
+
+        Just don't expect to submit such a hodge-podge as a candidate for being
+        distributed  with **cdist** itself, though... :-)
+        especially if it turns out to be something that can be acheieved with
+        reasonable effort in **POSIX shell**.
+
+        In practise, you would at least want to enforce some consistency, if anything for
+        code maintainibility and your own sanity, in addition to the
+        the `CAVEATS`_ mentioned down below.
+
+.. raw:: html
+
+    </details>
+    <br/>
+
+Needless to say, just because you *can* do something,
+doesn't mean you *should* be doing it, or it's even a *good idea* to do so.
+
+As a general rule of thumb, when extending **cdist**,
+there are many good reasons in favor of sticking with the **POSIX shell**
+wherever you can, and very few in favor of opting for some other
+scripting language.
+
+This is particularly true for any code that is meant to be run *remotely*
+on **target hosts** (such as **explorers**),
+where it is usually important to keep assumptions and requirements/dependencies
+to a bare minimum. See the  `CAVEATS`_ down below.
+
+That being said, **polyglot** capabilities of **cdist** can come
+quite handy for when you really need this sort of thing,
+provided that you are ready to bare the consequences,
+including the burden of extra dependecies
+--- which is usually not that hard for code run *locally* on **master**
+(`manifests`, `gencode-*` scripts, and code generated by `gencode-local`).
+
+In any case, the mere fact of knowing we *can* escape the POSIX hatch
+if we really have to, can be quite comforting for those of us suffering
+from POSIX claustrophobia... which *is* of course a real health hazard
+associated with high anxiety levels and all,
+in case you didn't already know... ;-)
+
+
+Writing polyglot extensions for **cdist**
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Whatever the kind of script (`manifest`, explorer, ...) you are writing,
+you need to ensure that all 3 conditions below are met:
+
+1.  your script starts with an appropriate **shebang** line, such as::
+
+      #!/usr/bin/env bash
+
+    .. comment: It would have been nice to make use of an extension
+        (such as `"sphinx_design"`) which provides a `.. dropdown::`
+        directive (for toggling visibility) which is the reason for
+        the ugly `.. raw:: html` stuff below...
+
+    .. raw:: html
+
+        <details>
+        <summary><a>It's usually preferable to rely on the <b>env</b> program...</a></summary>
+
+    .. container::
+
+        It's usually preferable to rely on the :program:`env` program,
+        like in the example above, to find the interpreter by searching the PATH.
+
+        The :program:`env` program is almost guaranteed to exist even on a rudimentary
+        UNIX/Linux system at a pretty stable location: `/usr/bin/env`
+
+        It is, of course, also possible to write down a **hard coded** path
+        for the interpreter, if you are certain that it will always be
+        located at that location, like so::
+
+            #!/bin/bash
+
+        This may sometimes be desirable, for example when you want to ascertain
+        using a specific version of an interpreter or when you are unsure about
+        what might get foundthrough the PATH.
+
+    .. raw:: html
+
+        </details>
+
+2.  your script has "*execute*" permissions set (in the Unix/Linux sense),
+    like so::
+
+        chmod a+x /path/to/your/script
+
+    This is essentially what matters to **cdist**, which it will take as a
+    clue for invoking your script *directly* (instead of passing it
+    to a shell as an argument).
+
+    For **generated code**, `cdist` will automatically take care of setting
+    *execute* permissions for you,
+    based on the presence of a leading **shebang** within the generated code.
+
+3.  the **interpreter** referenced by the **shebang** is available on any host(s)
+    where your code will run.
+
+.. raw:: html
+
+    <details>
+    <summary>
+        <a>
+        Even for the <b>POSIX shell</b>,
+        it is still recommended to <b>follow the same guidelines</b> outlined above.
+        </a>
+    </summary>
+
+.. note::
+
+    Even if you are just writing for the **POSIX shell**,
+    it is still recommended to follow the same guidelines outlined above.
+
+    At the very least, make sure your script has a proper **shebang**.
+
+    -   If you have been following the usual **cdist** advise:
+            you probably already have a proper **shebang** at the very beginning
+            of your POSIX shell scripts.
+
+
+    -   If (and *only* if), your POSIX shell script *does* contain a proper **shebang**:
+            you are also encouraged to also give it *"execute"* permissions,
+            so that your **shebang** will actually get honored.
+
+.. raw:: html
+
+    </details>
+    <br/>
+
+
+That's pretty much it... except...
+
+.. seealso:: The `CAVEATS`_ below.
+
+
+CAVEATS
+^^^^^^^^^^^^
+
+Shebang and execute permissions
+"""""""""""""""""""""""""""""""""
+In general, the first two conditions above are trivial to satisfy:
+Just make sure you put in a **shebang** and mark your script as *executable*.
+
+
+**Beware**, however, that:
+
+.. attention::
+
+    -   If your script lacks `execute` permissions (regardless of any **shebang**):
+            **cdist** will end up passing your script to `/bin/sh -e`
+            (or to `local_shell` / `remote_shell`,
+            if one is configured for the current context),
+            which may or may not be what you want.
+
+    -   If your script *does* have `execute` permissions but *lacks* a **shebang**:
+            you can no longer be sure which interpreter (if any) will end up running your script.
+
+            What is certain, on the other hand, is that there is a wide range of
+            different things that could happen in such a case, depending on the OS and the chain
+            of execution up to that point...
+
+            It is possible (but not certain) that, in such a case, your script may
+            end up getting fed into `/bin/sh` or the default shell
+            (whatever it happens to be for the current user).
+
+            There's even a legend according to which even `csh` may get a chance to feed
+            on your script, and then proceed to burning your barn...
+
+            So, don't do that.
+
+
+
+
+Interpreter availibility
+"""""""""""""""""""""""""""""""""
+
+For the last condition (interpreter availability),
+your mileage may vary for languages other than the **POSIX shell**.
+
+- For scripts meant to be run *locally* on the **master**, things remain relatively easy :
+    All you may need, if anything,
+    is a one time installation of stuff.
+
+    So, things should be realtively easy when it comes to: :file:`manifest` and :file:`gencode-*` scripts themselves, as well as any code generated by :file:`gencode-local`.
+
+
+- For scripts meant to be run *remotely* on **target hosts**, things might get quite tricky,
+    depending on how likely it is
+    for the desired **interpreter** to be installed by default
+    on the **target system**.
+
+    This is an important concern for :file:`explorer` scripts
+    and any code generated by :file:`gencode-remote`.
+
+    .. warning::
+
+        Apart from the POSIX shell (`/bin/sh`), there aren't many interpreters out
+        there that are likely to have a guaranteed presence on a pristine system.
+
+        At the very least, you would have to make sure that the required interpreter
+        (and any extra modules/libraries your script might depend on)
+        are indeed available on those host(s)
+        before your script is invoked...
+        which kind of goes against the near-zero-dependency philosphy embraced
+        by **cdist**.
+
+        Depending on the target host OS, you might get lucky with
+        `bash`, `perl`, or `python` being preinstalled.
+        Even then, those may not necessarily be the version you expect
+        or have the extra modules/libraries your script might require.
+
+        **You have been warned.**
+
+
+More details
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+As mentioned earlier, **cdist** itself mostly cares about the script
+being marked as an *executable*, which it will take as a clue for invoking
+that script *directly* (instead of passing it to a shell as an argument).
+
+The **shebang** magic is handled by the usual process `exec` mechanisms
+of the host OS (where the script is invoked) that will take over from
+that point on.
+
+
+Here is a simplified summary :
+
+-------------+---------------+------------------------------+--------------+--------------------------------------------------------+
+| executable? | shebang       | invocation resembles         | interpreter  | remarks                                                |
+=============+===============+==============================+==============+========================================================+
+| yes         | `#!/bin/sh`   | `/path/to/script`            | `/bin/sh`    | shebang **honored** by OS                              |
+-------------+---------------+------------------------------+--------------+--------------------------------------------------------+
+| yes         | `#!/bin/bash` | `/path/to/script`            | `/bin/bash`  | shebang **honored** by OS                              |
+-------------+---------------+------------------------------+--------------+--------------------------------------------------------+
+| yes         |               | `/path/to/script`            | *uncertain*  | shebang **absent**                                     |
+-------------+---------------+------------------------------+--------------+--------------------------------------------------------+
+| no          | `#!/bin/sh`   | `/bin/sh -e /path/to/script` | `/bin/sh -e` | shebang **irrelevant** (as script is not "executable") |
+-------------+---------------+------------------------------+--------------+--------------------------------------------------------+
+| no          | `#!/bin/bash` | `/bin/sh -e /path/to/script` | `/bin/sh -e` | shebang **irrelevant** (as script is not "executable") |
+-------------+---------------+------------------------------+--------------+--------------------------------------------------------+
+| no          |               | `/bin/sh -e /path/to/script` | `/bin/sh -e` | shebang **irrelevant** (as script is not "executable") |
+-------------+---------------+------------------------------+--------------+--------------------------------------------------------+
+
+In fact, it's a little bit more involved than the above. Remember:
+
+- As a special case, for any **generated code** (output by `gencode-*` scripts),
+  **cdist** will solely rely on the presence (or absence) of a leading **shebang**,
+  and set the executable bits accordingly, for obvious reasons.
+
+- In the end, if a script is NOT marked as "executable",
+  it will simply be passed as an argument to the configured shell
+  that corresponds to the relevant context (i.e. `local_shell` or `remote_shell`),
+  if one is defined within the **cdist** configuration,
+  or else to `/bin/sh -e`, as a fallback in in both cases.
+
+Well, there are also some gory implementation details
+(related to how environment variables get propagated),
+but those should normally have no relevance to this discussion.
+
+
+The API between **cdist** and any polyglot extensions
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Conceptually, the API, based on well-known UNIX constructs,
+remains exactly the same as it is for
+any extension written for the **POSIX shell**.
+
+Basically, you are all set as long as your scripting language is capable of:
+
+- accessing **environment variables**;
+- reading from and writing to the **filesystem** (files, directories, ...);
+- reading from :file:`STDIN` and writing to :file:`STDOUT` (and eventually to :file:`STDERR`)
+- **executing** other programs/commands;
+- **exiting** with an appropriate **status code** (where 0=>success).
+
+For all we know, no serious scripting language out there
+would be missing any such basics.
+
+The actual syntax and mechanisms will obviously be different,
+the shell idioms usually being much more concise for this sort of thing,
+as expected.
+
+See the below example entitled "`Interacting with the cdist API`_".
+
+
+Examples
+-------------------
+
+Interacting with the cdist API
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+As an API example, here's an excerpt from a **cdist** `type manifest`,
+written for the POSIX shell, showing how one would get at the name
+of the kernel on the **target host**:::
+
+    kernel_name=$(cat "${__global}/explorer/kernel_name")
+
+    # ... do something with kernel_name ...
+
+
+In a nutshell, the above snippet gives the general idea about the cdist API:
+
+Basically, we are stuffing a shell variable with the contents of a file...
+which happens to contain the output from the `kernel_name` explorer...
+
+Before invoking our `manifest` script,  **cdist** would have, among other things,
+run all **global explorers** on the **target host**,
+collected and copied their outputs under a temporary directory on the **master**, and
+set a specific environment variable (`$__global`)
+to the path of a specifc subdirectory of that temporary working area.
+
+At this point, that file (which contains the kernel name) is sitting there,
+ready to be slurped... which can obviously be done from any language
+that can access environment variables and read files from the filesystem...
+
+Here's how you could do the same thing in **Python**:
+
+.. code-block:: python
+
+    #!/usr/bin/env python
+
+    import os
+
+    def read_file(path):
+        content = ""
+        try:
+            with open(path, "r") as fd:
+                content = fd.read().rstrip('\n')
+        except EnvironmentError:
+            pass
+        return content
+
+    kernel_name = read_file( os.environ['__global'] + '/explorer/kernel_name' )
+
+    # ... do something with kernel_name ...
+
+
+And in **Perl**, it could look like:
+
+.. code-block:: perl
+
+    #!/usr/bin/env perl
+
+    sub read_file {
+        my ($path) = @_;
+        return unless open( my $fh, $path );
+        local ($/);
+        <$fh>
+    }
+
+    my $kernel_name = read_file("$ENV{__global}/explorer/kernel_name");
+
+    # ... do something with kernel_name ...
+
+
+Incidently, this example also helps appreciate some aspects of programming
+for the shell... which were designed for this sort of thing in the first place...
+
+A polygot type explorer (in Perl)
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Here's an imaginary type explorer written in **Perl**,
+that ouputs the version of the perl interpreter running on the target host:
+
+.. code-block:: perl
+
+    #!/usr/bin/env perl
+
+    use English;
+
+    print "${PERL_VERSION}\n";
+
+If the path to the intended interpreter can be ascertained, you can
+put that down directly on the **shebang**, like so::
+
+     #!/usr/bin/perl
+
+However, more often than not, you would want to rely
+on the `env` program (`/usr/bin/env`) to
+invoke the first interpreter with the given name (`perl`, in this case)
+found on the current PATH, like in the above example.
+
+Don't forget to set *execute* permissions on the script file:::
+
+    chmod a+x ...
+
+Or else **cdist** will feed it to a shell instance...
+which may burn your barn... :-)
--- a/docs/src/cdist-type.rst
+++ b/docs/src/cdist-type.rst
@ -79,9 +79,9 @@ then this content is printed as a deprecation messages, e.g.:

 .. code-block:: sh

-    $ ls -l deprecated 
+    $ ls -l deprecated
    -rw-r--r--  1 darko  darko  71 May 20 18:30 deprecated
-    $ cat deprecated 
+    $ cat deprecated
    This type is deprecated. It will be removed in the next minor release.
    $ echo '__foo foo' | ./bin/cdist config -i - 185.203.112.26
    WARNING: 185.203.112.26: Type __foo is deprecated: This type is deprecated. It will be removed in the next minor release.
@ -90,7 +90,7 @@ If 'deprecated' marker has no content then general message is printed, e.g.:

 .. code-block:: sh

-    $ ls -l deprecated 
+    $ ls -l deprecated
    -rw-r--r--  1 darko  darko  0 May 20 18:36 deprecated
    $ echo '__bar foo' | ./bin/cdist config -i - 185.203.112.26
    WARNING: 185.203.112.26: Type __bar is deprecated.
@ -98,41 +98,132 @@ If 'deprecated' marker has no content then general message is printed, e.g.:

 How to write a new type
 -----------------------
-A type consists of
-
- parameter    (optional)
- manifest     (optional)
- singleton    (optional)
- explorer     (optional)
- gencode      (optional)
- nonparallel  (optional)

 Types are stored below cdist/conf/type/. Their name should always be prefixed with
-two underscores (__) to prevent collisions with other executables in $PATH.
+two underscores (__) to prevent collisions with other executables in :code:`$PATH`.

-To implement a new type, create the directory **cdist/conf/type/__NAME**.
+To implement a new type, create the directory :file:`cdist/conf/type/{__NAME}`,
+either manually or using the helper script `cdist-new-type <man1/cdist-new-type.html>`_
+which will also create the basic skeleton for you.

-Type manifest and gencode can be written in any language. They just need to be
-executable and have a proper shebang. If they are not executable then cdist assumes
-they are written in shell so they are executed using '/bin/sh -e' or 'CDIST_LOCAL_SHELL'.
+A type consists of the following elements (all of which are currently *optional*):

-For executable shell code it is suggested that shebang is '#!/bin/sh -e'.
+* some **markers** in the form of **plain files** within the type's directory:

-For creating type skeleton you can use helper script
-`cdist-new-type <man1/cdist-new-type.html>`_.
+    .. list-table::

+        * - :file:`singleton`
+          - *(optional)*
+          - A type flagged as a :file:`singleton` may be used **only
+            once per host**, which is useful for types that can be used only once on a
+            system.
+
+            .. raw:: html
+
+                <br/>
+
+            Singleton types do not take an object name as argument.
+
+        * - :file:`nonparallel`
+          - (optional)
+          - Objects of a type flagged as :file:`nonparallel` cannot be run in parallel
+            when using :code:`-j` option.
+
+            .. raw:: html
+
+                <br/>
+
+
+            An example of such a type is :program:`__package_dpkg` type
+            where :program:`dpkg` itself prevents to be run in more than one instance.
+
+        * - :file:`install`
+          - *(optional)*
+          - A type flagged as :file:`install` is used only with :command:`install` command.
+            With other :program:`cdist` commands, i.e. :command:`config`, such types are skipped if used.
+
+        * - :file:`deprecated`
+          - *(optional)*
+          - A type flagged as :file:`deprecated` causes
+            :program:`cdist` to emit a **warning** whenever that type is used.
+
+            .. raw:: html
+
+                <br/>
+
+            If the file that corresponds to the `deprecated` marker has any content,
+            then this is used as a custom **deprecation message** for the warning.
+
+* some more **metadata**:
+
+    .. list-table::
+
+        * - :file:`parameter/\*`
+          - *(optional)*
+          - A type may have **parameters**. These must be declared following a simple convention described in `Defining parameters`_, which
+            permits specifying additional properties for each parameter:
+
+                * required or optional
+                * single-value or multi-value
+                * string or boolean
+
+            It is also possible to give a `default` value for each optional parameter.
+
+* and some **code** (scripts):
+
+    .. list-table::
+
+        * - :file:`manifest`
+          - *(optional)*
+          - :doc:`Type manifest <cdist-manifest>`
+
+        * - :file:`explorer/*`
+          - *(optional)*
+          - Any number of :doc:`type explorer <cdist-explorer>` scripts may exist under :file:`explorer` subdirectory.
+
+        * - :file:`gencode-local`
+          - *(optional)*
+          - A script that generates code to be executed *locally* (on master).
+
+        * - :file:`gencode-remote`
+          - *(optional)*
+          - A script that generates code to be executed *remotely* (on target host).
+
+
+.. tip::
+
+    Each of the above-mentioned scripts can be written in **any scripting language**,
+    provided that the script is executable and has a proper **shebang**.
+
+    For executable shell code, the recommended shebang is :code:`#!/bin/sh -e`.
+
+    If a script lacks `execute` permissions,  :program:`cdist` assumes
+    it to be written in **shell** and executes it using
+    `$CDIST_LOCAL_SHELL` or `$CDIST_REMOTE_SHELL`, if one is defined
+    for the current execution context (*local* or *remote*),
+    or else falling back to :code:`/bin/sh -e`.
+
+
+    For any code susceptible to run on remote target hosts
+    (i.e. **explorers** and any code generated by :code:`gencode-remote`),
+    it is recommended to stick to **POSIX shell**
+    in order to minimize requirements on target hosts where they would need to be executed.
+
+    For more details and examples, see :doc:`cdist-polyglot`.
+
+.. seealso:: `cdist execution stages <cdist-stages.html>`_

 Defining parameters
 -------------------
 Every type consists of required, optional and boolean parameters, which must
 each be declared in a newline separated file in **parameter/required**,
-**parameter/required_multiple**, **parameter/optional**, 
+**parameter/required_multiple**, **parameter/optional**,
 **parameter/optional_multiple** and **parameter/boolean**.
 Parameters which are allowed multiple times should be listed in
 required_multiple or optional_multiple respectively. All other parameters
 follow the standard unix behaviour "the last given wins".
 If either is missing, the type will have no required, no optional, no boolean
-or no parameters at all. 
+or no parameters at all.

 Default values for optional parameters can be predefined in
 **parameter/default/<name>**.
@ -237,7 +328,7 @@ In the __file type, stdin is used as source for the file, if - is used for sourc
        source="$(cat "$__object/parameter/source")"
        if [ "$source" = "-" ]; then
            source="$__object/stdin"
-        fi  
+        fi
    ....


@ -307,6 +398,7 @@ stdin from */dev/null*:
        done < "$__object/parameter/foo"
    fi

+.. _cdist-type#manifest:

 Writing the manifest
 --------------------
@ -380,6 +472,7 @@ in your type directory:

 For example, package types are nonparallel types.

+.. _cdist-type#explorers:

 The type explorers
 ------------------
@ -402,6 +495,7 @@ client, like this (shortened version from the type __file):
       md5sum < "$destination"
    fi

+.. _cdist-type#gencode-scripts:

 Writing the gencode script
 --------------------------
--- a/docs/src/cdist-why.rst
+++ b/docs/src/cdist-why.rst
@ -4,44 +4,65 @@ Why should I use cdist?
 There are several motivations to use cdist, these
 are probably the most popular ones.

-Known language
--------------
+No need to learn a new language
+-------------------------------

-Cdist is being configured in
-`shell script <https://en.wikipedia.org/wiki/Shell_script>`_.
-Shell script is used by UNIX system engineers for decades.
-So when cdist is introduced, your staff does not need to learn a new
+When adopting cdist, your staff does not need to learn a new
 `DSL <https://en.wikipedia.org/wiki/Domain-specific_language>`_
-or programming language.
+or programming language, as cdist can be configured
+and extended in **any scripting language**, the recommended one
+being `shell scripts <https://en.wikipedia.org/wiki/Shell_script>`_.
+
+Shell scripts enjoy ubiquity: they have been widely used by UNIX system engineers
+for decades, and a suitable interpreter (:code:`/bin/sh`) is all but
+guaranteed to be widely available on target hosts.
+
+
+Easy idempotance -- without having to give up control
+-----------------------------------------------------------------------------
+
+For the sake of `idempotence <https://en.wikipedia.org/wiki/Idempotence>`_, many **contemporary SCMs**  choose to ditch the power and versatality of general purpose programming languages, and adopt some form of
+declarative `DSL <https://en.wikipedia.org/wiki/Domain-specific_language>`_ for describing the desired end states on target systems.
+
+:program:`Cdist` takes a quite different approach, enabling *both* `idempotence <https://en.wikipedia.org/wiki/Idempotence>`_ *and* a decent level of programming power.
+
+Unlike other SCMs, :program:`cdist` allows you to use a general purpose scripting language (POSIX shell is recommended) for describing the desired end states on target systems, instead of some declarative `DSL <https://en.wikipedia.org/wiki/Domain-specific_language>`_.
+
+Unlike regular scripting, however, you are not left on your own for ensuring `idempotence <https://en.wikipedia.org/wiki/Idempotence>`_. :program:`Cdist` makes this really easy.
+
+It does not matter how many times you "invoke" **cdist types** and in which order: :program:`cdist` will ensure that the actual code associated with each type will be executed only once (in dependency order) which, in turn, may effectively end up becoming a no-op, if the actual state is already the same as the desired one.
+
+.. TODO: It would be great if there were an "architectural overview" page which could be referenced from here.
+

 Powerful language
-----------------
+--------------------

-Not only is shell scripting widely known by system engineers,
-but it is also a very powerful language. Here are some features
-which make daily work easy:
+Compared to a typical `DSL <https://en.wikipedia.org/wiki/Domain-specific_language>`_,
+shell scripts feature a much more powerful language.
+Here are some features which make daily work easy:

- * Configuration can react dynamically on explored values
+ * Ability to dynamically adapt configuration based on information
+   *explored* from target hosts;
 * High level string manipulation (using sed, awk, grep)
 * Conditional support (**if, case**)
 * Loop support (**for, while**)
- * Support for dependencies between cdist types
+ * Variable expansion
+ * Support for dependencies between cdist types and objects
+
+If and when needed, it's always possible to simply
+make use of **any other scripting language** at your disposal
+*(albeit at the expense of adding a dependency on the corresponding interpreter
+and libraries)*.

-More than shell scripting
-------------------------

-If you compare regular shell scripting with cdist, there is one major
-difference: When using cdist types,
-the results are
-`idempotent <https://en.wikipedia.org/wiki/Idempotence>`_.
-In practise that means it does not matter in which order you
-call cdist types, the result is always the same.

 Zero dependency configuration management
----------------------------------------
+-----------------------------------------

 Cdist requires very little on a target system. Even better,
-in almost all cases all dependencies are usually fulfilled.
+in almost all cases all dependencies are usually already
+fulfilled.
 Cdist does not require an agent or high level programming
 languages on the target host: it will run on any host that
 has a **ssh server running** and a POSIX compatible shell
--- a/docs/src/index.rst
+++ b/docs/src/index.rst
@ -30,6 +30,7 @@ It natively supports IPv6 since the first release.
   cdist-type
   cdist-types
   cdist-explorer
+   cdist-polyglot
   cdist-messaging
   cdist-parallelization
   cdist-inventory