ELEKS Labs: Deployment techniques in the QNX P2P network

    What is the best deployment tool to choose? This is a common question almost on every project, no matter it is open-source, proprietary, personal or corporate. And the answer is... it depends! It depends on your needs, on some available instruments and on your abilities of course.

Deployment approaches overview

    The easiest approach is to use packaging tools, available for your operating system. These tools generate self-standing packages, which contain files for your application. It’s very useful and user-friendly, because this packages can be either copied to each machine and installed manually, or you can setup your own server-repository and configure your operating system to work with it. This type of deployment saves you a lot of time but it’s not always available and sometimes your application is too complicated to be packed into a system package.
    The other approach is to use software, which already provides deployment facilities you need, and your task is to write a proper configuration for it. Such tools also often have ability to maintain some versioning for your deployed software which is quite nice. If there is a problem, you can always revert everything to the previous working installation. Capistrano, Vlad the Deployer are the examples of such tools.
    But what if we have something more hardcore? No packages, no version control, no third-party tools. We cannot use git, python, ruby and even many of the GNU coreutils. Welcome to almost bare QNX Neutrino 6.5 installation! Everything we have is a Korn shell, tar and some other few QNX built-ins. Our goal is to create such a deployment system, which would deploy our packages to a set of other nodes in the QNX network. The other nodes are extremely bare. Binaries of QNX core, cp, mv, uname and date utilities are everything that is installed on those nodes.

Qnet magic

    This task looks like next to impossible on the first look, while we know nothing about the Qnet protocol. It is one of very nice core features of QNX Neutrino operating system. Qnet lets some tightly coupled trusted machines to share their resources and standard utilities to manipulate files everywhere on the QNX network as if they were on your machine. In addition, the Qnet protocol doesn't do any authentication of remote requests; files are protected by the normal permissions that apply to users and groups, which is also a cool feature.
    Qnet names resolution is handled by the Qnet protocol manager, which creates a /net directory and handles any network interactions through it. All pathnames of the remote nodes will appear under this directory and you can manage their files and processes as if they were on your local machine.
We can use this nice feature for our needs. For example if we can run processes on the remote nodes as on our own, we can create a script, which can use useful utilities from our node and run them over Qnet on these not-so-functional remote nodes. But such approach also has one drawback: if we’re going to execute some script on the remote machine over Qnet, we need to write two separate scripts (like client-server) and one main script will execute another script on some remote nodes (the other script should already be there), which is not so good decision, because it is not guaranteed we can place any third-party script on that remote nodes. But we can go even further with Qnet magic and execute our own deployment script on another node using the “on” utility. The “on” utility is used to execute a command on another node or terminal. We’ll use it to execute a script on the network nodes.
    So the deployment process will look like our script is executing itself over the network on the remote node with different parameters. Kind of network recursive execution or boomerang. Utility “on” has a bunch of parameters like logging in as specified user on a remote node before executing a command, setting command priority, opening the specified terminal and some others.

Deployment script

    The deployment script will have two modes: master mode for the node it is executed on and slave mode for the network nodes. The generic template for such deployment script can look like this:

if [ "$SLAVE_MODE" -eq "0" ]; then
       # MASTER MODE

    SCRIPT_PATH=$( cd -P -- "$(dirname -- "$(command -v -- "$0")")" && pwd -P )
    SCRIPT_NAME="${0##*/}"
    NETWORK_SCRIPT_PATH="/net/${HOSTNAME}${SCRIPT_PATH}/${SCRIPT_NAME}"

    if [ -z "$NODES_TO_UPDATE" ]; then
        NODES_TO_UPDATE=`ls /net/`
    else
        NODES_TO_UPDATE=$( split_string "$NODES_TO_UPDATE" )
    fi

    for node in $NODES_TO_UPDATE; do
        REMOTE_COMMAND="$NETWORK_SCRIPT_PATH --slave -p $PACKAGES"

        on -f $node sh -c "$REMOTE_COMMAND" &
    done

    echo "Waiting for all nodes..."
    wait
    echo "Deployment finished"

else
    # SLAVE MODE

    for package in $PACKAGES_TO_UPDATE; do
        install $package
    done
fi

    We have SLAVE_MODE variable which is false when we run this script on master node and is true when it’s executed recursively over the network. One of the cornerstones to build such deployment system is to have a correct network path to the script itself, because it will be executed in quite unusual way. NETWORK_SCRIPT_PATH defines such correct network path to the script. Then, I assume, we’ve passed list of nodes we want to update by parameters to the script in master mode. If no, we can always grab all available nodes from /net directory. Next step is to run our script using the “on” utility on each node from this list of nodes. We can add an ampersand to the “on” command in order to continue execute our master script while each slave can do it’s job in the background. After all slave processes are spawned we can wait their termination using a wait command. When this script is executed on the remote node in a slave mode it gets the list of packages to update by parameter and just loops it and processes each package somehow.
    We can also add some deployment verification procedure for consistency reasons. We can generate some token, say datetime, and save it in the deployment directory or user home folder. In the master mode we can loop nodes one more time to check whether deployed version is correct. In the slave mode we can skip update if deployed version is already the most recent and update version after the packages installation loop finishes successfully.
    Also if we want to run some utils over the network, we can pass to slave mode a path to binaries on a node where the script in master mode is executed like this:

MASTER_BIN_PATHES="/net/$HOSTNAME/bin:/net/$HOSTNAME/usr/bin"
...
REMOTE_COMMAND="$NETWORK_SCRIPT_PATH --slave --additional-path=$MASTER_BIN_PATHES -p $PACKAGES"

    And use it in the slave mode:

if [ ! -z "$ADDITIONAL_PATH" ]; then
    export PATH="$PATH:$ADDITIONAL_PATH"
fi

    Or to use direct path over the network to needed utilities:

NET_TAR="$MASTER_BIN_PATH/tar"
$NET_TAR -xzf "$PACKAGE_PATH" -C "$INSTALL_DIR"

    We can improve existing solution in a number of ways:

we can use some helper script to retrieve packages to update from our build server if we have one
add some parameter to master mode to force update all nodes in case of broken installations (with correct deployment file and broken binaries)
add some pre-install and post-install hooks to the slave mode

Conclusion

This is the last approach for deployment system - to develop it by yourself. In some cases the restrictions can make this process a bit challenging as we can see in the case of bare QNX installation. But, on the other hand, we can use unique QNX technologies to solve it efficiently.

9/30/2013

Deployment techniques in the QNX P2P network

No comments:

Post a Comment