The playbook in the provision-jenkins.yml file in this repository pulls in a set of default values for many of the configuration parameters that are needed to deploy Jenkins from the vars/jenkins.yml file and the default configuration file (the config.yml file). The parameters defined in these files define a reasonable set of defaults for a fairly generic Jenkins deployment, including defaults for the ports that the Jenkins instances should listen on and the packages that must be installed on the node before the jenkins
service can be started.
While you may never need to change most of these values from their defaults, there are a number of parameters that can be used to customize your deployment, so a brief summary of what each is and how it is used could be helpful. In this section, we summarize all of these options, breaking them out into:
- parameters used to control the Ansible playbook run
- parameters used to configure new nodes that are created in a cloud (AWS or OpenStack) environment
- parameters used during the deployment process itself, and
- parameters used to configure our Jenkins nodes once Jenkins has been installed locally.
Each of these sets of parameters are described in their own section, below.
The following parameters can be used to control the ansible-playbook
run itself, defining things like how Ansible should connect to the nodes involved in the playbook run, which nodes should be targeted, where the Jenkins distribution should be downloaded from, which packages must be installed during the deployment process, and where those packages should be obtained from:
cloud
: this parameter is used to indicate the target cloud for the deployment (eitheraws
orosp
); this controls both the role that is used to create new nodes (when a matching set of nodes does not exist in the target environment) and how the build-app-host-groups role retrieves the list of target nodes for the deployment; if unspecified this parameter defaults to theaws
value specified in the default configuration fileregion
: this parameter is used to indicate the region that should be searched for matching nodes (and, if no matching nodes are found, the region in which a set of nodes should be created for use as Jenkins nodes); if unspecified the default value ofus-west-2
specified in the config.yml file is usedzone
: this parameter is used to indicate the availability zone that should be used when creating new nodes in an OpenStack environment; since this parameter is not needed for AWS deployments, there is no default value for this parameter (and any value provided during an AWS deployment will be silently ignored)tenant
: this parameter is used to indicate the tenant name to use, either when creating new nodes (when a matching set of nodes does not exist in the target environment) or when searching for a matching set of nodes in the build-app-host-groups role; if unspecified this parameter defaults to thedatanexus
value specified in the default configuration fileproject
: this parameter is used to indicate the project name to use, either when creating new nodes (when a matching set of nodes does not exist in the target environment) or when searching for a matching set of nodes in the build-app-host-groups role; if unspecified this parameter defaults to thedemo
value specified in the default configuration filedataflow
: this parameter is used to indicate the dataflow name to use, either when creating new nodes (when a matching set of nodes does not exist in the target environment) or when searching for a matching set of nodes in the build-app-host-groups role; the dataflow tag is used to link together the clusters/ensembles (Cassandra, NGINX, Kafka, Solr, etc.) that are involved in a given dataflow; if this value is not specified, it defaults to a value ofnone
during the playbook rundomain
: this parameter is used to indicate the domain name to use (eg. test, production, preprod), either when creating new nodes (when a matching set of nodes does not exist in the target environment) or when searching for a matching set of nodes in the build-app-host-groups role; if unspecified this parameter defaults to theproduction
value specified in the default configuration filecluster
: this parameter is used to indicate the cluster name to use, either when creating new nodes (when a matching set of nodes does not exist in the target environment) or when searching for a matching set of nodes in the build-app-host-groups role; this value is used to differentiate clusters of the same type from each other when multiple clusters are deployed for a given application for the same tenant, project, dataflow, and domain; if this value is not specified it defaults to a value ofa
during the playbook runuser
: the username that should be used when connecting to the target nodes via SSH; the value for this parameter will likely change from one target environment to the next; if unspecified a value ofcentos
will be usedconfig_file
: used to define the location of a configuration file (see the discussion of this topic, below); this file is a YAML file containing definitions for any of the configuration parameters that are described in this section and is more than likely a file that will be created to manage the process of creating a specific ensemble. Storing the settings for a given ensemble in such a file makes it easy to guarantee that all of the nodes in that ensemble are configured consistently. If a value is not specified for this parameter then the default configuration file (the config.yml file) will be used; to override this behavior (and not load a configuration file of any kind), one can simply set the value of this parameter to/dev/null
and specify all of the other, non-default parameters that are needed as extra variables during the playbook runprivate_key_path
: used to define the directory where the private keys are maintained when the inventory for the playbook run is being managed dynamically; in these cases, the scripts used to retrieve the dynamic inventory information will return the names of the keys that should be used to access each node, and the playbook will search the directory specified by this parameter to find the corresponding key files. If this value is not specified then the current working directory will be searched for those keys by default
When the inventory for the playbook run is being controlled dynamically (i.e. when the deployment is targeting nodes in an AWS or OpenStack environment) and no matching nodes are found, the playbook will actually create a new set of nodes (using the tags that were passed into the playbook run) and configure those nodes standalone Jenkins repositories. In that case, there are a number of parameters that must be provided to control the process of node creation:
type
: the type of node that should be created; if this value is unspecified then a default value oft2.large
(suitable for use in the default, AWS deployment) specified in the config.yml file is usedimage
: the image (AMI ID in the case of an AWS deployment or image UUID in the case of an OpenStack deployment) that should be used when creating new nodes; if this parameter is unspecified in an AWS deployment, then the playbook will search for a suitable image to use for the deployment; this parameter must be specified for an OpenStack deployment (and it's value must be the UUID of a pre-existing image that is suitable for use in the playbook run)cidr_block
: the CIDR block of the VPC where the nodes should be created in an AWS deployment (or the equivalent in an OpenStack deployment); it is assumed that this VPC (or OpenStack equivalent) already exists; if it is not specified, then the default value of10.10.0.0/16
from the config.yml file is usednode_map
: a list of dictionary entries where each entry specifies the number of nodes to create (thecount
) for a that application (or for each role in a given aapplication deployment if deployment of the cluster involves the deployment of nodes with different roles, like the seed and non-seed nodes in a Cassandra cluster); for the playbook in this repository the default value for this parameter (which appears in the vars/jenkins.yml file) will result in the creation of a single, standalone Jenkins instance if no matching nodes were found based on the tags that were passed into the playbook runroot_volume
: the size (in GB) of the root volume that should be created when building new nodes in an AWS or OpenStack environment; this parameter has a default value that depends on the whether or not there is a corresponding definition for thedata_volume
parameter (see below):- if there is no defined value for the
data_volume
parameter, then a root volume that is 40GB in size will be created if this parameter is not defined - if there is a defined value for the
data_volume
parameter, then a root volume that is 11GB in size will be created if this parameter is not defined
- if there is no defined value for the
data_volume
: the size (in GB) of the data volume that will be created when building new nodes in an AWS or OpenStack environment; if a value is defined for this parameter, a data volume with the corresponding size will be created for each of the instances that are created by the playbook run and those data volumes will then be mounted under the/data
directory for each of those instances; if a value is not defined for this parameter then no corresponding data volume will be created (and the nodes that created by the playbook run will only have a single, root volume).application_sg_rules
: a list of rules used to configure the firewall associated with the internal and external subnets; for the playbook in this repository the default rules (which should not need to be changed) will result in a single port being open on the internal and external subnets to support client connections to the Jenkins instances that are being deployed.
These parameters are used to control the deployment process itself, defining things like which packages to install.
jenkins_package_list
: the list of packages that should be installed on the Jenkins nodes; typically this parameter is left unchanged from the default (which installs theepel-release
,java-1.8.0-openjdk
, andjava-1.8.0-openjdk-devel
packages), but if it is modified the default, these two packages must be included as part of the new package list or an error will result when attempting to start thejenkins
service.
These parameters are used configure the Jenkins nodes themselves during a playbook run, defining things like the interfaces that Jenkins should be listening on and the directory where Jenkins should store its data.
internal_subnet
: the CIDR block describing the subnet that any nodes being created by the playbook run should attach as a private network (eth0
); this network is used for internode communications between the nodes of the clusters/ensembles that make up the dataflow being deployed; if it is not specified, then the default value of10.10.1.0/24
from the config.yml file is used; if the deployment is an OpenStack deployment then a value for the associatedinternal_uuid
parameter must also be provided, and that value must be the UUID for an existing internal network in the targeted OpenStack environmentexternal_subnet
: the CIDR block describing the subnet that any nodes being created by the playbook run should attach as a "public" network (eth1
); this network is used to support client connections to the various services that make up the dataflow being deployed; if it is not specified, then the default value of10.10.2.0/24
from the config.yml file is used; if the deployment is an OpenStack deployment then a value for the associatedexternal_uuid
parameter must also be provided, and that value must be the UUID for an existing external network in the targeted OpenStack environment
The playbook in this repository will dynamically determine the names of the interfaces that correspond to the defined internal_subnet
and external_subnet
CIDR block values and configure the members of the ensemble being deployed to listen on those interfaces, either for communication between the nodes that make up the ensemble or for client requests. This is accomplished by dynamically constructing an iface_description_array
parameter within the playbook, then using that parameter to determine the names of the corresponding interfaces and their IP addresses.
Put quite simply, the iface_description_array
lets you specify a description for each of the networks that you are interested in, then retrieve the names of those networks on each machine in a variable that can be used elsewhere in the playbook. To accomplish this, the iface_description_array
is defined as an array of hashes (one per interface), each of which include the following fields:
type
: the type of description being provided, currently only thecidr
type is supportedval
: a value describing the network in question; since onlycidr
descriptions are currently supported, a CIDR value that looks something like192.168.34.0/24
should be used for this fieldas_var
: the name of the variable that you would like the interface name returned as
With these values in hand, the playbook will search the available networks on each machine and return a list of the interface names for each network that was described in the iface_description_array
as the value of the fact named in the as_var
field for that network's entry. For example, given this description:
iface_description_array: [
{ as_var: 'data_iface', type: 'cidr', val: '192.168.34.0/24' },
{ as_var: 'api_iface', type: 'cidr', val: '192.168.44.0/24' },
]
In this example, the playbook will determine the name of the network that matches the CIDR blocks 192.168.34.0/24
and 192.168.44.0/24
, returning those interface names as the values of the data_iface
and api_iface
facts, respectively (eg. eth0
and eth1
). These two facts are then used later in the playbook to correctly configure the nodes to talk to each other (over the data_iface
network) and listen on the proper interfaces for user requests (on the api_iface
network).