Upgrading a TripleO deployment from Newton to Queens is done by first executing a minor update in both undercloud and overcloud, to ensure that the system is using the latest Newton release. After that, the undercloud is upgraded to the target version Queens. This will then be used to upgrade the overcloud.

Note

Before upgrading the undercloud to Queens, make sure you have created a valid backup of the current undercloud and overcloud. The complete backup procedure can be found on: undercloud backup

Note

Fast Forward Upgrade testing cannot cover all possible deployment configurations. Before performing the Fast Forward Upgrade of the undercloud in production, test it in a matching staging environment, and create a backup of the undercloud in the production environment. Please refer to undercloud backup for proper documentation on undercloud backups.

The undercloud FFU upgrade consists of 3 consecutive undercloud upgrades to Ocata, Pike and Queens.

## Install tripleo-repos
TRIPLEO_REPOS_RPM=$(curl -L --silent https://trunk.rdoproject.org/centos7/current/ | grep python2-tripleo-repos | awk -F "href" {'print$2'} | awk -F '"' {'print $2'}) sudo yum localinstall -y https://trunk.rdoproject.org/centos7/current/${TRIPLEO_REPOS_RPM}

## Deploy repos via tripleo-repos
sudo tripleo-repos -b current ocata ceph

## Pre-upgrade stop services and update specific packages
sudo systemctl stop openstack-* neutron-* httpd
sudo yum update -y instack-undercloud openstack-puppet-modules openstack-tripleo-common python-tripleoclient


## Deploy repos via tripleo-repos
sudo tripleo-repos -b current pike ceph

## Update tripleoclient and install ceph-ansible
sudo yum -y install ceph-ansible
sudo yum -y update python-tripleoclient


## Deploy repos via tripleo-repos
sudo tripleo-repos -b current queens ceph

## Update tripleoclient
sudo yum -y update python-tripleoclient


## Maintaining the system while the undercloud is on Queens and overcloud on Newton¶

After upgrading undercloud to Queens, the system is expected to be stable and allow normal management operations of the overcloud nodes that are still on Newton. However, to ensure that compatibility, several steps need to be performed.

1. You need to use the newer introspection images, because of incompatible changes in the newer versions of ironic client.

mkdir /home/stack/images
cd /home/stack/images
wget https://images.rdoproject.org/queens/delorean/current-tripleo/ironic-python-agent.tar
tar -xvf ironic-python-agent.tar

source /home/stack/stackrc
openstack overcloud image upload --image-path /home/stack/images/ \
--update-existing

2. Remember to keep the old Newton templates. When the undercloud is upgraded, the new Queens templates are installed. The Newton templates can be used to perform any needed configuration or management of the overcloud nodes. Be sure that you have copied your old templates. Or if you didn’t have a local copy, clone the Newton templates under a new directory:

git clone -b  stable/newton \
https://git.openstack.org/openstack/tripleo-heat-templates tripleo-heat-templates-newton

3. Use a new plan-environment.yaml file. As undercloud CLI calls have been upgraded, they will request that file. It needs to be on /home/stack/tripleo-heat-templates-newton, and have the following content:

version: 1.0

name: overcloud
description: >
Default Deployment plan
template: overcloud.yaml
environments:
- path: overcloud-resource-registry-puppet.yaml


Create a new docker-ha.yaml env file, based on the puppet-pacemaker one:

cp /home/stack/tripleo-heat-templates-newton/environments/puppet-pacemaker.yaml \
/home/stack/tripleo-heat-templates-newton/environments/docker-ha.yaml


Create an empty docker.yaml env file, replacing the one that is currently on newton:

: > /home/stack/tripleo-heat-templates-newton/environments/docker.yaml


After all these steps have been performed, the Queens undercloud can be used successfully to provide and manage a Newton overcloud.

## Upgrading the overcloud from Newton to Queens¶

Note

Generic Fast Forward Upgrade testing in the overcloud cannot cover all possible deployment configurations. Before performing Fast Foward Upgrade testing in the overcloud, test it in a matching staging environment, and create a backup of the production environment (your controller nodes and your workloads).

The Queens upgrade workflow essentially consists of the following steps:

1. Prepare your environment - get container images, backup. Generate any environment files you need for the upgrade such as the references to the latest container images or commands used to switch repos.
2. openstack overcloud ffwd-upgrade prepare $OPTS. Run a heat stack update to generate the upgrade playbooks. 3. openstack overcloud ffwd-upgrade run. Run the ffwd upgrade tasks on all nodes. 4. openstack overcloud upgrade run$OPTS. Run the upgrade on specific nodes or groups of nodes. Repeat until all nodes are successfully upgraded.
5. openstack overcloud ceph-upgrade run $OPTS. (optional) Not necessary unless a TripleO managed Ceph cluster is deployed in the overcloud; this step performs the upgrade of the Ceph cluster. 6. openstack overcloud ffwd-upgrade converge$OPTS. Finally run a heat stack update, unsetting any upgrade specific variables and leaving the heat stack in a healthy state for future updates.

### Prepare your environment - Get container images¶

When moving from Newton to Queens, the setup will be changing from baremetal to containers. So as a part of the upgrade the container images for the target release should be downloaded to the Undercloud. Please see the openstack overcloud container image prepare Containers based Overcloud Deployment for more information.

The output of this step will be a Heat environment file that contains references to the latest container images. You will need to pass this file into the upgrade prepare command using the -e flag to include the generated file.

You may want to populate a local docker registry in your undercloud, to make the deployment faster and more reliable. In that case you need to use the 8787 port, and the ip needs to be the local_ip parameter from the undercloud.conf file.

openstack overcloud container image prepare \
--namespace=192.0.2.1:8787/tripleoqueens --tag current-tripleo \
--output-env-file /home/stack/container-default-parameters.yaml \
--output-images-file overcloud_containers.yaml <OPTIONS> \
--push-destination 192.0.2.1:8787


In place of the <OPTIONS> token should go all parameters that you used with previous openstack overcloud deploy command.

openstack overcloud container image upload \
--config-file /home/stack/overcloud_containers.yaml \
-e /home/stack/container-default-parameters.yaml


### Prepare your environment - New templates¶

You will also need to create an environment file to override the FastForwardCustomRepoScriptContent and FastForwardRepoType tripleo-heat-templates parameters, that can be used to switch the yum repos in use by the nodes during the upgrade. This will likely be the same commands that were used to switch repositories on the undercloud.

cat <<EOF > init-repo.yaml
parameter_defaults:
FastForwardRepoType: custom-script
FastForwardCustomRepoScriptContent: |
set -e
case $1 in ocata) <code to install ocata repo here> ;; pike) <code to install pike repo here> ;; queens) <code to install queens repo here> ;; *) echo "unknown release$1" >&2
exit 1
esac
yum clean all
EOF


The resulting init-repo.yaml will then be passed into the upgrade prepare using the -e option.

You will also need to create a cli_opts_params.yaml file, that will contain the number of nodes for each role, and the flavor to be used. See that sample:

cat <<EOF > cli_opts_params.yaml
parameter_defaults:
ControllerCount: 3
ComputeCount: 1
CephStorageCount: 1
NtpServer: clock.redhat.com
EOF


Before running Fast Forward Upgrade, it is important that you ensure that the custom templates that you are using in your deploy (Newton version), are adapted to the syntax needed for the new stable release (Queens version). Please check the annex in this document, and the changelogs of all the different versions to get a detailed list of the templates that need to be changed.

Note

Before running the overcloud upgrade prepare ensure you have a valid backup of the current state, including the undercloud since there will be a Heat stack update performed here. The complete backup procedure can be found on: undercloud backup

Note

After running the ffwd-upgrade prepare and until successful completion of the ffwd-upgrade converge operation, stack updates to the deployment Heat stack are expected to fail. That is, operations such as scaling to add a new node or to apply any new TripleO configuration via Heat stack update must not be performed on a Heat stack that has been prepared for upgrade with the ‘prepare’ command. Only consider doing so after running the converge step. See the queens-upgrade-dev-docs for more.

Run overcloud ffwd-upgrade prepare. This command expects the full set of environment files that were passed into the deploy command, as well as the roles_data.yaml file used to deploy the overcloud you are about to upgrade. The environment file should point to the file that was output by the image prepare command you ran to get the latest container image references.

Note

It is especially important to remember that you must include all environment files that were used to deploy the overcloud that you are about to upgrade.

openstack overcloud ffwd-upgrade prepare --templates \
-e /home/stack/containers-default-parameters.yaml \
<OPTIONS> \
-e init-repo.yaml
-e cli_opts_params.yaml
-r /path/to/roles_data.yaml


In place of the <OPTIONS> token should go all parameters that you used with previous openstack overcloud deploy command.

This will begin an update on the overcloud Heat stack but without applying any of the TripleO configuration, as explained above. Once this ffwd-upgrade prepare operation has successfully completed the heat stack will be in the UPDATE_COMPLETE state. At that point you can use config download to download and inspect the configuration ansible playbooks that will be used to deliver the upgrade in the next step:

openstack overcloud config download --config-dir SOMEDIR


This will execute the ffwd-upgrade initial steps in all nodes.

openstack overcloud ffwd-upgrade run --yes


After this step, the upgrade commands can be executed in all nodes.

This will run the ansible playbooks to deliver the upgrade configuration. By default, 3 playbooks are executed: the upgrade_steps_playbook, then the deploy_steps_playbook and finally the post_upgrade_steps_playbook. These playbooks are invoked on those overcloud nodes specified by the –roles or –nodes parameters, which are mutually exclusive. You are expected to use –roles for controlplane nodes, since these need to be upgraded in the same step. For non controlplane nodes, such as Compute or Storage, you can use –nodes to specify a single node or list of nodes to upgrade. The controller nodes need to be the first upgraded, following by the compute and storage ones.

openstack overcloud upgrade run --roles Controller


Note

Optionally you can specify –playbook to manually step through the upgrade playbooks: You need to run all three in this order and as specified below (no path) for a full upgrade to Queens.

openstack overcloud upgrade run --roles Controller --playbook upgrade_steps_playbook.yaml
openstack overcloud upgrade run --roles Controller --playbook deploy_steps_playbook.yaml


After all three playbooks have been executed without error on all nodes of the controller role the controlplane will have been fully upgraded to Queens. At a minimum an operator should check the health of the pacemaker cluster

[root@overcloud-controller-0 ~]# pcs status | grep -C 10 -i "error\|fail\|unmanaged"


The operator may also want to confirm that openstack and related service containers are all in a good state and using the image references passed during upgrade prepare with the –container-registry-file parameter.

[root@overcloud-controller-0 ~]# docker ps -a


Warning

When the upgrade has been applied on the Controllers, but not on the other nodes, it is important to don’t execute any operation on the overcloud. The nova, neutron.. commands will be up at this point but users are not advised to use them, until all the steps of Fast Forward Upgrade have been completed, or it may drive unexpected results.

For non controlplane nodes, such as Compute or ObjectStorage, you can use –nodes overcloud-compute-0 to upgrade particular nodes, or even “compute0,compute1,compute3” for multiple nodes. Note these are again upgraded in parallel. Also note that you can still use the –roles parameter with non controlplane roles if that is preferred.

openstack overcloud upgrade run --nodes overcloud-compute-0


Use of –nodes allows the operator to upgrade some subset, perhaps just one, compute or other non controlplane node and verify that the upgrade is successful. One may even migrate workloads onto the newly upgraded node and confirm there are no problems, before deciding to proceed with upgrading the remaining nodes that are still on Newton.

Again you can optionally step through the upgrade playbooks if you prefer. Be sure to run upgrade_steps_playbook.yaml then deploy_steps_playbook.yaml and finally post_upgrade_steps_playbook.yaml in that order.

openstack overcloud upgrade run --nodes overcloud-compute-1 \
# etc for the other 2 as above example for controller


For re-run, you can specify –skip-tags validation to skip those step 0 ansible tasks that check if services are running, in case you can’t or don’t want to start them all.

openstack overcloud upgrade run --roles Controller --skip-tags validation


This step is only necessary if Ceph was deployed in the Overcloud. It triggers an upgrade of the Ceph cluster which will be performed without taking down the cluster.

Note

It is especially important to remember that you must include all environment files that were used to deploy the overcloud that you are about to upgrade.

openstack overcloud ceph-upgrade run --templates \
--container-registry-file /home/stack/containers-default-parameters.yaml \
<OPTIONS> -r /path/to/roles_data.yaml


In place of the <OPTIONS> token should go all parameters that you used with previous openstack overcloud deploy command.

At the end of the process, Ceph will be upgraded from Jewel to Luminous so there will be new containers for the ceph-mgr service running on the controlplane node.

Finally, run the converge heat stack update. This will re-apply all Queens configuration across all nodes and unset all variables that were used during the upgrade. Until you have successfully completed this step, heat stack updates against the overcloud stack are expected to fail. You can read more about why this is the case in the queens-upgrade-dev-docs.

Note

It is especially important to remember that you must include all environment files that were used to deploy the overcloud that you are about to upgrade converge, including the list of Queens container image references and the roles_data.yaml roles and services definition. You should omit any repo switch commands and ensure that none of the environment files you are about to use is specifying a value for UpgradeInitCommand.

Note

The Queens container image references that were passed into the openstack overcloud ffwd-upgrade prepare with the –container-registry-file parameter must be included as an environment file, with the -e option to the openstack overcloud ffwd-upgrade run command, together with all other environment files for your deployment.

openstack overcloud ffwd-upgrade converge --templates
-e /home/stack/containers-default-parameters.yaml \
-e cli_opts_params.yaml \
<OPTIONS> -r /path/to/roles_data.yaml


In place of the <OPTIONS> token should go all parameters that you used with previous openstack overcloud deploy command.

The Heat stack will be in the UPDATE_IN_PROGRESS state for the duration of the openstack overcloud upgrade converge. Once converge has completed successfully the Heat stack should also be in the UPDATE_COMPLETE state.

## Annex: Template changes needed from Newton to Queens¶

In order to reuse the Newton templates when the cloud has been upgraded to Queens, several changes are needed. Those changes need to be done before starting Fast Forward Upgrade on the overcloud.

Following there is a list of all the changes needed:

1. Remove those deprecated services from your custom roles_data.yaml file:
• OS::TripleO::Services::Core
• OS::TripleO::Services::GlanceRegistry
• OS::TripleO::Services::VipHosts
• OS::TripleO::Services::MySQLClient
• OS::TripleO::Services::NovaPlacement
• OS::TripleO::Services::PankoApi
• OS::TripleO::Services::Sshd
• OS::TripleO::Services::CertmongerUser
• OS::TripleO::Services::Docker
• OS::TripleO::Services::MySQLClient
• OS::TripleO::Services::ContainersLogrotateCrond
• OS::TripleO::Services::Securetty
• OS::TripleO::Services::Tuned
• OS::TripleO::Services::Clustercheck (just required on roles that also uses OS::TripleO::Services::MySQL)
• OS::TripleO::Services::Iscsid (to configure iscsid on Controller, Compute and BlockStorage roles)
• OS::TripleO::Services::NovaMigrationTarget (to configure migration on Compute roles)
1. Update any additional parts of the overcloud that might require these new services such as:
• Custom ServiceNetMap parameter - ensure to include the latest ServiceNetMap for the new services. You can locate in network/service_net_map.j2.yaml file
• External Load Balancer - if using an external load balancer, include these new services as a part of the external load balancer configuration
1. A new feature for composable networks was introduced on Pike. If using a custom roles_data file, edit the file to add the composable networks to each role. For example, for Controller nodes:

- name: Controller
networks:
- External
- InternalApi
- Storage
- StorageMgmt
- Tenant


Check the default networks on roles_data.yaml for further examples of syntax.

2. The following parameters are deprecated and have been replaced with role-specific parameters:

• from controllerExtraConfig to ControllerExtraConfig
• from OvercloudControlFlavor to OvercloudControllerFlavor
• from controllerImage to ControllerImage
• from NovaImage to ComputeImage
• from NovaComputeExtraConfig to ComputeExtraConfig
• from NovaComputeSchedulerHints to ComputeSchedulerHints
• from NovaComputeIPs to ComputeIPs
• from SwiftStorageIPs to ObjectStorageIPs
• from SwiftStorageImage to ObjectStorageImage
• from OvercloudSwiftStorageFlavor to OvercloudObjectStorageFlavor
1. Some composable services include new parameters that configure Puppet hieradata. If you used hieradata to configure these parameters in the past, the overcloud update might report a Duplicate declaration error. If this situation, use the composable service parameter. For example, instead of the following:

parameter_defaults:
controllerExtraConfig:
heat::config::heat_config:
DEFAULT/num_engine_workers:
value: 1


Use the following:

parameter_defaults:
HeatWorkers: 1

2. In your resource_registry, check that you are using the containerized services from the docker/services subdirectory of your core Heat template collection. For example:

resource_registry:
OS::TripleO::Services::CephMon: ../docker/services/ceph-ansible/ceph-mon.yaml
OS::TripleO::Services::CephOSD: ../docker/services/ceph-ansible/ceph-osd.yaml
OS::TripleO::Services::CephClient: ../docker/services/ceph-ansible/ceph-client.yaml

3. When upgrading to Queens, if Ceph has been deployed in the Overcloud, then use the ceph-ansible.yaml environment file instead of storage-environment.yaml. Make sure to move any customization into ceph-ansible.yaml (or a copy of ceph-ansible.yaml)

openstack overcloud deploy --templates \
-e <full environment> \
-e /usr/share/openstack-tripleo-heat-templates/environments/docker.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/ceph-ansible/ceph-ansible.yaml \
-e overcloud-repos.yaml


Customizations for the Ceph deployment previously passed as hieradata via *ExtraConfig should be removed as they are ignored, specifically the deployment will stop if ceph::profile::params::osds is found to ensure the devices list has been migrated to the format expected by ceph-ansible. It is possible to use the CephAnsibleExtraConfig and CephAnsibleDisksConfig parameters to pass arbitrary variables to ceph-ansible, like devices and dedicated_devices. See the TripleO Ceph config guide

The other parameters (for example CinderRbdPoolName, CephClientUserName, …) will behave as they used to with puppet-ceph with the only exception of CephPools. This can be used to create additional pools in the Ceph cluster but the two tools expect the list to be in a different format. Specifically while puppet-ceph expected it in this format:

{
"mypool": {
"size": 1,
"pg_num": 32,
"pgp_num": 32
}
}


with ceph-ansible that would become:

[{"name": "mypool", "pg_num": 32, "rule_name": ""}]

4. If using custom nic-configs, the format has changed, and it is using an script to generate the entries now. So you will need to convert your old syntax from:

resources:
OsNetConfigImpl:
properties:
config:
os_net_config:
network_config:
- type: interface
name: nic1
mtu: 1350
use_dhcp: false
list_join:
- /
- - {get_param: ControlPlaneIp}
- {get_param: ControlPlaneSubnetCidr}
routes:
- type: ovs_bridge
name: br-ex
dns_servers: {get_param: DnsServers}
use_dhcp: false
routes:
next_hop: {get_param: ExternalInterfaceDefaultRoute}
members:
- type: interface
name: nic2
mtu: 1350
primary: true
- type: interface
name: nic3
mtu: 1350
use_dhcp: false
- type: interface
name: nic4
mtu: 1350
use_dhcp: false
- type: interface
name: nic5
mtu: 1350
use_dhcp: false
- type: ovs_bridge
name: br-tenant
dns_servers: {get_param: DnsServers}
use_dhcp: false
members:
- type: interface
name: nic6
mtu: 1350
primary: true
group: os-apply-config
type: OS::Heat::StructuredConfig


To

resources:
OsNetConfigImpl:
type: OS::Heat::SoftwareConfig
properties:
group: script
config:
str_replace:
template:
get_file: ../../../../../network/scripts/run-os-net-config.sh
params:
\$network_config:
network_config:
- type: interface
name: nic1
mtu: 1350
use_dhcp: false
list_join:
- /
- - {get_param: ControlPlaneIp}
- {get_param: ControlPlaneSubnetCidr}
routes:
- type: ovs_bridge
name: br-ex
dns_servers: {get_param: DnsServers}
use_dhcp: false
routes:
next_hop: {get_param: ExternalInterfaceDefaultRoute}
members:
- type: interface
name: nic2
mtu: 1350
primary: true
- type: interface
name: nic3
mtu: 1350
use_dhcp: false
- type: interface
name: nic4
mtu: 1350
use_dhcp: false
- type: interface
name: nic5
mtu: 1350
use_dhcp: false
- type: ovs_bridge
name: br-tenant
dns_servers: {get_param: DnsServers}
use_dhcp: false
members:
- type: interface
name: nic6
mtu: 1350
primary: true

5. If using a modified version of the core Heat template collection from Newton, you need to re-apply your customizations to a copy of the Queens version. To do this, use a git version control system or similar toolings to compare.

## Annex: NFV template changes needed from Newton to Queens¶

Following there is a list of general changes needed into NFV context:

1. Fixed VIP addresses for overcloud networks use new parameters as syntax:

parameter_defaults:
...
# Predictable VIPs


For DPDK environments:

1. Modify HostCpuList and NeutronDpdkCoreList to match your configuration. Ensure that you use only double quotation marks in the yaml file for these parameters:

HostCpusList: "0,16,8,24"
NeutronDpdkCoreList: "1,17,9,25"

2. Modify NeutronDpdkSocketMemory to match your configuration. Ensure that you use only double quotation marks in the yaml file for this parameter:

NeutronDpdkSocketMemory: "2048,2048"

3. Modify NeutronVhostuserSocketDir as follows:

NeutronVhostuserSocketDir: "/var/lib/vhost_sockets"

4. Modify VhostuserSocketGroup as follows, mapping to the right compute role:

parameter_defaults:
<name_of_your_compute_role>Parameters:
VhostuserSocketGroup: "hugetlbfs"

5. In the parameter_defaults section, add a network deployment parameter to run os-net-config during the upgrade process to associate OVS PCI address with DPDK ports:

parameter_defaults:
ComputeNetworkDeploymentActions: ['CREATE', 'UPDATE']


The parameter name must match the name of the role you use to deploy DPDK. In this example, the role name is Compute so the parameter name is ComputeNetworkDeploymentActions.

6. In the resource_registry section, override the ComputeNeutronOvsDpdk service to the neutron-ovs-dpdk-agent docker service:

resource_registry:
OS::TripleO::Services::ComputeNeutronOvsDpdk: ../docker/services/neutron-ovs-dpdk-agent.yaml


And remove the previous entry:

resource_registry:
OS::TripleO::Services::ComputeNeutronOvsAgent: ../puppet/services/neutron-ovs-dpdk-agent.yaml


Please notice the naming change between ComputeNeutronOvsAgent and ComputeNeutronOvsDpdk.

For SR-IOV environments:

1. In the resource registry section, override the NeutronSriovAgent service to the neutron-sriov-agent docker service:

resource_registry:
OS::TripleO::Services::NeutronSriovAgent: ../docker/services/neutron-sriov-agent.yaml


And remove the previous entry:

resource_registry:
OS::TripleO::Services::NeutronSriovAgent: ../puppet/services/neutron-sriov-agent.yaml
`