Applicable for director version 266.2.0+
Linux stemcells v3586.5+ (agent 2.83.0)
Windows 2016 stemcell v1709.9+, Windows 2012R2 v1200.20+ (agent 2.110.0)
Rotating the Blobstore CA¶
As of director version 266.2.0+, TLS is enabled for the default DAV blobstore.
Due to limitations in stemcell support, the agent does not contact the blobstore over TLS by default. Refer to additional opsfiles in bosh-deployment to enable full end-to-end TLS.
Please take note of any ignored instances with
bosh instances --details. These instances will not be considered when redeploying or recreating instances. Ignored instances will only receive the new certificates when recreated.
Introducing a New Certificate Authority (CA)¶
This procedure works by deploying both the old and the new CA on all the VMs in a transitional fashion. The old CA is purged eventually. To achieve this, the procedure requires multiple deploys and recreates of all the deployments. Note that this method is analogous to the one used for NATS CA rotation, which could be performed at the same time as this one.
- The Director is in a healthy state.
- All the VMs are in the
runningstate in all deployments.
- These instructions must be adapted if used with bosh-lite ops files, as they overwrite the variables used in this procedure.
Step 1: Redeploy the director with the new blobstore CA.¶
bosh create-env ~/workspace/bosh-deployment/bosh.yml \ --state ./state.json \ -o ~/workspace/bosh-deployment/[IAAS]/cpi.yml \ -o add-new-blobstore-ca.yml \ -o ... additional opsfiles \ --vars-store ./creds.yml \ -v ... additional vars
- This adds new variables for the new CA/certificates/private_key.
- The director is given a modified CA with the original CA and the new CA concatenated as
- The blobstore continues to use the old certificates and private key.
- Each VM/agent continues to use the old certificates to communicate with the blobstore.
--- - type: replace path: /instance_groups/name=bosh/properties/agent/env/bosh/blobstores?/provider=dav/options/tls/cert/ca value: ((blobstore_server_tls_2.ca))((blobstore_server_tls.ca)) - type: replace path: /instance_groups/name=bosh/properties/blobstore/tls?/cert/ca value: ((blobstore_server_tls_2.ca))((blobstore_server_tls.ca)) - type: replace path: /variables/- value: name: blobstore_ca_2 type: certificate options: is_ca: true common_name: default.blobstore-ca.bosh-internal - type: replace path: /variables/- value: name: blobstore_server_tls_2 type: certificate options: ca: blobstore_ca_2 common_name: ((internal_ip)) alternative_names: [((internal_ip))]
Step 2: Recreate all the VMs, for each deployment.¶
The VMs need to be recreated in order to receive the new certificates generated from the new Blobstore CA being rotated in. If the VMs are not recreated, the agents they contain will not be able to communicate with the blobstore since they will not trust the new CA used to sign the blobstore's certificate.
bosh -d deployment-name recreate
Step 3: Redeploy the director to remove the old Blobstore CA.¶
bosh create-env ~/workspace/bosh-deployment/bosh.yml \ --state ./state.json \ -o ~/workspace/bosh-deployment/[IAAS]/cpi.yml \ -o remove-old-blobstore-ca.yml \ -o ... additional opsfiles \ --vars-store ./creds.yml \ -v ... additional vars
remove-old-blobstore-cabelow is used to remove the old CA from the concatenated CAs.
- The blobstore server is updated to use a new certificate and private key, generated by the new CA.
- All the components can now communicate using the new CA.
--- - type: replace path: /instance_groups/name=bosh/properties/agent/env/bosh/blobstores?/provider=dav/options/tls/cert/ca value: ((blobstore_server_tls_2.ca)) - type: replace path: /instance_groups/name=bosh/properties/blobstore/tls?/cert value: ca: ((blobstore_server_tls_2.ca)) certificate: ((blobstore_server_tls_2.certificate)) private_key: ((blobstore_server_tls_2.private_key)) - type: replace path: /variables/- value: name: blobstore_ca_2 type: certificate options: is_ca: true common_name: default.blobstore-ca.bosh-internal - type: replace path: /variables/- value: name: blobstore_server_tls_2 type: certificate options: ca: blobstore_ca_2 common_name: ((internal_ip)) alternative_names: [((internal_ip))]
Step 4: Recreate all VMs, for each deployment.¶
Recreating all the VMs will remove the old CA from them. The usual way to do this is:
bosh -d deployment-name recreate
Other BOSH commands can be used to recreate the VMs, while others will restart the VMs without recreating them. Please take note of the remarks below.
Commands that will reset the blobstore configuration on the deployed VMs¶
- stop hard and start VMs
bosh -d deployment-name stop --hard bosh -d deployment-name start
- recreate VMs
bosh -d deployment-name recreate
Commands that will NOT reset the blobstore configuration on the deployed VM¶
- restart VMs
bosh -d deployment-name restart
- just stop and start VMs
bosh -d deployment-name stop bosh -d deployment-name start
In order to continue using the new CA and certificates, operators would be required to include
the above opsfiles for subsequent
bosh create-env commands. This is not ideal and can lead to disruptions and downtime if the old CA is used again by mistake.
The following opsfile in conjunction with the
bosh interpolate command will replace the old
certificate values with the new, and remove the second variable created for the above process.
# update_blobstore_var_values.yml --- - type: replace path: /blobstore_server_tls value: ((blobstore_server_tls_2)) - type: replace path: /blobstore_ca value: ((blobstore_ca_2)) - type: remove path: /blobstore_ca_2 - type: remove path: /blobstore_server_tls_2
Create a backup of the credentials and apply the opsfile:
cp creds.yml creds.yml.backup bosh interpolate creds.yml \ -o update_blobstore_var_values.yml \ --vars-file creds.yml > updated_creds.yml mv updated_creds.yml creds.yml
Do not include the
remove-old-blobstore-ca in subsequent
bosh create-env commands.
Warning: If you do not perform the clean-up procedure, you must ensure that the ops files (
remove-old-blobstore-ca.yml) are used every time a create-env is executed going forward (which can be unsustainable). Removing the ops files would revert to the old CA, which will prevent blobstore fetching during deployments or other operations.
If your blobstore CA has already expired, then any actions that fetch from the blobstore will fail. It is also highly likely that the NATS CA has expired as the duration of both is usually the same. The procedure above ensures connectivity between the director and VMs while rotating, which requires a non-expired CA to perform.
- Open the file used for the
creds.yml) and remove all blobstore-related variable keys and values:
blobstore_server_tls, it is not necessary to remove the password values.
- Update the director with new certs with
bosh create-env. The CLI generates new values for the credentials removed in step 1.
- Recreate all your deployments so they receive the new certificates with
bosh recreate -d ... --fix.
--fixis required to ignore the unresponsive state of the VM.
Any instances that have not been recreated with
bosh recreate or through a redeploy causing a recreate will fail with errors like the one below. Perform a
bosh recreate --fix on any instances impacting a redeploy.
Task 135 | 14:43:43 | Updating instance zookeeper: zookeeper/c7f03a6d-fcde-4d85-874f-8cb1503082f6 (0) (canary) (00:00:01) L Error: Action Failed get_task: Task 968fad85-0f1a-494b-6040-5cc949555d17 result: Preparing apply spec: Preparing package openjdk-8: Fetching package blob: Getting blob from inner blobstore: Getting blob from inner blobstore: Shelling out to bosh-blobstore-dav cli: Running command: 'bosh-blobstore-dav -c /var/vcap/bosh/etc/blobstore-dav.json get d1bccd47-95ad-4516-49bc-0cf42a2782c3 /var/vcap/data/tmp/bosh-blobstore-externalBlobstore-Get731225442', stdout: 'Error running app - Getting dav blob d1bccd47-95ad-4516-49bc-0cf42a2782c3: Get https://10.0.1.6:25250/d1/d1bccd47-95ad-4516-49bc-0cf42a2782c3: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "default.blobstore-ca.bosh-internal")', stderr: '': exit status 1