Cloudy Journey: Release: Enterprise Chef 11.2.1 [feedly]

----
Release: Enterprise Chef 11.2.1
// Chef Blog

Enterprise Chef 11.2.1 is a critical bug-fix release for customers who installed Enterprise Chef 11.2.0. It corrects a single defect experienced by customers who upgraded from earlier releases.

Bug Fixes:

Fixes an issue where private-chef was being changed to private_chef unexectedly in upstart/runit configuration files

Notes:

If you upgrade from an earlier release of EC, your servers may now have two runit processes configured in upstart

/etc/init/private-chef-runsvdir.conf
/etc/init/private_chef-runsvdir.conf

The second one is incorrect, introduced by the aforementioned issue in EC 11.2.0. In this condition, you will see two runsvdir processes running with many errors:

ps:

root       924     1  0 05:20 ?        00:00:00 runsvdir -P /opt/opscode/service log: /lock: temporary failure runsv ocid: fatal: unable to lock supervise/lock: temporary failure runsv couchdb: fatal: unable to lock supervise/lock: temporary failure runsv bookshelf: fatal: unable to lock supervise/lock: temporary failure runsv postgresql: fatal: unable to lock supervise/lock: temporary failure runsv opscode-certificate: fatal: unable to lock supervise/lock: temporary failure root 926 1 0 05:20 ? 00:00:00 runsvdir -P /opt/opscode/service log: ry failure runsv opscode-expander: fatal: unable to lock supervise/lock: temporary failure runsv opscode-solr: fatal: unable to lock supervise/lock: temporary failure runsv rabbitmq: fatal: unable to lock supervise/lock: temporary failure runsv ocbifrost: fatal: unable to lock supervise/lock: temporary failure runsv opscode-chef-mover: fatal: unable to lock supervise/lock: temporary failure

pstree:

Correcting the error:

HA

on both the active/bootstrap and standby backend: remove the errant runsvdir config file

root@backend1# rm -f /etc/init/private_chef-runsvdir.conf  root@backend2# rm -f /etc/init/private_chef-runsvdir.conf

On the standby (non-bootstrap) backend: reboot your server to clear all remaining orphaned processes and to restart runsvdir to a working state
```
root@backend2# init 6
```

On the standby backend: Verify that there is only a single runsvdir process and it is error-free (all dots)

root@backend2# ps -ef |grep 'runsvdir -P /opt/opscode/service'  root       921     1  0 05:35 ?        00:00:00 runsvdir -P /opt/opscode/service log: ...........................................................................................................................................................................................................................................................................................................................................................................................................    root@backend2# private-chef-ctl ha-status  [OK] keepalived HA services enabled.  [OK] DRBD disk replication enabled.  [OK] DRBD partition /dev/opscode/drbd found.  [OK] DRBD device /dev/drbd0 found.  [OK] cluster status = backup  [OK] did not find VIP IP address and I am not master  [OK] found VRRP communications interface eth0  [OK] my DRBD status is Connected/Secondary/UpToDate and I am not master  [OK] my DRBD partition is not mounted and I am not master  [OK] DRBD primary IP address pings  [OK] DRBD secondary IP address pings  [OK] bookshelf is not running, and I am not master.  [OK] couchdb is not running, and I am not master.  [OK] keepalived is running.  [OK] nginx is not running, and I am not master.  [OK] oc_bifrost is not running, and I am not master.  [OK] oc_id is not running, and I am not master.  [OK] opscode-account is not running, and I am not master.  [OK] opscode-certificate is not running, and I am not master.  [OK] opscode-erchef is not running, and I am not master.  [OK] opscode-expander is not running, and I am not master.  [OK] opscode-expander-reindexer is not running, and I am not master.  [OK] opscode-org-creator is not running, and I am not master.  [OK] opscode-solr is not running, and I am not master.  [OK] opscode-webui is not running, and I am not master.  [OK] postgresql is not running, and I am not master.  [OK] rabbitmq is not running, and I am not master.  [OK] redis_lb is not running, and I am not master.
    [OK] all checks passed.

on the active/bootstrap backend: trigger a failover and then reboot

root@backend1# private-chef-ctl stop keepalived  ok: down: keepalived: 1s, normally up  root@backend1# sleep 30  root@backend1# init 6

on the bootstrap (now standby backend): Verify that there is only a single runsvdir process and it is error-free (all dots)

root@backend1# ps -ef |grep 'runsvdir -P /opt/opscode/service'  root       921     1  0 05:35 ?        00:00:00 runsvdir -P /opt/opscode/service log: ...........................................................................................................................................................................................................................................................................................................................................................................................................

On the active (non-bootstrap) backend, trigger another failover back to the bootstrap backend
```
root@backend2# private-chef-ctl restart keepalived
```

Test your now-active bootstrap backend to ensure full functionality (note: you may need to point your api_fqdn address at localhost using the server's /etc/hosts file

root@backend1# private-chef-ctl ha-status  [OK] keepalived HA services enabled.  [OK] DRBD disk replication enabled.  [OK] DRBD partition /dev/opscode/drbd found.  [OK] DRBD device /dev/drbd0 found.  [OK] cluster status = master  [OK] found VIP IP address and I am master  [OK] found VRRP communications interface eth0  [OK] my DRBD status is Connected/Primary/UpToDate and I am master  [OK] my DRBD partition is mounted and I am master  [OK] DRBD primary IP address pings  [OK] DRBD secondary IP address pings  [OK] bookshelf is running correctly, and I am master.  [OK] couchdb is running correctly, and I am master.  [OK] keepalived is running.  [OK] nginx is running correctly, and I am master.  [OK] oc_bifrost is running correctly, and I am master.  [OK] oc_id is running correctly, and I am master.  [OK] opscode-account is running correctly, and I am master.  [OK] opscode-certificate is running correctly, and I am master.  [OK] opscode-chef-mover is running.  [OK] opscode-erchef is running correctly, and I am master.  [OK] opscode-expander is running correctly, and I am master.  [OK] opscode-expander-reindexer is running correctly, and I am master.  [OK] opscode-org-creator is running correctly, and I am master.  [OK] opscode-solr is running correctly, and I am master.  [OK] opscode-webui is running correctly, and I am master.  [OK] postgresql is running correctly, and I am master.  [OK] rabbitmq is running correctly, and I am master.  [OK] redis_lb is running correctly, and I am master.    [OK] all checks passed.
    root@backend1# private-chef-ctl test  ...  Finished in 1 minute 23.67 seconds  116 examples, 0 failures, 3 pending

Note: pending errors are OK
Note: This command may fail on the first attempt after a fail-over, please contact support if it continues to fail.

On your frontends, follow the Standalone procedure as detailed below
Upgrade following the normal procedure to Enterprise Chef 11.2.1

Standalone

stop the errant runsvdir process:

# initctl status private_chef-runsvdir private_chef-runsvdir start/running, process 926

# initctl stop private_chef-runsvdir private_chef-runsvdir stop/waiting

remove the errant runsvdir config file

# rm -f /etc/init/private_chef-runsvdir.conf

stop all private-chef services
```
# private-chef-ctl stop  
```
reboot your server to clear all remaining orphaned processes and to restart runsvdir to a working state.

Verify that there is only a single runsvdir process and it is error-free (all dots)

# ps -ef |grep 'runsvdir -P /opt/opscode/service'  root       921     1  0 05:35 ?        00:00:00 runsvdir -P /opt/opscode/service log: ...........................................................................................................................................................................................................................................................................................................................................................................................................

Test your system to ensure full functionality (note: you may need to point your api_fqdn address at localhost using the server's /etc/hosts file
```
# private-chef-ctl test  ...  Finished in 1 minute 23.67 seconds  116 examples, 0 failures, 3 pending  
```
Note: pending errors are OK

----

Shared via my feedly reader

Sent from my iPhone

Cloudy Journey

Pages

Friday, September 5, 2014

Release: Enterprise Chef 11.2.1 [feedly]

Bug Fixes:

Notes:

Correcting the error:

HA

Standalone

No comments:

Post a Comment

Total Pageviews