Travails of upgrading my server from Ubuntu 16.04 LTS to 18.04 LTS

I just finished upgrading my server from Ubuntu 16.04 LTS to 18.04 LTS and wanted to record the fun I had doing it. My server is hosted on Digital Ocean and hosts a WordPress multi-site installation as well as my Nagios monitoring server.

So, the 1st question was, do I do a do-release-upgrade from the command line or do I build a new server fro scratch using the 18.04 LTS? In this case, I decided to go with the command line upgrade rather than a full install for following reasons:

  1. This was my 1st system upgrade so hopefully no accumulated effects.
  2. It is rather a pain to do a full install and get everything to work as well as transfer the WordPress. This included:
    1. Postfix with DKIM, SPF, DMARC, and TLS encryption
    2. CSF and LFD firewall
    3. Nagios
    4. WordPress
    5. Letsencrypy auto renewal of certificates
  3. I could always fall back on the build-a-new-server-from-scratch approach if this upgrade failed as I took a full server snapshot that I could easily restore in case of problems.

So I typed in the following commands using a PuTty terminal:

sudo apt-get update
sudo apt-get dist-upgrade
sudo apt-get autoremove
sudo do-release-upgrade

I answered yes when it asked if I agreed to open a fall back SSH port in case of communication issues during upgrade. I kept all my existing configurations (when prompted to keep existing configurations or install new ones that come with the packages). Finally the server reboots and I get the message that my server has 18.04 LTS installed now.

Now was the crucial test: What was still working? what was broken?

        1. The first thing I knew that worked was my postfix mail server when it sent an email that I logged in via SSH. That also told me that my LFD was working. I examined the email using Gmail’s show original command and verified that SPF, DKIM, and DMARC as well as TLS was working.
        2. CSF (Config Server Firewall) was not running. I restarted this with the command sudo csf -r and verified that CSF was running with the command sudo service csf status. I rebooted to check that CSF was running automatically on startup.
        3. WordPress was not working. It was just showing some php code in the browser window when pointed to my website admin login. After some searching using Google I learnt that when php7.2 replaced php7.0 the new php7.2 was not enable automatically. So after enabling php7.2 in apache by running the commands: sudo a2enmod php7.2 sudo systemctl apache2 restart, I had my WordPress system up and running! (The last are 2 commands to be executed in sequence)
        4. Nagios was not working, meaning that when I logged in as nagiosadmin from a browser it couldn’t connect to the server saying that Nagios was probably not running. I verified that Nagios was not running with the command sudo service nagios status. I tried starting Nagios form command line using sudo service nagios start and was pleasantly surprised to find that my Nagios was largely intact after all! However, on reboot Nagios did not start automatically. So I researched this and found that I had to create a systemd service for it. I did the following:
          1. If the file doesn’t exist, create the file /etc/systemd/system/nagios.service by typing sudo nano -c /etc/systemd/system/nagios.service
          2. Add the following to the file and save it at the end
          3. [Unit]
          4. Description=Nagios
          6. [Install]
          8. [Service] Type=simple
          9. User=nagios
          10. Group=nagcmd
          11. ExecStart=/usr/local/nagios/bin/nagios /usr/local/nagios/etc/nagios.cfg
        5. Enable and start Nagios service by typing:
          1. sudo systemctl enable /etc/systemd/system/nagios.service $ sudo systemctl start nagios
        6. Tested to see if Nagios auto started on power up. It did!

So what was left? There was a final issue. I was using a WordPress LTI plugin for SSO into WordPress from Moodle using LTI. This SSO feature was not working. Whats more, the WordPress site did not even launch successfully from Moodle when called. The issue turned out to be a subtlety: php7.2 flags as an error when a function is called using less than defined number of arguments. WordPress plugin WooCommerce was using a function on login that demanded 2 arguments and I was supplying only one. So I corrected the LTI plugin code by providing the 2nd argument and that fixed the problem. I had to spend several hours chasing down this issue as it was completely new to me. So all in all, about 8 hours of work! I hope there are no more hidden gotchas awaiting me.

Update: Just when I thought everything was OK something bit me on the backside-My letsencrypt-auto renewal cron came up with errors. It turns out that when there is an OS upgrade, this letsencrypt needs to be reinstalled. So I ran the /opt/letsencrypt/letsencrypt-auto certificates command and it did show my certificates. Next I ran the /opt/letsencrypt/letsencrypt-auto renew command and it did some re-installation and came up with an error stating that there were some problems with authentication with http. I was puzzled. I checked the DNS, all was correct. I checked the and was surprised to find the Apache page, it works. So basically, the 000-default site was enabled and was clashing with my site. So I had to do a sudo a2dissite of the 000-default site and restart the apache webserver. Now running the /opt/letsencrypt/letsencrypt-auto renew command worked. I still need to check if the auto renewal cron job works correctly. But phew!

Posted in Linux and tagged , .