A 503 service unavailable error simply means that the server was temporarily unable to handle the request for the website. This was happening to a webhook sent by a payment gateway Cashfree.com (after any payment that was collected) that was being refused by my server and given this error.
I found out about this error in 2 ways: 1) The apache2/access.log reported this connection as a 503 and 2) The payment gateway webhook test site indicated problem generating the webhook.
So how to debug this problem?
- Clearly it was not a real 503 issue since my server was not down, not overloaded and this problem was repeatable and reproducible. That must mean that it was being blocked for some unknown reasons.
- I used Postman to send a POST exactly similar to the webhook except with bogus data and signature. This connection was accepted correctly. So evidently the firewall(s) were OK and Apache2 was OK.
- So if Postman sent webhook was OK but not the one sent by Cashfree.com, what was different between them?
- This was the difference as seen in access.log:
xx.xx.xxx.xxx - - [24/Oct/2019:09:04:40 +0530] "POST /payments/wp-admin/admin-post.php?action=somewebhook HTTP/1.1" 503 22966 "-" "-"
xx.xx.xxx.xxx - - [25/Oct/2019:06:48:07 +0530] "POST /payments/wp-admin/admin-post.php?action=somewebhook HTTP/1.1" 200 3713 "-" "PostmanRuntime/7.18.0"
- Clearly the only difference as far as the log entry showed was that Cashfree’s webhook did not have referrer and user agent header information, whereas Postman’s webhook had user agent information. Was this the real reason?
- I checked error.log, syslog, lfd.log, etc., to no avail, no information about anything except that shown in access.log shown above,
- I set the logLevel to debug ssl:warn in the virtual hosts configuration. After this change, the logs were more detailed ans showed that the connection was not being denied by Apache2 as shown here:
Sat Oct 26 05:11:08.926488 2019] [core:debug] [pid 17330] protocol.c(2257): [client xx.xx.xxx.xxx:40248] AH03155: select protocol from , choices=http/1.1 for server xxxxx.org
[Sat Oct 26 05:11:08.998763 2019] [socache_shmcb:debug] [pid 17330] mod_socache_shmcb.c(495): AH00831: socache_shmcb_store (0x42 -> subcache 2)
[Sat Oct 26 05:11:08.998814 2019] [socache_shmcb:debug] [pid 17330] mod_socache_shmcb.c(849): AH00847: insert happened at idx=0, data=(0:32)
[Sat Oct 26 05:11:08.998820 2019] [socache_shmcb:debug] [pid 17330] mod_socache_shmcb.c(854): AH00848: finished insert, subcache: idx_pos/idx_used=0/1, data_pos/data_used=0/207
[Sat Oct 26 05:11:08.998824 2019] [socache_shmcb:debug] [pid 17330] mod_socache_shmcb.c(516): AH00834: leaving socache_shmcb_store successfully
[Sat Oct 26 05:11:09.062052 2019] [authz_core:debug] [pid 17330] mod_authz_core.c(809): [client xx.xx.xxx.xxx:40248] AH01626: authorization result of Require all granted: granted
[Sat Oct 26 05:11:09.062106 2019] [authz_core:debug] [pid 17330] mod_authz_core.c(809): [client xx.xx.xxx.xxx:40248] AH01626: authorization result of <RequireAny>: granted
[Sat Oct 26 05:11:09.062761 2019] [authz_core:debug] [pid 17330] mod_authz_core.c(809): [client xx.xx.xxx.xxx:40248] AH01626: authorization result of Require all granted: granted
[Sat Oct 26 05:11:09.062773 2019] [authz_core:debug] [pid 17330] mod_authz_core.c(809): [client xx.xx.xxx.xxx:40248] AH01626: authorization result of <RequireAny>: granted
- I disabled the CSF firewall to make sure that this was not a firewall issue. No change!
- I increased php.ini memory limit and execution time just as a check: No Change!
- Finally I whitelisted Cashfree.com’s webhook source IP in the Wordfence plugin that also had a firewall and bingo! the problem was solved. It turned out that Wordfence has a rule to block IP’s that do not have a referrer AND a user agent specified in the header.
Now it all ties: Apache2 was not blocking but still showing some weird 503 error. Postman was not being blocked because it had a non-blank user agent header.
So most cases it is always some firewall or .htaccess or ill written apache2 configuration for access.