Another day another VCSA issue. this morning when I tried to log into vCenter it essentially threw the infamous 503 Service Unavailable error.
And off I went looking into what was it this time. the usual checks on disk space looked okay on first glance.
df -h Filesystem Size Used Avail Use% Mounted on devtmpfs 7.9G 0 7.9G 0% /dev tmpfs 7.9G 664K 7.9G 1% /dev/shm tmpfs 7.9G 684K 7.9G 1% /run tmpfs 7.9G 0 7.9G 0% /sys/fs/cgroup /dev/sda3 11G 6.3G 3.8G 63% / tmpfs 7.9G 1.3M 7.9G 1% /tmp /dev/mapper/autodeploy_vg-autodeploy 9.8G 23M 9.2G 1% /storage/autodeploy /dev/mapper/archive_vg-archive 50G 47G 14M 100% /storage/archive /dev/mapper/log_vg-log 9.8G 3.4G 5.9G 37% /storage/log /dev/mapper/seat_vg-seat 25G 22G 1.3G 95% /storage/seat /dev/mapper/imagebuilder_vg-imagebuilder 9.8G 23M 9.2G 1% /storage/imagebuilder /dev/mapper/core_vg-core 50G 15G 33G 31% /storage/core /dev/mapper/db_vg-db 9.8G 676M 8.6G 8% /storage/db /dev/mapper/netdump_vg-netdump 985M 1.3M 916M 1% /storage/netdump /dev/mapper/dblog_vg-dblog 15G 470M 14G 4% /storage/dblog /dev/mapper/updatemgr_vg-updatemgr 99G 2.3G 92G 3% /storage/updatemgr /dev/sda1 120M 34M 78M 31% /boot
the /storage/archive is supposed to use 100% so no issues there, bu the another suspect was “/storage/seat” which was at 95% which was odd, close to full but not quite.
So I started searching for keywords “Errors” “full” in the vpx.log until i found this error message.
#fgrep -R -i "Error" vpxd.log 2019-10-03T15:40:18.849-07:00 error vpxd [[email protected] sub=vpxdVdb] Shutting down the VC as there is not enough free space for the Database(used: 95%; threshold: 95%).
So the fix was to either increase the disk size or clean up the events database.
I decided to cleanup the tables. I found this VMware KB but 6.7 does not have any vpx_event or _vpx_even_arg tables but instead has “vpx_event_1” to “vpx_event_48” and same for vpx_event_arg_xx tables.
After logging into the pgdb I started running the following command
truncate table vpx_event_1 cascade;
after running it for about 10 tables I could see space cleanup to 90%, so I googled and found this community thread that had the quick way to cleanup all the events tables.
DO $$ DECLARE rec record; BEGIN FOR rec IN SELECT * FROM pg_tables WHERE tablename ~ '^vpx_event_[0-9].*' ORDER BY tablename LOOP EXECUTE 'TRUNCATE TABLE ' || quote_ident(rec.schemaname) || '.' || quote_ident(rec.tablename) || ' CASCADE'; END LOOP; END$$;
And one for vpx_event_arg
DO $$ DECLARE rec record; BEGIN FOR rec IN SELECT * FROM pg_tables WHERE tablename ~ '^vpx_event_arg_[0-9].*' ORDER BY tablename LOOP EXECUTE 'TRUNCATE TABLE ' || quote_ident(rec.schemaname) || '.' || quote_ident(rec.tablename) || ' CASCADE'; END LOOP; END$$;
Down to 11% usage, reboot VCSA for good measure and take backup snapshot. All good.