VMware Cloud Director and ALB(AVI): Adjust log retention for Events, Tasks and Audit Trail events

Working with high churn workloads within VCD instances one can expect some growing pains. Depending on the integrations, workload and churn, these instances often requires some fine tuning. Why does this blog post title contain ALB(AVI)? Great question, the high churn workload here leverages VCD with ALB and containers! That combination generates a amazing amount of tasks and events that all get stored in the VCD DB.

Before limiting what is saved, how does one track down what tables are taking up space? Running the following command, or something similar depending on your preference will provide the sizing:

select schemaname as table_schema, relname as table_name, pg_size_pretty(pg_relation_size(relid)) as data_size from pg_catalog.pg_statio_user_tables order by pg_relation_size(relid) desc;

Out of the box the VCD appliance deploys with 79GB sizing for the vpostgres mount at XL sizing. Clearly this instance has been expanded but at some point expanding doesn’t solve the problem but does do a good job of masking the problem.

How does one workaround this issue?

Well, there are some choices and tradeoffs. In this case, the logs are all being shipped off to Log Insight (vRLI aka Aria Logs) instances with a very large amount of backing storage for retention reasons. Because vRLI is getting all of the data and the customer in this case is good with not having it in VCD, the choice was made to turn the retention down in the VCD instance itself.

Enter the cell management tool(CMT)!

**The commands down below are subjective to different use case and don’t quote me retention time or sizing requirements here as each instance/workload/customer is different

The following commands adjust the retention time for Events, Tasks and Audit Trail, where the -v ## represents the count of days to be stored. The change does not take effect immediately. Meaning if say… the event table was taking up 288GB and the change was made to limit the retention down to 5 days, it would not delete all records older than 5 days right away. One of the jobs related to the cleanup is ActivityLogCleanerJob which by default is set to run once per day. If required, this can be adjusted to run more frequently but could have negative performance impact. Again, adjust based on the environment and requirements.

#Command to update retention time for Events
/opt/vmware/vcloud-director/bin/cell-management-tool manage-config -n com.vmware.vcloud.events.history.days -v 10


#Command to update retention time for Audit Trail
/opt/vmware/vcloud-director/bin/cell-management-tool manage-config -n com.vmware.vcloud.audittrail.history.days -v 10
#Command to update retention time for Tasks
/opt/vmware/vcloud-director/bin/cell-management-tool manage-config -n com.vmware.vcloud.tasks.history.days -v 10

Each command should have a similar output to the following

New property being stored: Property "com.vmware.vcloud.events.history.days" has value "##"

Some extra info, the instances being tuned here are running VCD 10.4.2 and 10.5. These settings and how logs are being storage in future builds is subject to change!