VMware ESXi VM "Redo Log Corrupted"? Don't Panic, Here’s How to Resurrect It! 🚑

Hey there 👋, back with another post! As some of you may know (or maybe I haven’t mentioned it yet), a friend and I run a small virtualization cluster in a remote rural area. It hosts most of our self-developed applications and testing environments. Everything was running smoothly until we ran into a challenge: the local power grid is somewhat unstable. Frequent, sudden power outages often cause hard shutdowns across our cluster. After a recent improper power-off, our Reverse Proxy VM—which handles all external mapping—completely went on strike. Upon booting, it kept throwing the error: “The redo log is corrupted!” 🤯

Because this cluster strictly follows a small-enterprise network architecture—with clearly defined DMZ, Tunnel, and Trust zones—there are only one or two zones capable of external mapping. Once this reverse proxy went down, several of our external-facing services were effectively “cut off.” A timely fix was non-negotiable.

🙋 The Issue Looked Like This:#

The redo log of ‘xxxxxx.vmdk’ is corrupted. If the problem persists, discard the redo log.

Seeing this prompt left me momentarily stunned. Only one thought crossed my mind: We’re in trouble. 🥶

Troubleshooting Approach#

VMDK? It must be disk-related! 💾

Since the error explicitly mentioned VMDK, it was highly probable that the VM’s virtual disks were the culprit. I recalled that this VM had several existing snapshots, so I immediately attempted a disk consolidation. The result? Well… it didn’t do much. 🤷 The VM remained stagnant and refused to boot.

When you’re faced with a mission-critical VM that won’t start and you don’t have a fresh backup, the rule of thumb is to try every possible method (ideally after securing whatever data you can). In my case, I decided to Delete All Snapshots!

Virtual Machine Snapshot Management Delete All Snapshots

After confirming the deletion and watching the progress bar reach 100%, a minor miracle happened: the “striking” reverse proxy VM successfully booted up! 🎉🎉🎉 Back to “business” as usual; the world suddenly felt right again! 🥳

Normal Startup

🚨 Pro-Tip (A Must-Read for Production Environments!! Critical!!!) 🚨#

If you are managing vSphere VMs in a production environment, I strongly recommend cloning the VM to create a full backup before performing any destructive operations! This ensures you have a rollback path to avoid further data loss.

As for our “home lab” setup—well, we can afford to be a bit more adventurous. 😅 At worst, I’d just have to redeploy the VM and reconfigure everything from scratch. It’s a tedious process, but certainly better than having no service at all!