It's like the opposite of Dr. House's "It's never Lupus."
"It's always DNS."
I feel like we really need to speed up the embrace of IPv6 to solve this kind of issue. DNS is helpful to humans sure but a lot of these outages are triggered by services not being able to reach one another because they're hard-coded to a DNS to avoid shifting IPs due to things like NAT.
It feels like we could do an end-run around a lot of this by having a failover to an IPv6 address that is associated with the DNS entry if the DNS fails. Kind of like you generally have multiple DNS servers in sequence in case one of not-responsive, what if, at the service-level we stopped relying on DNS so much and instead used the benefits of IPv6 to not have services fail when DNS does? DNS should be for humans not for computers especially not in a world where IPv6 exists.
(someone who is more familiar with the ins-and-outs of IPv6 is welcome to tell me if and why I am wrong in thinking this)
 
            
          