Over the last weekend, this blog was down for a few days before it finally got back up and running last sunday. The source of the problem was because of CentOS 7 just recently been updated to 7.1 and the problem was caused by the initscripts package. While there’s nothing important here because this is just a personal blog and not a mission critical site, but seeing it was inaccessible for a few days irks me.
I found out the site was inaccessible on April 3. I was unable to connect at all and so i was thinking that maybe the node where i was at was down and so i sent a ticket to the provider. When they replied saying there’s nothing wrong on their side, then i started thinking was there a misconfiguration somewhere? but the day before it was working. So it began, the troubleshooting game. Unfortunately because the network is down and the provider didn’t give me access to any other console so it’s kinda hard to check on things because i have to tell them the command first by sending them tickets and then they would execute it and report back. At the same time i don’t like bothering them like this because this is an unmanaged service so asking them like this feels like i was bothering them.
After numerous messages and failed attempts. They offered me to create a full backup of my container and then rebuild the vps the OS and then put the backup file later on the freshly installed vps so i can do what i want with it later. I was still hesitant on doing that mainly because well let’s just say i don’t want to reconfigure the whole thing again and i’m thinking the problem wasn’t related to configurations and so i asked them to execute one more command, and that is to install packages using yum.
To my surprise they reported that yum failed and they couldn’t even ping the machine from the host node. So just as i thought this problem was caused by something else and so i ask them to try to re-add my ip address, move to another ip temporarily, and all of them ended up in failure and thought that this is due to broken network scripts. Then they re-offered me whether i want to proceed with the backup plan mentioned before … and because i’m left with no other choice i choose yes. After the procedure completed I’m left with a freshly installed CentOS 7 and a tar backup file placed in the root directory.
At this point i remember the last automated report that was sent to me. I noticed on that report there was plenty of updates being performed before i lost contact with my vps. I decided to apply updates right from the start, and then rebooted the server and voila the source of the problem was finally found. Network went down again and now i’m totally sure it was caused by updates. But which one? there are so many updates and i’m locked out of my vps again. But this time i don’t want to send so many messages to them like before so i did some research and found the source of the problem (redhat bugzilla link | openvz bugzilla link). But because there’s no way for me to access my vps the only thing i can do is just send a message to the provider where i’m hosted at pointing to those two sites and wait. Due to timezone difference it feels like i was waiting forever.
Finally after they applied the patch, my vps is accessible again .. but i still have one more thing to do, and that is restoring the backup they provided before and making sure that everything was good again. If you think everything is going to be smooth this time, you were wrong, because unfortunately during the restoring process my container was wrongly suspended. It took about 22 minutes after i sent an explanation to them to get it unblocked again and all is well again now.