Labels

Thursday, August 6, 2020

Ok People, Learn from my Mistakes

Updated: with new information on discovering electrical wiring mistakes.


This is a long post so bear with me.  I am putting this out there so that others can learn from my mistakes in networking.  If you read through to the end you will learn a lot of lessons including the ones that you will comment as “well everyone knows that.”


I have a distributed HomeLab that I have built up over the years that I had become quite proud of.  It allowed me to move equipment about the house and make changes as I needed as well as include new equipment that I purchase.  This included at least 8 different vlans for different purposes, including separating media and IOT equipment.  On the night of July 23rd we experienced a power outage.  It expressed itself as a loud bang, followed by the lights going out, and I witnessed a few sparks fly from the pole in front of my house (no transformer, just Verizon Fios gear).  The power came on about 30 seconds later but my internet, phone, and TV channels weren’t working (I have Verizon Fios).  SWMBO went to reading while I went off to explore what had happened.


I went downstairs to look at the ONT interface (I have the older version) and it was flashing as well as telling me that it was using battery power however, no problem lights.  Checked the electric panel and no breakers had been thrown.  Went back and reset the ONT interface, unplug and pull battery; reverse order to get it back.  Flashing stops for about 2 seconds, then goes back to the same flashing and battery lights.  So I get my meter and check the plug that the ONT interface was plugged into, power level is where it is expected.  I then went into my computer room and the first thing I notice is that my main router and accompanying managed switch have no lights on the front panel.  Sure enough, I try to access the router from my Mac Mini and no luck.  I pull the power from the router and the switch and I am getting power to both showing on the meter; also the UPS they are plugged into is showing no signs of problems.  The UPS has regulated power as well as surge protection so I was surprised.  I then pull out all of the patch cables from one of my extra Edgerouter-X, then I pull the ONT patch cord out from my main router and put it into the ER-X wan port, then I pull up a cable and connect the Mac Mini to the ER-X.  Looking into the configuration screen, I see that the ONT is alternating between connected and disconnected, and no IP.


Lesson 1: make sure you have a back up plan to troubleshoot in case you have a catastrophic network failure.


Lesson 2: a Patch Panel setup can save your bacon if you have to reconfigure your network in a hurry.


I tried a few different things, then concluded that the ONT wasn’t going to do its thing anytime soon.  I went off to bed knowing that SWMBO would not be happy in the morning.  The next morning I went searching for damage.  I found that I had lost the ONT (apparently), my main Cisco router (on UPS), my main 26 port Netgear managed switch (also on UPS), along with two 8 port managed switches (both on UPS) in my network.  I then set about reconnecting patches in a different way to connect my media equipment to the ER-X I had setup last night.  Fortunately, that was rather quick given that I had a diagram of the patch panel connections.  Once I did that, I called Verizon to get a tech out to fix the Fios connection issue, delay was a couple of days at least due to volume.  Having the media equipment on a flat network then allowed me to at least view the recordings SWMBO had made on our TiVo Bolt.  Unfortunately, it also showed me that I couldn’t see any of the TV channels on the Bolt, which was odd.


The Verizon tech appeared the next day and after some analysis replaced the ONT, and set me up.  Now, I was getting internet and my phone (on internet) was working.  Unfortunately, there were still no channels that I could receive on the Bolt, even after a cable card swap; I could however get to the internet.  So, I reasoned that the Bolt was bad so I contacted a family member that had recently converted their Bolt to a TiVo Edge.  I picked up the Bolt, got it back to the house, connected it and now I was receiving tv channels, verifying that I had lost my TiVo Bolt somewhere in the process.  I went downstairs and attempted to connect the TiVo Mini to the new Bolt, but was unsuccessful because these pieces of equipment were on two separate TiVo plans and the TiVo DRM kept me from connecting them together.  I then went out and purchased a couple of unmanaged switches that had decent throughput (2GB per port) to use with setting up the media on the network.


Lesson 3: Don’t forget that there may be DRM issues when you try to bring your equipment back up.


Lesson 4: Having equipment on UPS doesn’t guarantee that your equipment will be perfectly protected, there may be other issues.


I swapped some more patch cables so that I could get my UniFi AC-AP-Pro to at least give me wifi to the internet.  Now that I had at least a semblance of media working for SWMBO, it was time to determine what else had happened and how.  I went back and looked at my Cisco router and discovered that I could reset it and at least get the front lights working.  However, it was apparent that at least half of the 16 ports were fried, including the two WAN ports.  This led me to realize that there must have been a surge through the Ethernet lines; again odd.  In fact when I traced the path of the equipment that had failed, it was apparent that the surge came through the Ethernet cables connecting each device.  Since my TiVo Bolt was able to access internet but not the tv channels, it must have gotten the surge through the Coax cable.  Update: I later discovered that electricians that had wired my new kitchen appliances had used the main conduit that I had my longest Ethernet cabling and COAX cabling on.  Evidently I was the recipient of a power surge that had induced voltage in the Ethernet and COAX lines enough that it overcame the networking and media equipment.  I didn't first figure this out due to the odd failures I was seeing.


Lesson 5: it is important to invest in surge protectors on both Ethernet lines and Coax cables in your network because stuff happens.  Oh, and it wouldn't hurt to have a whole house surge protection system to limit collateral damage.


I have since purchased an Edgerouter 12 and the same Netgear M4100-26g switch that I had before (I actually like this switch).  I am still going through provisioning my network even two weeks after the initial power outage.  This morning I discovered another device that had bit the dust, my Ubuntu Desktop server.  It has all the appearance of a power supply problem; it boots, then powers off very quickly, even after getting to the screen where I enter credentials to login.  The server was also plugged into the UPS, go figure.  I am going to check out this one UPS to make sure that it is doing what it is supposed to do.  I also don’t have the UPS connected to a device to register error messages and do something with that information.  Update: I did indeed have to replace the power supply in the Ubuntu server.


Lesson 6: if you are using a UPS, make sure it is connected to a server to gather error messages, if nothing else but to determine if it is going bad.


Long story short, I am still finding things that have failed or partially failed due to this power outage.  I haven’t even looked at my home-assistant server and IOT equipment yet.  I am almost afraid to do so.


Lesson 7: have at least a minimal setup in mind to get your network back up after failures.  It may involve using older equipment or less capable equipment but those devices can come in handy.  Don’t just throw out old equipment, but be reasonable as well.  Don’t keep everything you have ever had in the network, just keep enough to get you back up and running.


Lesson 8: make a decision on what is important and what can wait should a catastrophe happen and you have to configure quickly.


Post log: I now have Ethernet surge protectors, Coax surge protectors, a whole house surge protector setup, have removed / rerouted the central Coax, and have two fiber optic setups running through the original central channel for network connection.


Hopefully, these thoughts will be important enough to you to think through your network and what may happen.  Good luck to everyone!!


LW