eth.limo DNS Outage Retrospective
Postmortem analysis of our recent DNS outage as a result of questionable registrar behavior
DNS Trouble
Beginning Saturday, March 25 2023 around 17:15 UTC the eth.limo domain suddenly vanished from the internet. The infrastructure was online and healthy. DNS lookups indicated that OpenDNS, Google, and a few other major providers were displaying inconsistent results. Some lookups would resolve correctly, others would not. My initial instincts directed me to investigate AWS Route53, which we use for programmatically managing the eth.limo zone. Using the Route53 nameservers, I was able to confirm that the zone was responding normally, however the majority of other DNS servers disagreed. At this point I logged into our Njalla account, who we were using as our domain registrar. The eth.limo domain was present and displayed “Active” in the column next to the name. I clicked on “manage” and nothing happened. Sensing something was off, I opened the browser dev tools and inspected the “manage” button:
After seeing “disabled” in the button tag, I knew that something at the registrar level must be going on. I quickly fired off back to back support tickets/emails to Njalla, inquiring into why we were locked out of managing our own domain. Several hours elapsed without a reply. During this window, eth.limo began resolving again, but all traffic was being directed to a very sketchy domain parking service, located at 46.8.8.100.
Luckily, HTTP Strict Transport Security kicked in and prevented the vast majority of users from ever actually landing on the parking page.
Njalla remained silent. Without a clear understanding from the registrar on what was transpiring, we began to reach out to our network. If there’s one silver lining in all of this, it’s that the web3 community is full of terrific people who are willing to help. We mobilized our army of friends and acquaintances with ICANN experience and began attempting to contact Njalla.
Recovery
Suddenly and without warning on Sunday, March 26th at around 8:00 PM UTC, the eth.limo gateway came back online and was once again pointing to our infrastructure. I scrambled to check the administrative email account used for Njalla and lo and behold there was an email notifying me that Njalla had responded to my support ticket. I logged in and expanded the ticket:
Very strange. Feeling relieved that the outage was over, we Tweeted that eth.limo was back online and posted the screenshot of the Njalla support ticket.
New DNS Registrar
It became clear that Njalla was no longer a reliable registrar and that our team needed to transfer the eth.limo domain to a more trustworthy registrar as soon as possible. Fortunately, we had been in contact with Mark Jeftovic at EasyDNS during the outage. We began to exchange messages and eventually scheduled a Zoom call so that we could introduce ourselves and discuss transferring the eth.limo domain from Njalla to EasyDNS. We were assured that eth.limo would be in good hands and that EasyDNS would openly and transparently work with us. After reviewing everything I could find about EasyDNS, I felt confident that eth.limo would be safe with EasyDNS as its new home. Our team completed the domain transfer of eth.limo to EasyDNS on 04/13/23 without any service disruption.
Lessons Learned
Try to establish a point of contact with your domain registrar that is outside of the normal support ticketing workflow.
Do not assume that your registrar understands the services you are providing.
Never keep more than 1 mission critical domain in the same account with the same registrar.
I have been a long time customer of Njalla, but I will no longer be using their services due to poor communication and the outright seizure of the eth.limo domain without warning or notice. To date we are still in the dark regarding what truly happened and why our domain was taken from us and sent to a pay-per-click parking page.
Future Risk Mitigation
We are taking active steps to ensure that the infrastructure that powers eth.limo can be deployed across a variety of clouds and compute platforms, providing us with additional resiliency and disaster recovery strategies.
Salute to Web3
Finally, the eth.limo team would like to thank the web3 community for its support. We strive to provide convenient access to dWeb content and we’re honored to be in such great company.
Question, as this is all Greek to me, is this why everything that involves Web3 has been acting extremely sketchy last two days