1

EDIT

I managed to find out what I was missing that made it clear how to fix my problem. Turns out that between 16.04 and 18.04, Ubuntu and derivatives switched from something dnsmasq based to something systemd-resolved based by default and systemd-resolved is much less willing than dnsmasq to tollerate hundreds of thousands of lines in /etc/hosts.

Below is the original question as I tried to describe it, I'm just adding this about there being an answer here while I'm on the edit page anyway because I figured I should fix the tags to better reflect what turned out to be the relevant components.


Original Post

So, I recently moved from xenial (16.04) to bionic (18.04), and I've been mostly successful at resolving at least the kinks that I think have a resolution. I think I'm down to one last thing that's not as I'd prefer it that I expect there to be a fix to.

In 16.04, I used to have a nice lengthy hostfile to just not even load known malicious sites. It worked well enough for me, if anything broke I could just check and see if the relevant domain was there, check how it got there, and make my own decisions about whether to un-blacklist it.

I tried to migrate over my hostfile from my old drive with 16.04 on it, but then suddenly nothing would load so I had to revert. After comparing a little bit, I found one tiny difference (my new hostfile included two versions of my computer's hostname, one ending in ".lan") and tried correcting that to see if it would fix things. No dice.

I can add items to the list, if I copy over my shortlist of domains I just don't want loading those get blocked fine, but when I add the massive list that used to work it seems to make DNS take forever and never resolve anything. This is the same list of hosts that I had added this same way just fine a few days ago.

My best thought on where I would try to begin with this is to look further into whatever changes happened to networking between xenial and bionic (I had to wrestle control back over my modern-name-of-eth0 because something new was trying to auto-manage it), but it took me some time to even find a guide that successfully got me control back over my wired network and I'm honestly not even sure what the entire scope of the changes are.

Does anyone else use a hostfile to just block a bunch of domains and happen to know what's going on here? Is there some setting that the default got changed and it impacts hostfile parsing and I need to set it to the old setting or something? I'd be more suspicious of the file itself, but that it was working just fine under xenial has me fairly certain that a change in bionic is relevant and it's not just that the hostfile itself is wrong (or if it is, it's wrong in a way that has historically worked, I've been using the same source since precise).

  • Do you mean your 'hosts' file? – Paul Benson Sep 02 '19 at 18:14
  • Yes, I'm adding hosts to /etc/hosts so they don't load. Sorry, I feel like in some context before I've heard of that file just being generally referred to as the "hostfile" (or maybe "hostsfile"?), I probably should have spelled out that I do specifically mean /etc/hosts. – Turnip Wizard Sep 02 '19 at 18:25
  • What happens if you change the name temporarily - sudo mv /etc/hosts /etc/host? Are the sites being blocked still so? – Paul Benson Sep 02 '19 at 19:59
  • Are which sites still being blocked? The problem is I want to block a variety of sites, but when I add them to the file then suddenly DNS stops working (or at least starts taking long enough that everything times out before it's willing to complete).

    I'm afraid I don't understand the question?

    – Turnip Wizard Sep 02 '19 at 21:03
  • You said when you moved to 18.04 using your old hosts file that no website would load (3rd para.). So I'm asking if you change its name or move it to eg your home folder, do websites now load? – Paul Benson Sep 02 '19 at 21:37
  • Oh, if I stop using that hosts file. Yeah, it's currently not in use because of this problem. Sorry, I had to stop using it to be able to load anything (like this page) so I didn't realize I sounded like I was still trying to use it. Yeah, no, if I use one that only has the default contents, or even the default contents plus just a few hosts, then things work fine. It's just when I add the whole comprehensive list of a bunch of other hosts that things stop working. – Turnip Wizard Sep 02 '19 at 23:37
  • OK. Please upload the first lines 100 lines of said 'hosts' file. Should be enough to see if anything immediate looks odd. – Paul Benson Sep 03 '19 at 14:56
  • Alright, assuming I did it right, the first 100 lines of the uncooperative hosts file should be at https://pastebin.com/z0p2xewy. Everything up through line 30 works fine, I don't know which line past that is the problem but if it's in the first hundred then it'll be in there. – Turnip Wizard Sep 03 '19 at 15:42
  • I see you have a number of 'googleapis' sites blocked. I've found that if you block those you may get problems, particularly one called 'ajax.googleapis.com'. If I block that one, even though many sites open OK, I get some error messages including this site. So I'd hash out all those googleapis sites to start with. – Paul Benson Sep 03 '19 at 17:22
  • Oh, no, those are there on purpose. I want the sites that include those to break (the parts that break are one of the things I want to block). Also, that part is in that first 30 lines that works fine; it's the other lines when I start appending known blacklists to the end of it where it starts breaking. I have been and still am blocking those with no problem, it's the part after "Here is where the rest of that other file gets appended" where I add the other provided domains that it changes to not resolving any DNS entries at all. – Turnip Wizard Sep 03 '19 at 18:38

1 Answers1

1

Okay, I probably should have thought to do this before asking the question but I realized I should watch my processes while names were not resolving when I try to use the full /etc/hosts file that I want to.

Turns out systemd-resolve was just chugging away on my CPU while nothing was resolving, so I looked into that. This led me to a few sites before I got to another Ask Ubuntu post with an answer posted by a user called sena that helped me resolve the issue (yes, I see what I did there).

It turns out that one of the major changes that happened between 2016 and 2018 for Ubuntu was the move from using dnsmasq for DNS stuff on the system to using systemd-resolve (systemd-resolved?) for this. Now, for a common user who doesn't futz with /etc/hosts, this is perfectly adequate.

The problem is that systemd uses a substantially slower method of parsing /etc/hosts. This is apparently tied to something about how one should just have a local DNS server do the kind of filtering I'm doing instead of doing it per-machine, so the expectation is that the hostsfile is only a dozen lines or so, not the 700,000-ish lines I'm using. While I can absolutely appreciate the rationale here, and I might go to the trouble of using this approach if I used exclusively desktops, my laptop is portable and goes with me all over the place so a home-only filter is dramatically less useful to me than I would like. Therefore, I needed a better fix.

The summary of the answer in that link (it was very helpful to me that the person who answered copied the information because I was mid-DNS-not-working when I scrolled down to find it) is as follows (thank you sena for a thing that fixed my laptop's DNS):

Here is solution for (X)Ubuntu 18.04 Bionic.

Install dnsmasq

sudo apt install dnsmasq

Disable systemd-resolved listener on port 53 (do not touch /etc/systemd/resolved.conf, because it may be overwritten on upgrade):

$ cat /etc/systemd/resolved.conf.d/noresolved.conf 
[Resolve]
DNSStubListener=no

and restart it

$ sudo systemctl restart systemd-resolved

(alternatively disable it completely by $ sudo systemctl disable systemd-resolved.service )

Delete /etc/resolv.conf and create again. This is important, because resolv.conf is a symbolic link to /run/systemd/resolve/stub-resolv.conf by default. If you will not delete symbolic link, the file will be overwritten by systemd on reboot (even though we disabled systemd-resolved!). Also NetworkManager (NM) checks if it is a symbolic link to detect systemd-resolved configuration.

$ sudo rm /etc/resolv.conf
$ sudo touch /etc/resolv.conf

Disable overwriting of /etc/resolv.conf by NM (there is also an option rc-manager, but it does not work, despite it is described in the NM manual):

$ cat /etc/NetworkManager/conf.d/disableresolv.conf 
[main]
dns=none

and restart it:

$ sudo systemctl restart NetworkManager

Tell dnsmasq to use resolv.conf from NM:

$ cat /etc/dnsmasq.d/nmresolv.conf 
resolv-file=/var/run/NetworkManager/resolv.conf

and restart it:

$ sudo systemctl restart dnsmasq

Use dnsmasq for resolving:

$ cat /etc/resolv.conf 
# Use local dnsmasq for resolving
nameserver 127.0.0.1

Doing this on my system seems to have (and I've rebooted, so it appears to be persisting for now) migrated all of the DNS stuff being done locally back to dnsmasq, which appears to experience absolutely no difficulties with hundreds of thousands of entries in /etc/hosts to blacklist known-bad domains while I'm out and about and can't rely on my home DNS solution to protect me.

(I don't know the "proper" way to credit someone in markdown-ish format, but again, want to point out that where I found this seems to be posted by sena, who deserves credit for coming up with this solution that I found.)