Blocking Ads & Monitoring External Drives with Raspberry Pi

I've written about how I setup my raspberry pi to host time machine backups. I took my pi a bit further and set it up as a local DNS server to block ad tracking systems and, as part of my digital minimalism kick/obsession, to block distracting websites network-wide on a schedule.

Pi-hole: block ads and trackers on your network

Pi-hole is a neat project: it hosts a local DNS server on your Pi which automatically pulls in a blacklist of domains used by advertisers. The interesting side effect is you can control the blacklist programmatically, enabling you to block distracting websites on a schedule. This is perfect for my digital minimalism toolkit.

  • Pi-hole has an active Discourse forum. I've come to love these project-specific forums instead of everything being centralized on StackOverflow.
  • Really impressed with how simple and well designed in the install process is. Run curl -sSL https://install.pi-hole.net | bash and there's a nice CLI wizard that walks you through the process. By the end, you can
  • You'll need to point your DNS resolution to your pi on your router, but you can manually override your router settings in your internet config in MacOS for testing.
  • After you have DNS resolution setup to point to the Pi, you can access the admin via http://pi.hole/admin
  • Upgrade your pi-hole via pihole -up
  • There's also an interesting project which bundles wireguard (vpn) into a docker image: https://github.com/IAmStoxe/wirehole

Automatically blocking distracting websites

Now to automatically block distracting websites! I have a system for aggressively blocking distracting sites on my local machine, but I wanted to extend this network-wide.

First, we'll need two scripts to block and allow websites. Let's call our blocking script block.sh:

#!/bin/bash
blockDomains=(facebook.com www.facebook.com pinterest.com www.pinterest.com amazon.com www.amazon.com smile.amazon.com)

for domain in ${blockDomains[@]}; do
  pihole -b $domain
done

For the allow.sh just switch the pihole command in the above script to include the -d option:

pihole -b -d $domain

You'll need to chmod +x both allow.sh and block.sh. Put the scripts in ~/Documents/. Test them locally via ./allow.sh.

Now we need to add them to cron. Run crontab -e and add these two entries:

0 21 * * * bash -l -c '/home/pi/Documents/block.sh' | logger -p cron.info
0 6 * * * bash -l -c '/home/pi/Documents/allow.sh' | logger -p cron.info

Next, make the following changes to enable a dedicated cron log file and more verbose cron logging:

# uncomment line with #cron
sudo nano /etc/rsyslog.conf

# add EXTRA_OPTS='-L 15'. 15 is the *sum* of the logging options that you want to enable
# I found this syntax very confusing and it wasn't until I read the manpage that I realized
# why my logging levels were not taken into effect.
sudo nano /etc/default/cron

# restart relevant services
sudo service rsyslog restart
sudo service cron restart

# follow the new log file
tail -f /var/log/cron.log

What's all this extra stuff around our script?

I wanted to see the stdout of my cron jobs in cron.log. Here's why the extra cruft around {block,allow}.sh enables this.

The bash -l -c is important: it ensures that the pi user's env configuration is used, which ensures the script can find pihole and other commands you might use in the script. Sourcing the user's environment is not recommended for a 'real' production system, but it's ok for our home-based pi project.

By default, the stdout of the script run in your cron definition is not sent to the parent processes stdout. Instead, it's emailed to you (if you don't have email configured on your pi it will land in /var/mail/pi). To me, this is insane, but I imagine this is the result of a decision made long ago and any seasoned sysadmin has this drilled into his memory.

As an aside, it is unfortunate that many ancient decisions made on a whim continue to cause wasted hours and lots of frustration to newcomers. Think of all of the lost time, or people who give up continuing to learn, because of the unneeded barriers to entry in various technologies. Ok, back to the explanation I promised!

In order to avoid having your cron job output sent to mail you need to redirect the output. | logger does this for us and sends the stdout to syslog. the -p cron.info argument sets the facility.level of the log message. Facility is a weird word used for 'process' or 'log category' and is important because it maps the log entry to the cron.log file specified in the rsyslog.conf modification we made earlier. In other words, it sets the facility of the log message so syslog can run it through its internal ruleset engine to determine which file it should go in. man logger has more nitty-gritty details about how this works.

How long will it take for these block/allow changes to take effect?

Since pi-hole uses DNS for the blacklist, the TTL on the DNS entry matters. Luckily, it's very very short (2m) by default. This means that it will take ~2m for websites to be blocked after the scripts above run on the Pi. You can check the local-ttl value by cat /etc/dnsmasq.d/01-pihole.conf. You can also see the TTL value on a specific DNS entry via the first number under then ANSWER SECTION response when running dig google.com.

If you want to test query response times (and the response content!) between your previous DNS server and your pi-hosted DNS server you can specify a DNS server to use: dig @raspberrypi.local facebook.com. However, something funky is going on with the query response times: code>@raspberrypi.local</code takes longer to execute and reports < 3ms query times but code>@192.168.1.2</code definitely executes more quickly but reports longer query times ~40ms. Would be interesting to understand how dig is reporting these numbers under the hood, but I'm not interested enough to keep digging.

Resources:

CUPS: host USB printers on your network through your Pi

I have an old, but trusty, black and white brother HL-L2340D printer. I rarely print stuff, but it's helpful to have a simple printer around when I need to.

The wireless connection on this printer never worked right. My router (Eero) doesn't host printers (which is very frustrating). The silver lining is this gave me an excuse to spend time learning printing on Linux. Here are some notes:

  • CUPS is still the standard linux print management software. I remember CUPS from over a decade ago. Amazing how slowly technology can change sometimes.
  • Install it via: sudo apt-get install cups
  • You need to add pi to the lpadmin group: sudo usermod -a -G lpadmin pi
  • If CUPS is running properly you can view it locally via: https://localhost:631/. You'll need to ignore invalid certificates on chrome. Use your login info for the pi user to login.
  • In my case, the default printer drivers included with CUPS didn't work for me. I needed to install a specific driver: sudo apt-get install printer-driver-brlaser
  • To broadcast the printer on the network you'll need to install sudo apt-get install avahi-utils then avahi-browse -a
    I already had avahi-daemon installed for the drive hosting stuff I did. This broadcasted the printers across the networks automatically for me.
  • sudo service cups-browsed restart and sudo service cups restart
  • A couple weeks after I set this up, I tried to print something and it only printed every other page. I did some digging and it looks like v4 of the brlaser driver, not v6, is available via the pi's apt-get packages.
  • Easiest solution looked to be to build brlaser v6 from source. There was a great guide online (linked below) that walked through this. It was super easy!
  • However, this didn't fix the issue. After a bit of digging this ended up being a low-level driver issue with printing 'complex' (high resolution?) documents. https://github.com/pdewacht/brlaser/issues/40
  • There was a fix for this on master, so I built the driver from master: wget https://github.com/pdewacht/brlaser/archive/master.tar.gz && tar xf master.tar.gz && cd brlaser-master && cmake . && make && sudo make install
  • After installing the driver from master, everything worked really well.

Resources:

SMART Drive Monitoring & HD Spindown [TODO move to another blog post]

In one of the blog posts or forum threads, they mentioned that hard drives won't spin down automatically on linux. I wanted to dig into this a little bit, here's what I found:

  • As a quick refresher, run mount to get a list of all devices on your machine. /dev/s* entries at the end are your hard drives.
  • There's an interesting tool that allows you to inspect and set various parameters/functions: sudo apt-get install hdparm -y. The Pi OS seems to have a recent version of this, which is great. However, I've seen a couple of references to hdparam being useless on newer drives which manage more and more of the settings on the drive-level and don't allow any configuration (which makes sense).
  • Get lots of info on a drive: sudo hdparm -I /dev/sda (replace sda1 with your mount reference)
  • It sounds like really old drives don't spin down by themselves, but most drives have spin-down/power management support built in. We shouldn't need to worry about drive spindown.
  • The most supported toolset available seems to be SmartMon: sudo apt-get install smartmontools
  • However, smartmontools is severely out of date (a 2017 version!). You can at least update the drive database using this command sudo wget https://raw.githubusercontent.com/smartmontools/smartmontools/master/smartmontools/drivedb.h -O /var/lib/smartmontools/drivedb/drivedb.h. It's not recommended run the latest ARM version because of nuanced differences in the ARM execution set that could interact badly with the lower level drive commands being used.
  • Run a long test manually: sudo smartctl -t long /dev/sda
  • Inspect lots of info on the drive, including test progress: sudo smartctl --all /dev/sda
  • You can set it up to scan your drives periodically and email you.
    • Uncomment smartd startup here: /etc/default/smartmontools
    • Setup per-drive config in /etc/smartd.conf or use DEVICESCAN to monitor all drives: DEVICESCAN -d removable -n standby -m mike@mikebian.co -M exec /usr/share/smartmontools/smartd-runner. In some cases, DEVICESCAN may not pick up all of the drives, but in my case I was able to verify through the logs that it did in my case.
    • Restart service sudo service smartmontools restart
    • To test your config, add -M test right after DEVICESCAN and sudo service smartd restart. If everything is working, you'll get a message in sudo cat /var/mail/mail (or via email if you have that setup).
    • Setup mail via SMTP TODO
    • You can customize the log destination by adding a facility to the smartd options in /etc/default/smartmontools. It's easier to simply tail -f /var/log/syslog | grep smartd
  • This looks like a neat tool to monitor hard drive temperature: sudo apt-get install hddtemp
  • Can't get your test to complete? The drive may be going to sleep. https://superuser.com/questions/766943/smart-test-never-finishes. If your drive still doesn't complete the test, it may be dying.

Setting up mail delivery

Here's how to setup mail delivery from your pi so everything doesn't get stuck in /var/mail:

sudo apt-get install ssmtp -y

sudo nano /etc/ssmtp/ssmtp.conf

# I have an old dreamhost account with an smtp server setup
# here's the config I used to route mail through that smtp server

# it's critical that hostname matches the host in dreamhost
hostname=yourdreamhostdomain.com
mailhub=smtp.dreamhost.com:465
UseTLS=YES                             # Secure connection (SSL/TLS)
FromLineOverride=YES                   # Force the From: line
AuthUser=email@dreamhost.com
AuthPass=dreamhostpassword
FromLineOverride=YES
Debug=YES

You can then send a test email with echo "Testing pi delivery" | mail email@domain.com. Email sent to /var/mail will automatically be routed through this SMTP server.

Resources:

Some nifty networking tips & tricks

A grab-bag of interesting networking tricks I ran into while working with the pi:

  • arp -e (or arp -a on macos) to scan the network for active IP addresses. ARP = Address Resolution Protocol and maps IP address and .local domains to mac address.
  • avahi-browse --all --resolve --terminate provides a more detailed view of local network devices. If you are curious about how .local works (as I was), dig into mdns a bit more.
  • You can configure htop to include network IO. Here's how to set the defaults.
  • dns-sd is an interesting tool to explore what services are being broadcasted on your local machine.
  • Not networking related, but lsusb -t lists out everything connected to your USB ports

Upgrading Raspberry Pi

Your Pi won't upgrade to the latest system version automatically. Here's how to upgrade:

sudo apt update
sudo apt full-upgrade