Blocking Ads & Monitoring External Drives with Raspberry Pi

I've written about how I setup my raspberry pi to host time machine backups. I took my pi a bit further and set it up as a local DNS server to block ad tracking systems and, as part of my digital minimalism kick/obsession, to block distracting websites network-wide on a schedule.

Pi-hole: block ads and trackers on your network

Pi-hole is a neat project: it hosts a local DNS server on your Pi which automatically pulls in a blacklist of domains used by advertisers. The interesting side effect is you can control the blacklist programmatically, enabling you to block distracting websites on a schedule. This is perfect for my digital minimalism toolkit.

Pi-hole has an active Discourse forum. I've come to love these project-specific forums instead of everything being centralized on StackOverflow. Really impressed with how simple and well designed in the install process is. Run curl -sSL https://install.pi-hole.net | bash and there's a nice CLI wizard that walks you through the process. By the end, you can You'll need to point your DNS resolution to your pi on your router, but you can manually override your router settings in your internet config in MacOS for testing. After you have DNS resolution setup to point to the Pi, you can access the admin via http://pi.hole/admin Upgrade your pi-hole via pihole -up There's also an interesting project which bundles wireguard (vpn) into a docker image: https://github.com/IAmStoxe/wirehole Automatically blocking distracting websites

Now to automatically block distracting websites! I have a system for aggressively blocking distracting sites on my local machine, but I wanted to extend this network-wide.

First, we'll need two scripts to block and allow websites. Let's call our blocking script block.sh:

#!/bin/bash blockDomains=(facebook.com www.facebook.com pinterest.com www.pinterest.com amazon.com www.amazon.com smile.amazon.com) for domain in ${blockDomains[@]}; do pihole -b $domain done

For the allow.sh just switch the pihole command in the above script to include the -d option:

pihole -b -d $domain

You'll need to chmod +x both allow.sh and block.sh. Put the scripts in ~/Documents/. Test them locally via ./allow.sh.

Now we need to add them to cron. Run crontab -e and add these two entries:

0 21 * * * bash -l -c '/home/pi/Documents/block.sh' | logger -p cron.info 0 6 * * * bash -l -c '/home/pi/Documents/allow.sh' | logger -p cron.info

Next, make the following changes to enable a dedicated cron log file and more verbose cron logging:

# uncomment line with #cron sudo nano /etc/rsyslog.conf # add EXTRA_OPTS='-L 15'. 15 is the *sum* of the logging options that you want to enable # I found this syntax very confusing and it wasn't until I read the manpage that I realized # why my logging levels were not taken into effect. sudo nano /etc/default/cron # restart relevant services sudo service rsyslog restart sudo service cron restart # follow the new log file tail -f /var/log/cron.log What's all this extra stuff around our script?

I wanted to see the stdout of my cron jobs in cron.log. Here's why the extra cruft around {block,allow}.sh enables this.

The bash -l -c is important: it ensures that the pi user's env configuration is used, which ensures the script can find pihole and other commands you might use in the script. Sourcing the user's environment is not recommended for a 'real' production system, but it's ok for our home-based pi project.

By default, the stdout of the script run in your cron definition is not sent to the parent processes stdout. Instead, it's emailed to you (if you don't have email configured on your pi it will land in /var/mail/pi). To me, this is insane, but I imagine this is the result of a decision made long ago and any seasoned sysadmin has this drilled into his memory.

As an aside, it is unfortunate that many ancient decisions made on a whim continue to cause wasted hours and lots of frustration to newcomers. Think of all of the lost time, or people who give up continuing to learn, because of the unneeded barriers to entry in various technologies. Ok, back to the explanation I promised!

In order to avoid having your cron job output sent to mail you need to redirect the output. | logger does this for us and sends the stdout to syslog. the -p cron.info argument sets the facility.level of the log message. Facility is a weird word used for 'process' or 'log category' and is important because it maps the log entry to the cron.log file specified in the rsyslog.conf modification we made earlier. In other words, it sets the facility of the log message so syslog can run it through its internal ruleset engine to determine which file it should go in. man logger has more nitty-gritty details about how this works.

How long will it take for these block/allow changes to take effect?

Since pi-hole uses DNS for the blacklist, the TTL on the DNS entry matters. Luckily, it's very very short (2m) by default. This means that it will take ~2m for websites to be blocked after the scripts above run on the Pi. You can check the local-ttl value by cat /etc/dnsmasq.d/01-pihole.conf. You can also see the TTL value on a specific DNS entry via the first number under then ANSWER SECTION response when running dig google.com.

If you want to test query response times (and the response content!) between your previous DNS server and your pi-hosted DNS server you can specify a DNS server to use: dig @raspberrypi.local facebook.com. However, something funky is going on with the query response times: code>@raspberrypi.local</code takes longer to execute and reports < 3ms query times but code>@192.168.1.2</code definitely executes more quickly but reports longer query times ~40ms. Would be interesting to understand how dig is reporting these numbers under the hood, but I'm not interested enough to keep digging.

Resources:

https://discourse.pi-hole.net/t/second-level-blacklist-triggered-on-a-schedule/23715/17 https://discourse.pi-hole.net/t/change-the-ttl/6903 https://www.raspberrypi.org/forums/viewtopic.php?t=186833 https://raspberrypi.stackexchange.com/questions/3741/where-do-cron-error-message-go https://serverfault.com/questions/137468/better-logging-for-cronjobs-send-cron-output-to-syslog https://www.rsyslog.com/doc/master/concepts/multi_ruleset.html CUPS: host USB printers on your network through your Pi

I have an old, but trusty, black and white brother HL-L2340D printer. I rarely print stuff, but it's helpful to have a simple printer around when I need to.

The wireless connection on this printer never worked right. My router (Eero) doesn't host printers (which is very frustrating). The silver lining is this gave me an excuse to spend time learning printing on Linux. Here are some notes:

CUPS is still the standard linux print management software. I remember CUPS from over a decade ago. Amazing how slowly technology can change sometimes. Install it via: sudo apt-get install cups You need to add pi to the lpadmin group: sudo usermod -a -G lpadmin pi If CUPS is running properly you can view it locally via: https://localhost:631/. You'll need to ignore invalid certificates on chrome. Use your login info for the pi user to login. In my case, the default printer drivers included with CUPS didn't work for me. I needed to install a specific driver: sudo apt-get install printer-driver-brlaser To broadcast the printer on the network you'll need to install sudo apt-get install avahi-utils then avahi-browse -a I already had avahi-daemon installed for the drive hosting stuff I did. This broadcasted the printers across the networks automatically for me. sudo service cups-browsed restart and sudo service cups restart A couple weeks after I set this up, I tried to print something and it only printed every other page. I did some digging and it looks like v4 of the brlaser driver, not v6, is available via the pi's apt-get packages. Easiest solution looked to be to build brlaser v6 from source. There was a great guide online (linked below) that walked through this. It was super easy! However, this didn't fix the issue. After a bit of digging this ended up being a low-level driver issue with printing 'complex' (high resolution?) documents. https://github.com/pdewacht/brlaser/issues/40 There was a fix for this on master, so I built the driver from master: wget https://github.com/pdewacht/brlaser/archive/master.tar.gz && tar xf master.tar.gz && cd brlaser-master && cmake . && make && sudo make install After installing the driver from master, everything worked really well.

Resources:

https://blog.za3k.com/printing-on-the-brother-hl-2270dw-printer-using-a-raspberry-pi/ https://www.openprinting.org/printer/Brother/Brother-HL-L2360D_series https://www.howtogeek.com/169679/how-to-add-a-printer-to-your-raspberry-pi-or-other-linux-computer/ https://www.linuxbabe.com/ubuntu/set-up-cups-print-server-ubuntu-bonjour-ipp-samba-airprint https://support.brother.com/g/b/faqlist.aspx?c=us&lang=en&prod=hll2360dw_us&ftype3=100257 SMART Drive Monitoring & HD Spindown [TODO move to another blog post]

In one of the blog posts or forum threads, they mentioned that hard drives won't spin down automatically on linux. I wanted to dig into this a little bit, here's what I found:

As a quick refresher, run mount to get a list of all devices on your machine. /dev/s* entries at the end are your hard drives. There's an interesting tool that allows you to inspect and set various parameters/functions: sudo apt-get install hdparm -y. The Pi OS seems to have a recent version of this, which is great. However, I've seen a couple of references to hdparam being useless on newer drives which manage more and more of the settings on the drive-level and don't allow any configuration (which makes sense). Get lots of info on a drive: sudo hdparm -I /dev/sda (replace sda1 with your mount reference) It sounds like really old drives don't spin down by themselves, but most drives have spin-down/power management support built in. We shouldn't need to worry about drive spindown. The most supported toolset available seems to be SmartMon: sudo apt-get install smartmontools However, smartmontools is severely out of date (a 2017 version!). You can at least update the drive database using this command sudo wget https://raw.githubusercontent.com/smartmontools/smartmontools/master/smartmontools/drivedb.h -O /var/lib/smartmontools/drivedb/drivedb.h. It's not recommended run the latest ARM version because of nuanced differences in the ARM execution set that could interact badly with the lower level drive commands being used. Run a long test manually: sudo smartctl -t long /dev/sda Inspect lots of info on the drive, including test progress: sudo smartctl --all /dev/sda You can set it up to scan your drives periodically and email you. Uncomment smartd startup here: /etc/default/smartmontools Setup per-drive config in /etc/smartd.conf or use DEVICESCAN to monitor all drives: DEVICESCAN -d removable -n standby -m mike@mikebian.co -M exec /usr/share/smartmontools/smartd-runner. In some cases, DEVICESCAN may not pick up all of the drives, but in my case I was able to verify through the logs that it did in my case. Restart service sudo service smartmontools restart To test your config, add -M test right after DEVICESCAN and sudo service smartd restart. If everything is working, you'll get a message in sudo cat /var/mail/mail (or via email if you have that setup). Setup mail via SMTP TODO You can customize the log destination by adding a facility to the smartd options in /etc/default/smartmontools. It's easier to simply tail -f /var/log/syslog | grep smartd This looks like a neat tool to monitor hard drive temperature: sudo apt-get install hddtemp Can't get your test to complete? The drive may be going to sleep. https://superuser.com/questions/766943/smart-test-never-finishes. If your drive still doesn't complete the test, it may be dying. Setting up mail delivery

Here's how to setup mail delivery from your pi so everything doesn't get stuck in /var/mail:

sudo apt-get install ssmtp -y sudo nano /etc/ssmtp/ssmtp.conf # I have an old dreamhost account with an smtp server setup # here's the config I used to route mail through that smtp server # it's critical that hostname matches the host in dreamhost hostname=yourdreamhostdomain.com mailhub=smtp.dreamhost.com:465 UseTLS=YES # Secure connection (SSL/TLS) FromLineOverride=YES # Force the From: line AuthUser=email@dreamhost.com AuthPass=dreamhostpassword FromLineOverride=YES Debug=YES

You can then send a test email with echo "Testing pi delivery" | mail email@domain.com. Email sent to /var/mail will automatically be routed through this SMTP server.

Resources:

https://www.raspberrypi.org/forums/viewtopic.php?t=188462 https://brismuth.com/scheduling-automated-storage-health-checks-d470b4283e3e https://www.lisenet.com/2014/using-smartctl-smartd-and-hddtemp-on-debian/ https://help.ubuntu.com/community/Smartmontools Some nifty networking tips & tricks

A grab-bag of interesting networking tricks I ran into while working with the pi:

arp -e (or arp -a on macos) to scan the network for active IP addresses. ARP = Address Resolution Protocol and maps IP address and .local domains to mac address. avahi-browse --all --resolve --terminate provides a more detailed view of local network devices. If you are curious about how .local works (as I was), dig into mdns a bit more. You can configure htop to include network IO. Here's how to set the defaults. dns-sd is an interesting tool to explore what services are being broadcasted on your local machine. Not networking related, but lsusb -t lists out everything connected to your USB ports Upgrading Raspberry Pi

Your Pi won't upgrade to the latest system version automatically. Here's how to upgrade:

sudo apt update sudo apt full-upgrade

Continue Reading

How to Block Distracting Websites on Your Laptop

"What exactly did I do the last 30 minutes?"

I'm sure you've been there, asking that same question, staring blankly into your computer screen.

I've written about how I'm working to minimized distraction. For me, a big component of that is blocking distraction on the device I spend the most time: my laptop.

Here's what I'm looking to do:

Automatically block distracting websites, but allow an easy way to temporarily unblock them. Example: I want to block Amazon by default, but sometimes I want to jump on and buy something quickly. I don't want to have to manage a schedule. Creating exceptions to schedules and then remembering to re-enable the schedule never works well. I don't want crappy software that is going to slow down my computer or cause weird networking issues. I want it to be hard, but not impossible, to disable. One or two clicks to disable is too easy. The Easy Way

For most folks, you'll want to use one of the couple apps out there that do this for you. Here are some that I've tried:

RescueTime Focus Freedom Cold Turkey

Focus is the best option I've found. It's a simple and nicely designed app. Check it out!

The Hard Way

If you like tinkering with your system setup, read on.

The pre-built applications always seemed to do strange things to the networking stack on my computer or hog lots of resources (GBs of memory in some cases). This is probably due to how much I customize my computer.

Also, I found that if I disabled my "blocking schedule" it didn't automatically re-enable. I would then find myself down the Twitter rabbit-hole with 20m wasted. That was a big issue for me.

Eventually, I got frustrated and built a solution which works surprisingly well:

Maintain a simple file listing every host is distracting. Run a script every time the computer wakes up. I used sleepwatcher for this. The script consumes a list of distracting hosts and adds them to /etc/hosts with a reference to a non-existent server. After trying a couple of tools, a node package hostile worked best. 1. Build a List of Distracting Websites

First, create a simple text file. Back it up on Gist or somewhere where it won't be lost. Version tracking the file allows you to view a history of what websites are distracting over time.

I keep this file in my dotfiles repo. Here's what it looks like:

facebook.com twitter.com smile.amazon.com

(yes, that's Amazon Smile since I have a browser extension to redirect me there)

Then you'll want to clean the file, add www variants of each host, and point them to 127.0.0.1:

sed '/^$/d' ./distracting_websites.txt | sed $'s/\(.*\)/127.0.0.1 \\1\\\n127.0.0.1 www.\\1/' > ~/distracting_sites.txt

I put this script in the setup process of my dotfiles for easy ad-hoc execution (you'll want to continually update your distracting_websites.txt as new things distract you).

2. Block all Distracting Websites with a Script

Below is a script that is run every time I wake my computer. Here's what it does:

Updates /etc/hosts using hostile and the distracting_sites.txt file Clears system DNS cache Clears Safari cache, which seems to have its own DNS cache. Chrome does not. # asdf is a node version management tool I use. Your exact execution paths will probably be different /Users/mike/.asdf/installs/nodejs/12.14.1/bin/node \ /Users/mike/.asdf/installs/nodejs/12.14.1/.npm/bin/hostile \ load /Users/mike/distracting_sites.txt # clear system cache # https://apple.stackexchange.com/questions/303110/flush-cache-of-dns-on-macos-sierra-high-sierra/303119#303119 sudo killall -HUP mDNSResponder # clear safari cache osascript << EOF tell application "Safari" activate end tell tell application "System Events" tell process "Safari" tell menu bar 1 to tell menu bar item "Develop" to tell menu 1 to tell menu item "Empty Caches" to click end tell end tell EOF

You can test this script by running it with sudo:

sudo /usr/local/sbin/sleepwatcher --verbose --wakeup .wakeup 3. Run Website Blocking Script When your Computer Wakes from Sleep

First, install sleepwatcher:

brew install sleepwatcher

Then, you'll want to find the location of the plist file which starts up sleepwatcher as a daemon process:

$ brew services Name Status User Plist sleepwatcher started root /Library/LaunchDaemons/homebrew.mxcl.sleepwatcher.plist

You'll want to edit this plist:

<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd"> <plist version="1.0"> <dict> <key>Label</key> <string>homebrew.mxcl.sleepwatcher</string> <key>ProgramArguments</key> <array> <string>/usr/local/sbin/sleepwatcher</string> <string>-V</string> <string>-w /Users/mike/.wakeup</string> </array> <key>RunAtLoad</key> <true/> <key>KeepAlive</key> <true/> <key>StandardOutPath</key> <string>/usr/local/var/log/sleepwatcher.log</string> <key>StandardErrorPath</key> <string>/usr/local/var/log/sleepwatcher.log</string> </dict> </plist>

Then, you'll want to ensure the process runs as root and the script you created is executable:

chmod + /Users/mike/.wakeup brew services stop sleepwatcher sudo brew services start sleepwatcher

And... you're done! Depending on your OS configuration you may need to grant some permissions on first run.

Was this overkill? Definitely. Does it prevent me from wasting any time on distracting websites? Absolutely.

Continue Reading

How I Broke My Phone Addiction

The launch of Neuralink started a conversation across the web about the "merge". The day when you can plug your brain into a computer and communicate with it through your thoughts. No keyboard, mouse, or touch screen. Something out of a sci-fi film.

I think Sam Altman has an interesting take:

I believe the merge has already started, and we are a few years in. Our phones control us and tell us what to do when; social media feeds determine how we feel; search engines decide what we think.

This resonated very strongly with me. My phone does control me to a certain extent and I feel uncomfortable if I hop in the car without it.

I've been on a kick this year of being intentional about how I use technology. A big part of that is my phone. It's the most distracting—and the most useful—thing I own.

"What's wrong with your phone? Is it broken?" has been a common refrain when I show a friend a photo or map on my phone. Friends often complain about how slow I am to respond to texts. My phone isn't broken and I know how to use the messages app, but I have taken some "extreme" measures to disconnect from my phone.

Below is the list of things I've done to disconnect from my phone. They work. Not that I'm close to perfect, but I can easily leave my phone in another room now and forget to check it for nearly a day. That's huge.

If they seem extreme, I encourage you try one or two and see what happens.

Turn off Text Notifications

Turn off all your text notifications. Settings > Notifications > Messages > Allow Notifications.

Yes, it's hard and annoying for a week. But man, it is amazing not to hear or see that ding from incoming messages. Once you get over the week or two of withdrawal you'll love it and never be able to go back. I've had my text notifications disabled for months at this point and it is the single biggest change you can make on your phone.

"Communication for work comes through text and I'm expected to respond instantly!"

You win. This won't work for you. Here are two ideas for you:

Can you block texts from non-work numbers? If your work will pay for a separate phone, you could setup that phone with text notifications and disable them on your device.

"What if you are meeting someone i person and need to communicate in real-time?"

This happens to me often. Just open up your message app.

"What if you miss an emergency text from your spouse/friend/whatever?"

Tell your spouse and close friends you've disabled text notifications. If they need you right away, they can call you.

1. Setup an Incoming Call Whitelist

The idea here is only allowing calls from people you know. Any other calls can go to voicemail. If it's important, they'll leave a message and you can listen to it later.

Here's how to setup this whitelist:

Navigate to Settings > Do Not Disturb. Setup a schedule and make it nearly the whole day. Allow Calls From: Favorites (or whatever list includes the people you need to be responsive to). This is your whitelist of people you want to hear from. Install Hiya. I've found this to be a helpful tool for identifying spam callers when reviewing my missed calls and voicemail.

"I may receive a callback from a customer service department or other unknown numbers"

Just disable do not disturb. Because you've setup a schedule, it'll automatically be enabled the next day.

2. Enable Grayscale Mode

This was a trick I pulled from Make Time. Makes your phone less addicting, but no less functional:

Settings > General > Accessibility >Display Accommodations >Color Filters. Switch Color Filters on and select Grayscale. 3. Remove Distracting Applications

Remove any apps which you find yourself pull-to-refresh'ing. Some examples:

Social: Twitter, Facebook, Pinterest, whatever Dedicated news apps Email. Truth be told, I still have this on my phone for work communication. Move any apps that you need to keep, but are distracting, off the home screen. 4. Block Nearly All Notifications

Go through every app on your phone (Settings > Notifications) and turn off notifications. Think hard about the couple apps you really need notifications from and enable them.

Here's my list:

Google Calendar FaceTime Phone Airline apps Google Maps Scooter apps, Lyft, Uber DMs on Slack within working hours Todoist 5. Block Distracting Websites

It's not intuitive but you can block distracting websites on chrome/safari on your phone.

Navigate to 'Screen Time > Content & Privacy Restrictions > Content Restrictions > Web Content > Limit Adult Websites' and enter in all distracting sites under "Never allow".

For example, here's some of my list:

facebook.com twitter.com news.ycombinator.com recode.net theverge.com techcrunch.com producthunt.com quora.com reddit.con

Setup a monthly reminder on your todo list to add any new distracting news sites that you've started looking at on your phone.

Other Tips & Tricks

If you've made it this far, I challenge you to stick with the setting changes for two weeks. That's about how long it took for me to stop being annoyed by the changes.

Below are some other configuration tips & tricks I've written down over the years for when I get a new iPhone.

Other Misc Tips & Tricks Move mail off of the home screen General > Display & Brightness > Night Shift. Enable, 9am-7pm General > Accessibility > Home Button > Reset Finger to Open Delete default apps I'll never use: Home, Books, iTunes Store, Watch, Tips, TV, Apple Mail, News, Stocks. Messages: Enable send as SMS, disable send read receipts, enable text message forwarding. iCloud: disable photos (use Amazon photos instead), enable contacts, disable calendar, enable Messages, disable Stocks, enable iCloud backup, disable Keychain (use 1Password instead) Phone > Call Blocking & Identification: Hiya Spam & Block Enable password autofill for 1Password. Password & Accounts > Autofill Passwords > 1Password. Disable keychain passwords. Amazon Music: download some music you like, enable automatic downloads of offline music, and disable cellular streaming. Download offline google maps for your local area. Automatic Backup Configuration If you have Amazon Prime. Install Prime Photos and use it to backup all of your photos. Settings > iCloud > Storage > Manage Storage > Backups > Disable Photo Backup . 5gb is not enough room for anything, and Amazon gives you unlimited photo storage for free. Plus, there additional storage tiers (if you take a lot of videos on your iphone) is really cheap. Settings > iCloud > Photos > Disable Photo Stream. Manually initiate a backup to ensure everything goes smoothly. Without photos + videos, your iPhone backup should be able to fit into the 5gb default iCloud storage. Warranty & Documentation

Some notes on warranty replacement:

An IMIE number is a unique identifier for your phone. Document this number in a 1Password note. Settings > General > About > IMIE Your ICC number is the unique identifier for your SIM card. Document this as well. If something is going wrong with your iPhone, try backing up the phone to iTunes and then doing a fresh restore. If that doesn't work it's a hardware issue. Try this before wasting your time with Apple/your cell provider. If you get an "Invalid SIM" error when switching cell phone providers or phones your IMIE and ICC numbers may not be "paired'. You can often pair these numbers yourself through the settings area of your cell provider. The support reps often do not check this or understand it fully.

Continue Reading

Learning Clojure by Automating an RSS Reader

I've been working on revamping how I consume information. Most of my information consumption has been moved to RSS feeds, but I can't keep up with the number of articles in my feeds. When I take a look at my reader I tend to get overwhelmed and spend more time than I'd like to trying to "catch up" on information I generally was consuming out of curiosity.

Not good.

I want articles to be automatically marked as read after they are a month old to eliminate the feeling of being "behind". This is a perfect little project to learn a programming language that's looked interesting for a while!

Building a small project in a new language or technology is the best way to learn. While I was building this tool, I documented what questions I was asking, answers to these questions, and what articles and resources I found helpful.

Posts like this have been interesting to me, hopefully this is a fun read for others!

What do I want to build?

I want to build a Clojure script for FeedBin that will:

Inspect all unread articles If the publish date is more than two weeks in the past, mark the article as unread Automatically run every day

Let's get started!

Resources

Here are some helpful blogs & tutorials I used while learning:

http://slipset.github.io/posts/Why-Clojure-is-my-favourite-language https://ltriant.github.io/2019/08/13/clojure-learning-functional-design.html https://learnxinyminutes.com/docs/clojure/ https://eli.thegreenplace.net/2017/notes-on-debugging-clojure-code/ https://clojure.org/guides/getting_started

Also, I always try to grab a couple of large open-source repos to look at when I'm learning a new language. Here are some places I searched:

https://github.com/trending/clojure https://clojars.org http://open-source.braveclojure.com

Some repos I found interesting:

https://github.com/metabase/metabase This is probably the largest full-blown open-source Clojure application out there. Most other projects I found were libraries, not applications. https://github.com/LightTable/LightTable https://github.com/clojars/clojars-web https://github.com/dakrone/clj-http Syntax & Structure

Now that I have some browser tabs open with documentation, let's start learning!

How do I install this thing? https://clojure.org/guides/getting_started => brew install clojure/tools/clojure Going through the "Learn X in Y" guide, some interesting takeaways: Clojure is built on the JVM and uses Java classes for things like arrays. Code in Clojure is essentially a list-of-lists. A list is how you execute code: the first element is the method name, and then arguments separated by spaces. This feels very weird at first, but it's a really powerful concept. Simple made Easy explains the philosophy behind this a bit. "Quoting" (prefacing a list with a single quote) prevents the list from executing. This is helpful for defining a list, passing code as a data structure that can be mutated later on. Sequences (Arrays/Lists) seem to have some important different properties from vectors. I need to understand this a bit more. When you define a function it doesn't get a name. You need to assign it (def) to a variable to give it a name. The [] in a function definition is the list of arguments. There are lots of ways to create functions: fn, defn, def ... #() multi-variadic function is a new word for me! It's a function with a variable number of arguments. Looks like you can define different execution paths depending on the arguments, kind-of like Elixir's pattern matching. [& args] is equivalent to (*args) in ruby The beginner (me!) can treat ArrayMap and HashMap as the same. Keywords == ruby symbols The language looks to execute from the inside out, and the composition of functions is done via spaces not commas, parens, etc. Looks like everything is immutable in Clojure. Everything is a function. So much so, that even basic control flow is managed the same way as a standard function. Looks like "STM" is an escape hatch if you need to store state. Similar to Elixir's process state. The Clojure community is big on "repl driven development", but what exactly do they mean? How is that different from binding.pry in a ruby process to play around with code? Looks like it's not that different. Some nice editor integrations make things a bit more clean, but more or less the same as opening up rails console with pry enabled. I've always disliked the ability to alias a module or function to a custom name. It makes it much harder for newcomers to the codebase to navigate what is going on. Looks like this is a pretty common pattern in Clojure, the require at the top of a file can setup custom aliases for all functions. "forms" have been mentioned a couple of times, but I still don't get it. What is a form? I've heard that Clojure is a Lisp. What is a "lisp"? https://en.wikipedia.org/wiki/Lisp_(programming_language) There was an original LISP programming language, but "a lisp" is a language patterned after the original LISP Seems like the unique property of a lisp-style language is code is essentially is a linked list data structure. Since all code is a data structure, you can define really interesting macros to modify your source code. Another property is the parentheses-based syntax. It's interesting to look at the different lisp styles available. I feel like the only language that is popular today is Clojure. Sounds like immutability is unique to Clojure and isn't a core structure other lisps.

I think I know just enough to start coding.

Coding in Clojure

Here's the learning process which generated the final source code:

Let's define the namespace and get a "Hello World" to make sure I have the runtime executing locally without an issue. 184408626bb41b87d53f9b0bb5485a8e9201d8d5 Ok, now let's outline the logic we'll need to implement. 7e018b05ff8ad925ef2bfe9c56c4a702dce4c3d0 Now, let's pick a HTTP library and figure out how to add it as a dependency. https://clojars.org looks like the most popular package repository. It doesn't seem like there's any download/popularity indicator that you can sort by. Bummer. Hard to figure out what sort of HTTP library I should use. Looks like project.clj is a gemspec type definition file. Metabase's http library is clj-http. Let's use that. We'll also need to figure out how to setup this dependency file. https://github.com/metabase/metabase/blob/master/project.clj#L63 https://github.com/technomancy/leiningen is linked in the project.clj files I've seen. It's listed as a dependency manager on the clj-http library: https://clojars.org/clj-http. Let's install it via brew install leiningen. lein new feedbin and mv ./feedbin./ ./ to setup the project structure. Looks like lein will help us with dependencies and deployment. b0b4022618abac840af6679f900584d04de510c1 There's this skip-aot thing in the main: definition which I don't understand. In any case, if I stuff a defn -main in the file for the namespace defined in main lein run works! 764d7a1e2a537d61b036df4229a2c96671725dd8 It looks like this ^: syntax is used often. What is it? Ok, let's copy our logic outline from the other file we were working on over to the src/feedbin/core.clj and try to add our HTTP dependency. Added [clj-http "3.10.0"] to the dependency list in project.clj, lein run seemed to pull down a bunch of files and run successfully. Now, let's pull the FeedBin variable from the ENV and store it to a var. Looks like you have to wrap let in parens, and include commands that rely on the var within the scope of the parens. I could see how this would force you to keep methods relatively short. 6f1f8099ffd0ed5f997be93685d18d1c574efb6b Let's hit the API and get all unread entries and store them in a var. Looks like cheshire is a popular JSON decoder, let's use that. It looks like let is only when you want temporary bindings within a specific scope. Otherwise, you should use def to setup a variable. 5b63cd289052d9fcebec2cb2965d598927b0616a Convention is - for word separation, not _ or camel case. Let's refactor the getenv to use def. Much better! a6a95a1e4703c07e76ecce32b56b6b0f1903acca Time to select entries that are two months old. A debugger is going to be helpful here to poke at the API responses. Looks like debugger is the pry equivalent. I had trouble getting this to work and deep-dived on this a bit: (pst) displays the stacktrace associated with the last exception. This is not dependent on clj-debugger Looking closer at clj-debugger it has ~no documentation and hasn't been updated in nearly two years. Is there a better option? Doesn't look like it (require 'feedbin.core :reload-all) seems like the best way to hot reload the code in a repl. Then you can rerun (feedbin.core/-main) Ah, figured it out! (break) on it's own won't do anything. It needs an input to debug. (break true) works. You need to run this in lien repl for it to work. As a side note, I've found the REPL/debugging aspect of learning a new programming language to be really important. Languages that don't have great tooling and accessible documentation around this make it much harder for newcomers to come up to speed. The REPL feedback loop is just so much faster and in developer tooling speed matters. I was able to extract the published date, now I just need to do some date comparison to figure out which entries are over a month old. ca16f54f66a39753933168c3f8deac636144ca47 Now to mark the entries as "read" (in feedbin this is deleting the entries). Should be able to just iterate through the ID list and POST to the delete endpoint. I started running into rate limiting errors as I was testing this. # turns a string into a regex, but appears to do much more. Looks like it's a shorthand for creating lambda. https://clojure.org/guides/weird_characters macroexpansion is an interesting command to help with debugging. With the rate limit errors gone, I can finally get this working for good. I tried passing in the article IDs as a comma-separated list as a query string and it didn't work. I need to send this data in as a JSON blob. 166ea49439ed690ff08c8fd987530b170b9bb80e Got the delete call working. You can pass a hash directly to clj-http and it'll convert it into JSON. Nice. 63ac8bf1d4fd969326fffa9ad7b50ad1f0a4b56d

Great! We have the script working. Now, let's deploy it.

Clojure Deployment Using AWS Serverless

I have a friend who is constantly talking about how awesome serverless is (i.e. AWS Lambda). I also remember hearing that you can setup cron-like jobs in AWS that hit a lambda. Let's see if that's the case and if we can get this script working on lambda.

Some things we'll need to figure out:

How/where do I specify that an endpoint should be hit every X hours? How do I specify where the entrypoint is for the lambda function? How do we specify environment variables?

Notes

I jumped into AWS lambda dashboard and created a function named "Mark-Feedbin-Entries-As-Read" with Java 11. It looks like the crazy AWS permission structure is generated for me. I added the com.amazonaws/aws-lambda-java-core package and it looks like I need to run gen-class to expose my handler. What is gen-class? It generates a .class file when compiling, which I vaguely remember is a file which is bundled into the .jar executable. Looks like aot compilation needs to be enabled as well. Still need to understand what aot is. I ran lein uberjar and specified feedbin.core::handler as my handler. Created a test event with "testing" as the input. Used the -standalone jar version that was generated. Looks like environment variables can be setup directly in the Lambda GUI. "Cron jobs" are setup via CloudWatch events. What is CloudWatch? It's AWS's monitoring stack. Strange that this is the recommended way to setup cron jobs. I would have thought there was a dedicated service for recurring job schedules. "Serverless" (looks like a CDK-like YML configuration syntax for AWS serverless) makes it look easy to deploy a lambda which executes on a schedule, but doesn't indicate how it's actually managed in AWS in the blog post. Aside: It's interesting the more you dig into AWS, the more it feels like a programming language. Each of the services is a library and the interface to configure them in yaml. It looks like "Amazon EventBridge" is the new "CloudWatch Events". Looks like we can setup a rule which triggers a lambda function at a particular rate. Neat, you can setup a rule directly with the AWS Lambda GUI. Use a EventBridge trigger with rate(1 day) to trigger the function every day. Really easy! I checked on it the next day and it's failing. How can we inspect the request? It's probably failing due to the input data being some sort of JSON object vs a simple string that I tested with. Here's what I found: you can inspect the logs, use CloudTrail to view an event, enable X-Ray tracing, and send failed events to a dead letter queue. I enabled all of this stuff: my end goal to inspect the event JSON passed the lambda to determine how to fix it. Ah! After a bit more digging, if you find the event in CloudTrail there's a "View event" button that will give you the JSON output. I can then copy the JSON into the test event in the configuration for the lambda and run it there to get helpful debugging information. Feels a bit primitive, but it works. I wonder how you would run the function and locally and write integration tests using example AWS JSON? Looks like the function signature for my handler is incorrect. When handling events, the handler accepts two arguments [Object com.amazonaws.services.lambda.runtime.Context]. This fixed the issue! 8520e8a319bd5d41a67a01f9517ce4cf559ab381

Resources:

https://bernhardwenzel.com/articles/using-clojure-with-aws-lambda/ https://aws.amazon.com/blogs/compute/clojure/ https://thenewstack.io/move-your-cron-jobs-to-serverless-in-3-steps/ https://serverless.com/blog/cron-jobs-on-aws/ https://docs.aws.amazon.com/lambda/latest/dg/with-scheduledevents-example-use-app-spec.html https://lumigo.io/blog/eventbridge-vs-cloudwatch-events-kinesis-and-sns/ https://docs.aws.amazon.com/eventbridge/latest/userguide/run-lambda-schedule.html https://d0nkrs.com/post/building-aws-lambda-functions-with-clojure https://github.com/aws/aws-cdk https://github.com/jebberjeb/lambda-sample Open Questions

Here's a list of questions that I wasn't able to answer during my learning process:

How can you parallelize operations in Clojure? How easy is deployment? How does interop with Java work? Is there a rails-like web stack? Is there a style guide?

Continue Reading

Reclaiming Your Mind: Creating an Information Diet

There's been a lot of areas of my life that I've been 'auditing' and attempting to tweak the habits that have intentionally or accidentally fallen into place. One of these is my information diet: how I find, consume, and process information.I've been tracking my time spent on reading/time on the internet and I'm not liking the trend. I've felt more addicted to information this year and I want to eliminate that feeling. Revamping my information intake is one way I'm going to do that. It's worth thinking about why it's worth spending time consuming information, how I consume information, and how I want to change my information consumption.

Categories Stories. I've been almost exclusively consuming non-fiction for the last decade and rarely read any non-fiction. At the suggestion of my ever-wise wife and the promptings of a great podcast on story I've reprioritized non-fiction as something worth spending time on. Great stories can change our perspective on our life and increase our creative thinking.Curiosity & Exploration. Investing in discovering new and interesting ideas has always paid off for me. For me, this generally looks like browsing community sites like hacker news, reading a random newsletter, of following interesting people on Twitter. Learning about random, interesting topics has always been really enjoyable for me—it sparks creativity (and joy) and is useful later Work. Learning specific to a work-related problem. Entertainment & Social. Twitter, Facebook, news, etc. When you look back at your time spent here you always feel like it was a waste. I've been convinced that keeping up with news is largely a waste of time (you'll hear anything worth knowing about through friends), and time spend communicating with friends over social is better spent with friends in person. This doesn't mean that there isn't a place for this category, but for me it means I need to bias towards eliminating any time spent on this category. Infrequent Personal-ish Updates. There's a group of organizations or people I follow that I want to keep tabs on, but don't send emails often. A friend running a non-profit, bands announcing a new album, etc. Deals/Promotions. Transaction/Service Emails. Mediums Podcasts.BooksBlogs.Community News. Hacker News, Lobsters, Reddit, Product Hunt.News Sites. Google News, Bloomberg, TechCrunch, etc. Social. Twitter, FaceBook.Movies. NetFlix, Amazon Prime, YouTube.Email.Personal communication. Texts, voicemails, etc.

Thoughts on Consumption.

Continually improve the system. Set aside time every month to quickly audit what I'm consuming and what tools I'm using. Make consumption a choice rather than a reaction. Right now, I randomly visit Hacker News or see an article come through my email. Instead I want to centralize information in one category into a single place that I can go. Optimize for pull vs push consumption. A great example here is email newsletters. They are push, not pull, and are often messy to read and pile up in my inbox. I want to separate "conversations with people" over "updates from companies/interesting news".Categorizing information is critical. Email newsletters can't be categorized easily. I want to put feeds into separate buckets that I can prioritize and triage separately. I should use RSS again. Way back when I read everything via NetNewsWire. Email newsletters took over seemingly overnight and I forgot that RSS existed. Most sites I care about still support RSS (even if it's not advertised explicitly).Use a RSS Reader. Specifically, one supported by paid subscriptions. Free is great, but most free things (without a huge market) die or have negative externalities over time. I don't want to have to mess with this part of my toolkit much and deal with a killed product. Paid subscriptions mean it's a real business that will continue to improve over time. Limit consumption. I want to enforce a limit on the numbers of things I'm consuming. I wonder if there is a way to automatically reset the read count of various feeds so it doesn't look like there are too many articles to read when I use a reader.Prefer books over articles. For most business/technology problems, blogs and Q&A sites are the main source of data, but work aside, books are generally higher-quality information compared to blogs. The time it took someone to create the content is a good indicator of the quality. Books > Blogs > Twitter. (this gets a bit tricky with low-cost kindle books: skip these). Optimize for highest impact & quality information at the beginning of the day. This means reading books and long-form articles at the beginning of the day while my mind is clear, instead of consuming blogs, tweets, texts, etc. Treat books like a blog archive. I really like this concept, can't remember where I first heard it. Reading books from cover to cover doesn't make a ton of sense, although it's definitely how I'm trained to read books. Skimming through a chapter (or skipping it entirely) if you find it boring or too verbose shouldn't feel 'wrong'. If the writer can't keep your attention, that's their fault. Additionally, books are generally longer than they need to be in order to hit page quotas.Don't switch contexts. If you are reading a book, don't stop and read a blog article. Cultivating sustained laser-focus attention on a single thing is critically important. I've found this to be more challenging as the years go by, and it's something I need to be even more intentional about.Focus on managing written internet media. I don't over-consume podcasts or books. I struggle most with interesting, distracting news sources like Hacker News or What I'm going to change Limit number of news feeds to 30. I suspect this number will change as I continue to slowly improve how I'm processing information, but this is a good start.Convert email newsletters to RSS. Most newsletters (like Ruby Weekly) have an RSS feed. For those that don't, FeedBin has a service to convert email newsletters into a feed, and I imagine there are standalone services that will do this for you automatically.Mass-unsubscribe from email newsletters. I've been using unroll.me for years (I don't love the privacy component, but it's a useful tool). It looks like their unsubscribe option will actually click through the unsubscribe links for me. I should go through my daily Unroll.me summary and remove newsletters I'm not interested in, and convert the others into a feed.Setup two aliases email+updates@gmail.com and email+promotions@gmail.com. Forward all updates to FeedBin and auto-archive. Auto-archive & tag all promotions.Subscribe to weekly summaries on community news site. Mass unfollow everyone on Twitter, and limit the people I follow to 50.Update my website blocking strategy, including blocking of all news & social sites. More on this in a separate post.Stop using Apple Podcasts. I find it hard to keep things organized and Apple seems to randomly reverse the listing of certain podcasts. I should trim which Podcasts I subscribe to and find another Podcast application. Categories: engineering, techReview all gmail filtersReview and trim all YouTube subscriptions. Review all Twitter app connectionsReview any compromised passwords via 1Password

After a bunch of investigation, I settled on using FeedBin.

https://reederapp.com/mac/#faq doesn't look like it is updated oftenhttps://github.com/mausba/rssheap went open source and hasn't been touched in over a yearhttps://github.com/getstream/winds podcast and RSS reader, open-source, commercially supported and recently updated.https://www.inoreader.com. Standalone paid product, has an API, doesn't look too complex. 64 feeds for free. Free and paid tiersFeedly.com is the most popular, but looks to be overdone. http://newsblur.com. Standalone paid product. Not updated frequently. Doesn't look like a great design. Free and paid tiers. I found this reader most commonly referenced by HN and Lobsters.https://readkitapp.com ties into various services to create a great reading experience. https://www.goldenhillsoftware.com/unread/ another macos reader.https://yoleoreader.com web-based reader with a low-cost paid subscription. https://feedbin.com well-designed feed reader. Supports podcasts and receiving email newsletters via a special email address. Also has a Twitter reader as well. Bootstrapped business. Also open source, very cool. https://github.com/feedbin/feedbinhttps://github.com/ViennaRSS/vienna-rss Pete Cooper is involved in this one. Open source. Looks like a zombie product.https://www.nooshub.com new reader from HN with some fancy "AI" grouping.https://ranchero.com/netnewswire/https://apps.apple.com/app/leaf/id576338668 Leaf. Looks dead. Hasn't been updated in two years.

Other interesting finds:

https://superfeedr.com RSS feed APIhttps://throttlehq.comhttps://news.ycombinator.com/item?id=20167143 and https://news.ycombinator.com/item?id=19909102

Continue Reading