Learning Swift Development for macOS by Building a Website Blocker

I loved Focus App. It blocked websites and apps on a schedule. But, years ago it started glitching out: sucking up tons of ram and freezing my computer. They didn’t fix the bug and I abandoned using it and instead switched to a host-based blocking system which has served me well.

However, there are some issues with the host-based approach:

I can’t block specific URLs, only hosts (focus app couldn’t do this either) I can’t set a schedule I can’t block apps If I remove a host it will not automatically get blocked unless I sleep and wake the computer Sleepwatcher (cli tool) is dead and requires some manual set up to get working.

My goal is to layer on top of the existing host-based system that has been working great and add another layer of focus tooling:

CLI-first tool Allow configuration to be easily set using a JSON file Allow different blocking configuration to be scheduled Replace sleepwatcher by configuring script execution on wake Add a ‘first wake of the day’ trigger that I can tie into clean browsers and todoist scheduler Allow both hosts and partial match urls to be blocked ‘Partial match’ means (a) anchors are excluded and (b) the configured block url must only be a subset of the url on the browser in order to be blocked. This will enable things like blocking news or shopping search on google. Support blocking urls in google chrome and safari No UI, maybe build a simple REST API that could be tied into my beloved Raycast Run CLI tool as privileged (in order to mutate /etc/hosts)

With a clear goal in mind for this learning project, I was able to get started and build this out. Here are the two repos with the resulting code:

hyper-focus CLI source code hyper-focus GUI via Raycast extension

I haven’t touched macOS development in years and hadn’t done any Swift development before. Below are my notes from learning swift and macOS development.

Swift Language The guard statement is explicitly used to return early. It’s like unless in ruby with some special scoping properties. More info. Specifically guard is useful for unwrapping an optional and assigning the unwrapped variable to something that can be used in the outer scope. There’s a community built package manager, but it requires that you (a) have a Package.swift and (b) use a specific source code structure. Both of which are a pain for a simple utility. I found later on that it’s better to just set up your application using Package.swift, even if it’s small. You’ll end up needing a community package and using the swift CLI tooling is nice. There’s a built-in JSON decoder, but it requires you to describe the incoming JSON payload as a struct. This makes sense since swift is strictly typed, but makes fiddling with data structures a PITA. There’s no built-in logging library with levels. There’s an open-source package out there, but not having it included with the stdlib is crazy to me. Here’s a < 50 line implementation of a simple stdout log. @objc exposes the swift function/class to the objective-c side of the world. You don’t have to worry too much about this, the compiler will warn you and enforce that you put these attributes in the right places. You can extend existing classes via extension String and add whatever methods you’d like onto them. I’m surprised by this for what seems an otherwise very structured language. This was a great compromise. One of the guys who works on the Swift language built Rust. I don’t know Rust (it’s on my learning list!) but from what I’ve heard—and the adoption it’s gotten across the new CLI tooling that has been emerging—it’s an amazing language. Probably part of the reason Swift seems so well-designed. Doesn’t seem like there are union types in Swift. You have to define an enum and then unwrap the enum using a switch statement. This seems insane to be and makes for very ugly code, I must be missing something here. You can nest struct definitions, which is nice. You can’t add a trailing comma to arrays or dicts, which drives me nuts. Makes it harder to refactor code and adds additional mental overhead to editing anything. It’s puzzling to me why more languages don’t allow this (one of the things I love about Ruby). You can typecast an object to a specific type with as! SafariWindow I imagine, since Swift is strongly typed, this has some limitations + compile errors, but I don’t know what they are and didn’t bother to learn. You only need an import to pull in a framework, not individual files. All files in the project are automatically compiled. Anything marked with public is available to everything in the project. This seems to indicate something otherwise, still some more investigation needed here. Argument order matters even when using keyword arguments. Bummer. Crash reports are still nearly useless. They have a stack trace, but no line numbers. You need to convert the crash report into a stack trace which is usable, which requires symbol-mapping file (dSYM) generated at the same time as the binary that generated the crash report. PLCrashReporter does a lot of this for you, but for a simple single-file swift script this is a massive pain. There are no stack traces on the command line, even in debug mode. ! asserts that the optional is not nil. If it is, your app will crash. You can use as? to define a default value if a non-nil value does not exist Method overloads exist, so you can define a method multiple times with different params. I really like this pattern, wish Swift had method guards like Elixir (one of my favorite things about Elixir). You have to explicitly indicate that a func could throw an exception with throws in the method signature. This is interesting, I think I like it, makes the design of the function more explicit. Empty dictionary is [:], and you can inline-type Any to a dictionary via varName: [String: Any]. I think Swift dictionaries are the same as an NSDictionary under the hood. dispatchMain() is not the same as RunLoop.main.run() despite what some blog articles say. let == const in JavaScript, var is roughly equivalent to JavaScript. Multiple let statements in an if can be separated by a comma. If any of the let statements results in a nil value, then the if statement fails. I don’t understand the value of this syntax above &&. I don’t like this language design choice. There are some magic variables. For instance, if you are in a catch block the error variable represents the exception. If you have a global function named error it is not accessible and overwritten by the local error variable. I didn’t read up on Swift’s memory allocation strategy, but my assumption is if a var isn’t referenced any longer (i.e. out of scope) it’s removed/garbage collected. The foot gun here is you have a class which subscribed to a notification (NSWorkspace.shared.notificationCenter.addObserver) but that class is not assigned to a var that will continue to persist after the caller completes (i.e. a class or global variable) the object will be garbage collected and you’ll never receive that notification and an error will not be thrown. However, if a function creates a Task which creates its own run loop, that task will continue to run as long as the loop is created even after the caller that created the Task has completed. I would imagine this is a bad design pattern. This also applies to other systems which receive ‘notifications’. I use this word very vaguely because I don’t understand macos subsystems very well/at all. It seems like there are ‘grand central dispatch’ queues which feel similar to a SQS queue, and those seem to be impacted as well. Any async pub/sub type interface would be impacted by the subscriber being garbage collected and you will not receive an error. It puzzles me why errors are not thrown. Hosting a localhost server

This is simple as long as you do bind to a local IP: localhost, 127.0.0.1, etc. If you bind to your router’s IP address you’ll run into all sorts of permissioning issues:

The default permissioning is different depending on what macos version you are on. Here’s an example of how to check an application’s default permissioning You cannot change your entitlements/permissions if you are just building a simple binary or cli app. You need an app with a Info.plist to set the proper security config. This is because of new security stuff that apple has introduced. This means you need to use xcode to setup and build your application. I couldn’t find any good examples of an app that is built without using XCode. The alternative to this is using another layer of indirection, like tuist. This is bringing back memories of all of the stuff I hated about desktop application development. Don’t bind to the device IP (i.e. the wifi- or ethernet-assigned address) unless you need to. Bind to localhost so the server is only accessible on the device. Swift server package options https://criollo.io https://github.com/httpswift/swifter https://github.com/Building42/Telegraph https://github.com/envoy/Ambassador Packaging

Not using a Package.swift for anything even slightly complex will bring a world of pain:

The VS Code tooling doesn’t work as well (no error highlights and LSP stuff) You can’t use a package manager and therefore can’t easily pull in community packages Anything that uses swift build doesn’t work

You’ll want to use a Package.swift in your project. Generating a Package.swift is pretty easy:

swift package init --type executable

When running swift build I ran into:

no such module 'PackageDescription

This post describes the issue and the following command fixes it for me:

sudo xcode-select --reset

If you run into issues with compilation errors due to some features not being available on older macos versions, you’ll need to add a platform requirement to your Package.swift:

platforms: [ .macOS(.v13) ],

Here’s an example Package.swift for the CLI tool.

Cleaning All Cache

I ran into a very weird build error:

❯ swift run Building for debugging... Build complete! (0.25s) dyld[21481]: Symbol not found: (_$s10Foundation11JSONDecoderC6decode_4fromxxm_AA4DataVtKSeRzlFTj) Referenced from: '/Users/mike/Projects/focus-app/.build/x86_64-apple-macosx/debug/focus-app' Expected in: '/System/Library/Frameworks/Foundation.framework/Versions/C/Foundation' [1] 21481 abort swift run

Even after resetting the project to a state where I knew it compiled, it still errored out. After walking away for a while, I found this post and tried updating the min macos version. It magically fixed the issue.

Here’s what I used to clear all build caches:

rm -Rf .build/ rm Package.resolved rm -Rf ~/Library/Developer/Xcode/DerivedData rm -Rf /Users/mike/Library/Caches/org.swift.swiftpm Open Questions Is there a way to open a repl with your application’s code imported? It was nice that a compiled language had a recent repl, but ideally, I want to open a repl and be able to import/use my applications code. How is the debugger? I just did caveman debugging for this project and didn’t bother understanding the GUI debug tooling. It’s unclear how good the package ecosystem is. It seems better than my Cocoa days, but there weren’t that many options and the package activity seems pretty dead. It doesn’t seem like you can build a .app without an xcode project. This is annoying, especially if you are building a small tool and don’t want to learn and understand the xcode toolchain (it still seems terrible). I wonder if I’m missing something here and if there’s some good tooling to support a CLI-based application build? I was surprised at how many errors were not reported. If you’ve subscribed an object as an observer to a notification center, the object was GC’d, that should give you an error. It seems like there were a good number of silent failures which made it harder to discover unexpected failures, especially to someone who is not a desktop developer. I wonder if there’s some env flags that change this behavior. I never understood/learned exactly what the @ does in Swift. It looks like a JS/Python decorator, but it’s unclear if all of the annotations are owned by Swift or if developers can write their own. Where is the documentation for all of the magic variables? i.e. error in a catch block? Open Source https://github.com/Ranchero-Software/NetNewsWire https://github.com/rxhanson/Rectangle Has automated some of the release process https://github.com/exelban/stats https://github.com/kean/PulsePro https://github.com/piemonte/Player https://github.com/cirruslabs/tart https://github.com/signalapp/Signal-iOS https://github.com/onevcat/Rainbow https://github.com/Sequel-Ace/Sequel-Ace https://github.com/HedvigInsurance/ugglan https://github.com/lvillani/chai https://github.com/halo/LinkLiar Thoughts on Swift

Swift is a really nice language. I like how it is strongly typed, but the typing system is good at inferring types when it can, so you don’t have to specify that many types. The type inference seems very good—better than TypeScript, Sorbet, and python from what I can tell.

I don’t like how there are not any imports, and how anything marked as public can clutter the global namespace. I hate this about ruby, and it’s something I think python gets very right. I wish there would be explicit imports and any package-level functions would be forced to be called with their package name. I can understand how this would get very messy with the objc stuff, but that could have been special-cased in some way.

Some of the objc interface stuff is strange, but I think the language designers did a very good job of dealing with it in a simple way.

The tooling isn’t bad but there are some strange gaps in the stdlib, largely because of the legacy cocoa infrastructure you can leverage. I found this annoying: there’s not a simple logger, there’s no built-in yaml parser, etc. The Cocoa apis have a lot of legacy decisions to deal with and they are generally a pain to use. I wish the stdlib was more expansive and designed without thinking about the legacy APIs too much.

The package manager requires you to build your application in a specific way, which is annoying, but if you follow the golden path things work in a pretty clean way. It’s nice that there is an official package manager that Apple is committed to maintaining.

After writing something simple in Swift, I found myself wishing JavaScript was Swift. It feels like JavaScript in many ways, but has less foot guns and is more simple. The language designers did a great job, and it felt fun to work in.

Continue Reading

My Experience With GitHub Codespaces

I have an older intel MacBook (2016, 2.9ghz) that I use for personal projects. My corporate machine is an M1 Macbook Pro and I love it, but I’ve been holding off on replacing my personal machine until the pro M2 comes out (hopefully soon!).

I love playing with new technology, especially developer tools, and when I got accepted to the codespace beta I couldn’t resist tinkering with it. To speed up my ancient MacBook, try some new tech, and have the ability to learn more ML/AI tooling in the future.

Summary

I largely agree with this analysis.

Codespaces are very cool. They work better than I expected—it felt like I was developing on a local machine. Given how expensive the sticker pricing is, I don’t get why you wouldn’t just buy a more powerful local machine in a corporate setting (codespaces is free for open source work). I can’t see devs being ok with a Chromebook vs MacBook pro, so the cost savings aren’t there (i.e. buy a cheaper machine and put the savings into rented codespace).

You could run a similar dockerized setup locally on the MacBook if you wanted to normalize the dev environment (which is a big benefit, esp in larger orgs). I think this is one of the best benefits of codespaces—completely documented and normalizing your development environment so it’s portable across machines.

Notes

Here are some notes & thoughts on my experience with codespaces:

Codespace is essentially a docker image running on a VM in the cloud wired up to your VS code local installation in a way that makes your experience feel like you aren’t using a remote machine. Amazingly, code, gh pr view --web, etc all work (i.e. opens a local browser) and integrate with macOS. They’ve done a decent job integrating codespaces into the native experience so you forgot If you are curious, this is done by a magic environment variable: BROWSER=/vscode/bin/linux-x64/e7f30e38c5a4efafeec8ad52861eb772a9ee4dfb/bin/helpers/browser.sh Add Development Container Configuration is the command you need to run to autogen the default .devcontainer/ config for your codespace. Your dotfiles are magically cloned to /workspaces/.codespaces/.persistedshare/dotfiles File system changes are not instantly updated in the file explorer. There is a slight delay, which is frustrating. It looks like there is a reference that has emerged after the initial beta. Lots of examples/open source code still references some of the old stuff, so you’ll have to be careful not to cargo-cult everything if you want to build things in the latest style that will be resilient to changes. /workspaces/.codespaces/shared/.env has a bunch of tokens and context about the environment. You can have multiple windows/editors against multiple folders. You can do this by cloning additional folders to /workspaces and then run code . when cd‘d in that folder. Terminal state is not restored when a codespace is paused Codespace logs are persisted to /workspaces/.codespaces/.persistedshare/EnvironmentLogbackup.txt. You can also access them via the cli gh codespace logs Some of the utilities used to communicate with your local installation of vscode are located in ~/.vscode-remote/bin/[unique sha]/bin/. It’s interesting to poke around and understand how client communication works. /workspaces/.codespaces/shared/.env-secrets contains github credentials, and other important secrets. CODESPACE_VSCODE_FOLDER is not setup in /etc/profile.d. This is injected into the environment via VS Code extension JavaScript. Therefore, this variable is not available during postCreateCommand execution. If you’ve used the remote SSH development, much of the magic that makes that work is used in a codespace. There’s a hidden .vscode folder installed on the remote machine and some binaries which run there to make VS Code work properly. Load order

I couldn’t find clear documentation on the load order: when does your code get copied to the container, when do all of the VS code tools startup on the machine, etc. https://containers.dev/implementors/spec/ for the general devcontainer specification, but it’s not too helpful.

Dockerfile. Your application code does not exist, features are not installed. Features (like brew). Each feature is effectively a bundle of shell scripts that are executed serially. Application code does not exist at this point. Post Install. Dockerfile is built, features are installed, application code exists, dotfiles are not installed. Dotfiles. At this step (and all previous steps), code (vs code cli) does not exist and has not yet been installed. Sometime after this the code binary is installed and some of the daemon-like processes that run on the remote machine are started up. From what I can tell, there’s not a single-run lifecycle hook that you can use at this stage. ASDF: Version Manager for Everything

I really like asdf conceptually: one version manager to rule them all. Consistent versions and installation methods across machines and languages. Simple and beautiful. I’ve been using it for years on Elixir, Ruby, JavaScript, and Python projects and have had a great experience.

The devcontainer image examples had a completely different runtime for each major language. What if you use multiple languages? What if your environment is more custom?

I thought it would make sense to try to use asdf across all projects, as opposed to language-specific builds.

Some notes:

If you install asdf via homebrew it will throw asdf installation files in /home/linuxbrew/.linuxbrew/Cellar/asdf/0.10.2/libexec/asdf.sh. Many tools, including ElixirLS assume that the full installation exists in ~/.asdf . This caused issues on the codespace, it seems as though the shell script to start ElixirLS was not using the default shell and did not seem to be sourcing standard environment variables. I’m guessing depending on how the extension is built it does not properly run in I ran into weird issues with pyright: poetry run pyright . returned zero errors, while running pyright. inside of poetry shell triggered a lot of errors relating to missing imports (related issue). Erlang uses devcontainers and asdf, which is a good place to look for examples.

Here’s the image I ended up building and it’s been working great across a couple of projects.

Docker Compose & Docker-in-Docker

Using docker compose (to run postgres, redis, etc) is super helpful but is not straightforward. Here’s how I got it working:

You can specify a docker-compose.yml file to be used in your devcontainer.json. This seems like a great idea until you realize that you can’t manage the other services that are started through the compose definition at all. You are "trapped" inside your application container and cannot inspect or manage the other processes at all. Most of the documentation + content out there recommends using dockerComposeFile in your devcontainer.json. This is not the best way. The more flexible approach is to install docker inside a single container. This requires a bit more setup, specifically passing additional flags to the parent docker container in order to be able to run docker. Dotfiles transformation

My dotfiles are very well documented, but were not ready for codespaces. I needed to do some work to separate out the macos specific stuff from the cross-platform compatible tools.

Here’s a great guide on how to get your dotfiles setup Thankfully, brew works on linux and has a really easy integration within codespaces. This made my life easier since my dotfiles are built around brew. Pull out packages that are system-agnostic and stick them in a Brewfile. Here’s mine. Create an install script specifically for codespaces. Here’s what mine looks like. VS Code Extensions

Sync Settings extensions are not installed automatically. You have to specify which extensions you want installed on the codespace through a separate configuration github.codespaces.defaultExtensions.

Homebrew Installation Failure

Due to old packages (or old apt-get state, not sure which) installed on the image. If you use a raw base image for your codespace, you need to ensure you run apt-get update in order for homebrew install to work properly.

Another alternative is using the dev- variant of many of the base images (here’s an example).

GPG Signing

It looks like the codespace machine calls some sort of GH API to power the GPG signing. If you have a .gitconfig in your dotfiles, it will overwrite the custom settings GitHub creates when generating the codespace machine. You’ll run into errors writing commits in this scenario.

Here’s what you need to do to fix the issue:

git config --global credential.helper /.codespaces/bin/gitcredential_github.sh git config --global gpg.program /.codespaces/bin/gh-gpgsign

You’ll also want to ensure that GPG signing is enabled for the repository you are working in. If it’s not, you’ll get the following error:

error: gpg failed to sign the data fatal: failed to write commit object

You can ensure you’ve allowed GPG access by going to your codespace settings and looking at the "GPG Verification" header.

As an aside, this was an interesting post detailing out how to debug git & gpg errors.

Awk, and other tools

The version of awk on some of the base machines seems old or significantly different than the macOS version. It wouldn’t even respond to awk --version. I installed the latest version via homebrew and it fixed an issue I was having with git fuzzy log where no commit found on line would be displayed when viewing the commit history.

I imagine other packages are old or have strange versions installed too. If you run into issues with tooling in your dotfiles that work locally, try updating underlying packages.

Shell Snippets

Here are some useful shell commands to make integrating cs with your local dev environment more simple.

# gh cli does not provide an easy way to pull the codespace machine name when inside a repo targetMachine=$(gh codespace list --repo iloveitaly/$(gh repo view --json name | jq -r ".name") --json name | jq -r '.[0].name') # copy files from local to remote machine. Note that `$RepositoryName` is a magic variable that is substituted by the gh cli gh codespace cp -e -c $targetMachine ./local_file 'remote:/workspaces/$RepositoryName/remote_file' # create a new codespace for the current repo in the pwd gh alias set cs-create --shell 'gh cs create --repo $(gh repo view --json nameWithOwner | jq -r .nameWithOwner)' Unsupported CLI Tooling

Here are some gotchas I ran into with my tooling:

zsh-notify. Macos popup when a command completes won’t work anymore. pbcopy/pbpaste doesn’t work in the terminal. You lose all of your existing shell history. There are some neat tools out there to sync shell history across machines, might be a way to fix this. Open Questions Is there more control available for codespaces generated by a pull request? Ideally, you could have a script that would run to generate sample data, spin up a web server, etc and make that web server available to the public internet in some secure way. I think vercel does this in some way, but it would be neat if this was built into GitHub, tied into VS Code, and allowed for a high level of control. I’m still in the process of learning/mastering tmux, there seemed to be some incompatibilities that I’ll need to work around. cmd+f within the integrated shell doesn’t search through scroll buffer clipboard integration doesn’t work (main reason for using tmux is keyboard scroll-buffer search and copy/paste support) pbcopy/pbpaste, which I use pretty often, doesn’t work. A good option is using something like Uniclip, but this will require some additional effort to get working. Other alternatives that might be worth investigating: https://github.com/jedisct1/piknik https://gist.github.com/dergachev/8259104 I had trouble with some specific VS Code tasks not working properly. This was due to how some tasks build the shell environment. Can you run github actions locally within the codespace? This would be super cool. Looks like it’s not possible right now, but there’s some open source tooling around this which looks interesting. There’s got to be a cleaner way to sharing a consistent ssh key with a codespace for deploys. This post had some notes around this. I’m not sure how the timeout works. What if I’m running a long-running test or some other terminal process? Will it be terminated? Is there a way to keepalive the session in some other side process? Can you mount the remote drive locally and have it available in the finder? scping files to view and manipulate locally is going to get tired fast.

Continue Reading

Blocking Websites on a Schedule With Pi-Hole

I’ve written about blocking adds and distracting websites before as part of my digital minimalism crusade. I’m a big fan of thinking through your lifestyle design and automating decisions as much as possible.

For instance, after 9pm at night and before 7am there’s a set of distracting websites that I do not want myself, or anyone in my family, to be able to access. This introduces just enough friction to bad behavior (like scrolling Twitter at 9pm) that it prevents me from doing the wrong thing.

Below I’ve described how I block (and then subsequently allow) websites on a schedule, and some other misc related trick with the Raspberry Pi & Pi-Hole.

Block Sites on a Schedule

I wanted to block my Roku TV based on my cron schedule. However, the TV uses a bunch of different subdomains across various services. With a /etc/hosts blocking method, you can’t block domains based on a pattern, but you can with pi-hole.

The --wild command converts your domain into a wildcard regex to match the domain of any subdomains.

For example, if you have a blocklist file containing a simple list of URLs:

facebook.com pinterest.com amazon.com netflix.com feedbin.com disneyplus.com roku.com youtube.com twitter.com

Your block.sh would look like:

blockDomains=$(<blocklist) for domain in ${blockDomains[@]}; do pihole --wild $domain done

Note that the position of the -d is significant in your allow.sh:

blockDomains=$(<blocklist) for domain in ${blockDomains[@]}; do pihole --wild -d $domain done

Here’s a great discussion about how to block groups in Pi-hole.

Running Pi-Hole & Scheduled Blocking on Docker

I’ve codified most of this into a docker container and related docker-compose.

Whitelist Alexa-related Domains

If you block amazon (which I recommend to avoid buying stuff or getting sucked into prime video), you may want to whitelist alexa-related domains so they work inside "blocked hours". Here are the domains you want to whitelist (and here’s a script to do it):

bob-dispatch-prod-na.amazon.com avs-alexa-14-na.amazon.com api.amazon.com api.amazonalexa.com latinum.amazon.com DDNS with Dreamhost

Sometimes, if you are running a VPN or a node on a service (like Storj) you’ll want to have an external domain available which points to your network IP.

I have a dreamhost server that runs a couple of WordPress sites for me. They have a nice API for modifying DNS records that can be used to dynamically update a domain record which points to my home network.

Here’s the modified dreamhost script that worked for me (I couldn’t get the PR for this merged). Here’s how to set it up as a cron on the pi:

crontab -e @hourly bash -l -c 'DREAMHOST_API_KEY=THEKEY DREAMHOST_UPDATE_DOMAIN=subdomain.domain.com /home/pi/Documents/dreampy_dns/dreampy_dns.py'

Watch the logs:

tail -f /var/log/cron.log

To test to make sure it’s working (from a server outside your network):

telnet node.thesite.com 28967 Even better: DDNS with Dreamhost + Docker

You can also run this as a docker image. Here’s an example docker-compose.

Blocking DNS over HTTP

iOS and specific websites on macos use DNS over HTTP. This breaks the blocking rules you setup on pihole. You can configure pihole to reject all DNS over HTTP queries.

Here’s what this looks like in the pi-hole interface:

Here’s how to do this on the command line.

Blocking Spam, Porn & Other Sites on Raspberry Pi

Block List Project has a great index of various site groups you can block, including porn. Here’s another block list.

Navigate to Group Management > Ad List and then pick the "Original" version of the lists on the blocklist project.

Here’s a script which does this.

Continue Reading

Using the Multiprocess Library in Python 3

Python has a nifty multiprocessing library which comes with a lot of helpful abstractions. However, as with concurrent programming in most languages, there are lots of footguns.

Here some of the gotchas I ran into:

Logging does not work as you’d expect. Global state associated with your logger will be wiped out, although if you’ve already defined a logger variable it will continue to reference the same object from the parent process. It seems like the easiest solution for logging is to setup a new file-based logger in the child process. If you can’t do this, you’ll need to implement some sort of message queue logging which sounds terrible. Relatedly, be careful about using any database connections, file handles, etc in a forked process. This can cause strange, hard to debug errors. When you pass variables to a forked process, they are ‘pickled’. This serializes the python data structure and deserializes on the ‘other end’ (i.e. in the forked process). I was trying to decorate a function and pickle it, and ran into weird issues. Only top-level module functions can be pickled. If you are using the macos libraries via python, you cannot reference them both on a parent and child process. The solution here is to run all functions which hit the macos libraries in a subprocess. I was not able to get the decorator in this linked post working. Here’s a working example using a modified version of the source below.

I struggled to find full working examples of using the multiprocess library online (here’s the best I found). I’ve included an example of using multiprocessing to create a forked process to execute a function and result the results inline.

Send a signal from the parent process to the child process to start executing using multiprocessing.Condition. I want not able to get this working without first notify()ing the parent process. Kill the child process after 10m. This works around memory leaks I was running into with the applescript I was trying to execute. Configure logging in forked process. Return result synchronously to the caller using a shared queue implemented using multiprocessing.Queue import multiprocessing import time import logging forked_condition = None forked_result_queue = None forked_process = None forked_time = None logger = logging.getLogger(__name__) def _wrapped_function(condition, result_queue, function_reference): # this is run in a forked process, which wipes all logging configuration # you'll need to reconfigure your logging instance in the forked process logger.setLevel(logging.DEBUG) first_run = True while True: with condition: # notify parent process that we are ready to wait for notifications # an alternative here that I did not attempt is waiting for `is_alive()` https://stackoverflow.com/questions/57929895/python-multiprocessing-process-start-wait-for-process-to-be-started if first_run: condition.notify() first_run = False condition.wait() try: logger.debug("running operation in fork") result_queue.put(function_reference()) except Exception as e: logger.exception("error running function in fork") result_queue.put(None) def _run_in_forked_process(function_reference): global forked_condition, forked_result_queue, forked_process, forked_time # terminate the process after 10m if forked_time and time.time() - forked_time > 60 * 10: assert forked_process logger.debug("killing forked process, 10 minutes have passed") forked_process.kill() forked_process = None if not forked_process: forked_condition = multiprocessing.Condition() forked_result_queue = multiprocessing.Queue() forked_process = multiprocessing.Process( target=_wrapped_function, args=(forked_condition, forked_result_queue, function_reference) ) forked_process.start() forked_time = time.time() # wait until fork is ready, if this isn't done the process seems to miss the # the parent process `notify()` call. My guess is `wait()` needs to be called before `notify()` with forked_condition: logger.debug("waiting for child process to indicate readiness") forked_condition.wait() # if forked_process is defined, forked_condition always should be as well assert forked_condition and forked_result_queue # signal to the process to run `getInfo` again and put the result on the queue with forked_condition: forked_condition.notify() logger.debug("waiting for result of child process") return forked_result_queue.get(block=True) def _exampleFunction(): # do something strange, like running applescript return "hello" def exampleFunction(): return _run_in_forked_process(_exampleFunction) # you can use the wrapped function like a normal python function print(exampleFunction()) # this doesn't make sense to use in a single-use script, but if you need to you'll need to terminate the forked process forked_process.kill()

Note that the target environment here was macos. This may not work perfectly on linux or windows, it seems as though there are additional footguns on windows in particular.

Continue Reading

Book Notes: The Hard Thing About Hard Things

Something new I’m doing this year is book notes. I believe writing down your thoughts helps you develop, harden, and remember them. Books take a lot of time to read, taking time to document lessons learned is worth it.

Here are the notes for The Hard Thing About Hard Things by Ben Horowitz. Definitely worth reading, especially if you are actively building a company, although I wouldn’t say it’s in the must-read category.

Below are my notes! Enjoy.

Leadership

A much better idea would have been to give the problem to the people who could not only fix it, but who would also be personally excited and motivated to do so.

I think any good leader feels personally responsible for the outcome of whatever they are doing. Everything is their job, in the sense that ultimately if the project isn’t successful it is their fault.

However, I think Ben’s framing is important: it’s the leaders job to clearly describe problems—instead of hiding them—no matter how large, and get the right people aligned to the problem who are energized by big scary problems that need to be solved.

The more you communicate without BS—describing reality exactly how it is—the more people will trust what you say. There are no lines to read between. It takes time for this trust to filter its way through an organization, but it makes any other communication (which is a prime job of a leader) way easier in the future.

Former secretary of state Colin Powell says that leadership is the ability to get someone to follow you even if only out of curiosity.

Sometimes only the founder has the courage to ignore the data;

It’s nice to lean on data to make decisions. All of the great decisions in life need to be made out of an absence of data; in the absence of certainty. The safety of the modern world has made us less comfortable with taking risks and being decisive in areas of life where it is impossible to get certainty.

the wrong way to view an executive firing is as an executive failure; the correct way to view an executive firing is as an interview/integration process system failure.

Ben has a lot of counterintuitive thinking about executive management throughout the book. I found the thinking around executive hiring, management, etc the part of the book most worth reading.

He articulates the executive hiring, management, and firing process as incredibly messy, opaque, and constantly changing. I think this is the thing that technical founders struggle with a lot—it’s not straightforward, requires a lot of tacit knowledge that can only be acquired through experience, and requires lots of conflict-laden conversations which everyone hates.

Part of the leader’s job is the ability to step-in and cover any of the executive’s job if they leave or are fired. This helps the leader understand what’s really needed in that role at this stage of the company.

What is needed from an executive changes quickly as a company grows. It’s your job as a leader to understand what is needed right now, communicate that expectation, and then measure their performance off that revised standard. It’s up to the executive to figure out how to retool their skills to meet the new requirements; you don’t have time to help them here. If they can’t figure out the new role you need to let them go fast.

Management techniques that work with non-executives don’t work with executives. You can’t lead professional leaders in the same way. For instance, the "shit sandwich" approach feels babying to a professional when it may work well for a lead-node individual contributor. What works on a lead-node team doesn’t work when running a management team.

in my experience, look and feel are the top criteria for most executive searches.

Developing and holding to an independent standard in any of life is incredibly hard. We are deeply mimetic and avoiding pattern-matching on what the herd believes is right is one of the hardest tasks of leadership.

Consensus decisions about executives almost always sway the process away from strength and toward lack of weakness.

You want someone who is world-class at thing you are hiring them for. Make sure your organization can swallow their faults; don’t try to avoid faults—even major ones—completely.

Relatedly, the concept of "madness of crowds" is a good mental model to keep in mind.

This is why you must look beyond the black-box results and into the sausage factory to see how things get made.

Understanding how things work at the ground-level in an organization is key to improving performance. I always thought Stripe’s leadership did a great job here: jumping into engineering teams for a week to understand what the real problems were can’t be replaced by having 100 1:1s.

I describe the CEO job as knowing what to do and getting the company to do what you want.

This is what I liked most about the book—plain descriptions of commonly amorphous concepts.

Company building

as often candidates who do well in interviews turn out to be bad employees.

If someone is good at cracking an interview, it could be a signal that they aren’t good at the core work. If someone is exceptional, they aren’t going to care about interviewing well or understanding the big-company decision-making matrix around hiring: they know they are smart and want to work at a place that values the work.

This is a distinct advantage startups have. I love the interview process at one of my new favorite productivity apps:

We don’t do whiteboard interviews and you’re always allowed to google. We’ll talk about things you’ve previously worked on and do a work trial – you’ll be paid as a contractor for this.

They can focus on the work and ignore the mess of other signals that are only important when you need to ensure quality at scale.

In good organizations, people can focus on their work and have confidence that if they get their work done, good things will happen for both the company and them personally. It is a true pleasure to work in an organization such as this. Every person can wake up knowing that the work they do will be efficient, effective, and make a difference for the organization and themselves. These things make their jobs both motivating and fulfilling.

Simple and true description of what makes a company great, and conversely what makes bureaucratic organizations painful to operate in.

Companies execute well when everybody is on the same page and everybody is constantly improving.

Constant improvement compounds over time.

What do I mean by politics? I mean people advancing their careers or agendas by means other than merit and contribution.

Good definition of politics.

I’d love to understand what companies have designed a performance process for higher management tiers that isn’t political. At larger companies, getting promoted to higher levels becomes more political almost by definition: it’s harder to describe your impact quantitatively because your work is more people-oriented and dependent on your leadership ability.

Perhaps the CEO’s most important operational responsibility is designing and implementing the communication architecture for her company.

I’d love to hear more stories about well-designed communication systems in companies.

Perhaps most important, after you and your people go through the inhuman amount of work that it will take to build a successful company, it will be an epic tragedy if your company culture is such that even you don’t want to work there.

Reminds me of the parenting idea "don’t raise kids that you don’t want to hang out with."

the challenge is to grow but degrade as slowly as possible.

Ben makes the assumption that all companies degrade over time. Things that were easy become difficult when you add more people: mostly because of the communication overhead/coordinate and knowledge gaps across the organization.

I want to learn more about what organizations fought against this and when they felt there was an inflection point of degradation. How big can you grow before things degrade quickly?

Management

big company executives tend to be interrupt-driven.

They wait for problems to come to them, and they don’t execute work individually. Be aware of when you’ve reached this stage and then hire for these people. Hiring this type of person too early will most likely fail—if you are used to working in this style, it’s hard to change.

An early lesson I learned in my career was that whenever a large organization attempts to do anything, it always comes down to a single person who can delay the entire project.

Resonates with my experience. It’s amazing how one or two B players can destroy the ability to get anything significant done. The Elon Musk biography talks about how Elon’s employees were terrified about being "the blocker" and would do anything they needed to in order to avoid being that person. He would ask for status update multiple times a day and force you to do whatever needed to be done to eliminate yourself as the primary blocker.

However, if I’d learned anything it was that conventional wisdom had nothing to do with the truth and the efficient market hypothesis was deceptive. How else could one explain Opsware trading at half of the cash we had in the bank when we had a $20 million a year contract and fifty of the smartest engineers in the world? No, markets weren’t “efficient” at finding the truth; they were just very efficient at converging on a conclusion—often the wrong conclusion.

[managing by the numbers] penalizes managers who sacrifice the future for the short term and rewards those who invest in the future even if that investment cannot be easily measured.

Not everything can be measured. You need to have qualitative and quantitative metrics, and you can’t rely too strongly on quantitative metrics. Building anything great requires great conviction in the absence of evidence supporting the outcome you believe is inevitable.

As Andy Grove points out in his management classic High Output Management, the Peter Principle is unavoidable, because there is no way to know a priori at what level in the hierarchy a manager will be incompetent.

This is the sort of thing that makes management so incredibly hard.

If you become a prosecuting attorney and hold her to the letter of the law on her commitment [to fix a problem that she discovered], you will almost certainly discourage her and everybody else from taking important risks in the future.

No easy answer to this question. You have to hold people accountable but understand the situation enough not to disincentivize critical behavior which improves the company. If you don’t do this right, people notice and will manage their work towards what is indirectly rewarded.

the best ideas, the biggest problems, and the most intense employee life issues make their way to the people who can deal with them. One-on-ones are a time-tested way to do that,

This rings true to me. Although, I think it’s critical to get as much state out of meetings into central systems as possible so 1:1s can mostly focused on the small batch of critically important stuff that cannot be handled async.

Sales

There’s an interesting thread in the story of OpsWare that could yield the lesson "Don’t rely too much on whales". I don’t think anyone would disagree with this advice in the abstract, but I think practically it’s hard to build a big business without whales. I think you want to avoid being too reliant on whales, but I believe you also need to be ok pandering to your largest customers in B2B SaaS and doing what needs to happen to keep them thrilled with you.

There was a really helpful appendix with some great questions and guides to hiring a sales leader. I think these people-oriented jobs can sometimes seem as a black art to the hyper-logical work that technical founders start out doing.

Continue Reading

Book Notes: Successful Fathers

An older book (no kindle version!), Successful Fathers, was recommended to me by a father I really respect. Here are my book notes for it.

Fathers and mothers today, isolated as they are from their own parents and extended family, need as much experienced advice as they can get. In our own era, they need to work harder to get it.

Rings very true. Most parents, who aren’t under some other massive life stress (finances, health, etc) are very concerned about raising their kids the "right way". I am too.

However, parenting advice is an industry. It’s a business. There’s lots of advice in books, videos, courses, etc and much of it is conflicting. We aren’t missing advice, we are missing advice we can trust.

What he identifies here is this didn’t use to be a problem. Your parents, cousins, tight-night community passed down what they learned and that was it. You trusted them because you saw the outcome and didn’t try to look anywhere else. It’s definitely more complicated now.

Love is the capacity and willingness to embrace hardship for the sake of someone’s welfare

Beautiful definition.

From the earliest infancy, we acquire [values] by imitating people whose character we admire, principally from our parents.

I really enjoyed Wanting which explores the idea of mimetic desire. This is another example of how we are deeply mimetic. It’s not something that’s good or bad, it just is, and it’s important to take into account when thinking about human nature.

If my kids copy who they admire, it’s important that I become someone they can admire. This may sound obvious, but it results in interesting tradeoffs: should I spend time with my kids, or time doing something that powers the rest of my life?

I think the answer is complicated. Modern parenting advice and the current cultural norms—at least in how I perceive them—seem to push for spending more time with your kids, always. You aren’t a good dad unless you are at the soccer game, dance practice, reading to your kids, etc. Those are all good things, but the equation is more complex when thinking about how to spend your time when you have kids: making sure you are living life to the fullest is a critically important thing and can’t be sacrificed on the altar of maximizing time with your kids.

A husband’s neglect for his wife, a failure to support her authority, leads eventually toward the children’s saas and disobedience at home.

Parents need to be fully aligned in how they are raising kids—standard of behavior, discipline, etc. If dad isn’t fully behind mom, and vise-vera, kids will naturally use this against their parents to get what they want.

It’s up to parents to ensure kids don’t get what they want, but get what they need.

Psychologists have noted that much of the posturing and verbal defiance of adolescents is really testing and questioning of their father’s standards.

This aligns closely with the "high standard, high connection" parenting approach that most of the "emotion coaching"-type parenting books espouse.

The idea is that kids really want boundaries and rules and bad behavior—when they are young, or older—is their way of testing where the boundaries really are. This has felt very true in my experience.

Middle-class children today almost never see their father work

I think this is changing with remote work, but in a sense, it’s hard to ‘see’ the work that dad (or mom!) might be doing on a call or on the computer.

The idea still holds that kids rarely see dads working in an area where they excel. This decreases the respect for their parents and naturally encourages them to look elsewhere for a mimetic model for their life.

Bringing kids into your work in creative and sometimes strange ways feels key to giving them a greater understanding of what you do and why they should respect you for it.

Television and other entertainment have become the principal means by which children concepts of adult life

The core of this statement is true: rather than kids learning about adult life through a group of adults, kids do infer what ‘real life’ is like through screens. Videos, social media, news, articles, etc have an outsized influence on how kids decide how the real work operates.

The book makes the argument that this influence has grown over time because of the decreased interaction with other adults. I think this is more true than ever before—we don’t interact with nearly as many people as we used to, because we don’t have to. Mostly everything you need can be ordered quickly on a phone, work can be done remotely, neighborhoods are more isolated, etc. The lack of adult interaction changes how kids build a model of how adult life really is.

Material riches crowd out the central realities of life

The more wealth we have, the more we are comfortable and forget the suffering of others. It’s easier to empathize with others’ suffering when you are experiencing it yourself.

children come to know their father’s mind inside and out

Letting kids into your inner life—what you are experiencing, what you are feeling, how you are thinking about a problem, how you are approaching a situation is key to giving them an understanding of who you are. If they don’t know who you are, they can’t decide if they should model their life after you.

I’ve always loved families that have intense discussions. Vigorous debate and explaining how you are thinking are key to a child’s formation.

Continue Reading

Book Notes: Making it All Work

Something new I’m going to try doing this year is book notes. I’m continually more bought into the idea that writing down your thoughts helps you harden and remember them. Books take a lot of time to read: if I’m going to invest the time in a book, I should be ok investing another ~hour in calcifying the lessons I learned—so that’s what I’m going to try to do. This should help me better filter what books to read: if it’s not worth spending the time to write the notes, I probably shouldn’t read the book (obviously excluding entertainment-only reading).

Here are the notes for Making it All Work. The book wasn’t great, I wouldn’t read it unless you haven’t read Getting Things Done and are new to personal productivity. Here are the notes! Enjoy.

Improving your productivity system David Allen has a zen feel to how he describes things. Meditation, and the idea of generally being self-aware about what you are thinking, feeling, experiencing, etc is a very useful tool. When I sat back and thought about how I ‘felt’ when trying to work on something, I was able to identify various little blockers and better determine what the best thing to do at the moment is. In other words, being aware of where you aren’t performing well is key to improving. This takes time & effort. For instance, there were times I just felt distracted and I didn’t have any state of a specific project ‘loaded up’ in my mind. When I already have ideas around a project, email, etc ‘loaded’ into my mind I can work much more quickly on executing. Doing something physical—kettlebell swings, going for a run, pull-ups, etc—can shift me out of a distracted mode and convince my mind to focus its attention on the next project. I find that this sort of ‘focus shift’ using something physical can help when I’m switching between projects. I find this is true with parenting too. If your kid is in a bad mood / having a meltdown, shifting the physical environment can solve the problem. When my toddler is having a meltdown I suggest we go on a "barefoot walk" around the neighborhood (I’m weird and like walking without shoes on) and she immediately stops crying is ready to go.

And frankly, if you have any thought more than once, in the same way about the same subject, you’re probably involved in unnecessary work and exhausting your creative energy.

Thinking about the same thing over and over is a ‘process smell’: you aren’t clear on what you need to do next and don’t trust your system.

Paying attention to what has your attention

Good catchphrase from the book that encompasses this.

GTD is fundamentally much more about mind management than about time management

Time management tools have never worked well for me, but GTD has.

Given the vast changes in speed, volume, and ambiguity of what grabs our attention these days, we face an increasing need to have an “extended mind” that can truly relieve the pressure from our psyche and free it up for more valuable work.

I have felt the increase in information throughput in my life and have spent a lot of time intentionally decreasing it. The other half of the equation is developing processes and systems to help manage the information you do ne ed to care about. Todoist and Obsidian have been helpful here, but I need to continue to improve my processes.

Create clarity about your work He talks a lot about ‘task dumping’: getting everything you are thinking about dumped somewhere where you know you are going to look at it again. I’ve made this an obsessive habit over the last decade, but reflecting on it I need to do better at this in the mornings. I pray & read every morning, and tasks / projects can distract me during this time. Having a pad of paper (explicitly not using your phone, since it can be distracting just to have it around) to jot down thoughts to ‘clear’ them from your mind is an effective practice for me that I need to get better at. He defines this as "accepting, clarifying, sorting, reflecting, and engaging". I think this is a good articulation of the process we need to go through before intentionally working on the right thing. Chunking down loosely defined tasks (I use todoist) is something I don’t spend time on right now. Without a clearly defined next action, it’s hard to complete the task, and it takes extra cognitive overhead to start working on the task (because it has multiple components). An easy improvement here for me is task splitting. If there’s a large task, even if it’s well defined, I can punt the ‘large’ task to be due in the future, and add a simple next action to my near-term todo list. This applies to parenting too. A task can seem complex and overwhelming to kids, even if they have done it before and seems obvious to you. "Unload the dishwasher" is more ambiguous than "put the forks and spoons away that are in the bottom of the dishwasher". I’ve seen breaking down the task for my kids be an important way of getting them engaged with a project more. In my todo list, there are a bunch of investigative tasks. Writing or researching something I’m interested in. This could be anything from thinking about a parenting problem with my wife, investigating a new productivity tool like Raycast, implementing a new health habit, etc. These are rarely ever things I’m going to be able to complete in one sitting, and I’m rarely able to do more than one per day. However, they clutter up my todo list and weigh down on my psyche (i.e. if I see a ton of tasks due in one day, I get overwhelmed, even if I know they don’t need to be completed right now). I need to think through this and determine how to automatically limit these types of tasks each day.

Ambiguity is a monster that can still take up residence and lurk in the sharpest, most productive places and among the most sophisticated people.

Unlocking the creative process Whenever I’m hugely productive I feel this: "Loss of control and perspective is the natural price you will pay for being creative and productive. The trick is not how to prevent this from happening, but how to shorten the time you stay in an unsettled state. " Spending time organizing yourself is an important function: "Much of the energy in propelling a rocket is spent in course correction—it is, in a way, always veering out of control and off-target." Patrick Lencioni’s work is aligned with the idea that the unsaid human issues in the room (whether at work or home) affect your ability to be creative and solve problems together: "Perhaps we all are more attuned to one another than we realize, and if someone is disconnected from the mutual intention of the occasion because of unacknowledged issues, they just won’t participate fully in the game, which will mitigate the group’s cohesion and positive energy." Separating creativity from analysis or rigor is important. I’ve found you can spark creative thoughts by intentionally including bad, wild, or dumb ideas in a list. It makes it feel easier—even if it’s an exercise you are doing with yourself—to express more creative ideas. "Good brainstorming is stifled by any attempt to analyze and evaluate the meaning and merit of those ideas too soon." Levels of perspective

I’ve always felt it’s hard to calculate the next best thing to do. There are too many options, there is too much to do. In the book, it’s articulated that the reason this is hard is that there are too many inputs and it’s impossible to determine the next best thing. By organizing your tasks and thoughts well, you can make a better intuitive judgment aided by your best-effort prioritization and your psyche will slowly trust your judgment. For me, I often feel mental friction when I’m not certain about what to do next and I think this idea will help me here.

He makes the argument that it’s critical to think in terms of "level of perspective" or "horizons of focus":

Purpose/Principles Vision Goals Areas of Focus Projects Tasks

I think this is a good model, even if the exact wording of these categories might change depending on how you think about the world. I need to refine some of my thinking on the higher-level areas of focus; reflecting on this during reading the book made me realize how much I’m missing at the "top".

The true power in a long-range vision is the acceptance that holding that picture inside your consciousness permits you to imagine yourself doing something much grander than you would normally allow yourself.

Big thinking is a skill. Some people don’t have it naturally (like myself). Clarity around a long-range vision does seem to enable your mind to think bigger.

No effective framework will ever get any simpler than the continuum of purposes/principles, vision, goals, areas of focus, projects, and next actions.

This was a helpful structure for me. I was missing the higher-level categories and need to work on defining them and putting them in a place where I can be reminded and structure the lower levels around them.

Continue Reading

2021 Goal Retrospective

I’ve been doing yearly retrospectives on my yearly goals for awhile now. My belief in incremental improvement being the key to achieving anything great life has only been strengthened as time has gone on. Here’s my review of last year’s goals and my thoughts about what I can change going forward.

What Worked Focusing on just two habit changes for the year is the right balance for me. I think of a habit as a recurring behavior I want to change, as opposed to a specific one-time event or project completion. I implemented the two habits I targeted. Setting goals that are just hard enough is really critical. I set a couple goals that were just enough of a stretch that I felt like I could push and get them across the line. If they were 20% harder I probably wouldn’t have made it, but if they were much easier I would have left something on the table. It’s impossible to get this 100% right, but you get better at this with time. Defining the why behind goals has been really important. This helps filter out goals that don’t matter as much (if you can’t articulate a compelling why, you shouldn’t include it in your goals) and a crisply-written why helps maintain motivation over time. Tracking habit-like goals weekly on an excel sheet is a great tool for remembering to do them. However, (a) you can’t track too many goals and (b) tracking them has to be easy (< 30s to record).

It’s been interesting & rewarding to see over time how I’ve become more effective at setting goals. Sure, goal planning systems are more effective than nothing out of the box, but there’s a lot to learn by iterating on your own goal system that takes into account your specific psychology and quirks. It’s worth putting in the time to really think seriously about your goals and how achieve them each year.

What Didn’t Goals without very clear, measurable metrics—or metrics that were hard to track/observe—were hard to hit. Even if you "complete" them it doesn’t give you the same feeling of acomplishment, and the lack of specifics doesn’t give you the motivation to push when it gets challenging. I wanted to do a screen-free day each week. Instead, I did ~20 over the course of the year. This was strangely hard to remember to do, and I’m not sure why. Weekly discplines (as opposed to daily discplines) are harder to build, especially when you don’t have anyone to hold you accountable. For example, I used to hate working out, but committed to a weekly time with a friend and now it’s a habit and I enjoy it. I need to determine how to pull that same sort of energy into changes that are not dependent on another person. I think part of the issue with the screen-free day and some other related habits is exatly what it means isn’t clear. Does it mean I should put my phone + laptop in another room? What if we committed to be somewhere and need to use the phone to get there? What exceptions exist? This muddies the waters and makes it hard to focus on this sort of goal during the whirlwind of daily life. What Should Change I want to look into apps, or some other low-friction reminder tool, to help build habits. There are some smaller, micro habits (like taking a vitman every day, flossing, etc) that don’t fit well into the goal planning process. Streaks looks like a simple app that was recommended by a couple folks. If there are goals or habits which seem hard to build, put some time into really clarifiying the actions you need to take to make progress on the goal or the exact actions that the habit requires. As an example, I need to think on the screentime goal and determine how I can really intergrate this into my daily life and be reminded of this automatically.

Continue Reading

Learning TypeScript by Migrating Mint Transactions

Years ago, I built a chrome extension to import transactions into Mint. Mint hasn’t been updated in nearly a decade at this point, and once it stopped connecting to my bank for over two months I decided to call it quits and switch to LunchMoney which is improved frequently and has a lot of neat developer-focused features.

However, I had years of historical data in Mint and I didn’t want to lose it when I transitioned. Luckily, Mint allows you to export all of your transaction data as a CSV and LunchMoney has an API.

I’ve spent some time brushing up on my JavaScript knowledge in the past, and have since used Flow (a TypeScript competitor) in my work, but I’ve heard great things about TypeScript and wanted to see how it compared. Building a simple importer tool like this in TypeScript seems like a great learning project, especially since the official bindings for the API is written in TypeScript.

Deno vs Node

Deno looks cool. It uses V8, Rust, supports TypeScript natively, and seems to have an improved REPL experience.

I started playing around with it, but it is not backwards compatible with Node/npm packages which is a non-starter for me. It still looks pretty early in its development and adoption. I hope Deno matures and is more backwards compatible in the future!

Learning TypeScript You can’t run TypeScript directly via node (this is one of the big benefits of Deno). There are some workarounds, although they all add another layer of indirection, which is the primary downfall of the JavaScript ecosystem in my opinion. ts-node looks like the easiest solution to run TypeScript without a compilation step. npm i ts-node will enable you to execute TypeScript directly using npx ts-node the_script.ts. However, if you use ESM you can’t use ts-node. This is a known issue, and although there’s a workaround it’s ugly and it feels easier just to have a watcher compile in the background and execute the raw JS. .d.ts within repos define types on top of raw JS. This reason this is done is to allow a single package to support both standard JavaScript and TypeScript: when you are using TypeScript the .js and .d.ts files are included in the TypeScript compilation process. Use npx tsc --init to setup an initial tsconfig.json. I turned off strict mode; it’s easier to learn a new typing system without hard mode enabled. Under the hood, typescript transpiles TypeScript into JavaScript. If you attempt to debug a TypeScript file with node inspect -r ts-node/register the code will look different and it’ll be challenging to copy/paste snippets to debug your application interactively. Same applies to debugging in a GUI like VS Code. You can enable sourcemaps, but the debugger is not smart enough to map variables dynamically for you when inputting strings into the console session. This is massive bummer for me: I’m a big fan of REPL-driven development. I can’t copy/paste snippets of code between my editor + REPL, it really slows me down. Similar to most other languages with gradual typing (python, ruby, etc), there are ‘community types’ for each package. TypeScript is very popular, so many/most packages includes types within the package itself. The typing packages need to be added to package.json. There’s a nice utility to do this automatically for you. If you want to be really fancy you can overload npm i and run typesync automatically. VS Code has great support for TypeScript: you can start a watcher process which emits errors directly into your VS Code status bar via cmd+shift+b. If you make any changes to tsconfig.json you’ll need to restart your watcher process. You can define a function signature that dynamically changes based on the input. For instance, if you have a configuration object, you can change the output of the function based on the structure of that object. Additionally, you can inline-assign an object to a type, which is a nice change from other languages (ruby, python). Example of inline type assignment: {download: true} as papaparse.ParseConfig<Object>. In this case, Object is an argument into the ParseConfig type and changes the type of the resulting return value. Very neat! I ran into Element implicitly has an 'any' type because expression of type 'string' can't be used to index type 'Object'. No index signature with a parameter of type 'string' was found on type 'Object. The solution was typing a map/object/hash with theVariable: { [key: string]: any } . I couldn’t change the any type of the value without causing additional typing errors since the returning function was typed as a simple Object return. There’s a great, free, extensive book on TypeScript development.

One of the most interesting pieces of TypeScript is how fast it’s improving. Just take a look at the changelog. Even though JavaScript isn’t the most well-designed language, "One by one, they are fixing the issues, and now it is an excellent product." A language that has wide adoption will iterate it’s way to greatness. There’s a polish that only high throughput can bring to a product and it’s clear that after a very long time JavaScript is finally getting a high level of polish.

Linting with ESLint, Code Formatting with Prettier ESLint looks like the most popular JavaScript linting tool. It has lots of plugins and huge community support. You can integrate prettier with eslint, which looks like the most popular code formatting tool. VS code couldn’t run ESLint after setting it up. Had trouble loading /node_modules/espree/dist/espree.cjs. Restarting VS Code fixed the problem.

Here’s the VS Code settings.json that auto-fixed ESLint issues on save:

{ "[typescript]": { "editor.formatOnSave": true, "editor.codeActionsOnSave": { "source.fixAll.eslint": true }, }, "eslint.validate": ["javascript"] }

And here’s the .eslintrc.json which allowed ESLint, prettier, and ESM to play well together:

{ "env": { "browser": false, "es2020": true }, "extends": [ "standard", "plugin:prettier/recommended" ], "parser": "@typescript-eslint/parser", "parserOptions": { "ecmaVersion": 2020, "sourceType": "module" }, "plugins": [ "@typescript-eslint" ], "rules": { } } Module Loading

As with most things in JavaScript-land, the module definition ecosystem has a bunch of different community implementation/conventions. It’s challenging to determine what the latest-and-best way to handle module definitions is. This was a great overview and I’ve summarized my learnings below.

require() == commonjs == CJS. You can spot modules in this format by module.exports in their package. This was originally designed for backend JavaScript code. AMD == Asynchronous Module Definition. You can spot packages in this style by define(['dep1', 'dep2'], function (dep1, dep2) { at the header of the index package. Designed for frontend components. UMD == Universal Module Definition. Designed to unify AMD + CJS so both backend and frontend code could import a package. The signature at the top of the UMD-packaged module is messy and basically checks for define, module.exports, etc. import == ESM == ES Modules. This is the latest-and-greatest module system officially baked into ES6. It has wide browser adoption at this point. This is most likely what you want to use. import requires module mode in TypeScript (or other compilers) not set to commonjs. If you use ESM, your transpiled JS code will look a lot less garbled, and you’ll still be able to use the VS Code debugger. The big win here is your import variable names will be consistent with your original source, which it makes it much easier to work with a REPL. There are certain compatibility issues between ESM and the rest of the older package types. I didn’t dig into this, but buyer beware. It looks like experimental support for loading modules from a URL exist. I hope this gets baked in to the runtime. There are downsides (major security risks), but it’s great for getting the initial version of something done. This was one of the features I thought was neat about Deno: you could write a script with a single JavaScript file without creating a mess of package*, tsconfig.json, etc files in a new folder. https://unpkg.com is a great tool for loading a JS file from any repo on GitHub. You’ll get Cannot use import statement inside the Node.js REPL, alternatively use dynamic import if you try to import inside of a repl. This is a known limitation.. The workaround (when in es2020 mode) is to use await import("./out/util.js"). When importing a commonjs formatted package, you’ll probably need to import specific exports via import {SpecificExport} from 'library'. However, if the thing you want to import is just the default export you’ll run into issues and probably need to modify the core library. Here’s an example commit which fixed the issue in the LunchMoney library When importing a local file, you need to specify the .js (not the ts) in the import statement import { readCSV, prettyPrintJSON } from "./util.js"; Package Management You can install a package directly from a GitHub reference npm i lunch-money/lunch-money-js You can’t put comments in package.json, which is terrible. Lots of situations where you want to document why you are importing a specific dependency, or a specific forked version of a dependency. npm install -g npm to update to the latest npm version. By default, npm update only updates packages to the latest minor semver. Use npx npm-check-updates -u && npm i to update all packages to the latest version. This is dangerous, and only makes sense if there are a small number of packages https://openbase.com is a great tool for helping decide which package to use. JavaScript Learnings You’ll want to install underscore and use chain for data manipulation: _.chain(arr).map(...).uniq().value(). Lots of great tools you are missing from ruby or python. ES6 introduced computed property names so you can use a variable as an object key { [variableKey]: variableValue } I had trouble getting papaparse to read a local file without using a callback. I hate callbacks; here’s a promise wrapper that cleaned this up for me. Merge objects with _.extend. The dotenv package didn’t seem to parse .env with exports in the file. Got tripped up on this for a bit. require can be used to load a JSON file, not just a javascript file. Neat! There are nice iterators now! for(const i in list) There’s array destruction too const [a, b] = [1,2] Underscore JS has a nice memoize method. I hate the pattern of having a package-level variable for memoization. Just feels so ugly. There’s a in keyword that can be used with objects, but not arrays (at least in the way you’d expect). There’s a null-safe operator now. For instance, if you want to safely check a JSON blob for a field and set a default you can now do something like const accounts = json_blob?.accounts || [] You are iterate over the keys and values of an object using for (const [key, value] of Object.entries(object)) https://github.com/ccxt/ccxt is a neat project which transpiles JavaScript code into multiple languages. Hacking & Debugging The most disappointing part of the node ecosystem is the REPL experience. There are some tools that (very) slightly improve it, but there’s nothing like iPython or Pry. nbd is dead and hasn’t been updated in years node-help is dead as well and just made it slightly easier to view documentaiton. node-inspector is now included in node and basically enables you to use Chrome devtools local-repl looks neat, but also hasn’t been updated in ~year. The updated repl project wouldn’t load for me on v16. The debugging happy path seems to be using the GUI debugger. You can use toString() on a function to get the source code. Helpful alternative to show-source from ruby or ll from python. However, it has some gotchas: It’s specifically discouraged since it’s been removed from the standard Arguments and argument defaults are not specified It’s not obvious how to list local variables in the CLI debugger. There’s a seemingly undocumented exec .scope that you can run from the debugger context (but not from a repl!). You can change the target to ES6 to avoid some of the weird JS transpiling stuff, Run your script with node inspect and then before conting type breakOnUncaught to ensure that you can inspect any exceptions. I prefer terminal-based debugging, if you want to connect to a GUI (chrome or VS Code) use --inspect. There’s not a way I could find to add your own aliases to the debugging (i.e. c == continue == cont). It’s worth writing your own console.ts to generate a helpful repl environment to play with imports and some aliases defined. Unfortunately, this needs to be done on a per-project basis. You can’t redefine const variables in a repl, which makes it annoying to copy/paste code into a console. It looks like there are some hacks you can use to strip out the const and replace with a let before the copy/pasted code gets eval’d. This seems like a terrible hack and should just be a native flag added to node. In more recent versions of node (at least 16 or greater), you can use await within a repl session. If you are in a debugger session await does not work, unlike when you are a standard node repl. You cannot resolve promises and therefore cannot interact with async code. This is a known bug, will not be changed, and makes debugging async code interactively extremely hard. Very surprised this is still a limitation. console.dir is the easiest way to inspect all properties of an object within a REPL. This uses util.inspect under the hood, so you don’t need to import this package and remember the function arguments. There’s a set of functions only available in the web console. Most of these seem to model after jQuery functions. Open Questions How can I run commands without npx? Is there some shim I can add to my zsh config to conditionally load all npx-enabled bins when node_modules exists? Is there anything that can done to make the repl experience better? This is my biggest gripe with JavaScript development. https://github.com/11ways/janeway looks interesting but seems dead (no commits in over a year) This code looks like an interesting starting point to removing all const that are pasted into a repl. The number of configuration files you need to get started in a repo just to get started is insane (tsconfig.json, package*.json, .eslintc.json). Is there a better want to handle this? Some sort of single configuration file to rule them all?

Continue Reading

Building a SouthWest Price Monitor and Learning Server Side JavaScript

I originally wrote a draft of this post in early 2019. I’m spending some time learning TypeScript, so I wanted to finally get my JavaScript-related posts out of draft. Some notes and learnings here are out of date.

Both sides of our family live out of state. Over the last couple years, we’ve turned them on to credit card hacking to make visiting cheap (free). SouthWest has some awesome point bonuses on credit cards, but you can’t watch for price drops on Kayak and other flight aggregators.

After a bit of digging, I found a basic version of a tool to do just this. It’s a self-hosted bot to watch for flight cost drops so you can book (or rebook for free). I’ve been wanting to dig into server side JavaScript development, and this is the perfect excuse.

Here’s what I’d like to do:

Get the tool running somewhere simple: Heroku, Raspberry Pi, etc Convert the use of redis to mongodb. Redis isn’t a database, it’s a key-value store. But this project is using it for persistence. Why switch to MongoDB? I’ve been wanting to understand document databases a bit more. Postgres would have been easier for me, but this project is all about learning. Possibly add the option of searching for the best flight deal on a particular month

Below is a ‘learning log’ of what I discovered along the way. Let’s get started!

Learning JavaScript

As I mentioned in an earlier post, my JavaScript knowledge was very out of date (pre ES6). Some findings and musings below will be obvious to a seasoned JavaScript developer, but to someone more experienced in Ruby/Python/etc they’ll be some interesting tidbits.

Looks like express is the dominant HTTP router + server. It’s equivalent to the routing engine of Rails combined with rack and unicorn. It doesn’t seem like there are strong conventions to how you setup an express-based app. You bring your own ODM/ORM, testing library, etc. There is a consistent template/folder structure. However, express doesn’t make any assumptions about a database library, although it does support a couple of different templating languages and has a preferred default (pug). app.use adds additional middleware to the stack. Middleware is simply a function with three arguments. Very similar to rack in ruby-land or plugs in Elixir-land. There’s a part of me that loves the micro-modularity of the node/npm ecosystem, but the lack of declarative programming like DateTime.now + 1.day starts to feel really messy. The equivalent in node is (new Date()).setDate((new Date()).getDate() + 1);. Another example: there’s no built-in sortBy and sort mutates the original array. Some popular packages that solve this (moment, datefuncs, underscore, etc) and the popular choice is to just pull in these packages and use them heavily. I always find including many external decencies adds a lot of maintenance risk to your code. These packages can die, cause strange performance issues, cause weird compatibility issues with future iterations of the language, etc. The good news is the JavaScript ecosystem is so massive, the most popular packages have a very low risk of abandonment. Variable scoping is weird in debugger mode. If the variable isn’t referenced in the function, it’s not available to inspect in the debugger repl. Make sure you reference the variable to inspect/play with it in real time. Node, express, etc are not billed as full-stack web frameworks like rails. I find this super frustrating: not being able spin up a console (rails console) with your entire app’s environment loaded up is annoying. For this particular problem, it looks like the best alternative is to write your own console.js (here’s another guide) with the things you need and startup a repl. The annoying thing here is you need to manually connect to your DB and trigger the REPL after the DB connection is successful. Blitz and Redwood are solving these problems, although these didn’t exist when this post was written. It seems like node inspect + a debugger line doesn’t run the code ‘completely’. For instance, if the code runs past a mongodb.connection line it doesn’t connect. I wonder if this is because the .connection call runs async and doesn’t get a chance to execute before the debugger line is called? Is there a way to instruct the repl to execute anything in the async queue? I found that starting up a vanilla node console and requiring what you needed works better. There are some interesting utility libraries that convert all methods on an object to be promises (async). http://bluebirdjs.com/docs/api/promise.promisifyall.html Languages with declarative convenience methods are just so much nicer. args.priceHistory[args.priceHistory.length - 1] is just ugly compared to args.priceHistory.last. My time at a BigCo has helped me understand the value of typing. I still find the highest velocity developer experience is type-hinting (i.e. types are not required) combined with a linter. This lets you play with code without getting all the details hardened, but still enforces guardrails to avoid a class of production errors. I’m not seeing the value in the event-loop programming paradigm. I get how it allows you to handle more concurrent connections, but isn’t that something that should be handled by the language or in some lower level abstraction? It’s much easier to reason about code when it runs sequentially. For instance, not having object.save throw an exception right away is really annoying: I need to either use callbacks to act when the code has executed OR use async and await everywhere. I do not understand why this pattern has become so popular. https://repl.it is very cool. The idea of sending out links with a console running your code is very handy. This is used a lot in the JavaScript community. It’s fascinating to me how there’s always the 10x-er that becomes a hero of the community. https://github.com/substack has created a ridiculous number of npm packages. Think about let r = await promise as let r = null; promise.then(rr => r = rr) which is executed synchronously. Instead of hash.merge(h2) you write Object.assign({}, h2, hash). There are many unintuitive sharp edges to the language, as you learning, just googling "how to do X with JavaScript" is the best way to determine the JavaScript equivalent. http://jsnice.org is great at parsing obfuscated JS. It tries to rename variables based on the context. Very cool. ... is the splat operator used on objects It’s called the ‘rest’ operator. constructor is the magic method for class initialization Looks like function definitions within a class don’t need the function keyword Puppeteer, Proxies, and Scraping

Part of this project involved scraping information the web. Here’s some tidbits about scraping that I learned:

The node ecosystem is great for web scraping. Puppeteer is a well maintained chrome-controller package and there’s lot of sample code you can leverage to hack things together quickly. Websites have gotten very good at detecting scrapers. There are some workarounds to try to block bot detection, but if you are using a popular site, you will most likely be detected if you are using the default puppeteer installation. A common (and easy) detection method is IP address. If you are scraping from an AWS/cloud IP, you’ll be easily blocked. The way around this is a proxy to a residential IP address. Another option is to host your scraper locally on a Raspberry Pi or on your local computer. https://chrome.browserless.io cool way to test puppeteer scripts I learned a bit about web proxies. Firstly, there are a bunch of proxy protocols (SOCKS, HTTP with basic auth, etc). Different systems support different type of proxies. Package Management You can’t effectively use npm and yarn in the same project. Pick one or the other. Yarn is a more stable, more secure version of npm (but doesn’t have as many features / as much active development) module.exports lets a file expose constants to others which import the file, similar to python’s import system (but with default exports). I like this compared with ruby’s "everything is global" approach. It allows the other author to explicitly define what it wants other users to access. Npm will run pre & post scripts simply based on the name of the scripts. import Section, {SectionGroup} assigns Section to the default export of the file, and imports the SectionGroup explicitly. If you try to import something that isn’t defined in the module.exports of a file you will not get an error and will instead get an undefined value for that import. Testing tape is the test runner that this particular project used. It doesn’t look like it’s possible to run just a single test in a file without changing the test code to use test.only instead of test. The "Test Anything Protocol" is interesting http://testanything.org. Haven’t run into this before. I like consistent test output across languages. I do like how tape tests list out the status of each individual assertion. It becomes a bit verbose, but it’s helpful to see what assertions after the failing assertion succeeded or failed. VS Code + node debugging is very cool when you get it configured. You need to modify your VS Code launch.json in order to get it to work with test files. https://gist.github.com/dchowitz/83bdd807b5fa016775f98065b381ca4e#gistcomment-2204588 Debugging & Hacking

I’m a big fan of REPL driven development and I always put effort into understanding the repl environment in a language to increase development speed. Here are some tips & tricks I learned:

Tab twice (after inputting ob.) in a repl exposes everything that is available on the object under inspection. node inspect THE_FILE.js allows debugger statements to work. You can also debug remotely with chrome or with VS Code. Visual debugging is the happy path with node development, the CLI experience is poor. You don’t need to setup variables properly in the node repl. Nice! You can just a = 1 instead of let a = 1 I’ll often copy code into a live console to play around with it, but if it’s defined as const I need to restart the console and make sure I don’t copy the const part of the variable definition. That’s annoying. There’s a lot of sharp edges to the developer ergonomics. console.dir to output the entire javascript object Unlike pry you need to explicitly call repl after you hit a breakpoint when running node inspect. Also, debugger causes all promises not to resolve when testing puppeteer. https://github.com/berstend/puppeteer-extra/wiki/How-to-debug-puppeteer Cool! Navigating to about:inspect in Chrome allows you to inspect a node/puppeteer process. list is equivalent to whereami. You need to execute it explicitly with params list(5) _ exists like in ruby, but it doesn’t seem to work in a repl triggered by a debugger statement. _error is a neat feature which keeps the last exception that was thrown. .help while in a repl will output a list of "dot commands" you can use in the repl. I had a lot of trouble getting puppeteer to execute within a script executed with node inspect and paused with debugger. I’m not sure why, but I suspect it has something to do with how promises are resolved in inspect mode. You can enable await in your node console via --experimental-repl-await. This is really helpful to avoid having to write let r; promise.then(o => r) all of the time. Mongo & ODMs You’ll want to install mongo and the compass tool (brew install mongodb-compass) for GUI inspection. Running into startup problems? tail -f ~/Library/LaunchAgents/homebrew.mxcl.mongodb-community.plist If you had an old version of mongo install long ago, you may need to brew sevices stop mongodb-community && rm -rf /usr/local/var/mongodb && mkdir /usr/local/var/mongodb && brew services start mongodb-community -dv The connection string defaults to mongodb://localhost:27017 Mongoose looks like a well-liked JavaScript ODM for Mongo. You can think of each "row" (called a "document") as a JSON blob. You can nest things (arrays, objects, etc) in the blob. The blob is named using a UID, which is like a primary key but alphanumeric. You can do some fancy filtering that’s not possible with SQL and index specific keys on the blob. Looks like you define classes that map to tables ("schemas") but it doesn’t look like you can easily extend them. You can add individual methods to a class but you can’t extend a mongoose model class. It looks like a mongoose.connection call creates an event loop. Without closing the event loop, the process will hang. Use process.exit() to kill all event loops. Relatedly, all mongo DB calls are run async, so you’ll want to await them if you expect results synchronously. brew install mongodb-compass-community gives you a GUI to explore your mongo DB. Similar to Postico for Postgres. Open Questions How are event loops, like the one mongoose uses implemented? Is the node event loop built in Javascript or are there C-level hooks used for performance? There are lots of gaps in the default REPL experience. Is there an improved repl experience for hacking? Do Blitz/RedwoodJS/others materially improve the server side JS experience? What killer features does mongodb have? How does it compare to other document databases? Is there a real reason to use document databases now that most SQL databases have a jsonb column type with an array of json operators built in?

Continue Reading