Skip to content

Scraping the web with OpenAI

One of the really interesting LLM use cases is extracting structured data from unstructured data. In the old days (6mo ago), extracting structured data from web pages required custom xpath or css selectors for each website that constantly broke as the host changed their page structure. For instance, extracting the price of a house on redfin. This is why Plaid (and similar competitors) break so often: many of their integrations "screen scrape" which means they need a team of people updating xpath and css selectors on various bank sites (TreasuryDirect, for example, is broken constantly). I built a open source database of venture capital firms that used this approach to extract team member information from each firm…

Continue Reading

Fixing Word Navigation in ZSH

Moving to zsh from bash has been a great quality of life improvement. However, there is one thing that has driven me nuts that I have not been able to figure out: customizing the word boundary definition. I’m using zsh 5.9 and have a lot of plugins. forward-word ,backward-word , and the kill variants were the main widgets that I use. I used bindkey to determine these functions. After some investigation, it seems like these widgets are controlled via zstyle ':zle:*' configuration. You can dump configuration via zstyle -L You can determine what underlying zsh function is used by a widget via zle -lL…

Continue Reading

Learning Swift Development for macOS by Building a Website Blocker

I loved Focus App. It blocked websites and apps on a schedule. But, years ago it started glitching out: sucking up tons of ram and freezing my computer. They didn’t fix the bug and I abandoned using it and instead switched to a host-based blocking system which has served me well. However, there are some issues with the host-based approach: I can’t block specific URLs, only hosts (focus app couldn’t do this either) I can’t set a schedule I can’t block apps If I remove a host it will not automatically get blocked unless I sleep and wake the computer Sleepwatcher (cli tool) is dead and requires some manual set up to get working…

Continue Reading

My Experience With GitHub Codespaces

I have an older intel MacBook (2016, 2.9ghz) that I use for personal projects. My corporate machine is an M1 Macbook Pro and I love it, but I’ve been holding off on replacing my personal machine until the pro M2 comes out (hopefully soon!). I love playing with new technology, especially developer tools, and when I got accepted to the codespace beta I couldn’t resist tinkering with it. To speed up my ancient MacBook, try some new tech, and have the ability to learn more ML/AI tooling in the future. Summary I largely agree with this analysis. Codespaces are very cool. They work better than I expected—it felt like I was developing on a local machine…

Continue Reading

Learning TypeScript by Migrating Mint Transactions

Years ago, I built a chrome extension to import transactions into Mint. Mint hasn’t been updated in nearly a decade at this point, and once it stopped connecting to my bank for over two months I decided to call it quits and switch to LunchMoney which is improved frequently and has a lot of neat developer-focused features. However, I had years of historical data in Mint and I didn’t want to lose it when I transitioned. Luckily, Mint allows you to export all of your transaction data as a CSV and LunchMoney has an API. I’ve spent some time brushing up on my JavaScript knowledge in the past, and have since used Flow (a TypeScript competitor) in my work, but I’ve heard great things about TypeScript and wanted to see how it compared…

Continue Reading

Building a SouthWest Price Monitor and Learning Server Side JavaScript

I originally wrote a draft of this post in early 2019. I’m spending some time learning TypeScript, so I wanted to finally get my JavaScript-related posts out of draft. Some notes and learnings here are out of date. Both sides of our family live out of state. Over the last couple years, we’ve turned them on to credit card hacking to make visiting cheap (free). SouthWest has some awesome point bonuses on credit cards, but you can’t watch for price drops on Kayak and other flight aggregators. After a bit of digging, I found a basic version of a tool to do just this. It’s a self-hosted bot to watch for flight cost drops so you can book (or rebook for free)…

Continue Reading

Building a Chrome Extension to Import Transactions into Mint

I originally wrote a draft of this post in early 2019. I’ve since stopped using Mint and switched to LunchMoney. However, I’m spending some time learning TypeScript so I wanted to finally get my JavaScript-related posts out of draft. I use Mint (although it’s rotting on the vine after being acquired by Intuit), and want to import a list of transactions from a bank account that isn’t supported. However, there’s not a way to do this through the mint UI, but there is a hack someone documented. I know old-school JavaScript but haven’t learned ES6, and I’ve never built a Chrome extension. Building a Chrome extension to use the private mint API to import transactions from a CSV is a perfect learning project…

Continue Reading

Building a Docker image for a Python Django application

After building a crypto index fund bot I wanted to host the application so the purchase routines would run automatically. In addition to this bot, there were a couple of other smaller applications I’ve been wanting to see if I could self-host (Monica, Storj, Duplicati). In addition to what I’ve already been doing with my Raspberry Pi, I wanted to see if I could host a couple small utilities/applications on it, and wanted to explore docker more. A perfect learning project! Open Source Docker Files As with any learning project, I find it incredibly helpful to clone a bunch of repos with working code into a ~/Projects/docker so I can easily ripgrep my way through them. https://github.com/schickling/dockerfiles/ Older, but simple Dockerfiles…

Continue Reading

Using GitHub Actions With Python, Django, Pytest, and More

GitHub actions is a powerful tool. When GitHub was first released, it felt magical. Clean, simple, extensible, and adds so much value that it felt like you should be paying for it. GitHub actions feel similarly powerful and positively affected the package ecosystem of many languages. I finally had a chance to play around with it as part of building a crypto index fund bot. I wanted to setup a robust CI run which included linting, type checking, etc. Here’s what I learned: It’s not possible to test changes to GitHub actions locally. You can use the GH CLI locally to run them, but GH will use the latest version of the workflow that exists in your repo. The best workflow I found is working on a branch and then squashing the changes…

Continue Reading