database Archives - Michael Bianco

Scraping the web with OpenAI

One of the really interesting LLM use cases is extracting structured data from unstructured data. In the old days (6mo ago), extracting structured data from web pages required custom xpath or css selectors for each website that constantly broke as the host changed their page structure. For instance, extracting the price of a house on redfin. This is why Plaid (and similar competitors) break so often: many of their integrations "screen scrape" which means they need a team of people updating xpath and css selectors on various bank sites (TreasuryDirect, for example, is broken constantly). I built a open source database of venture capital firms that used this approach to extract team member information from each firm…

How to Create a Read-Only MySQL User

Simple enough problem, but I couldn’t find a quick solution that worked out of the box for my configuration. Here’s a quick guide to setting up a read-only user on your MySQL server (useful for safely inspecting production databases): Note that you may need to change Host = ‘localhost’ depending on your MySQL server configuration.

Tag: database