Using ChatGPT to Convert LabCorp PDFs into a Google Sheet
Tags: ai, chatgpt, health • Categories: Productivity
The last couple of years I’ve monitored my food, blood levels, etc more closely. It’s a topic for another blog post, but it’s been really interesting to watch how key blood levels have changed over time and reacted to changes in my diet and exercise.
I use lab core for all my blood work, and I’ve been relatively happy with them. However, their online portal does not allow you to download a CSV or Excel document of your blood work over time. They only offer a PDF download.
This makes it challenging to track your levels over time and understand what’s changing and why due to lifestyle changes.
Enter ChatGPT. With the latest vision models, you can use it to extract tabular data from the unstructured PDF that LabCorp provides you.
Converting PDF to JPEGs
The first step is converting a PDF into multiple JPEGs that you can easily drop into ChatGPT. I was not able to figure out how to group multiple pages into a single jpeg, but limit the maximum of pages in each jpeg. In my opinion, the Image Magic command line tool is poorly documented. It does not provide enough examples to easily understand how to do this. If anyone knows how to instruct the image magic convert tool to split a PDF into multiple JPEGs with a certain number of images in each JPEG, let me know!
Here’s the zsh script you can use (after brew install imagemagick
) to convert your LabCorp PDF into a JPEG. I found that using lower-quality JPEGs did not harm model performance and actually improved it when attaching multiple images to a single chat.
convert-pdf-to-jpeg() {
local filepath=$1
local outputpath="${filepath:r}"
# the position of `filepath` matters! Put this before the operators
magick -density 300 "${filepath}" -alpha remove -resize 1024x -quality 100 "${outputpath%.*}-%d.jpeg"
# select the first output in the finder
open -R "$outputpath-0.jpeg"
}
Converting LabCorp JPEGs into Tabular Data
It’s a little known secret, but you can paste tab separated content directly into Google Sheet. This makes it easy to run command line tools, convert data to tab separated content, and then paste it directly into Google Sheet.
This is exactly what we do here. Instruct ChatGPT to convert the JPEG into tab separated content for easily pasting it into Google Sheets.
Here’s the prompt that worked for me:
Extract the data in the attached labcorp blood report into the following columns:
- Date
- Specimen ID
- Test
- Units
- Result. Numeric value, can include `<`
- Normal Range. In the format of `lower-higher` where `lower` and `higher` are both integers.
Format the result as a TSV in a code block. Omit header line.
Here’s the google sheet template that you paste the labcorp blood work directly into.
I’ve also published a custom GPT with this prompt and some additional settings (like disabling the code interpreter).