Simon Willison's WeblogApril 23, 2026

Extract PDF text in your browser with LiteParse for the web

by Simon Willison's Weblog

LlamaIndex have a most excellent open source project called LiteParse, which provides a Node.js CLI tool for extracting text from PDFs. I got a version of LiteParse working entirely in the browser, using most of the same libraries that Lit