09 Dec Install Poppler on Databricks cluster
I'm working on a project where I have to use Optical Character Recognition (OCR) to extract and analyze data from scanned PDF documents. This ETL process will be running on a Databricks cluster. To accomplish this I am using the following Python libraries pdf2image and easyocr....