Extracting hidden text from a PDF requires Optical Character Recognition (OCR). Unlike FullText or Native methods, which work with selectable text, OCR is capable of recognizing text in images, scanned documents, and hidden text layers.
FullText and Native work for text-based PDFs, but fail if text is embedded in images.
OCR technologies (such as Tesseract, Microsoft OCR, or OmniPage OCR) analyze and extract text from images, ensuring all content is retrieved.
UiPath provides Get OCR Text and Read PDF with OCR activities for this purpose.
✅ Reference: UiPath Official Documentation – PDF and OCR
Contribute your Thoughts:
Chosen Answer:
This is a voting comment (?). You can switch to a simple comment. It is better to Upvote an existing comment if you don't have anything to add.
Submit