this post was submitted on 23 Oct 2023
1 points (100.0% liked)

Self-Hosted Main

515 readers
1 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

For Example

We welcome posts that include suggestions for good self-hosted alternatives to popular online services, how they are better, or how they give back control of your data. Also include hints and tips for less technical readers.

Useful Lists

founded 1 year ago
MODERATORS
 

Hello, I'm starting a new course and the materials are all in PDF viewable only, for comody sake i use it a lot for online services to convert image to text, even ChatGpt 4 does it, does somebody knows some king of self hosted ocr converter? To convert screenshots into text?

Tnx

top 9 comments
sorted by: hot top controversial new old
[–] sixtyfifth_snow@alien.top 1 points 1 year ago

tesseract-ocr? You can download it via apt or something similar.

[–] DaHunni@alien.top 1 points 1 year ago (1 children)

paperless-ngx has built in ocr but I don't think it would fit your needs

[–] t1nk3rz@alien.top 1 points 1 year ago

I will check it up

[–] The_Laki@alien.top 1 points 1 year ago (1 children)

Windows 11 has this built in if you take a screenshot

[–] t1nk3rz@alien.top 1 points 1 year ago

Didn't know that,i use flameshot for screenshots,i will take a look thnx

[–] BadGroundbreaking243@alien.top 1 points 1 year ago (1 children)

You could spin up paperless-ngx. Or use pdf24 creator. Beware paperless consume will delete the file.

I used paperless-ngx before and it works pretty good.

[–] t1nk3rz@alien.top 1 points 1 year ago

I will check it up, i have Stirlingpdf and I see it also has ocr support

[–] henry_tennenbaum@alien.top 1 points 1 year ago

I'm not sure I understand you correctly. Do you want to apply OCR to PDFs or to Screenshots?

For PDFs there's the excellent ocrmypdf which paperless-ngx uses under the hood.

[–] lilolalu@alien.top 1 points 1 year ago

Nextcloud AIO (all-in-one) comes with full text search installed, which brings tesseract to nextcloud. so you can let tesseract-ocr run over all documents and then they will be searchable with Elasticsearch.