Look at Metaβs Nougat OCR, use the API, run a flask server and play with it
Machine Learning
Community Rules:
- Be nice. No offensive behavior, insults or attacks: we encourage a diverse community in which members feel safe and have a voice.
- Make your post clear and comprehensive: posts that lack insight or effort will be removed. (ex: questions which are easily googled)
- Beginner or career related questions go elsewhere. This community is focused in discussion of research and new projects that advance the state-of-the-art.
- Limit self-promotion. Comments and posts should be first and foremost about topics of interest to ML observers and practitioners. Limited self-promotion is tolerated, but the sub is not here as merely a source for free advertisement. Such posts will be removed at the discretion of the mods.
"Guys I'm new to medicine, I'm gonna solve cancer, any guidance would be helpful."
ππ
If you're interested in a new task, it's good for you to start by exploring Papers with Code leaderboard. You can find recent research papers related to it.
As of about a year ago, I haven't seen anything that really outperforms Tesseract across multiple benchmarks. You can get near 100% accuracy if the image is clean and the font isn't anything weird. But if you have image noise, you need to lower your expectations.
Azure Form Recognizer OCR is very good.
hey there, as a beginner in ml, staying updated on OCR is key. to surpass Tesseract or EasyOCR, focus on deep learning models, like CNNs or transformers. achieving near 100% accuracy is tough, but pre-processing, data augmentation, and model fine-tuning can help get you there. it's not fully solved, but keep experimenting and learning. good luck! π
have you checked out the latest research papers on OCR? following top conferences like CVPR, ICCV, and NeurIPS can help you stay updated. consider exploring deep learning models like Transformers for improved accuracy. good luck! π