andrew0

joined 2 years ago
[–] andrew0@lemmy.dbzer0.com 2 points 5 days ago

Be wary that their docs are so and so. Nanonets OCR, Mistral OCR and MinerU will also extract formulas and images.

One other model I forgot to mention is Docling. This one is quite quick to set up in a docker container, and will have a web interface ready to go where you can upload documents. This sort of follows the PaddleOCR pipeline, but also allows you to use vLMs.

Good luck!

[–] andrew0@lemmy.dbzer0.com 5 points 6 days ago (2 children)

If you find that OCR doesn't get you very far, maybe try a small vLM to parse PNGs of the pages. For example, Nanonets OCR will do this, although quite slow if you don't have a GPU. It will give you a Markdown version of the page, which you can then translate with another tool.

PaddleOCR might also be useful, since it focuses on Chinese, but it's more difficult to set up. To add to this, some other options are MinerU and MistralOCR (this is paid, but you can test it for free if you upload it in Mistral's library).

[–] andrew0@lemmy.dbzer0.com 120 points 6 days ago (5 children)

Sure, if all politicians make all their data available to the public. Their phone chat messages, photos taken, everything.

No...? Then don't bring it up ever again. Initiatives like these will only make it look like you're a villain if you want privacy.

[–] andrew0@lemmy.dbzer0.com 2 points 3 weeks ago

All the ones I mentioned can be installed with pip or uv if I am not mistaken. It would probably be more finicky than containers that you can put behind a reverse proxy, but it is possible if you wish to go that route. Ollama will also run system-wide, so any project will be able to use its API without you having to create a separate environment and download the same model twice in order to use it.

[–] andrew0@lemmy.dbzer0.com 17 points 4 weeks ago (4 children)

Ollama for API, which you can integrate into Open WebUI. You can also integrate image generation with ComfyUI I believe.

It's less of a hassle to use Docker for Open WebUI, but ollama works as a regular CLI tool.

[–] andrew0@lemmy.dbzer0.com 5 points 1 month ago

Look into Cosmic DE. It has a similar vibe if you set up tiling, but without all the headache of configuring all the components so that it is usable.

[–] andrew0@lemmy.dbzer0.com 12 points 1 month ago* (last edited 1 month ago) (2 children)

Didn't this guy just go on French national TV and say that Macron is a dictator?

I think if this guy gets voted in, Romania might say bye bye to its EU funds.

[–] andrew0@lemmy.dbzer0.com 33 points 2 months ago* (last edited 2 months ago)

I heard that Poland is also cheering for some MAGA guy in the next election... Troubling times ahead.

For Romania, there might still be a chance in the run-off. However, the difference between the two candidates was quite large (20% difference; 1.8 million votes). Similarly, the other candidates seemed to have voters that would rather vote for the nazi. Most likely all hope is lost, but that 1% chance is still there.

[–] andrew0@lemmy.dbzer0.com 11 points 2 months ago (1 children)

It would be a shame if someone were to make a post with their office locations across Europe and share it in all the European communities on Lemmy...

[–] andrew0@lemmy.dbzer0.com 20 points 2 months ago (3 children)

What's the plan against this? It's pretty clear that this type of grifting works. Hungary kept Orban in power for far too long, and now Romania might be next.

[–] andrew0@lemmy.dbzer0.com 8 points 3 months ago (1 children)

Romania has previously jumped into a war, only to change sides later. I wouldn't be surprised if they end up taking the bait before the upcoming presidential elections. From what I've heard, the far right candidate that's left in the race is betting that he will get the votes of everyone that voted for Calin Georgescu. His platform? Being a boot licker for Trump.

Troubling times for the Balkans.

[–] andrew0@lemmy.dbzer0.com 2 points 3 months ago

It was just announced that the EU is pausing sustainability requirements on smaller business (< 500 employees) for 2 years. This stems from fears related to the trade war, as they want to keep smaller businesses competitive. Nevertheless, I'm pretty sure that this won't be great for the environment.

 

Hello everyone! I am interested in replacing the Google Speech Recognition and Synthesis app on Android. For Speech-to-Text (STT), I've tried Whisper and FUTO, and settled on the latter because it seemed to be more versatile. Also, FUTO seems to have some decent recognition, but not yet capable of handling all the languages that I want. Regardless, so far happy with STT. The only annoyance I have is that it does not appear as an option in the settings for Speech recognition :(

However, I can't seem to find any replacements that have good Text-to-Speech (TTS) quality. I tried espeak-ng and RHVoice, but both have robotic outputs.

Given the recent advancements in AI, I was expecting that there would be ways to incorporate open source TTS models like Kokoro to generate speech on the go. Nevertheless, I could not really find any such apps so far.

Has anyone managed to completely replace the Google app with (an)other privacy-focused FOSS app(s)?

 

Hi! I'm trying to archive papers as soon as they appear in a scientific journal, and I've attempted to search for PDF links on each page using some regular web scraping.

The problem is that most of these journals will add their fancy PDF readers, and downloading the file is not as straight-forward as it seems. However, the Zotero Connector works flawlessly when you trigger the extension. Therefore, I attempted to set up a selenium instance with this extension to download the papers given a link, but I struggle to actually get the extension to trigger. I tried sending a Shift + Ctrl + S command, but that doesn't seem to get picked up. Similarly, I can't figure out how to call the extension from the console.

Did anyone else attempt such a workflow before? Am I doing something completely unnecessary, as there are better options available? Help a fellow sailor out. Thanks a lot in advance for your help!

90
submitted 9 months ago* (last edited 9 months ago) by andrew0@lemmy.dbzer0.com to c/technology@lemmy.world
 

I recently discovered that Redox OS got a new release earlier this month. I'm quite surprised how far they managed to get, given that only a handful of people are working on this project (compared to the Linux kernel).

Now, I'm curious what it would take to get bigger players to focus on this project. Given the recent Linux + Rust drama, it would surprise me if the backers of Rust for Linux would not give this project some attention.

 

Hello everyone! I've been playing around with Wayland for a bit and was hoping to start learning some more about it. For example, I would be interested in making a lock screen, similar to Swaylock, as a toy project.

What GUI toolkit would you use to develop apps on Wayland? I've added a little poll below with some of the popular choices I've seen thrown around. Feel free to add your own suggestions and maybe leave a comment as to why you'd use that!

Link to poll

44
Jump from Arch to NixOS? (lemmy.dbzer0.com)
submitted 2 years ago* (last edited 2 years ago) by andrew0@lemmy.dbzer0.com to c/linux@lemmy.ml
 

As the title implies, should I do it? I love Arch so far, and I can fix most issues that pop out. However, I sometimes wish to start fresh without too much hassle, but I get a feeling NixOS isn't as mature as Arch.

Have any of you used both, and if so, what do you miss from Arch? What are you grateful for in NixOS?

1
submitted 2 years ago* (last edited 2 years ago) by andrew0@lemmy.dbzer0.com to c/piracy@lemmy.dbzer0.com
 

Hi everyone! I'll soon take the DP-100 exam for Microsoft Azure, and I was interested in finding more leaked exam questions. At the moment, I was using examtopics for this, but it sucks because it basically cuts you off halfway through.

I heard there are some private trackers that specialize in exam questions, such as LearnFlakes, but I do not have anyone that can invite me to them. Therefore, I was wondering if there is another way to find the information I need for this exam.

Do you know any other sources that are fully free?

view more: next ›