AfterAte

joined 1 year ago
[–] AfterAte@alien.top 1 points 11 months ago

Any model that won't give me instructions on how to make napalm is sensored in my book. ¯⁠\⁠_⁠(⁠ツ⁠)⁠_⁠/⁠¯

[–] AfterAte@alien.top 1 points 11 months ago

Do a test with llama-cpp.exe directly and using oobabooga (which uses llama-cpp-python) and see if there's a consistent difference. I'm guessing even glue can be a bottleneck.

[–] AfterAte@alien.top 1 points 11 months ago

If nobody has any other suggestions, try Phind.com with GPT4 selected in the drop-down. You get 10 free tries a day, the downside is that they use your data for training their in-house model. That's as good as you're gonna get for less known languages, for free.

Since all local models suck for Rust, I'm gonna assume all general purpose coding models suck for anything but the most common languages (Python, JS/Type Script, C/C++, Java). Although there is an/a SQL only model that is really good with SQL. Maybe someone did one for PS...

[–] AfterAte@alien.top 1 points 11 months ago

Great! Have fun!

[–] AfterAte@alien.top 1 points 11 months ago

I don't know much about Rust, but Easy Rust is a good source for learning: https://github.com/Dhghomon/easy_rust

But in a useful format for fine-tuning... no idea where to get that. And I'm not qualified to make it either. But i don't want to burden you with extra work so I guess C++ will have to do for now :) Thank you for the model, from me and everyone else with a potato PC m(_ _)m

[–] AfterAte@alien.top 1 points 11 months ago

Also, set the temperature to 0.1 or 0.2. those two things helped me getting it to work nicely.

[–] AfterAte@alien.top 1 points 11 months ago (2 children)

Btw, does your dataset include coding examples? If so, do you include Rust? I find current models really suck at Rust, but can make a pretty good Snake game in Python 😂

[–] AfterAte@alien.top 1 points 11 months ago (2 children)

Try using the alpaca template, turn temperature down to 0.1 or 0.2 and repetition penalty to 1. I haven't tested this yet, but those settings work for Deepseek-coder. If you're using oobabooga, the StarChat preset works for me.

[–] AfterAte@alien.top 1 points 11 months ago

Wow, that's amazing. On the eval+ leaderboards, Deepseek-coder-1.3B-instruct gets 64.6, so that's a ~4% increase. It's about 3% less than Phind-v2's result, which is amazing.

[–] AfterAte@alien.top 1 points 11 months ago (1 children)

Oh nice! I'll have to try those settings and compare with the StarChat preset in Oobabooga. I hear ya, I get 1t/s too.. it's unbearable.

[–] AfterAte@alien.top 1 points 11 months ago

Nice, I didn't know that.

[–] AfterAte@alien.top 1 points 11 months ago (4 children)

Take a look at Phind.com. They use the web to enhance their model's answers. That means that you can get up to date information on APIs instead of relying on the data with a cutoff of 2021 or 2022. You can use their in house Phind V8 model for free, but if you want to use GPT4, you get 10 tries a day. If you want more, they have paid plans. They recently announced that their free V8 model was as good as GPT4, but other people here have disagreed with them. I have never used GPT4, but their free Phind model was better than anything local we have.

view more: next ›