BalorNG

joined 1 year ago
[–] BalorNG@alien.top 1 points 11 months ago
[–] BalorNG@alien.top 1 points 11 months ago

I say:

  1. It has a performance hit, but it remains to be seen if going with a much larger model can compensate for that.
  2. The model needs to be trained from scratch, you cannot finetune an existing model for this apparently...
[–] BalorNG@alien.top 1 points 11 months ago

I mean, you can jailbreak/browbeat chatgpt/Claude into going against guardrails relatively easily, I smash "X" for doubt that Grok is going to be any different. If it will, now THAT is going to huge, if not in a way we'd like to I guess...

[–] BalorNG@alien.top 1 points 11 months ago (1 children)

That explains why Goliath worked and yours - not so much, I guess...

[–] BalorNG@alien.top 1 points 11 months ago (1 children)

"Prompt Template: Alpeca" Wut?

Looks like a scam to be fair. I bet if you apply, you'll get "Just send us 100$ for access!"

[–] BalorNG@alien.top 1 points 11 months ago

Did you do post-merge retraining? Without at least some results are going to be poor...

[–] BalorNG@alien.top 1 points 11 months ago (3 children)

Did you do post-merge training and how much?

[–] BalorNG@alien.top 1 points 11 months ago

10s/tok and couple kilowatts of power... ok, if it was as smart as Einstein and as unerring as an Oracle it might make sense, but you can use it for free at Petals at 3 tok/sec and it is most certainly not...

[–] BalorNG@alien.top 1 points 11 months ago

Technically, you can somewhat automate the testing process by creating a script that makes that model aswer a series of questions that are relevant to YOU and are unique (so cannot be gamed by training for benchmarks) and evaluate those yourself.

Make sure you experiment using different sampling methods and run several tests due to inherent randomness of output.

[–] BalorNG@alien.top 1 points 11 months ago (4 children)

Please dear Tzeench, have someone leak gpt4 in general confusion, I MUST know if this is really 10 7b models in a trench coat :)

[–] BalorNG@alien.top 1 points 11 months ago

My name is Mensch. Uber Mensch.

[–] BalorNG@alien.top 1 points 1 year ago (1 children)

He MUST become a CEO of Uber, too! :))))

 

https://arxiv.org/abs/2310.17680

Ok, technically a tiny language model for now:

Imagine a developer who can only change their last line of code, how often would they have to start writing a function from scratch before it is correct? Auto-regressive models for code generation from natural language have a similar limitation: they do not easily allow reconsidering earlier tokens generated. We introduce CodeFusion, a pre-trained diffusion code generation model that addresses this limitation by iteratively denoising a complete program conditioned on the encoded natural language. We evaluate CodeFusion on the task of natural language to code generation for Bash, Python, and Microsoft Excel conditional formatting (CF) rules. Experiments show that CodeFusion (75M parameters) performs on par with state-of-the-art auto-regressive systems (350M-175B parameters) in top-1 accuracy and outperforms them in top-3 and top-5 accuracy due to its better balance in diversity versus quality.

And only for code. And seems it is much slower. But looks extremely interesting as "proof of concept".

I think that instead of a lot of "denoising" steps to generate text from gibberish, a dual-model system that takes a typical autoregressive input and than runs a few "denoising" steps to look for errors and inconsistencies might be best of both worlds, instead of typical methods of increasing model output quality like progressive refinement that require rewriting entire text token-by-token several times...

view more: next ›