this post was submitted on 14 Nov 2023
1 points (100.0% liked)

LocalLLaMA

3 readers
1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 1 year ago
MODERATORS
 

Warning, this is still work in progress.

https://huggingface.co/piotr-ai/polanka-7b-v0.1

First version of 7b Polish LLM finetuned using custom data in Polish language.

As a base model I used uncensored https://huggingface.co/ehartford/dolphin-2.1-mistral-7b so Dolphin "personality" should also be there.

It was trained using 4K context in ChatML format. All done on a single 4090 for multiple days.

top 3 comments
sorted by: hot top controversial new old
[โ€“] k0setes@alien.top 1 points 11 months ago (1 children)

I have tested several small 7B models for speaking Polish and it seems to me that currently openchat_3.5.Q4_K_S.gguf

is probably the best.
Of course this was not a large-scale study, so it is not necessarily 100% true ;)
And I look forward to the final release ๐Ÿ‘

[โ€“] Significant_Focus134@alien.top 1 points 11 months ago

Thanks! For the record, that version is very under-trained. Today I started to train on much bigger dataset (50k entries) that is mostly built from the wikipedia.

[โ€“] paryska99@alien.top 1 points 1 year ago

I hope we can get quantized gguf soon from the legendary TheBloke