LocalLLaMA

11 readers

4 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 2 years ago

MODERATORS

communick@poweruser.forum

"Base" models were actually trained with some GPT instruct datasets (alien.top)

submitted 2 years ago by Wonderful_Ad_5134@alien.top to c/localllama@poweruser.forum

9 comments fedilink hide all child comments

Look at this, apart Llama1, all the other "base" models will likely answer "language" after "As an AI". That means Meta, Mistral AI and 01-ai (the company that made Yi) likely trained the "base" models with GPT instruct datasets to inflate the benchmark scores and make it look like the "base" models had a lot of potential, we got duped hard on that one.

https://preview.redd.it/vqtjkw1vdyzb1.png?width=653&format=png&auto=webp&s=91652053bcbc8a7b50bced9bbf8638fa417387bb

you are viewing a single comment's thread
view the rest of the comments

[–] phree_radical@alien.top 1 points 2 years ago

Interestingly, Mistral Instruct:

As an AI

### top_k:

0.686088: 13892 "assistant"
0.049313: 28725 ","
0.039010:  3842 "language"
0.037810:  2229 "model"
0.031591: 28733 "-"
0.018000:  3332 "research"
0.016518:  1587 "system"
0.009266: 21631 "Assistant"
0.006967:  7583 "expert"
0.005598:  3921 "tool"
0.004394:  8073 "agent"
0.004242:   369 "that"
0.002696:   304 "and"
0.002644:   297 "in"
0.001415:  5716 "student"
0.001410:  5514 "technology"
0.001197:  7786 "coach"
0.001073:  1918 "team"
0.001073: 24480 "scientist"
0.001052:  2818 "based"
0.001036:  2007 "program"
0.000925: 12435 "bot"
0.000819:  5181 "platform"
0.000819: 28723 "."
0.000816: 21782 "developer"
0.000813:  6031 "assist"
0.000806:  3327 "personal"
0.000803:  9464 "algorithm"
0.000776:  2488 "project"
0.000746:   354 "for"
0.000743:  8626 "teacher"
0.000666:  7511 "eth"
0.000645:  6953 "writer"
0.000640: 24989 "practition"
0.000623:  3441 "voice"
0.000621:  5024 "professional"
0.000611: 22275 "analyst"
0.000588: 15589 "Language"
0.000583:  8252 "virtual"
0.000531:  7153 "digital"
0.000525:   298 "to"
0.000523: 11108 "technique"
0.000523: 10706 "chat"
0.000521: 19899 "specialist"
0.000517:  8311 "tut"
0.000501:  1338 "person"
0.000493:  6878 "experiment"
0.000474:   325 "("
0.000460: 18112 "engineer"
0.000458:  4993 "application"