this post was submitted on 12 Nov 2023
1 points (100.0% liked)

LocalLLaMA

3 readers
1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 1 year ago
MODERATORS
 

Look at this, apart Llama1, all the other "base" models will likely answer "language" after "As an AI". That means Meta, Mistral AI and 01-ai (the company that made Yi) likely trained the "base" models with GPT instruct datasets to inflate the benchmark scores and make it look like the "base" models had a lot of potential, we got duped hard on that one.

โ€‹

https://preview.redd.it/vqtjkw1vdyzb1.png?width=653&format=png&auto=webp&s=91652053bcbc8a7b50bced9bbf8638fa417387bb

you are viewing a single comment's thread
view the rest of the comments
[โ€“] phree_radical@alien.top 1 points 1 year ago

Interestingly, Mistral Instruct:

As an AI

### top_k:

0.686088: 13892 "assistant"
0.049313: 28725 ","
0.039010:  3842 "language"
0.037810:  2229 "model"
0.031591: 28733 "-"
0.018000:  3332 "research"
0.016518:  1587 "system"
0.009266: 21631 "Assistant"
0.006967:  7583 "expert"
0.005598:  3921 "tool"
0.004394:  8073 "agent"
0.004242:   369 "that"
0.002696:   304 "and"
0.002644:   297 "in"
0.001415:  5716 "student"
0.001410:  5514 "technology"
0.001197:  7786 "coach"
0.001073:  1918 "team"
0.001073: 24480 "scientist"
0.001052:  2818 "based"
0.001036:  2007 "program"
0.000925: 12435 "bot"
0.000819:  5181 "platform"
0.000819: 28723 "."
0.000816: 21782 "developer"
0.000813:  6031 "assist"
0.000806:  3327 "personal"
0.000803:  9464 "algorithm"
0.000776:  2488 "project"
0.000746:   354 "for"
0.000743:  8626 "teacher"
0.000666:  7511 "eth"
0.000645:  6953 "writer"
0.000640: 24989 "practition"
0.000623:  3441 "voice"
0.000621:  5024 "professional"
0.000611: 22275 "analyst"
0.000588: 15589 "Language"
0.000583:  8252 "virtual"
0.000531:  7153 "digital"
0.000525:   298 "to"
0.000523: 11108 "technique"
0.000523: 10706 "chat"
0.000521: 19899 "specialist"
0.000517:  8311 "tut"
0.000501:  1338 "person"
0.000493:  6878 "experiment"
0.000474:   325 "("
0.000460: 18112 "engineer"
0.000458:  4993 "application"