LocalLLaMA

14 readers

1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 2 years ago

MODERATORS

communick@poweruser.forum

Safety checks in Llama 2 (alien.top)

submitted 2 years ago by Little-Name9809@alien.top to c/localllama@poweruser.forum

4 comments fedilink hide all child comments

Recently came across this AI Safety test report from LinkedIn: https://airtable.com/app8zluNDCNogk4Ld/shrYRW3r0gL4DgMuW/tblpLubmd8cFsbmp5

From this report it seems Llama 2 (7B version?) lacks some safety checks compared to OpenAI models. Same with Mistral. Did anyone find the same result? Has it been a concern for you?

you are viewing a single comment's thread
view the rest of the comments

[–] phree_radical@alien.top 1 points 2 years ago (1 children)

It's comparing base models (which are not trained to follow or refuse instructions) against instruction-tuned ones (OpenAI)

[–] CookieCat171@alien.top 1 points 2 years ago (1 children)

afety checks in Llama 2

it seems it's comparing chat models: https://airtable.com/app8zluNDCNogk4Ld/shrYRW3r0gL4DgMuW/tblpLubmd8cFsbmp5

[–] phree_radical@alien.top 1 points 2 years ago

Looks like you've now made some changes. Columns now read "Llama2-7b-chat" instead of "llama2." Also, chat responses below the completions, chastising the inappropriate messages. However, a completion was generated, first, and the item is still marked as "fail." Very poor show