this post was submitted on 28 Nov 2023
1 points (100.0% liked)

LocalLLaMA

3 readers
1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 1 year ago
MODERATORS
 

https://huggingface.co/NurtureAI/Starling-LM-11B-alpha-v1

This is Berkeley's model: Starling-LM-7B-alpha with the size of model increased to 11B from 7B.
Special thanks to user Undi95 for their mistral passthrough explanation with cg123's mergekit, Berkeley of course for Starling-LM-7B-alpha, and also everyone contributing to open source AI development.

Together we are strong!

The performance of this model will increase drastically as it is further fine tuned with the newly added layers.

AWQ version and GGUF version coming soon!

top 44 comments
sorted by: hot top controversial new old
[–] perlthoughts@alien.top 1 points 1 year ago

not sure about that, but it is good.

[–] LyPreto@alien.top 1 points 1 year ago (3 children)

I saw their 7B model closing in on gpt-4 scores in some benchmarks which is absolutely wild but also sus

[–] shaman-warrior@alien.top 1 points 1 year ago (1 children)

Its surprising… check it out at least

[–] perlthoughts@alien.top 1 points 1 year ago
[–] perlthoughts@alien.top 1 points 1 year ago

more bots, more bots!

[–] extopico@alien.top 1 points 1 year ago (1 children)

How was the model size increased to 11B. It’s a merge but with what?

[–] perlthoughts@alien.top 1 points 1 year ago

slices:
- sources:
- model: berkeley-nest/Starling-LM-7B-alpha
layer_range: [0, 24]
- sources:
- model: berkeley-nest/Starling-LM-7B-alpha
layer_range: [8, 32]
merge_method: passthrough
dtype: float16

[–] perlthoughts@alien.top 1 points 1 year ago

mergekit. there is a link in the post.

[–] perlthoughts@alien.top 1 points 1 year ago (1 children)

gguf files have finished uploading for all llama.cpp users.

[–] Mother-Ad-2559@alien.top 1 points 1 year ago (1 children)

Speedy work, looking forward to test these!

[–] perlthoughts@alien.top 1 points 1 year ago
[–] perlthoughts@alien.top 1 points 1 year ago

you can now download it with lmstudio. be sure to use the openchat prompt template.

[–] perlthoughts@alien.top 1 points 1 year ago (2 children)
[–] extopico@alien.top 1 points 1 year ago (1 children)

you merged Starling with Starling? What merge did you use? Can you share the yaml?

[–] perlthoughts@alien.top 1 points 1 year ago

yes, it's merged the same way as mistral 11b. With itself.

[–] roselan@alien.top 1 points 1 year ago

I'm just testing it casually rn in lmsys, and really like it's flow and tone. It's pretty pleasant to speak to from the get go, it's a good start and I can't wait to dig a bit harder on it.

[–] MoneroBee@alien.top 1 points 1 year ago (2 children)

Great model. Question, do you know why it's outputting these "<0x0A>" tokens?

For example:

Here are some ways to improve your vertical leap:**<0x0A><0x0A>**1. Strength training: Focus on exercises

[–] LocoMod@alien.top 1 points 1 year ago

I’m getting the same output. Those are line breaks. How odd…

[–] perlthoughts@alien.top 1 points 1 year ago

im on it, thanks for testing.

[–] roselan@alien.top 1 points 1 year ago

I got starling-11b-q8_0.gguf in LM studio and can't get a decent output. Or I get a one liner like "here is a possible response: hello" with defaut or chatml preset, or i get pages of various "possible" responses filed with smileys most of the time (alpaca preset).

What chat format can I use?

[–] Creative_Bottle_3225@alien.top 1 points 1 year ago (1 children)

I tried this model a little while ago with LM Studio and I noticed that it does not have GPU acceleration. Sin

[–] ex-arman68@alien.top 1 points 1 year ago

I noticed the same: in LM Studio I cannot enable Apple Metal (GPU), and I get the message: "Metal acceleration is not yet supported for this model architecture ('starcoder')". However, according to Activity Monitor, it fully uses the GPU when it runs. And it is very fast!

[–] perlthoughts@alien.top 1 points 1 year ago (1 children)

I uploaded my lmstudio config file to https://huggingface.co/NurtureAI/Starling-LM-11B-alpha-v1-GGUF to help people with lmstudio setup.

[–] CharlieBarracuda@alien.top 1 points 1 year ago (1 children)

Thank you friend, very useful! You unlocked the correct use of this amazing llm for a lot of non-tech users like me

[–] perlthoughts@alien.top 1 points 1 year ago

thanks, i really appreciate it.

[–] perlthoughts@alien.top 1 points 1 year ago (1 children)
[–] actualopenai@alien.top 1 points 1 year ago (1 children)
[–] perlthoughts@alien.top 1 points 1 year ago

for sure right after i inhale this pizza rq

[–] actualopenai@alien.top 1 points 1 year ago (2 children)

what are ur settings for passthrough as im trying to make a 11b https://huggingface.co/maywell/Synatra-7B-v0.3-RP

[–] perlthoughts@alien.top 1 points 1 year ago

slices:
- sources:
- model: berkeley-nest/Starling-LM-7B-alpha
layer_range: [0, 24]
- sources:
- model: berkeley-nest/Starling-LM-7B-alpha
layer_range: [8, 32]
merge_method: passthrough
dtype: float16

[–] perlthoughts@alien.top 1 points 1 year ago

for sure right after i inhale this pizza rq

[–] perlthoughts@alien.top 1 points 1 year ago

I noticed a lot of responses about the mergekit configuration i used to copy layers of 7b model of mistral to 11b. Here is my config.yml for mergekit (link in post description):

slices:
  - sources:
    - model: maywell/Synatra-7B-v0.3-RP
      layer_range: [0, 24]
  - sources:
    - model: maywell/Synatra-7B-v0.3-RP
      layer_range: [8, 32]
merge_method: passthrough
dtype: float16
[–] AdTotal4035@alien.top 1 points 1 year ago

How do you figure out the correct config files for models. I am using LMstudio. For example, say I have this. How am I supposed to know how to distill this info for each model? Is there some sort of guide or logic to learn. Its driving me crazy.

Prompt template: OpenChat

GPT4 User: {prompt}&lt;|end_of_turn|>GPT4 Assistant:
[–] perlthoughts@alien.top 1 points 1 year ago

be sure after you set new default template in lm studio you reload the app delete the chats and reload model there is some sort of bug in that software thaf is extremely annoying hope that helps.

[–] perlthoughts@alien.top 1 points 1 year ago (1 children)

i updated configuration again for lmstudio config in gguf repo on huggingface.

[–] AdTotal4035@alien.top 1 points 1 year ago

This is my latest output after re-installing the entire program. I re-downloaded your model and used the q4_k version this time rather than q4_k_m. Still running into this weird issue, where after its done, i get another answer, but this time its from gpt4. I have no idea what that even means. I tried to highlight my settings in red.

https://preview.redd.it/4lup96l36f3c1.png?width=1940&amp;format=png&amp;auto=webp&amp;s=74a91c79a6c4935711de0822aa232a46f3db3ad2

[–] perlthoughts@alien.top 1 points 1 year ago

Last example, until i'm introduced to another issue in the generation.

Explain how a solid state battery is made in detail.

 A solid-state battery is an emerging type of battery that uses a solid electrolyte instead of a liquid or gel-based one, which is commonly used in traditional batteries like lithium-ion batteries. The solid-state design offers several advantages over conventional batteries, such as improved safety, higher energy density, and longer lifespan. Here's a detailed explanation of how a solid-state battery is made:

1. Electrode preparation: The first step in making a solid-state battery involves preparing the anode (negative electrode) and cathode (positive electrode). Common materials used for these electrodes include lithium, silicon, or other metal oxides. These materials are typically mixed with conductive additives like carbon to enhance their electrical properties.

2. Electrolyte formation: The solid-state battery uses a solid electrolyte, which is usually made from ceramic materials such as lithium-ion conductors like Li7-xLa3-y(PO4)3 (LFP), or other solid-state materials that can facilitate the movement of ions between the anode and cathode. The electrolyte must have a high ionic conductivity while maintaining a stable structure to prevent unwanted reactions with the electrodes.

3. Electrode coating: After preparing the electrodes and electrolytes, they are coated onto thin metal or plastic substrates. This process involves depositing a thin layer of the material onto the substrate using techniques such as spraying, printing, or coating methods. The thickness of these layers is critical to ensure efficient ion transport and maintain battery performance.

4. Assembly: Once the electrodes and electrolytes are prepared, they are assembled into a battery structure. This typically involves stacking the anode, solid-state electrolyte, and cathode together in a layered configuration. The layers can be held together using adhesive materials or other bonding techniques to ensure electrical contact between the components.

5. Sealing: To prevent external contaminants from entering the battery and causing degradation, the assembled layers are sealed with a protective layer such as an encapsulating material or a thin film. This seal also helps maintain the integrity of the solid-state electrolyte and prevents unwanted reactions between the electrodes.

6. Final processing: After assembly and sealing, the battery is subjected to various tests and processes to ensure its performance, safety, and reliability. These tests may include cycling tests (charging and discharging) to evaluate the battery's capacity and lifespan, as well as thermal and mechanical stress tests to assess its durability under different conditions.

7. Battery packaging: Once the solid-state battery has passed all necessary tests and evaluations, it is packaged in a protective casing or housing that protects the battery from external impacts and environmental factors. The final product can then be integrated into various applications such as electric vehicles, portable electronics, or energy storage systems.

Overall, solid-state batteries have the potential to revolutionize the battery industry due to their superior performance characteristics and safety features. However, significant challenges still need to be addressed, including improving ionic conductivity, reducing manufacturing costs, and ensuring long-term stability of the materials used in these batteries.
[–] perlthoughts@alien.top 1 points 1 year ago

Another good example:

Explain how a rocket engine works in detail. Use specific examples where necessary.

A rocket engine is a propulsion device that uses the principles of Newton's third law of motion and conservation of momentum to generate thrust by expelling a high-speed jet of gas or other fluid. The basic principle behind a rocket engine is that when a mass is accelerated in one direction, an equal and opposite force is applied in the opposite direction.

Here's a detailed explanation of how a rocket engine works:

1. Combustion chamber: The rocket engine consists of a combustion chamber where fuel (usually a mixture of oxygen and hydrogen) and oxidizer are mixed and ignited. This process is called combustion, and it generates a high-pressure, high-temperature gas mixture known as the propellant.

2. Nozzle: The hot, high-pressure gas mixture exits the combustion chamber through a converging-diverging section of the engine called the nozzle. The nozzle's design helps to increase the speed and pressure of the gas as it expands, which in turn increases the efficiency of the rocket engine.

3. Newton's third law: According to Newton's third law of motion, every action has an equal and opposite reaction. In the case of a rocket engine, the acceleration of the propellant out of the nozzle creates an equal and opposite force on the rocket itself, causing it to move in the opposite direction. This is how the rocket generates thrust.

4. Conservation of momentum: The conservation of momentum principle states that the total momentum of a closed system remains constant if no external forces are acting upon it. In a rocket engine, the momentum of the propellant entering the combustion chamber (which is initially at rest) is equal to the momentum of the expelled gas leaving the nozzle. This ensures that the rocket continues to accelerate as long as fuel is being consumed.

5. Rocket types: There are several types of rocket engines, including:

- Liquid-fueled: These engines use liquid fuels and oxidizers, such as hydrogen and oxygen, which are stored separately in tanks. Examples include the Space Shuttle's Rocketdyne RS-28 engine and the Falcon 9's Merlin engine.

- Solid-fueled: In these engines, the fuel and oxidizer are combined into a solid, composite material. When ignited, the entire solid propellant burns uniformly from the center to the outer surface. Examples include the Space Shuttle's Solid Rocket Booster (SRB) and the early Saturn V rocket engines.

- Hybrid: These engines use a combination of liquid and solid fuels, with the oxidizer stored in a separate tank. The fuel and oxidizer are fed into the combustion chamber separately, but they burn together to produce thrust. Examples include the SpaceShipTwo's hybrid engine and some experimental rocket engines.

In summary, a rocket engine works by igniting a mixture of fuel and oxidizer in a combustion chamber, which generates high-pressure gas. The gas is then expelled through a nozzle at high speeds, creating thrust according to Newton's third law and conservation of momentum principles. Different types of rocket engines use various combinations of fuels and oxidizers to achieve the desired performance characteristics.
[–] ex-arman68@alien.top 1 points 1 year ago (1 children)

I think I found the key to avoid repetitions and long rambling answers, which this model has a tendency to do. Hopefully a further fine tune will reduce it. The key is to reduce creativity all the way down, and make the model deterministic. How do you do that?, you may ask. Easy, it is controlled by the following 3 inference parameters: temp, top_p, and top_k

With the following default settings I often get repetitions or additional rambling information:

    "top_k": 40,
    "top_p": 0.95,
    "temp": 0.8,

If I use the following values instead, to make the model deterministic, the problem seems to be gone:

    "top_k": 1,
    "top_p": 0.1,
    "temp": 0.1,

Please note that if you want to use the model for story writing, maybe you get better results by dialing up the creativity.

Here is my complete config file for LM Studio:

{
  "name": "OpenChat",
  "inference_params": {
    "top_k": 1,
    "top_p": 0.1,
    "temp": 0.1,
    "input_prefix": "GPT4 Correct User: ",
    "input_suffix": "&lt;|end_of_turn|>GPT4 Correct Assistant: ",
    "antiprompt": [
      "GPT4",
      "&lt;|end_of_turn|>",
      "[End of Turn]",
      "[]"
    ],
    "pre_prompt": "Below is an instruction that describes a task. Write a concise response that appropriately completes the request. Ensure all essential details are provided. Each of your statements must be unique.",
    "pre_prompt_suffix": "&lt;|end_of_turn|>",
    "pre_prompt_prefix": "GPT4 System: "
  }
}

A few words about the above:

  • I only include necessary options to avoid overwriting user settings when loading the model or switching prompt format. If you export a config file, please make sure you then edit it manually to clean it up.
  • GPT Correct User/Assistant. The Correct keyword is important. It refers to the training data, where the answers were verified as correct. If you do not use it (eg: GPT4 User), it will still works, but it will five more weight to training data which was unverified (Human User was also used)
  • GPT4 Sytem or just System are the 2 official recommended ways to prefix system messages. Either work.
  • In my system message (pre_promt), I avoid any negative (eg: No repetitions). Remember this is just a language model: if it sees the word "repeat" (or similar), it will have a tendency to see this as an instruction to create repetitions! Instead I turned it around into a positive statement based on the word "unique".
  • Trailing spaces in the prefixes and suffixes are not critical, but ensure proper formatting.

As a bonus, here is my config for generating code, which according to my limited testing, this model seems to be surprisingly good at:

{
  "name": "OpenChat Code",
  "inference_params": {
    "top_k": 1,
    "top_p": 0.1,
    "temp": 0.1,
    "input_prefix": "Code User: ",
    "input_suffix": "&lt;|end_of_turn|>Code Assistant: ",
    "antiprompt": [
      "GPT4",
      "&lt;|end_of_turn|>",
      "[End of Turn]",
      "[]"
    ],
    "pre_prompt": "You are a helpful coding assistant. Respond concisely, but ensure all essential details are provided. Each of your statements must be unique.",
    "pre_prompt_suffix": "&lt;|end_of_turn|>",
    "pre_prompt_prefix": "GPT4 System: "
  }
}
[–] ex-arman68@alien.top 1 points 1 year ago

I have been further testing code generation, and I am impressed! It seems be almost on par with GPT4, and it can do things GPT4 cannot, like writing code for Google Apps Script. I have tried a few relatively complex tasks, in various languages (Powershell, Python, Javascript, Google Apps Script), covering various domain (Active Directory, Spotify, Mathematics). As far as I can tell, the code provided is correct!

Try the following example (make sure to load the OpenChat code prompt format I provided earlier); the result is impressive:

I need help writing code for google apps script. Could you please write a function that connects to spotify and sort a given playlist by order of popularity. Then another function to write back the sorted track to a given playlist. Make sure to handle pagination properly, in case the spotify playlist is large. Also added errors and exception handling. Include details of how to connect to spotify.