not sure about that, but it is good.
LocalLLaMA
Community to discuss about Llama, the family of large language models created by Meta AI.
I saw their 7B model closing in on gpt-4 scores in some benchmarks which is absolutely wild but also sus
Its surprising… check it out at least
thanks.
more bots, more bots!
zuck?
How was the model size increased to 11B. It’s a merge but with what?
slices:
- sources:
- model: berkeley-nest/Starling-LM-7B-alpha
layer_range: [0, 24]
- sources:
- model: berkeley-nest/Starling-LM-7B-alpha
layer_range: [8, 32]
merge_method: passthrough
dtype: float16
mergekit. there is a link in the post.
gguf files have finished uploading for all llama.cpp users.
Speedy work, looking forward to test these!
thanks!
you can now download it with lmstudio. be sure to use the openchat prompt template.
with itself.
you merged Starling with Starling? What merge did you use? Can you share the yaml?
yes, it's merged the same way as mistral 11b. With itself.
lmao
😀
I'm just testing it casually rn in lmsys, and really like it's flow and tone. It's pretty pleasant to speak to from the get go, it's a good start and I can't wait to dig a bit harder on it.
Great model. Question, do you know why it's outputting these "<0x0A>" tokens?
For example:
Here are some ways to improve your vertical leap:**<0x0A><0x0A>**1. Strength training: Focus on exercises
I’m getting the same output. Those are line breaks. How odd…
im on it, thanks for testing.
I got starling-11b-q8_0.gguf in LM studio and can't get a decent output. Or I get a one liner like "here is a possible response: hello" with defaut or chatml preset, or i get pages of various "possible" responses filed with smileys most of the time (alpaca preset).
What chat format can I use?
I tried this model a little while ago with LM Studio and I noticed that it does not have GPU acceleration. Sin
I noticed the same: in LM Studio I cannot enable Apple Metal (GPU), and I get the message: "Metal acceleration is not yet supported for this model architecture ('starcoder')". However, according to Activity Monitor, it fully uses the GPU when it runs. And it is very fast!
I uploaded my lmstudio config file to https://huggingface.co/NurtureAI/Starling-LM-11B-alpha-v1-GGUF to help people with lmstudio setup.
Thank you friend, very useful! You unlocked the correct use of this amazing llm for a lot of non-tech users like me
thanks, i really appreciate it.
also i released more 11b mistral sizes.... incase anyone is interested.
https://huggingface.co/NurtureAI/SynthIA-11B-v1.3
https://huggingface.co/NurtureAI/Mistral-11B-Instruct-v0.1
https://huggingface.co/NurtureAI/dolphin-2.2.1-mistral-11b
https://huggingface.co/NurtureAI/zephyr-11b-beta
https://huggingface.co/NurtureAI/openchat_3.5-11B
https://huggingface.co/NurtureAI/neural-chat-11b-v3-1
May the force be with you.
can u add this one :) https://huggingface.co/maywell/Synatra-7B-v0.3-RP
for sure right after i inhale this pizza rq
what are ur settings for passthrough as im trying to make a 11b https://huggingface.co/maywell/Synatra-7B-v0.3-RP
slices:
- sources:
- model: berkeley-nest/Starling-LM-7B-alpha
layer_range: [0, 24]
- sources:
- model: berkeley-nest/Starling-LM-7B-alpha
layer_range: [8, 32]
merge_method: passthrough
dtype: float16
for sure right after i inhale this pizza rq
I noticed a lot of responses about the mergekit configuration i used to copy layers of 7b model of mistral to 11b. Here is my config.yml for mergekit (link in post description):
slices:
- sources:
- model: maywell/Synatra-7B-v0.3-RP
layer_range: [0, 24]
- sources:
- model: maywell/Synatra-7B-v0.3-RP
layer_range: [8, 32]
merge_method: passthrough
dtype: float16
How do you figure out the correct config files for models. I am using LMstudio. For example, say I have this. How am I supposed to know how to distill this info for each model? Is there some sort of guide or logic to learn. Its driving me crazy.
Prompt template: OpenChat
GPT4 User: {prompt}<|end_of_turn|>GPT4 Assistant:
be sure after you set new default template in lm studio you reload the app delete the chats and reload model there is some sort of bug in that software thaf is extremely annoying hope that helps.
i updated configuration again for lmstudio config in gguf repo on huggingface.
This is my latest output after re-installing the entire program. I re-downloaded your model and used the q4_k version this time rather than q4_k_m. Still running into this weird issue, where after its done, i get another answer, but this time its from gpt4. I have no idea what that even means. I tried to highlight my settings in red.
Last example, until i'm introduced to another issue in the generation.
Explain how a solid state battery is made in detail.
A solid-state battery is an emerging type of battery that uses a solid electrolyte instead of a liquid or gel-based one, which is commonly used in traditional batteries like lithium-ion batteries. The solid-state design offers several advantages over conventional batteries, such as improved safety, higher energy density, and longer lifespan. Here's a detailed explanation of how a solid-state battery is made:
1. Electrode preparation: The first step in making a solid-state battery involves preparing the anode (negative electrode) and cathode (positive electrode). Common materials used for these electrodes include lithium, silicon, or other metal oxides. These materials are typically mixed with conductive additives like carbon to enhance their electrical properties.
2. Electrolyte formation: The solid-state battery uses a solid electrolyte, which is usually made from ceramic materials such as lithium-ion conductors like Li7-xLa3-y(PO4)3 (LFP), or other solid-state materials that can facilitate the movement of ions between the anode and cathode. The electrolyte must have a high ionic conductivity while maintaining a stable structure to prevent unwanted reactions with the electrodes.
3. Electrode coating: After preparing the electrodes and electrolytes, they are coated onto thin metal or plastic substrates. This process involves depositing a thin layer of the material onto the substrate using techniques such as spraying, printing, or coating methods. The thickness of these layers is critical to ensure efficient ion transport and maintain battery performance.
4. Assembly: Once the electrodes and electrolytes are prepared, they are assembled into a battery structure. This typically involves stacking the anode, solid-state electrolyte, and cathode together in a layered configuration. The layers can be held together using adhesive materials or other bonding techniques to ensure electrical contact between the components.
5. Sealing: To prevent external contaminants from entering the battery and causing degradation, the assembled layers are sealed with a protective layer such as an encapsulating material or a thin film. This seal also helps maintain the integrity of the solid-state electrolyte and prevents unwanted reactions between the electrodes.
6. Final processing: After assembly and sealing, the battery is subjected to various tests and processes to ensure its performance, safety, and reliability. These tests may include cycling tests (charging and discharging) to evaluate the battery's capacity and lifespan, as well as thermal and mechanical stress tests to assess its durability under different conditions.
7. Battery packaging: Once the solid-state battery has passed all necessary tests and evaluations, it is packaged in a protective casing or housing that protects the battery from external impacts and environmental factors. The final product can then be integrated into various applications such as electric vehicles, portable electronics, or energy storage systems.
Overall, solid-state batteries have the potential to revolutionize the battery industry due to their superior performance characteristics and safety features. However, significant challenges still need to be addressed, including improving ionic conductivity, reducing manufacturing costs, and ensuring long-term stability of the materials used in these batteries.
Another good example:
Explain how a rocket engine works in detail. Use specific examples where necessary.
A rocket engine is a propulsion device that uses the principles of Newton's third law of motion and conservation of momentum to generate thrust by expelling a high-speed jet of gas or other fluid. The basic principle behind a rocket engine is that when a mass is accelerated in one direction, an equal and opposite force is applied in the opposite direction.
Here's a detailed explanation of how a rocket engine works:
1. Combustion chamber: The rocket engine consists of a combustion chamber where fuel (usually a mixture of oxygen and hydrogen) and oxidizer are mixed and ignited. This process is called combustion, and it generates a high-pressure, high-temperature gas mixture known as the propellant.
2. Nozzle: The hot, high-pressure gas mixture exits the combustion chamber through a converging-diverging section of the engine called the nozzle. The nozzle's design helps to increase the speed and pressure of the gas as it expands, which in turn increases the efficiency of the rocket engine.
3. Newton's third law: According to Newton's third law of motion, every action has an equal and opposite reaction. In the case of a rocket engine, the acceleration of the propellant out of the nozzle creates an equal and opposite force on the rocket itself, causing it to move in the opposite direction. This is how the rocket generates thrust.
4. Conservation of momentum: The conservation of momentum principle states that the total momentum of a closed system remains constant if no external forces are acting upon it. In a rocket engine, the momentum of the propellant entering the combustion chamber (which is initially at rest) is equal to the momentum of the expelled gas leaving the nozzle. This ensures that the rocket continues to accelerate as long as fuel is being consumed.
5. Rocket types: There are several types of rocket engines, including:
- Liquid-fueled: These engines use liquid fuels and oxidizers, such as hydrogen and oxygen, which are stored separately in tanks. Examples include the Space Shuttle's Rocketdyne RS-28 engine and the Falcon 9's Merlin engine.
- Solid-fueled: In these engines, the fuel and oxidizer are combined into a solid, composite material. When ignited, the entire solid propellant burns uniformly from the center to the outer surface. Examples include the Space Shuttle's Solid Rocket Booster (SRB) and the early Saturn V rocket engines.
- Hybrid: These engines use a combination of liquid and solid fuels, with the oxidizer stored in a separate tank. The fuel and oxidizer are fed into the combustion chamber separately, but they burn together to produce thrust. Examples include the SpaceShipTwo's hybrid engine and some experimental rocket engines.
In summary, a rocket engine works by igniting a mixture of fuel and oxidizer in a combustion chamber, which generates high-pressure gas. The gas is then expelled through a nozzle at high speeds, creating thrust according to Newton's third law and conservation of momentum principles. Different types of rocket engines use various combinations of fuels and oxidizers to achieve the desired performance characteristics.
I think I found the key to avoid repetitions and long rambling answers, which this model has a tendency to do. Hopefully a further fine tune will reduce it. The key is to reduce creativity all the way down, and make the model deterministic. How do you do that?, you may ask. Easy, it is controlled by the following 3 inference parameters: temp, top_p, and top_k
With the following default settings I often get repetitions or additional rambling information:
"top_k": 40,
"top_p": 0.95,
"temp": 0.8,
If I use the following values instead, to make the model deterministic, the problem seems to be gone:
"top_k": 1,
"top_p": 0.1,
"temp": 0.1,
Please note that if you want to use the model for story writing, maybe you get better results by dialing up the creativity.
Here is my complete config file for LM Studio:
{
"name": "OpenChat",
"inference_params": {
"top_k": 1,
"top_p": 0.1,
"temp": 0.1,
"input_prefix": "GPT4 Correct User: ",
"input_suffix": "<|end_of_turn|>GPT4 Correct Assistant: ",
"antiprompt": [
"GPT4",
"<|end_of_turn|>",
"[End of Turn]",
"[]"
],
"pre_prompt": "Below is an instruction that describes a task. Write a concise response that appropriately completes the request. Ensure all essential details are provided. Each of your statements must be unique.",
"pre_prompt_suffix": "<|end_of_turn|>",
"pre_prompt_prefix": "GPT4 System: "
}
}
A few words about the above:
- I only include necessary options to avoid overwriting user settings when loading the model or switching prompt format. If you export a config file, please make sure you then edit it manually to clean it up.
- GPT Correct User/Assistant. The Correct keyword is important. It refers to the training data, where the answers were verified as correct. If you do not use it (eg: GPT4 User), it will still works, but it will five more weight to training data which was unverified (Human User was also used)
- GPT4 Sytem or just System are the 2 official recommended ways to prefix system messages. Either work.
- In my system message (pre_promt), I avoid any negative (eg: No repetitions). Remember this is just a language model: if it sees the word "repeat" (or similar), it will have a tendency to see this as an instruction to create repetitions! Instead I turned it around into a positive statement based on the word "unique".
- Trailing spaces in the prefixes and suffixes are not critical, but ensure proper formatting.
As a bonus, here is my config for generating code, which according to my limited testing, this model seems to be surprisingly good at:
{
"name": "OpenChat Code",
"inference_params": {
"top_k": 1,
"top_p": 0.1,
"temp": 0.1,
"input_prefix": "Code User: ",
"input_suffix": "<|end_of_turn|>Code Assistant: ",
"antiprompt": [
"GPT4",
"<|end_of_turn|>",
"[End of Turn]",
"[]"
],
"pre_prompt": "You are a helpful coding assistant. Respond concisely, but ensure all essential details are provided. Each of your statements must be unique.",
"pre_prompt_suffix": "<|end_of_turn|>",
"pre_prompt_prefix": "GPT4 System: "
}
}
I have been further testing code generation, and I am impressed! It seems be almost on par with GPT4, and it can do things GPT4 cannot, like writing code for Google Apps Script. I have tried a few relatively complex tasks, in various languages (Powershell, Python, Javascript, Google Apps Script), covering various domain (Active Directory, Spotify, Mathematics). As far as I can tell, the code provided is correct!
Try the following example (make sure to load the OpenChat code prompt format I provided earlier); the result is impressive:
I need help writing code for google apps script. Could you please write a function that connects to spotify and sort a given playlist by order of popularity. Then another function to write back the sorted track to a given playlist. Make sure to handle pagination properly, in case the spotify playlist is large. Also added errors and exception handling. Include details of how to connect to spotify.