Chromix_

joined 1 year ago
[–] Chromix_@alien.top 1 points 1 year ago

You wrote that it works for short prompts. Did you also try slightly longer prompts (up to 4k tokens)? This doesn't hit the sliding window yet, but still resulted in not much useful output for me and some others.

 

I have tried MistralLite, as the small model size paired with the large context would be great for quick summarization of long texts. Yet it does not do so for me, neither with the transformers example code and the original model, nor when running via llama.cpp.

Instead of summarizing the text it continues the story, like with other models when using an incorrect prompt format, or no prompt at all. It does not seem to be broken in general though, as the example prompt on the model site works fine. Here is a prompt that doesn't work for me with this model:

<|prompter|>Summarize the following story:

The Time Traveller (for so it will be convenient to speak of him) was expounding a recondite matter to us. His pale grey eyes shone and twinkled, and his usually pale face was flushed and animated. The fire burnt brightly, and the soft radiance of the incandescent lights in the lilies of silver caught the bubbles that flashed and passed in our glasses. Our chairs, being his patents, embraced and caressed us rather than submitted to be sat upon, and there was that luxurious after-dinner atmosphere, when thought runs gracefully free of the trammels of precision. And he put it to us in this way - marking the points with a lean forefinger - as we sat and lazily admired his earnestness over this new paradox (as we thought it) and his fecundity.

"You must follow me carefully. I shall have to controvert one or two ideas that are almost universally accepted. The geometry, for instance, they taught you at school is founded on a misconception."

"Is not that rather a large thing to expect us to begin upon?" said Filby, an argumentative person with red hair.

"I do not mean to ask you to accept anything without reasonable ground for it. You will soon admit as much as I need from you. You know of course that a mathematical line, a line of thickness nil, has no real existence. They taught you that? Neither has a mathematical plane. These things are mere abstractions."

"That is all right," said the Psychologist.

"Nor, having only length, breadth, and thickness, can a cube have a real existence."

"There I object," said Filby. "Of course a solid body may exist. All real things-"

"So most people think. But wait a moment. Can an instantaneous cube exist?"

"Don’t follow you," said Filby.

"Can a cube that does not last for any time at all, have a real existence?"<|assistant|>

I get this in return from MistralLite:

The Time Traveller paused, and then said: "Consider a sphere of infinitesimal thickness and an infinitesimal radius punctured in the centre. Such a sphere has no volume and no surface, and as it moves forward in time it will appear to an observer to pass through him. Such a sphere, I contend, has no reality."

"But of course a volume less than a cubic centimetre can be imagined," said the Mathematician.

Mistral-OpenOrca on the other hand provides a correct answer, even with this non-matching prompt format:

The Time Traveller shared his groundbreaking idea with his friends during a lively conversation after dinner. He explained that the geometry they learned in school was based on a misconception and that mathematical lines and planes were mere abstractions without real existence. He further argued that a solid body, such as a cube, could not have real existence if it did not last for any time at all. His friends found his ideas challenging and controversial, but they were intrigued by his passion and creativity.

Might I be doing something wrong? Even with this Amazon finetune the model isn't supposed to lose summarization capabilities.