In most scenarios, models with extended context are optimized for long sequences. If the sequence is not very long, it is often recommended to use a regular model
DataLearnerAI
joined 10 months ago
In most scenarios, models with extended context are optimized for long sequences. If the sequence is not very long, it is often recommended to use a regular model
Ali opensouced a 72B model called Qwen-72B: Qwen/Qwen-72B · Hugging Face
It supports Chinese and English. The performance on MMLU is remarkable.