Does anyone know if they expose all the good stuff that Guidance uses for their guided generation and speedup? This plus guidance (kv cache, grammar control, etc) would be fast fast!
Does anyone know if they expose all the good stuff that Guidance uses for their guided generation and speedup? This plus guidance (kv cache, grammar control, etc) would be fast fast!