Among other things, GPTQ, GGUF' K-Quants, and bitsandbytes FP4 are relatively "easy" quantization. Not to discount them... They are very sophisticated, but models can be quantized very quickly with them.
EXL2 an AWQ are much more intense. You feed them profiling data, text you want to use as a reference to optimize the quantization towards that. And the quantization takes forever, and requires a lot of GPU. But the quantized weights you get out of them are very VRAM efficient.