Resources

LoRA

  • LoRA Paper - “LoRA: Low-Rank Adaptation of Large Language Models” (Hu et al., 2021)
  • PEFT on GitHub - Hugging Face’s Parameter-Efficient Fine-Tuning library, supporting LoRA, QLoRA, and other adapters

QLoRA

  • QLoRA Paper - “QLoRA: Efficient Finetuning of Quantized LLMs” (Dettmers et al., 2023)
  • bitsandbytes on GitHub - Library providing 4-bit and 8-bit quantization support used by QLoRA

Fine-Tuning Frameworks

Weights & Biases

Hugging Face

Quantization and Local Deployment

  • llama.cpp on GitHub - C/C++ library for GGUF quantization (llama-quantize) and local inference
  • LM Studio - Desktop GUI for discovering, downloading, and chatting with quantized models from Hugging Face

Citations