GCP Compute Engine with NVIDIA A100 80 GB
The LLM Fine-Tuning Lab is the crown jewel of our lab infrastructure. Each student gets a dedicated NVIDIA A100 with 80 GB of VRAM — the highest-memory GPU available on GCP — giving you enough room to fine-tune 7B parameter models with full precision or run QLoRA on models up to 70B parameters. This is the same hardware that AI companies use to create their custom language models. The 1 TB SSD ensures fast model loading — downloading a 70B model from Hugging Face takes minutes, not hours. Pre-installed tooling includes Unsloth for 2x faster fine-tuning, Axolotl for config-driven training, and Flash Attention 2 for memory-efficient attention computation.
LLM fine-tuning labs are scheduled sessions (3-6 hours). Book from your course dashboard. Base models are pre-cached on the instance SSD.
Your A100 80GB instance launches with the selected base model already loaded on the 1 TB SSD. No download wait time.
Format your training dataset using the provided templates — instruction/response pairs, chat format, or preference pairs for DPO.
Set training hyperparameters: LoRA rank, learning rate, batch size, quantization settings. Use provided config templates or customize.
Launch training with Unsloth or Axolotl. Monitor loss curves, learning rate schedule, and GPU memory in real-time via W&B.
Run your fine-tuned model on evaluation prompts. Compare outputs against the base model. Run automated quality benchmarks.
Merge LoRA adapters, save the final model, and push to Cloud Storage or Hugging Face Hub for use in serving labs.
Other AI Labs environments students typically use alongside this one.
Pre-configured environment for building retrieval-augmented generation systems. Includes vector databases, embedding model APIs, document pr…
Explore lab →Single-GPU environment for training deep learning models, running computer vision pipelines, and experimenting with neural network architect…
Explore lab →Environment for deploying, serving, and benchmarking LLM inference. Students learn to optimize serving throughput, configure quantized model…
Explore lab →Enroll in a course that uses this lab, or visit our Houston center for a hands-on demo.