Blogs

Thoughts on AI/ML, research findings, and technical insights

Featured Post

How I Customized Llama 3.1 8B on a Budget

Democratizing AI: Fine-tuning Large Language Models with Limited Resources

Graduate Student Research Northeastern University Budget-Friendly AI

I was impressed by how fine-tuned large language models outperform retrieval-augmented systems, especially at inference time. So I set out to fine-tune an open-source model like Meta's Llama 3.1 (8B parameters). But most sources stated you needed giant, expensive GPUs and tons of storage resources that I simply didn't have...

Read Full Article

Recent Posts

Optimizing LLMs on AMD MI300X: What I Learned Porting Qwen to ROCm

A deep dive into porting transformer kernels from NVIDIA CUDA to AMD ROCm/HIP, achieving 12.1ms token generation with 79-83% memory bandwidth utilization on MI300X architecture.

GPU Optimization CUDA to HIP AMD MI300X

How I Customized Llama 3.1 8B on a Budget

Learn how to fine-tune large language models without expensive hardware, using LoRA and smart optimization techniques.

LLM Fine-tuning LoRA Budget AI
Dec 15, 2024 Read More →

More Content on Substack

Discover additional articles, research insights, and technical deep-dives on my Substack

Visit My Substack

Stay Updated

Get notified about new blog posts, research findings, and AI/ML insights