🌌 Fine-Tuning LLMs Locally in 2025: A Comprehensive Guide
Large language models (LLMs) are transforming the way we interact with technology. These powerful models can generate creative text formats, translate languages, write different kinds of creative content, and answer your questions in an informative way, revolutionizing various fields1. Fine-tuning allows you to adapt these LLMs for specific tasks and domains, leading to even greater accuracy and efficiency.
🌟 Introduction
🚀 Welcome to this comprehensive guide! This section will give you the foundational knowledge you need. LLMs are deep learning algorithms that can understand and generate human-like text. They are trained on massive amounts of text data, enabling them to perform a wide range of language-based tasks. However, to truly harness their power for specific applications, fine-tuning is essential. Fine-tuning involves adapting a pre-trained LLM to a new dataset and task, allowing it to learn specialized knowledge and improve its performance in the target domain.
🌟 Why Fine-Tune LLMs?
Fine-tuning offers several advantages over using a generic pre-trained LLM:
-
Enhanced Accuracy: Fine-tuning allows the model to learn the nuances and specific patterns in the task-related data, leading to more accurate and relevant outputs. For example, a model fine-tuned on medical texts will be better at understanding medical terminology and providing accurate information about diseases and treatments2.
-
Domain Adaptation: Different domains have unique vocabularies, styles, and language conventions. Fine-tuning helps the model adapt to these differences, making it more effective in specialized fields like finance, law, or scientific research2.
-
Resource Efficiency: Training an LLM from scratch is computationally expensive and time-consuming. Fine-tuning a pre-trained model leverages the knowledge already acquired during pre-training, making it a more efficient approach. This allows developers to deploy solutions faster and with fewer resources2.
-
Customization: Fine-tuning enables developers to tailor the model to their specific needs. This could involve adjusting the tone and style of generated text, improving the model’s ability to follow instructions, or focusing on particular types of queries2. Fine-tuning builds upon the concept of transfer learning, where knowledge gained from one task is applied to a different but related task. By starting with a pre-trained LLM, you are essentially transferring the knowledge acquired from a massive dataset to your specific task, leading to faster learning and improved performance2. In essence, fine-tuning allows you to customize and adapt LLMs to specific tasks and domains, leading to significant improvements in performance and efficiency.
🌟 Advantages and Disadvantages of Local Fine-tuning
Fine-tuning LLMs locally offers several benefits:
-
Data Privacy: Running LLMs locally ensures that sensitive data remains secure and private, as it does not leave your local environment. This is crucial for applications dealing with confidential information, such as healthcare or financial data3.
-
Cost Savings: Eliminating cloud fees by using local hardware can lead to significant cost savings, especially for long-term projects or high-volume applications3.
-
Control and Customization: Local fine-tuning provides greater control over the model and its training process. You can experiment with different parameters, datasets, and fine-tuning techniques without relying on external services. However, there are also some potential limitations:
-
Computational Resources: Fine-tuning large LLMs requires significant computational power, including high-performance GPUs and ample RAM. This can be a barrier for individuals or organizations with limited resources3.
-
Technical Expertise: Setting up and managing a local fine-tuning environment requires technical expertise in areas like deep learning, Python programming, and hardware configuration.
-
Model Accuracy: While fine-tuning can improve accuracy, it may not always match the performance of the most advanced models available through cloud-based APIs3.
🌟 Fine-tuning Methods
There are different approaches to fine-tuning LLMs, each with its own advantages and trade-offs:
-
Prompt Tuning: This method involves optimizing a small set of parameters that are prepended to the input prompt. It’s a lightweight approach that requires minimal computational resources and can be effective for certain tasks.
-
Full Fine-tuning: This involves updating all the parameters of the pre-trained model. It typically requires a large dataset and significant computational resources but can lead to better performance on specific tasks. However, it also carries the risk of overfitting, especially with smaller datasets2.
-
Parameter-Efficient Fine-tuning (PEFT): This approach focuses on updating only a subset of parameters or adding lightweight modules to the model. Techniques like adapters, prefix tuning, and LoRA (Low-Rank Adaptation) fall under this category. PEFT requires less data and computational power, making it more accessible and scalable. It also helps maintain the generalization capabilities of the original model while adapting it to new tasks4. Choosing the right fine-tuning method depends on factors like the specific task, the size of the model and dataset, and the available computational resources.
🌟 Tools and Libraries
Fine-tuning LLMs locally requires a set of specialized tools and libraries:
- Hugging Face Ecosystem: Hugging Face provides a central hub for accessing pre-trained LLMs, datasets, and various tools for fine-tuning and deploying models4.
- Hugging Face Hub: A vast repository of pre-trained LLMs, including Llama, Granite, Phi, and Gemma models.
- Transformers: A user-friendly Python library that simplifies working with LLMs, including tasks like loading models, tokenizing text, and fine-tuning6.
- Datasets: An efficient library for loading and preprocessing datasets for fine-tuning7.
-
TRL (Transformer Reinforcement Learning): This library streamlines the process of fine-tuning, reinforcement learning from human feedback (RLHF), and aligning open LLMs4.
-
PEFT (Parameter-Efficient Fine-Tuning) Libraries: These libraries implement techniques like LoRA and QLoRA, enabling efficient fine-tuning with reduced computational resources.
- BitsAndBytes: Enables loading and fine-tuning large models with reduced memory footprint using quantization techniques8.
- Unsloth: Provides custom kernels for faster training and reduced memory usage, especially beneficial for local fine-tuning8.
- Accelerate: This library facilitates distributed training, allowing you to utilize multiple GPUs or TPUs to speed up the fine-tuning process4. PEFT techniques like LoRA and QLoRA are crucial for efficient local fine-tuning, especially when dealing with large models and limited resources. They allow you to achieve significant performance improvements by fine-tuning only a small fraction of the model’s parameters.
🌟 Data Format
Fine-tuning requires a well-structured dataset that aligns with the task you want the model to learn. Choosing the correct data format is crucial for successful fine-tuning and depends on the specific task and model. Common data formats include:
-
Instruction Format: This format consists of pairs of prompts and their corresponding ideal completions. It’s often used for tasks like instruction following, question answering, and code generation4. JSON {“prompt”: "
", “completion”: " "} -
Conversational Format: This format represents a conversation between a user and an assistant, with each turn labeled with the respective role (e.g., “user” or “assistant”). It’s suitable for fine-tuning dialogue models and chatbots4. JSON { “messages”: [ {“role”: “system”, “content”: “You are…”}, {“role”: “user”, “content”: ”…”}, {“role”: “assistant”, “content”: ”…”}, ] }
🌟 Data Preprocessing
Before fine-tuning, it’s crucial to preprocess your dataset to ensure its quality and compatibility with the LLM. This involves several steps:
-
Identifying and Handling Missing Values: Missing data can lead to inaccurate results. Techniques for handling missing values include deleting rows or columns with missing data, imputing missing values using statistical methods, or using algorithms that can handle missing data effectively1.
-
Removing Duplicates: Duplicate entries can distort analysis and bias the model. Identifying and removing duplicates ensures that each data point is unique and contributes meaningfully to the learning process1.
-
Standardizing Data Formats: Inconsistent data formats can cause errors and confusion. Standardization involves converting text to a common case, ensuring uniform date formats, and cleaning up any inconsistencies in data representation1.
-
Outlier Detection and Treatment: Outliers are data points that deviate significantly from the rest of the data. They can affect statistical analyses and model training. Techniques for handling outliers include identifying them using statistical methods and deciding whether to remove, transform, or keep them based on their impact1.
-
Normalization and Scaling: Different features in the data may have different scales, which can affect model performance. Normalization techniques like min-max scaling or Z-score normalization bring values into a specific range, ensuring that no single feature dominates the learning process1.
-
Encoding Categorical Variables: Machine learning algorithms often require numerical input. Encoding methods like one-hot encoding or label encoding convert categorical variables (e.g., “red”, “green”, “blue”) into numerical representations that the model can understand1.
Data augmentation is another important technique that can improve the fine-tuning process. It involves creating new training examples by slightly modifying existing ones. This can help increase the size of the training dataset, prevent overfitting, and enhance the model’s ability to generalize to unseen data1.
🌟 Fine-Tuning Tutorial
Here’s a general workflow for fine-tuning LLMs locally:
1. Define a Use Case: Clearly define the task you want the model to perform. This will guide your choice of model, dataset, and fine-tuning strategy. For example, if you want to fine-tune a model for generating creative stories, you might choose a model like Llama and a dataset of fictional stories4. 2. Set Up the Environment: Install the necessary libraries, including the Hugging Face Transformers, Datasets, and TRL libraries, as well as any PEFT libraries you plan to use. Ensure you have sufficient computational resources, including a powerful GPU and ample RAM4. 3. Prepare the Dataset: Choose a dataset from Hugging Face Hub or create your own. Preprocess the data according to the steps outlined in the “Data Preprocessing” section. Format the data into the appropriate structure (instruction format or conversational format) based on your chosen task and model4. 4. Load the Model and Tokenizer: Use the transformers library to load the pre-trained LLM and its corresponding tokenizer from Hugging Face Hub. Specify the model name and any quantization configurations if you are using techniques like 4-bit quantization to reduce memory usage9. 5. Apply PEFT: If using techniques like LoRA or QLoRA, configure and apply them to the model using the appropriate PEFT library. This involves setting parameters like the rank of the LoRA layers and the target modules within the model where the adapters will be applied4. 6. Fine-Tune the Model: Use the TRL library and the SFTTrainer to fine-tune the model on your prepared dataset. Configure the training arguments, including parameters like learning rate, batch size, and number of epochs. Monitor the training process and track metrics like loss and accuracy4. 7. Evaluate the Model: Assess the performance of the fine-tuned model on a separate evaluation dataset. Use appropriate evaluation metrics to measure the model’s accuracy, fluency, and relevance for your chosen task4. 8. Save and Deploy: Save the fine-tuned model in a suitable format, such as PyTorch binaries or a quantized format like GGUF. Deploy the model for your specific application, either locally or by integrating it with a platform like Hugging Face Spaces4.
🌟 Example Programs
This section provides specific examples and tutorials for fine-tuning different LLMs locally using Python code:
⚡ Llama 3.1
-
Cloning and Running Llama 3.1: 10 provides a step-by-step guide to cloning the Llama 3.1 model from Hugging Face and running it locally. This includes setting up the environment, installing the necessary libraries, and running a basic inference example.
-
Data Extraction from PDFs: 11 demonstrates how to use Llama 3.1 to extract data from PDF documents and generate question-answer pairs. This example showcases the model’s ability to process unstructured data and create a structured dataset for fine-tuning.
-
Step-by-Step Fine-tuning Guide: 9 offers a comprehensive guide to fine-tuning Llama 3.1 locally. This includes setting up the environment, preparing the dataset, fine-tuning the model using Hugging Face Transformers and PEFT techniques, and quantizing the model for efficient deployment.
Key Takeaways: Llama 3.1 is a powerful and versatile LLM that can be fine-tuned for various tasks, including data extraction, question answering, and code generation. Using tools like Hugging Face Transformers and PEFT techniques like LoRA can significantly improve the efficiency of local fine-tuning.
⚡ Llama 3.2
-
Fine-tuning on Windows: 12 provides a guide to fine-tuning Llama 3.2 on a Windows environment. This involves installing necessary tools like a C compiler and Python development dependencies, creating a dataset, and setting up the training script using the Unsloth library.
-
Fine-tuning for Python Code: 13 demonstrates how to fine-tune Llama 3.2 3B Instruct for Python code using a curated dataset and the Unsloth library. This example highlights the model’s ability to learn and generate code, making it suitable for code completion, code generation, and code debugging tasks.
-
Dataset Preparation and Training: 14 provides insights into preparing a dataset and training Llama 3.2 with LoRA and a standardized chat template. This example emphasizes the importance of data formatting and using efficient fine-tuning techniques for optimal performance.
Key Takeaways: Llama 3.2 builds upon the capabilities of Llama 3.1 and offers improved performance in various tasks, including code-related applications. Fine-tuning with Unsloth and LoRA can further enhance its efficiency and accuracy.
⚡ Granite
-
Fine-tuning with Transformers: 15 provides a comprehensive guide to fine-tuning the IBM Granite model using Hugging Face Transformers. This example demonstrates the process of loading the model, preparing the dataset, applying PEFT techniques, and evaluating the fine-tuned model’s performance.
-
Prompt Tuning with Watsonx: 16 guides you through prompt tuning a Granite model using IBM Watsonx. This involves setting up the environment, importing datasets, and evaluating performance using metrics like accuracy and F1-score.
Key Takeaways: Granite is a powerful LLM that can be fine-tuned for various tasks using different approaches like prompt tuning and full fine-tuning. Utilizing tools like Hugging Face Transformers and IBM Watsonx can streamline the fine-tuning process.
⚡ Phi4
-
Fine-tuning on Google Colab: 17 provides a guide to fine-tuning Microsoft’s Phi-4 model on Google Colab. This example leverages the free GPU resources available on Colab to fine-tune the model for specific tasks.
-
Fine-tuning with LoRA: 18 explains how to fine-tune Phi-4 locally using LoRA adapters. This involves setting up the model, preparing the dataset, and monitoring GPU usage during the fine-tuning process.
Key Takeaways: Phi-4 is a powerful open-source LLM that can be efficiently fine-tuned using PEFT techniques like LoRA. Utilizing platforms like Google Colab can provide access to the necessary computational resources for fine-tuning large models.
⚡ Gemma2
-
Fine-tuning for Custom Data Extraction: 19 demonstrates how to fine-tune Gemma2 for custom data extraction using local GPUs, the Unsloth library, and synthetic data. This example highlights the model’s ability to learn specific patterns and extract relevant information from text.
-
Fine-tuning for Medical Data: 20 shows how to fine-tune Gemma2 on a medical dataset using Kaggle for GPU access and Weights & Biases for performance tracking. This example showcases the model’s potential in healthcare applications, such as medical question answering and information retrieval.
-
Fine-tuning with JAX and Flax: 21 provides a tutorial on fine-tuning Gemma2 using JAX and Flax. This involves loading the model, preparing the dataset, and configuring the training loop using JAX’s functional programming paradigm and Flax’s neural network library.
Key Takeaways: Gemma2 is a versatile LLM that can be fine-tuned for various tasks, including data extraction, medical applications, and code generation. Utilizing different tools and libraries like Unsloth, Kaggle, and JAX can optimize the fine-tuning process.
🌟 Retrieval Augmented Generation (RAG)
While fine-tuning can enhance an LLM’s knowledge and performance, it has limitations when dealing with long documents or dynamic content. Retrieval Augmented Generation (RAG) is a technique that complements fine-tuning by allowing the LLM to access and utilize external knowledge sources3. RAG involves retrieving relevant information from a knowledge base or external documents and incorporating it into the LLM’s response. This allows the model to provide more comprehensive and up-to-date answers, even when the information is not explicitly included in its training data. RAG is particularly beneficial when:
-
You want to dynamically change the content that the LLM can access, such as regularly updating a knowledge base or personalizing information access for specific users3.
-
You need to restrict access to certain parts of the data, as fine-tuning can potentially expose the entire training data to anyone using the model3.
-
You want to get references to particular documents in the LLM’s answers, allowing for verification and traceability of information3.
🌟 Open-Source vs. Proprietary LLMs
When choosing an LLM for fine-tuning, you have the option of using open-source or proprietary models. Each has its own advantages and disadvantages:
⚡ Open-Source LLMs:
- Pros:
- Accessibility: Open-source models are free to use and modify, allowing for community contributions and rapid innovation2.
- Transparency: Users can inspect the model architecture and training data, fostering trust and understanding2.
- Customization: Organizations can fine-tune models to meet specific needs without licensing restrictions2.
- Cons:
- Support: Limited official support compared to proprietary models, relying on community forums for troubleshooting2.
- Quality Control: Variability in model quality and performance, as contributions may not always meet high standards2.
⚡ Proprietary LLMs:
- Pros:
- Support and Maintenance: Often come with dedicated support teams and regular updates, ensuring reliability2.
- Performance: Typically optimized for specific tasks, providing high-quality outputs2.
- Security: Companies may offer better data protection and compliance with regulations2.
- Cons:
- Cost: Licensing fees can be significant, especially for large-scale deployments2.
- Limited Customization: Users may have restricted access to modify the model or its training data2. The choice between open-source and proprietary LLMs depends on factors like budget, desired level of customization, and the need for support and maintenance.
🌟 Evaluating LLMs
Evaluating the performance of fine-tuned LLMs is crucial to ensure they meet the desired requirements. This involves using appropriate evaluation metrics and tools to measure the model’s accuracy, fluency, and relevance for the specific task22. Some common evaluation metrics for LLMs include:
-
Accuracy: Measures how often the model generates the correct answer or output.
-
Fluency: Assesses the grammatical correctness and naturalness of the generated text.
-
Relevance: Evaluates how well the model’s response addresses the given prompt or question. Several tools are available for evaluating LLMs:
-
Open LLM Leaderboard: Provides a benchmark for comparing the performance of different LLMs on various tasks22.
-
Language Model Evaluation Harness: A framework for evaluating LLMs on a wide range of tasks, including question answering, text generation, and code generation22.
-
Chatbot Arena: A platform for evaluating and comparing the conversational abilities of different LLMs22.
🌟 Quantization
Quantization is a technique used to reduce the memory footprint and computational requirements of LLMs. It involves converting the model’s parameters from higher precision (e.g., 32-bit floating point) to lower precision (e.g., 4-bit integers). This can significantly speed up inference and reduce the resources needed for local fine-tuning22. Different quantization techniques are available, each with its own trade-offs between accuracy and efficiency:
-
FP32 (Full Precision): The standard format for storing model parameters, offering high accuracy but requiring significant memory.
-
FP16 (Half Precision): Reduces memory usage by half compared to FP32, with minimal impact on accuracy.
-
INT8 (8-bit Integer): Further reduces memory usage and improves speed, but may lead to some loss of accuracy.
-
4-bit Quantization: Offers even greater memory savings and speed improvements, but requires careful tuning to maintain acceptable accuracy. Libraries like BitsAndBytes provide tools for quantizing LLMs and fine-tuning them with reduced precision.
🌟 Computational Resources
Fine-tuning LLMs locally requires significant computational resources. The specific requirements depend on the model size, dataset size, and fine-tuning technique1. Here are some key considerations when choosing hardware for local fine-tuning:
-
Processing Power: High-performance CPUs or GPUs are essential for handling the computational demands of LLM fine-tuning. GPUs are generally recommended for faster training1.
-
Memory (RAM): Sufficient RAM is crucial to store the model parameters, training data, and intermediate activations during the fine-tuning process. Larger models and datasets require more RAM1.
-
Storage Solutions: Choose storage solutions that can handle the size of the model and dataset. SSDs offer faster access speeds compared to traditional HDDs1.
-
Network Infrastructure: A robust network infrastructure is important for downloading models and datasets, especially if you are working with large files1.
-
Backup and Recovery: Implement reliable backup solutions to protect your models and data from loss1.
-
Scalability Options: Consider hardware that allows for easy upgrades in the future, as your computational needs may evolve1.
🌟 Model Comparison
Model | Key Features | Advantages | Disadvantages |
---|---|---|---|
Llama 3.1 | Powerful and versatile, supports various tasks | Relatively accessible, can be fine-tuned with PEFT | May require significant computational resources |
Llama 3.2 | Improved performance, optimized for code-related tasks | Efficient with Unsloth and LoRA | Still computationally demanding for large-scale fine-tuning |
Granite | Supports prompt tuning and full fine-tuning | Can be fine-tuned with Watsonx | May require specialized hardware or cloud resources |
Phi4 | Open-source, efficient with LoRA | Can be fine-tuned on Google Colab | May have limitations in certain tasks compared to larger models |
Gemma2 | Versatile, suitable for data extraction and medical applications | Can be fine-tuned with JAX and Flax | May require careful tuning for optimal performance |
🌟 Conclusion
Fine-tuning LLMs locally empowers you to create custom AI solutions tailored to your specific needs. By following the guidelines and examples in this article, you can harness the power of LLMs and unlock their full potential for various applications. Remember to choose the right tools, prepare your data carefully, and optimize your computational resources for an efficient and successful fine-tuning process. Local fine-tuning offers significant advantages in terms of data privacy, cost savings, and control over the model. However, it also presents challenges related to computational resources and technical expertise. By carefully considering these factors and choosing the appropriate fine-tuning methods and tools, you can effectively adapt LLMs to your specific tasks and domains. As LLMs continue to evolve, local fine-tuning will become increasingly important for developing specialized AI applications. Advancements in PEFT techniques, quantization methods, and hardware capabilities will further enhance the accessibility and efficiency of local fine-tuning, enabling wider adoption and innovation in the field of LLMs.
🔧 Works cited
1. Ultimate Guide to LLM Fine-tuning 2025 - Rapid Innovation, accessed on February 26, 2025, https://www.rapidinnovation.io/post/fine-tuning-large-language-models-llms
2. Fine-Tuning LLMs: Expert Guide to Task-Specific AI Models - Rapid Innovation, accessed on February 26, 2025, https://www.rapidinnovation.io/post/for-developers-step-by-step-guide-to-fine-tuning-llms-for-specific-tasks
3. LLMs for Business: Fine-Tuning and Risks - Serokell, accessed on February 26, 2025, https://serokell.io/blog/llms-fine-tuning-avoiding-risks
4. How to fine-tune open LLMs in 2025 with Hugging Face - Philschmid, accessed on February 26, 2025, https://www.philschmid.de/fine-tune-llms-in-2025
5. Fine-Tuning Large Language Models (LLMs) - Towards Data Science, accessed on February 26, 2025, https://towardsdatascience.com/fine-tuning-large-language-models-llms-23473d763b91/
6. Top 10 LLM Tools to Run Models Locally in 2025 - God of Prompt, accessed on February 26, 2025, https://www.godofprompt.ai/blog/top-10-llm-tools-to-run-models-locally-in-2025
7. Fine-Tuning LLMs: A Guide With Examples - DataCamp, accessed on February 26, 2025, https://www.datacamp.com/tutorial/fine-tuning-large-language-models
8. Fine-Tune Llama 3.1 Ultra-Efficiently with Unsloth | Towards Data Science, accessed on February 26, 2025, https://towardsdatascience.com/fine-tune-llama-3-1-ultra-efficiently-with-unsloth-7196c7165bab/
9. How to Fine-Tune LLaMA 3.1 Locally: A Step-by-Step Guide | by Adarsh Ajay | Medium, accessed on February 26, 2025, https://medium.com/@adarsh.ajay/how-to-fine-tune-llama-3-1-locally-a-step-by-step-guide-341de509d64f
10. How to Run Llama-3.1 Locally Using Python and Hugging Face - DEV Community, accessed on February 26, 2025, https://dev.to/debapriyadas/cloning-and-running-llama-31-model-from-hugging-face-using-python-3m80
11. Building Fine Tuning Dataset Using Llama 3.1 - YouTube, accessed on February 26, 2025, https://www.youtube.com/watch?v=q_VjdFXAjEQ
12. Fine-Tuning Llama 3.2 for Targeted Performance: A Step-by-Step Guide - Spheron’s Blog, accessed on February 26, 2025, https://blog.spheron.network/fine-tuning-llama-32-for-targeted-performance-a-step-by-step-guide
13. Fine-Tuning Llama 3.2 3B Instruct for Python Code: A Comprehensive Guide with Unsloth, accessed on February 26, 2025, https://www.marktechpost.com/2025/02/04/fine-tuning-llama-3-2-3b-instruct-for-python-code-a-comprehensive-guide-with-unsloth/
14. EASIEST Way to Fine-Tune LLAMA-3.2 and Run it in Ollama - YouTube, accessed on February 26, 2025, https://www.youtube.com/watch?v=YZW3pkIR-YE
15. Supervised Fine-tuning of IBM granite model using transformers and Optuna - Medium, accessed on February 26, 2025, https://medium.com/@shilpahegde03/supervised-fine-tuning-of-ibm-granite-model-using-transformers-5752cd01fae5
16. Prompt tune a Granite model in Python using watsonx - IBM, accessed on February 26, 2025, https://www.ibm.com/think/tutorials/prompt-tune-a-granite-model-using-watsonx
17. Fine-tune Microsoft’s new open-source LLM, Phi-4 for free via Colab! - Reddit, accessed on February 26, 2025, https://www.reddit.com/r/GoogleColab/comments/1i2422b/finetune_microsofts_new_opensource_llm_phi4_for/
18. How to Fine-Tune Phi-4 Locally? - Analytics Vidhya, accessed on February 26, 2025, https://www.analyticsvidhya.com/blog/2025/01/fine-tune-phi-4-locally/
19. Fine-tuning Gemma 2 2B for custom data extraction, using Local GPU, Unsloth and your own synthetic… | by Vasudevan Vijayaragavan, Associate Vice President - Medium, accessed on February 26, 2025, https://medium.com/@vasudevan.vijay/fine-tuning-gemma-2-2b-for-custom-data-extraction-using-local-gpu-unsloth-and-your-own-synthetic-6ac4fb8064e8
20. Fine-Tuning Gemma 2 and Using it Locally - DataCamp, accessed on February 26, 2025, https://www.datacamp.com/tutorial/fine-tuning-gemma-2
21. Fine-tuning Gemma using JAX and Flax | Google AI for Developers - Gemini API, accessed on February 26, 2025, https://ai.google.dev/gemma/docs/jax_finetune
22. Top AI/LLM learning resource in 2025, accessed on February 26, 2025, https://originshq.com/blog/top-ai-llm-learning-resource-in-2025/