Technical Documentation

A Comprehensive Guide To Using Unsloth AI In Google Colaboratory

Technical guide covering **a comprehensive guide to using unsloth ai in google colaboratory**

👤
Author
Cosmic Lounge AI Team
📅
Updated
6/1/2025
⏱️
Read Time
7 min
Topics
#llm #ai #model #fine-tuning #training #gpu #cuda #pytorch #introduction #design

📖 Reading Mode

📖 Table of Contents

🌌 A Comprehensive Guide to Using Unsloth AI in Google Colaboratory



🌟 1. Introduction to Unsloth AI and Google Colab:

🚀 Welcome to this comprehensive guide! This section will give you the foundational knowledge you need. Unsloth AI presents a significant advancement in the efficient fine-tuning of large language models (LLMs).

It functions as a framework engineered to accelerate the fine-tuning process by up to two times and reduce memory consumption by as much as 70% for prominent LLMs such as Llama-3, Mistral, Phi-4, and Gemma, all without any discernible loss in accuracy 1. This enhanced efficiency holds substantial value for researchers and practitioners who may operate with limited computational resources or require rapid iteration during model development. The primary advantage offered by Unsloth lies in its ability to optimize the fine-tuning workflow, thereby making this complex process more accessible and practically viable for a broader spectrum of users. The repeated emphasis on the speed and memory reduction achieved through Unsloth strongly suggests that these are its key differentiators in the competitive landscape of LLM fine-tuning tools. Google Colaboratory emerges as the preferred environment for leveraging the capabilities of Unsloth. This platform provides complimentary access to GPU resources, rendering it an exceptionally suitable choice for training models that demand substantial computational power 1. Unsloth is specifically designed to integrate seamlessly within the Colab ecosystem, offering pre-configured notebooks that are optimized to effectively utilize these freely available resources. The natural alignment between Unsloth and Google Colab effectively broadens access to LLM fine-tuning capabilities by eliminating the necessity for expensive local hardware infrastructure. The official Unsloth AI documentation, accessible at https://docs.unsloth.ai/, serves as the authoritative source of information for all aspects of installing, utilizing, and comprehending the Unsloth framework 1. This documentation covers fundamental aspects such as the installation procedures, the process of creating datasets for fine-tuning, and the steps involved in deploying the resulting custom models. The presence of comprehensive official documentation signifies a level of maturity and robust support for the tool, offering users a reliable foundation for their work.



🌟 2. Getting Started with Unsloth Colab Notebooks:

The official Unsloth notebooks, which provide pre-configured environments for fine-tuning various language models, are readily accessible through their GitHub repository, likely located at https://github.com/unslothai/notebooks 6. Users can gain access to these notebooks directly through the Google Colab interface by either providing the GitHub URL or by navigating to the repository within a web browser and selecting the option to open the notebooks in Colab. Hosting these notebooks on GitHub offers several advantages, including facilitating version control, enabling contributions from the community, and ensuring straightforward access for users. A typical Unsloth Colab notebook is structured as a sequence of code cells and markdown cells. The markdown cells are designed to provide users with explanations, instructions, and contextual information regarding the code and the fine-tuning process. Conversely, the code cells contain the executable Python code necessary for various tasks, such as installing the Unsloth library and its dependencies, loading and preparing the training data, configuring the language model, initiating the fine-tuning process, and finally, performing inference with the fine-tuned model. This standard structure of Colab notebooks allows individuals already familiar with the platform to quickly grasp and interact with Unsloth’s implementation. The purpose of the different code cells within an Unsloth Colab notebook is logically organized to guide the user through the fine-tuning workflow. The initial code cells are typically dedicated to the installation of the Unsloth library and any other required dependencies 2. Subsequent cells then focus on the crucial step of data preparation, which might involve loading a dataset from a file or a cloud repository and formatting it appropriately for the chosen language model. Following this, other cells handle the loading and configuration of the pre-trained language model that will be fine-tuned. Parameters related to the training process, such as the learning rate, batch size, and number of training steps, are usually defined in dedicated cells. The core of the notebook then contains the code that initiates and executes the fine-tuning process. Finally, notebooks typically include cells for evaluating the performance of the fine-tuned model and for demonstrating how to use it for inference on new data. This sequential organization of code cells within the notebooks provides a clear and logical pathway for users to follow, simplifying the often complex process of LLM fine-tuning. To interact with the Unsloth Colab notebooks, users need to understand a few basic functionalities of the Colab environment. The “Play” button, located to the left of each code cell, allows users to execute that specific cell 2. Alternatively, users can use the keyboard shortcut CTRL + ENTER to achieve the same result. It is essential to execute these code cells in the order they appear in the notebook, as subsequent cells may rely on the outcomes or variables defined in the preceding ones. Skipping cells can often lead to errors in the execution. The “Runtime” menu, situated in the top toolbar of the Colab interface, provides several useful options. One such option is “Run all,” which executes all the code cells within the notebook in a sequential manner 2. While this can be a convenient way to run the entire workflow, particularly for initial trials or when using default configurations, it will bypass any customization steps that the user might want to implement. Lastly, the “Connect / Reconnect T4 button” is crucial for managing the GPU instance provided by Google Colab. The T4 GPU is a free resource offered by Colab, and this button allows users to connect to or reconnect to this GPU, which is essential for accelerating the computationally intensive tasks involved in fine-tuning large language models.



🌟 3. Supported Language Models in Unsloth Colab:

Unsloth Colab notebooks offer support for a diverse array of popular large language models 1. This includes models from the Llama family, such as Llama 3 (including versions with 8 billion and 70 billion parameters, as well as the 3.1-8B and 3.2-1B/3B variants), and vision-enabled models like Llama 3.2 Vision (11B).

Models from the Mistral family are also well-represented, including Mistral v0.3 Instruct (7B), Mistral NeMo (12B), Mistral Small (22B), and Mistral v0.3 (7B). The Gemma family, developed by Google, is supported with models like Gemma 3 (4B and 1B) and Gemma 2 (9B and 2B).

For users interested in the Phi architecture, Unsloth supports Phi-4 (14B), Phi-3.5 (mini), and Phi-3 (medium).

Additionally, models from the Qwen family are included, such as Qwen 2.5 (in various sizes like 3B, 7B, 14B, 32B, and 1.5B), Qwen 2 (7B), Qwen 2.5 Coder (14B), and the vision-language model Qwen2-VL (7B and 72B).

Beyond these major families, Unsloth also provides support for other notable models like DeepSeek-R1, TinyLlama, CodeGemma, Yi, Vicuna, and Open Hermes. The inclusion of vision models like Llama 3.2 Vision (11B) and Pixtral (12B) further expands the capabilities of Unsloth for multimodal applications. This extensive support for a wide range of models underscores Unsloth’s versatility and its aim to serve as a comprehensive solution for fine-tuning various LLM architectures. While the official Unsloth documentation provides a comprehensive list of supported models, users should be aware that specific version requirements or limitations may exist for certain models. These details are often found within the individual Colab notebooks dedicated to each model or within the “Unsloth Requirements” section of the documentation 1. For instance, the documentation mentions that running Unsloth directly on Windows using Triton necessitates specific versions of PyTorch (version 2.4 or later) and CUDA (version 12) 11. This highlights the importance of checking the specific instructions associated with the chosen model to ensure compatibility and avoid potential issues during the setup and training phases. Users must pay close attention to any specified version requirements to ensure compatibility and prevent errors during installation and training.

⚡ Table 1: Supported Language Models in Unsloth Colab

Model FamilySpecific ModelsNotes
Llama 38B, 70B, 3.1 (8B), 3.2 (1B, 3B), 3.2 Vision (11B)GRPO reasoning notebooks available for some versions
Mistralv0.3 Instruct (7B), NeMo (12B), Small (22B), v0.3 (7B)
Gemma3 (4B, 1B), 2 (9B, 2B)GRPO reasoning notebook available for Gemma 3 (1B)
Phi4 (14B), 3.5 (mini), 3 (medium)GRPO reasoning notebooks available for Phi-4 (14B)
Qwen2.5 (3B, 7B, 14B, 32B, 1.5B), 2 (7B), 2.5 Coder (14B), Qwen2-VL (7B)GRPO reasoning notebook available for Qwen2.5 (3B)
OtherDeepSeek-R1, TinyLlama, CodeGemma, Yi, Vicuna, Open Hermes, Pixtral (12B)Specific notebooks available for various tasks (e.g., conversational)

This table provides a consolidated view of the language models supported by Unsloth within their Colab notebooks. It organizes the information by model family, drawing from various parts of the documentation 8, and includes a “Notes” column to highlight specific functionalities or versions, such as the availability of GRPO reasoning notebooks. This structured presentation directly addresses the user’s need to understand the range of models that can be used with Unsloth in Colab.



🌟 4. Installation and Setup in Google Colab:

The primary method for installing Unsloth and its required dependencies within the Google Colab environment is through the use of the pip package installer 1. A commonly used installation command is !pip install “unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git” 4. The inclusion of the [colab-new] tag in this command suggests that it might incorporate specific configurations or optimizations tailored for the Colab environment. In addition to the Unsloth library itself, users may also need to install other essential dependencies, such as bitsandbytes, accelerate, transformers, peft (Parameter-Efficient Fine-Tuning), trl (Transformer Reinforcement Learning), and xformers 4. The specific versions of these supporting libraries can be critical for ensuring compatibility and proper functioning of the Unsloth framework. The basic installation command pip install unsloth is generally recommended for users working on Linux-based systems 1. However, for users on Windows, the installation process is typically more involved and may necessitate the installation of several prerequisites, including the NVIDIA CUDA Toolkit and Microsoft C++ Build Tools, as well as the configuration of specific environment variables 11. Unsloth also provides a dedicated Windows-specific installation script, named unsloth_windows.ps1, which aims to simplify this process. Furthermore, for users who have the Conda package and environment management system set up on their local machines, Conda installation is also presented as an alternative option 11. The existence of different installation procedures tailored to various operating systems highlights the need for users to consider their specific platform when setting up the Unsloth environment. When installing Unsloth, it is important to be mindful of potential dependency conflicts and version requirements. The official documentation advises users to utilize specific versions of Python, namely 3.10, 3.11, or 3.12, as Unsloth does not currently support Python version 3.13 11. Moreover, ensuring compatibility among PyTorch, CUDA, CUDNN, Triton, and Xformers is essential for the proper functioning of Unsloth, especially when leveraging GPU acceleration. The PyTorch Compatibility Matrix is suggested as a helpful resource for verifying the compatibility of these components 11. For certain advanced functionalities, such as running Unsloth directly on Windows, specific minimum versions of PyTorch (2.4 or later) and CUDA (12) are required 11. Therefore, careful management of the software environment, including the versions of Python and the various libraries, is paramount for a successful Unsloth installation and to avoid encountering compatibility-related issues. While the official Unsloth documentation snippets do not directly address permanent installation within Google Colab, the user’s query specifically inquired about this. Snippet15 provides several methods for achieving permanent library installations in Colab, primarily by leveraging Google Drive or Google Cloud Storage. These techniques involve installing libraries into a specified directory within the user’s Google Drive or a mounted Cloud Storage bucket and then adding that directory to the Python path. This approach could be relevant for users who wish to avoid reinstalling Unsloth and its dependencies at the beginning of each new Colab session.



🌟 5. Executing Unsloth Colab Notebooks:

To execute the code within an Unsloth Colab notebook, users can interact with the individual code cells. The most straightforward method is to click the “Play” button that is located to the left of each code cell 2. For users who prefer keyboard shortcuts, pressing CTRL + ENTER will also execute the currently selected code cell. These interaction methods are standard within the Google Colab environment, making them familiar to most users of the platform. A critical aspect of working with Colab notebooks, especially those designed for complex workflows like fine-tuning large language models with Unsloth, is the importance of running the code cells in a chronological order 2. This is because subsequent cells often depend on the results, variables, or models that were defined or loaded in the earlier cells. If a user skips one or more cells, it can lead to errors when a later cell attempts to use a variable or function that has not yet been defined or executed. Therefore, maintaining the intended sequence of execution is essential for a smooth and error-free experience with Unsloth Colab notebooks. In addition to executing individual cells, Google Colab provides options within the “Runtime” menu to manage the execution of the entire notebook. One particularly useful option is “Run all,” which, as the name suggests, executes all the code cells in the notebook sequentially from top to bottom 2. This can be a quick and convenient way to run the entire fine-tuning workflow, especially for users who are utilizing the default configurations provided in the notebook or for initial testing purposes. However, it’s important to note that using the “Run all” option will bypass any customization steps or modifications that the user might intend to make to the default parameters or code within the notebook. As users execute the code cells within an Unsloth Colab notebook, the output generated by each cell is displayed directly below it. This output allows users to monitor the progress of various stages, such as the installation of libraries, the loading and preprocessing of data, the configuration of the model, and the actual training process. By observing this output, users can gain insights into whether each step is completing successfully and can identify any potential issues or warnings that might arise. If errors do occur during the execution of a code cell, Colab will typically display an error message, often including a traceback that indicates the location and type of error. In such cases, users should carefully examine these error messages to understand the root cause of the problem. A common troubleshooting step is to rerun the cell that produced the error or even to rerun some of the preceding cells, as the issue might stem from an earlier part of the workflow.



🌟 6. Running Unsloth Models and Notebooks Locally:

Fine-tuned language models created using Unsloth in Google Colab can be saved for later use in several ways. Users have the option to save these models locally to their computer’s file system 4. Additionally, Unsloth facilitates pushing the trained models to the Hugging Face Hub, a popular platform for sharing and accessing machine learning models 4. This allows for easy sharing with the community and convenient access from different environments. The Unsloth Colab notebooks themselves, which contain the code and configurations for fine-tuning, can be downloaded directly from the Unsloth GitHub repository. To run Unsloth models and notebooks on a local machine, it is first necessary to set up the appropriate local environment. This typically involves installing Python (with recommended versions being 3.10, 3.11, or 3.12) and a set of essential Python libraries 1. These libraries include unsloth itself, as well as fundamental machine learning libraries like torch (PyTorch), transformers, accelerate, peft, bitsandbytes, and potentially triton and xformers, depending on the specific functionalities being used. The recommended method for installing these libraries on Linux-based systems is using the pip package installer 1. For Windows users, the setup process is generally more complex and often requires installing additional prerequisites before installing the Python libraries. These prerequisites can include the NVIDIA CUDA Toolkit and Microsoft C++ Build Tools. In some cases, using the Windows Subsystem for Linux (WSL) to create a Linux environment within Windows can simplify the installation process 11. Alternatively, if the user has Conda installed on their local machine, it can also be used to manage the Python environment and install the necessary libraries 11. Setting up a local environment for Unsloth demands careful adherence to the installation instructions that are specific to the user’s operating system. When setting up a local environment for Unsloth, particularly for leveraging GPU acceleration, there are important considerations for different operating systems and potential prerequisites. On Linux, the installation process is generally quite straightforward using pip, provided that a compatible version of Python is already installed. For Windows users, as mentioned previously, it is crucial to install the NVIDIA CUDA Toolkit and the Microsoft C++ Build Tools. Additionally, setting up the correct environment variables for the C++ compiler is often necessary 11. Using the Windows Subsystem for Linux can provide a more streamlined experience by allowing users to work within a Linux environment directly on their Windows machine 11. If the goal is to utilize the computational power of a local GPU, then the appropriate NVIDIA CUDA drivers and the CUDA toolkit that are compatible with the installed version of PyTorch must be installed 11. Users can verify if CUDA is correctly installed by running the command nvcc in their terminal. As noted earlier, Unsloth currently supports Python versions 3.10, 3.11, and 3.12, so ensuring that one of these versions is installed is essential. Once the local environment has been properly configured with Python and all the necessary libraries, users can proceed to run the Unsloth notebooks that they have downloaded from the GitHub repository. The standard way to execute these .ipynb files locally is by using either Jupyter Notebook or JupyterLab. Before running the notebooks, it is important to ensure that the Jupyter kernel is connected to the correct Python environment where Unsloth and its dependencies have been installed. This ensures that the notebook can access and utilize the required libraries.



🌟 7. Configurable Parameters and Settings in Unsloth Colab:

Unsloth Colab notebooks are designed to be highly configurable, exposing a wide array of parameters that users can adjust to tailor the fine-tuning process to their specific needs and resources 2. These adjustable parameters include:

  • model_name: This parameter specifies the pre-trained language model that will serve as the foundation for fine-tuning. Examples include “unsloth/llama-3-8b-bnb-4bit”.

  • max_seq_length: This determines the maximum length of the input sequences that the model will process 2. Unsloth is notable for its ability to efficiently handle very long context lengths during fine-tuning 2.

  • dataset: This parameter indicates the dataset that will be used to fine-tune the model.

  • per_device_train_batch_size: This controls the number of training examples that are processed by each GPU in a single iteration of training.

  • gradient_accumulation_steps: This setting allows users to simulate larger batch sizes than would otherwise fit in memory by accumulating gradients over multiple smaller batches before performing a weight update.

  • learning_rate: This crucial parameter sets the magnitude of the steps taken to update the model’s weights during the training process.

  • num_train_epochs or max_steps: These parameters define the total duration of the training process, either in terms of the number of passes through the entire dataset (epochs) or the total number of training iterations (steps).

  • r (rank), lora_alpha, lora_dropout: These are parameters specifically related to LoRA (Low-Rank Adaptation), a parameter-efficient fine-tuning technique that is commonly employed by Unsloth to reduce memory usage and training time 2.

  • target_modules: This parameter specifies which layers within the pre-trained model will have the LoRA adapters applied to them for fine-tuning 2. Unsloth often recommends training on all relevant modules for optimal performance 2.

  • use_gradient_checkpointing: This is a memory optimization technique that reduces the GPU memory footprint during training 2. Unsloth often suggests using its optimized version of gradient checkpointing, which is specified by the string value “unsloth” 2.

  • dtype: This parameter allows users to specify the data type to be used for training, such as None for automatic detection based on the hardware, or specific types like torch.float16 or torch.bfloat16 2.

  • load_in_4bit: This boolean parameter enables 4-bit quantization of the model, which significantly reduces memory usage, making it possible to fine-tune larger models on devices with limited GPU memory 7.

  • output_dir: This parameter specifies the directory where the fine-tuned model’s weights and configuration files will be saved after the training process is complete. The extensive range of configurable parameters available in Unsloth notebooks provides users with a high degree of control over the fine-tuning process. This allows for precise tailoring of the training to specific tasks, datasets, and computational resources, ultimately enabling users to optimize for different hardware configurations and achieve their desired performance goals. Several key training parameters warrant further explanation due to their significant impact on the fine-tuning process:

  • max_seq_length: This parameter directly influences the context window that the model can effectively process. Longer sequence lengths enable the model to understand and generate longer pieces of text but also require more GPU memory. Unsloth’s efficient implementation allows for the use of longer sequence lengths compared to some other frameworks 2.

  • batch_size: The batch size affects both the speed and the stability of the training process. Larger batch sizes can often lead to faster training times as more data is processed in parallel, but they also demand more GPU memory. Smaller batch sizes might be necessary when working with limited resources.

  • learning_rate: The learning rate is a critical hyperparameter that determines the size of the updates made to the model’s weights during training in response to the errors it makes. A higher learning rate can lead to faster convergence but might also cause the training to become unstable or overshoot the optimal weights. Conversely, a lower learning rate might result in more stable training but could take longer to converge.

  • r (rank): In the context of LoRA, the rank parameter (r) defines the dimensionality of the low-rank matrices that are introduced into the model’s layers. Higher rank values increase the number of trainable parameters, which can potentially improve the model’s accuracy on more complex tasks but also increase memory usage and training time. Typical suggested values for the rank in Unsloth notebooks range from 8 for fast fine-tunes up to 128 for more demanding tasks 2.

  • lora_alpha: The lora_alpha parameter acts as a scaling factor for the LoRA matrices. It is often recommended to set this value to be equal to or double the rank (r) 2. The specific impact of lora_alpha can depend on the task and the dataset. Understanding the role and interplay of these key parameters is fundamental to effectively fine-tuning large language models using Unsloth and to achieving the desired performance on specific downstream tasks. Beyond the core training parameters, Unsloth notebooks also provide various options for model configuration and data handling. Model configuration primarily involves selecting the pre-trained model to be used as a starting point and can also include options for loading the model in a quantized format, such as 4-bit quantization, to reduce memory consumption 7. Data handling settings encompass specifying the dataset to be used for fine-tuning, formatting the input prompts in a way that is suitable for the chosen model, and potentially utilizing data collators to efficiently batch and prepare the data for training. Unsloth includes utility functions like to_sharegpt that aim to simplify the often time-consuming process of preparing datasets for fine-tuning by automatically merging columns into a unified prompt format 2.



🌟 8. Modifying and Updating Settings for Customization:

Within the Google Colab environment, modifying the configurable parameters in Unsloth notebooks is a straightforward process. Users can directly locate the code cells where these parameters are defined and change their assigned values. This typically involves finding the lines of code where variables are assigned (e.g., max_seq_length = 2048) and then altering the numerical values or string inputs as needed. The interactive nature of Colab notebooks makes it exceptionally easy for users to experiment with different parameter settings and observe their impact on the fine-tuning process. There are several strategies that users can employ to update these settings in order to tailor the model training or inference process for specific tasks. For instance, if the task at hand involves processing or generating longer sequences of text, it might be necessary to increase the value of the max_seq_length parameter. However, users should be mindful that increasing the sequence length can also lead to higher memory usage. Unsloth’s architecture is designed to handle longer context lengths more efficiently than some other frameworks 2. Adjusting the learning rate and the total number of training steps (either num_train_epochs or max_steps) can significantly impact the convergence of the model and its final performance. Experimenting with different values for the LoRA-related parameters, such as the rank (r) and the scaling factor (lora_alpha), can help users find the right balance between memory usage and the accuracy of the fine-tuned model. For tasks that are more sensitive to the nuances of language, a higher rank might be beneficial, while for resource-constrained environments, a lower rank might be preferable. To further illustrate how modifying these settings can impact the fine-tuning process, consider a few common customization scenarios. If a user is working with documents that tend to be quite long, they might need to increase the max_seq_length to ensure that the model can process the entire context. While this could increase memory consumption, Unsloth’s optimizations aim to mitigate this issue 2. If, during training, the loss on the training data plateaus relatively early, it might indicate that the model is no longer learning effectively. In such a case, a user might consider decreasing the learning rate to allow for finer adjustments to the model’s weights or increasing the number of training steps to give the model more opportunities to learn. For users who are working in environments with limited GPU memory, utilizing 4-bit quantization by setting load_in_4bit to True can be a crucial step in making it possible to fine-tune larger models. Additionally, adjusting the rank (r) and alpha (lora_alpha) parameters of LoRA can help to further reduce the memory footprint of the fine-tuning process.



🌟 9. Advanced Techniques and Community Insights:

Beyond the official documentation, a wealth of information and insights on using Unsloth Colabs can be found in recent tutorials, blog posts, and community discussions. These external resources often provide step-by-step guides and practical advice for fine-tuning specific language models or for working with custom datasets within the Unsloth Colab environment 4. Platforms like Reddit host community discussions where users share troubleshooting tips, best practices they have discovered, and advanced techniques they have employed 2. One recurring theme in these resources is Unsloth’s capability to handle very long context fine-tuning efficiently, which is a significant advantage for tasks that require understanding and generating long sequences of text 2. The Unsloth GitHub repository itself serves as an invaluable resource, providing access to the latest code, documentation, issue tracking, and contributions from the community 12. Within these community-driven resources, users can find explorations of advanced fine-tuning techniques supported by Unsloth, as well as various optimization strategies and troubleshooting tips. For example, Unsloth supports Reward-based training through Generative Reward Policy Optimization (GRPO), and there are dedicated notebooks available for this technique 1. Community members often share their findings on optimal parameter settings for achieving the best performance with specific models and on particular tasks. Furthermore, discussions frequently address common challenges encountered during fine-tuning, such as out-of-memory (OOM) errors, and offer practical troubleshooting advice. A common strategy recommended for resolving OOM errors is to reduce the context window (by adjusting max_seq_length) or to decrease the batch size 20. To ensure users can readily access the latest information and support, it is important to highlight key resources such as the Unsloth GitHub repository (https://github.com/unslothai/unsloth).

This repository is the central hub for the project, containing the source code, comprehensive documentation, and a platform for users to report issues and contribute to the project. Additionally, the official Unsloth documentation website (https://docs.unsloth.ai/) may provide links to community forums or other discussion platforms. The welcome page of the documentation, for instance, mentions both Reddit and Discord as community channels 1.



🌟 10. Conclusion:

Unsloth AI, in conjunction with the accessibility of Google Colaboratory, offers a powerful and efficient platform for fine-tuning large language models. The framework’s emphasis on speed and reduced memory usage makes advanced NLP tasks more feasible for a wider range of users, while the extensive support for various model architectures provides considerable flexibility. By understanding the structure and operation of Unsloth Colab notebooks, users can effectively leverage the provided resources for their specific needs. The ability to configure a wide array of parameters allows for fine-grained control over the training process, enabling optimization for different tasks and hardware constraints. Furthermore, the active community surrounding Unsloth provides a valuable source of information, advanced techniques, and troubleshooting support beyond the official documentation.

🔧 Works cited

1. Unsloth Documentation: Welcome, accessed on March 22, 2025, https://docs.unsloth.ai/ 2. Step-By-Step Tutorial: How to Fine-tune Llama 3 (8B) with Unsloth + Google Colab & deploy it to Ollama : r/LocalLLaMA - Reddit, accessed on March 22, 2025, https://www.reddit.com/r/LocalLLaMA/comments/1e416fo/stepbystep_tutorial_how_to_finetune_llama_3_8b/ 3. Google Colab - Unsloth Documentation, accessed on March 22, 2025, https://docs.unsloth.ai/get-started/installing-+-updating/google-colab 4. Fine-tuning made easy with Unsloth and Colab | by Ahamed Musthafa R S | Medium, accessed on March 22, 2025, https://medium.com/@amrstech/fine-tuning-made-easy-with-unsloth-and-colab-e0993f3f4c07 5. Phi_4-Conversational.ipynb - Google Colab, accessed on March 22, 2025, https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Phi_4-Conversational.ipynb 6. Unsloth Fine-tuning Notebooks for Google Colab, Kaggle, Hugging Face and more. - GitHub, accessed on March 22, 2025, https://github.com/unslothai/notebooks 7. Fine-tuning Gemma using Unsloth - Colab - Google, accessed on March 22, 2025, https://colab.research.google.com/github/google-gemini/gemma-cookbook/blob/main/Gemma/[Gemma_2]Finetune_with_Unsloth.ipynb?hl=tr 8. Unsloth Notebooks | Unsloth Documentation, accessed on March 22, 2025, https://docs.unsloth.ai/get-started/unsloth-notebooks 9. All Our Models | Unsloth Documentation, accessed on March 22, 2025, https://docs.unsloth.ai/get-started/all-our-models 10. Ollama + Unsloth + Llama-3 + Alpaca.ipynb - Colab - Google, accessed on March 22, 2025, https://colab.research.google.com/drive/1WZDi7APtQ9VsvOrQSSC5DDtxq159j8iZ?usp=sharing 11. Windows Installation | Unsloth Documentation, accessed on March 22, 2025, https://docs.unsloth.ai/get-started/installing-+-updating/windows-installation 12. unslothai/unsloth: Finetune Llama 3.3, DeepSeek-R1, Gemma 3 & Reasoning LLMs 2x faster with 70% less memory! - GitHub, accessed on March 22, 2025, https://github.com/unslothai/unsloth 13. Pip Install - Unsloth Documentation, accessed on March 22, 2025, https://docs.unsloth.ai/get-started/installing-+-updating/pip-install 14. A Rapid Tutorial on Unsloth - Stephen Diehl, accessed on March 22, 2025, https://www.stephendiehl.com/posts/unsloth/ 15. How do I install a library permanently in Colab? - Stack Overflow, accessed on March 22, 2025, https://stackoverflow.com/questions/55253498/how-do-i-install-a-library-permanently-in-colab 16. Pixtral Vision Finetuning Unsloth - General Dataset.ipynb - Google Colab, accessed on March 22, 2025, https://colab.research.google.com/drive/1K9ZrdwvZRE96qGkCq_e88FgV3MLnymQq?usp=sharing 17. Fine-Tuning TinyLLaMA with Unsloth tutorial - Lablab.ai, accessed on March 22, 2025, https://lablab.ai/t/fine-tuning-tinyllama 18. Fine-tuning Llama 3 with Unsloth: A Beginner’s Guide | by Seekmeai | Medium, accessed on March 22, 2025, https://medium.com/@seekmeai/fine-tuning-llama-3-with-unsloth-a-beginners-guide-d239d48eaf71 19. How to run Llama 3.1 (or) any LLM in Google Colab | Unsloth - YouTube, accessed on March 22, 2025, https://www.youtube.com/watch?v=4ahdsiKgx6c 20. Fine-Tuning Gemma (Easiest Method with Unsloth & Colab) - YouTube, accessed on March 22, 2025, https://www.youtube.com/watch?v=pWZfufhF45o 21. Tutorial: How to Train your own Reasoning model using Llama 3.1 (8B) + Unsloth + GRPO, accessed on March 22, 2025, https://www.reddit.com/r/LocalLLaMA/comments/1iyuz01/tutorial_how_to_train_your_own_reasoning_model/