Running Large AI Models Locally vs. in the Cloud: A Comprehensive Comparison

Dean Lofts
2 min readApr 26, 2023

--

Running Large AI Models Locally vs. in the Cloud: A Comprehensive Comparison
Running Large AI Models Locally vs. in the Cloud: A Comprehensive Comparison

As AI models like `oasst-sft-6-llama-30b` become more prevalent, users are faced with a decision: should they run these models locally or in the cloud? This article compares these two options comprehensively, including the necessary hardware upgrades for local execution and the cost implications of cloud-based solutions.

Local Execution: Upgrading Your Hardware

Running large AI models locally can offer increased privacy and control but requires substantial hardware upgrades to handle the resource-intensive tasks. To run models like `oasst-sft-6-llama-30b` comfortably, consider the following upgrades:

GPU: Upgrade to a powerful GPU designed for deep learning, such as the NVIDIA A100 or NVIDIA A40, which offer more CUDA cores and higher memory bandwidth.

RAM: Increase your RAM to at least 64 GB, preventing crashes and allowing for multitasking while running the model.

CPU: Upgrade your CPU to a more powerful option, such as the Intel Core i9 or AMD Ryzen 9, for improved model training and inference performance.

Storage: Add an SSD for faster read/write speeds and ensure ample storage space for model files and temporary data.

Cooling: Implement adequate cooling solutions to handle the heat generated by the upgraded hardware.

Power Supply: Upgrade your PSU to accommodate the increased power requirements of the new components.

Pros of Local Execution

  • Better privacy and control, as no internet connection is required
  • No recurring costs (besides initial hardware investment and electricity)

Cons of Local Execution

  • The high upfront cost of hardware upgrades
  • Resource-intensive tasks may impact overall system performance
  • Cloud Execution: Cost Implications

Cloud-based solutions, such as Google Cloud Platform (GCP) and Amazon Web Services (AWS), provide powerful computing resources for AI tasks. Costs vary depending on the provider, resources used, and duration of usage. For example, a preemptible NVIDIA A100 GPU on GCP costs around $1.20 per hour, while a p4d.24xlarge instance (with 8 NVIDIA A100 GPUs) on AWS costs approximately $32.77 per hour.

Pros of Cloud Execution

  • Access to powerful computing resources without upfront hardware investment
  • Easily scalable resources to accommodate varying workloads
  • Pay-as-you-go pricing model

Cons of Cloud Execution

  • Data privacy concerns
  • Recurring costs that can accumulate over time
  • Risk of inadvertently leaving instances running, leading to unexpected charges

The choice between running large AI models locally or in the cloud ultimately depends on your priorities and requirements. If privacy, control, and a one-time investment are your main concerns, local execution may be the best option. However, cloud-based solutions are worth considering if you prefer flexible, scalable resources without upfront hardware costs. Carefully weigh the pros and cons of each approach to determine the most suitable option for your needs.

--

--

Dean Lofts
Dean Lofts

Written by Dean Lofts

Dean (Loftwah) | Self-taught coder | AI, AWS, DevOps, DevRel, Ruby, Rust, Terraform | Hip-hop producer, dad | EddieHub Ambassador, LinkFree | ISTP-A 🎧🤖🎓

No responses yet