Running Large AI Models Locally vs. in the Cloud: A Comprehensive Comparison

2 min readApr 26, 2023

Running Large AI Models Locally vs. in the Cloud: A Comprehensive Comparison

As AI models like `oasst-sft-6-llama-30b` become more prevalent, users are faced with a decision: should they run these models locally or in the cloud? This article compares these two options comprehensively, including the necessary hardware upgrades for local execution and the cost implications of cloud-based solutions.

Local Execution: Upgrading Your Hardware

Running large AI models locally can offer increased privacy and control but requires substantial hardware upgrades to handle the resource-intensive tasks. To run models like `oasst-sft-6-llama-30b` comfortably, consider the following upgrades:

GPU: Upgrade to a powerful GPU designed for deep learning, such as the NVIDIA A100 or NVIDIA A40, which offer more CUDA cores and higher memory bandwidth.

RAM: Increase your RAM to at least 64 GB, preventing crashes and allowing for multitasking while running the model.

CPU: Upgrade your CPU to a more powerful option, such as the Intel Core i9 or AMD Ryzen 9, for improved model training and inference performance.

Storage: Add an SSD for faster read/write speeds and ensure ample storage space for model files and temporary data.

Cooling: Implement adequate cooling solutions to handle the heat generated by the upgraded hardware.

Power Supply: Upgrade your PSU to accommodate the increased power requirements of the new components.

Pros of Local Execution

Better privacy and control, as no internet connection is required
No recurring costs (besides initial hardware investment and electricity)

Cons of Local Execution

The high upfront cost of hardware upgrades
Resource-intensive tasks may impact overall system performance
Cloud Execution: Cost Implications

Cloud-based solutions, such as Google Cloud Platform (GCP) and Amazon Web Services (AWS), provide powerful computing resources for AI tasks. Costs vary depending on the provider, resources used, and duration of usage. For example, a preemptible NVIDIA A100 GPU on GCP costs around $1.20 per hour, while a p4d.24xlarge instance (with 8 NVIDIA A100 GPUs) on AWS costs approximately $32.77 per hour.

Pros of Cloud Execution

Access to powerful computing resources without upfront hardware investment
Easily scalable resources to accommodate varying workloads
Pay-as-you-go pricing model

Cons of Cloud Execution

Data privacy concerns
Recurring costs that can accumulate over time
Risk of inadvertently leaving instances running, leading to unexpected charges

The choice between running large AI models locally or in the cloud ultimately depends on your priorities and requirements. If privacy, control, and a one-time investment are your main concerns, local execution may be the best option. However, cloud-based solutions are worth considering if you prefer flexible, scalable resources without upfront hardware costs. Carefully weigh the pros and cons of each approach to determine the most suitable option for your needs.

Running Large AI Models Locally vs. in the Cloud: A Comprehensive Comparison

Pros of Local Execution

Cons of Local Execution

Pros of Cloud Execution

Cons of Cloud Execution

Written by Dean Lofts

No responses yet