You run ollama pull llama3.3:70b on a workstation with a 1 Gbps connection and it crawls to 47% and hangs. The progress bar stops refreshing, the terminal shows no new bytes, and hitting Ctrl+C then re-running the same command restarts from 0% rather than resuming. This happens across Ollama 0.3 and 0.4 on both macOS and Linux, and it almost never means the model registry is down. The root causes are nearly always local: a stalled TCP connection, a full disk mid-write, a corrupted partial blob, or a proxy that terminates long-lived HTTP streams.
Common causes
Ordered by hit rate, highest first.
1. Network TCP stream times out silently
Ollama streams each layer as a single HTTP range request. Corporate proxies, home routers with aggressive idle-connection timeouts, and some ISPs will silently drop the TCP connection after 30-120 seconds of low throughput — especially for large layers (10+ GB). Ollama 0.3.x does not retry individual layer chunks, so the download appears stuck without an error.
How to spot it: Run ss -tn | grep :443 or netstat -an | grep ESTABLISHED in another terminal while the pull hangs. If the connection to registry.ollama.ai disappears from the list, the TCP session dropped.
2. Target disk is full mid-download
Ollama writes blobs to ~/.ollama/models/blobs/ on macOS/Linux or %USERPROFILE%\.ollama\models\blobs\ on Windows. If that volume fills up during the download, the write stalls silently rather than throwing an error.
How to spot it: Run df -h ~/.ollama while the download is hanging. If “Use%” is at or near 100%, disk space is the cause.
3. Corrupted partial blob from a previous interrupted download
A prior interrupted pull leaves a partial .part file in ~/.ollama/models/blobs/. Ollama 0.3 tries to continue writing into it but the checksum boundary is wrong, so it stalls trying to validate a corrupt chunk.
How to spot it: Run ls -lh ~/.ollama/models/blobs/ | grep '.part'. Any .part file older than the current session indicates leftover state.
4. DNS resolution failure for layer CDN hosts
The blob CDN (blob.ollama.ai or Cloudflare R2 buckets) resolves differently from registry.ollama.ai. If your DNS returns a bad address for the CDN host, the manifest downloads fine but the first large layer request hangs waiting for a TCP connection that never completes.
How to spot it: Run curl -v --max-time 10 https://blob.ollama.ai and check whether it times out while registry.ollama.ai responds normally.
5. OLLAMA_MODELS points to a slow or network-mounted path
If you have OLLAMA_MODELS=/mnt/nas/ollama or similar, writes to a network share can stall under sustained 5-15 GB writes. NFS and SMB can buffer and then block when the server’s write cache fills.
How to spot it: Run echo $OLLAMA_MODELS. If it points outside your local disk, move the download to a local path first.
6. Antivirus or firewall scanning the blob writes in real time
Real-time antivirus scanning of multi-gigabyte files being written to disk can stall file I/O enough to cause the download loop to time out its write buffer.
How to spot it: Temporarily disable real-time protection, restart the pull, and check whether throughput returns. On Windows check Windows Defender exclusions for %USERPROFILE%\.ollama.
Shortest path to fix
Step 1: Free disk space and remove partial blobs
# Check disk space
df -h ~/.ollama
# Remove all partial blobs from previous attempts
rm -f ~/.ollama/models/blobs/*.part
# Also check for zero-byte blobs
find ~/.ollama/models/blobs/ -size 0 -delete
Step 2: Pull again with verbose output to watch progress per-layer
OLLAMA_DEBUG=1 ollama pull llama3.3:70b
With OLLAMA_DEBUG=1 Ollama prints the URL of each layer being fetched and the byte range. If it stalls on a specific layer hash, note the hash for step 3.
Step 3: Test direct CDN connectivity
# Confirm registry responds
curl -I https://registry.ollama.ai/v2/
# Test large download throughput to the CDN (replace with actual blob URL from debug output)
curl -o /dev/null --max-time 60 -w "%{speed_download}" \
"https://blob.ollama.ai/sha256:abc123..."
If throughput is under 1 MB/s or the request times out, the network path is the bottleneck.
Step 4: Pull over a different network or with a proxy bypass
# Bypass corporate proxy for Ollama traffic
NO_PROXY=registry.ollama.ai,blob.ollama.ai ollama pull llama3.3:70b
# Or use a different DNS resolver
GODEBUG=netdns=go ollama pull llama3.3:70b
Step 5: Point OLLAMA_MODELS to local fast storage
# In ~/.bashrc or ~/.zshrc
export OLLAMA_MODELS=/path/to/local/ssd/ollama
# Restart the Ollama service
ollama serve &
ollama pull llama3.3:70b
Step 6: Pull a smaller quantization first to verify the stack works
# Verify the pipeline works end-to-end with a fast model
ollama pull llama3.2:3b
ollama run llama3.2:3b "say hello"
# If this succeeds, the infrastructure is fine — the 70B stall is a network duration issue
Prevention
- Add
~/.ollamato your antivirus exclusion list before pulling large models. - Keep at least 20 GB of free space on the Ollama models volume; check with
df -h ~/.ollamabefore starting a pull over 10 GB. - Set
export OLLAMA_MODELSto a local NVMe path, not a network share, in your shell profile. - Use
OLLAMA_DEBUG=1for any pull over 20 GB so you can see exactly which layer stalls. - On macOS, increase the default TCP keepalive:
sudo sysctl -w net.inet.tcp.keepidle=60000. - If behind a corporate proxy, add
NO_PROXY=*.ollama.aito your environment permanently. - After a stall, always run
rm -f ~/.ollama/models/blobs/*.partbefore retrying to avoid checksum loops.
FAQ
Q: Does Ollama resume downloads after a network interruption?
A: Partially — Ollama 0.4 will skip already-downloaded layers (blobs already present with valid checksums) but will restart the specific layer that was in-progress when the connection dropped. Removing .part files before retrying ensures a clean restart of that layer.
Q: Can I pre-download a model on another machine and copy the blobs?
A: Yes. Copy the entire ~/.ollama/models/ directory (blobs + manifests subdirectories) to the target machine. Then run ollama list — the model will appear without a new network download.
Q: Why does the percentage jump backwards sometimes? A: Ollama downloads layers in parallel. The percentage shown is an aggregate across all layers. A layer that was partially written and then failed its checksum is discarded and restarts, which causes the aggregate to drop.
Q: How do I pull a specific quantization variant rather than the default?
A: Use the tag syntax: ollama pull llama3.3:70b-instruct-q4_K_M. Run ollama search llama3.3 or check https://ollama.com/library/llama3.3/tags for the full tag list including Q4_K_M, Q5_K_M, Q8_0, etc.