---
name: huggingface-hub
description: "HuggingFace hf CLI: search/download/upload models, datasets."
version: 1.0.0
author: Hugging Face
license: MIT
tags: [huggingface, hf, models, datasets, hub, mlops]
---

# Hugging Face CLI (`hf`) Reference Guide

The `hf` command is the modern command-line interface for interacting with the Hugging Face Hub, providing tools to manage repositories, models, datasets, and Spaces.

> **IMPORTANT:** The `hf` command replaces the now deprecated `huggingface-cli` command.

## Quick Reference
*   **Installation:** `curl -LsSf https://hf.co/cli/install.sh | bash -s`
*   **Help:** Use `hf --help` to view all available functions and real-world examples.
*   **Authentication:** Recommended via `HF_TOKEN` environment variable or the `--token` flag.
*   **Python API patterns:** See `references/python-api-patterns.md` for correct HfApi usage, parameter notes, and common pipeline tags.

---

## Core Commands

### General Operations
*   `hf download REPO_ID`: Download files from the Hub.
*   `hf upload REPO_ID`: Upload files/folders (recommended for single-commit).
*   `hf upload-large-folder REPO_ID LOCAL_PATH`: Recommended for resumable uploads of large directories.
*   `hf sync`: Sync files between a local directory and a bucket.
*   `hf env` / `hf version`: View environment and version details.

### Authentication (`hf auth`)
*   `login` / `logout`: Manage sessions using tokens from [huggingface.co/settings/tokens](https://huggingface.co/settings/tokens).
*   `list` / `switch`: Manage and toggle between multiple stored access tokens.
*   `whoami`: Identify the currently logged-in account.

### Repository Management (`hf repos`)
*   `create` / `delete`: Create or permanently remove repositories.
*   `duplicate`: Clone a model, dataset, or Space to a new ID.
*   `move`: Transfer a repository between namespaces.
*   `branch` / `tag`: Manage Git-like references.
*   `delete-files`: Remove specific files using patterns.

---

## Specialized Hub Interactions

### Datasets & Models
*   **Datasets:** `hf datasets list`, `info`, and `parquet` (list parquet URLs).
*   **SQL Queries:** `hf datasets sql SQL` — Execute raw SQL via DuckDB against dataset parquet URLs.
*   **Models:** `hf models list` and `info`.
*   **Papers:** `hf papers list` — View daily papers.

### Discussions & Pull Requests (`hf discussions`)
*   Manage the lifecycle of Hub contributions: `list`, `create`, `info`, `comment`, `close`, `reopen`, and `rename`.
*   `diff`: View changes in a PR.
*   `merge`: Finalize pull requests.

### Infrastructure & Compute
*   **Endpoints:** Deploy and manage Inference Endpoints (`deploy`, `pause`, `resume`, `scale-to-zero`, `catalog`).
*   **Jobs:** Run compute tasks on HF infrastructure. Includes `hf jobs uv` for running Python scripts with inline dependencies and `stats` for resource monitoring.
*   **Spaces:** Manage interactive apps. Includes `dev-mode` and `hot-reload` for Python files without full restarts.

### Storage & Automation
*   **Buckets:** Full S3-like bucket management (`create`, `cp`, `mv`, `rm`, `sync`).
*   **Cache:** Manage local storage with `list`, `prune` (remove detached revisions), and `verify` (checksum checks).
*   **Webhooks:** Automate workflows by managing Hub webhooks (`create`, `watch`, `enable`/`disable`).
*   **Collections:** Organize Hub items into collections (`add-item`, `update`, `list`).

---

## Pitfalls & Troubleshooting

### Network Connectivity (China/Restricted Networks)
HuggingFace API (`huggingface.co`) may be unreachable from China or restricted networks. Both `hf` CLI and `huggingface_hub` Python library will timeout.

**Diagnosis:**
```bash
curl -s --connect-timeout 5 "https://huggingface.co" > /dev/null && echo "OK" || echo "BLOCKED"
```

**Proxy configuration for Python library:**
```bash
# Option 1: Environment variables (may not work if proxy is down)
export https_proxy=http://127.0.0.1:7890 http_proxy=http://127.0.0.1:7890

# Option 2: Pass proxy via httpx_client (more reliable)
from huggingface_hub import HfApi
import httpx
client = httpx.Client(proxy="http://127.0.0.1:7890")
api = HfApi()
# Note: HfApi doesn't directly accept a custom httpx client in all versions
```

**Fallback when API is unreachable:** Use cached data, web search for model info, or hardcoded model lists for UI/reporting tasks.

### Python API Parameter Gotchas
- `list_models()` does NOT have a `direction` parameter (removed/deprecated). Use `sort="downloads"` without direction — results come in ascending order by default; reverse client-side if needed.
- `expand` parameter accepts a list of property names to include in response (e.g., `["downloads", "likes", "pipeline_tag"]`). Without it, many fields may be `None`.
- The iterator returned by `list_models()` is lazy — wrapping in `list()` forces full fetch but may be slow for large result sets.

### Timeout Issues
Default timeouts may be too short for slow connections:
```python
from huggingface_hub import HfApi
# No direct timeout parameter on list_models; relies on httpx defaults
# If timing out, reduce limit or use filter to narrow results
models = api.list_models(sort="downloads", limit=20, filter="text-generation")
```

### CLI vs Python Library
- `hf models list` — CLI, may hang on slow networks
- `HfApi().list_models()` — Python, same network requirements
- Both share the same backend API and will fail together if network is blocked

## Advanced Usage & Tips

### Global Flags
*   `--format json`: Produces machine-readable output for automation.
*   `-q` / `--quiet`: Limits output to IDs only.

### Extensions & Skills
*   **Extensions:** Extend CLI functionality via GitHub repositories using `hf extensions install REPO_ID`.
*   **Skills:** Manage AI assistant skills with `hf skills add`.

---

## Python API (`huggingface_hub`)

The `huggingface_hub` Python package provides programmatic access. Common pattern for listing models:

```python
from huggingface_hub import HfApi
api = HfApi()
models = api.list_models(sort="downloads", limit=30)
for m in models:
    print(m.id, m.downloads, m.pipeline_tag)
```

### Key Parameters for `list_models()`
| Parameter | Type | Description |
|-----------|------|-------------|
| `sort` | str | `"downloads"`, `"likes"`, `"trending_score"`, `"created_at"`, `"last_modified"` |
| `limit` | int | Max models to return (None = all) |
| `pipeline_tag` | str | Filter by task: `"text-generation"`, `"image-classification"`, etc. |
| `filter` | str | Free-text filter (library, language, tags) |
| `author` | str | Filter by org/user |
| `search` | str | Substring match on model ID |
| `expand` | list | Extra fields: `["downloads", "likes", "pipeline_tag", "tags", "library_name"]` |

See `references/python-api-notes.md` for pitfalls, proxy config, and offline fallback model list.

---

## Pitfalls

### Network connectivity (China / restricted networks)
HuggingFace API (`huggingface.co`) is often unreachable without proxy. If `HfApi` calls hang or timeout:
1. Set proxy: `export https_proxy=http://127.0.0.1:7890`
2. If proxy also fails, use **offline fallback**: hardcode a curated model list (see `references/python-api-notes.md`)
3. The `hf` CLI respects `HF_ENDPOINT` env var — can point to a mirror like `https://hf-mirror.com`

### `list_models()` has NO `direction` parameter
Unlike some APIs, you cannot pass `direction=-1` to reverse sort. The `sort` parameter alone determines order (highest first for `downloads`). Passing `direction` raises `TypeError`.

### Timeout on large queries
`list_models(limit=None)` fetches ALL models (millions) and can hang. Always set a reasonable `limit` (e.g., 30-100).

### `expand` parameter trade-off
Using `expand=["downloads", "likes", ...]` makes each response larger but avoids needing separate calls. However, it can slow down the request significantly for large result sets.
