Ubuntu安装与运行 OpenELM 步骤
一 环境准备
二 获取模型权重
1) 在 Hugging Face 获取访问令牌(https://huggingface.co/settings/tokens,勾选 read 权限)。
2) 安装 Git LFS:curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | sudo bash && sudo apt-get install git-lfs && git lfs install
3) 克隆模型:git clone https://huggingface.co/apple/OpenELM-3B-Instruct
-rw-r--r-- 1 user user 4.2G model-00001-of-00002.safetensors
-rw-r--r-- 1 user user 1.8G model-00002-of-00002.safetensors
三 快速验证与命令行推理
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_name = "apple/OpenELM-3B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.float16 if torch.cuda.is_available() else torch.float32,
device_map="auto"
)
prompt = "Once upon a time there was"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=64, temperature=0.7, top_p=0.9, do_sample=True)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))四 Docker GPU 部署(可选)
docker run -it --gpus all \
-v $(pwd):/workspace \
-p 7860:7860 \
--name openelm-deploy \
nvidia/cuda:12.1.1-cudnn8-runtime-ubuntu22.04 /bin/bash
五 常见问题与优化