如果目标机器完全无网,需要在一台有网机器上完成以下准备,再拷贝到离线机器。
推荐:
python3 --version
pip3 --versiongit lfs install
git clone https://huggingface.co/apple/OpenELM-270M-Instruct常用模型:
OpenELM-270MOpenELM-450MOpenELM-1_1BOpenELM-3BOpenELM-270M-Instruct访问:
https://huggingface.co/apple下载:
model.safetensorsconfig.jsontokenizer.modeltokenizer_config.jsongeneration_config.json在有网机器上执行:
pip download torch transformers sentencepiece -d ./offline_packages如需 CPU 版本 PyTorch(更小):
pip download torch --index-url https://download.pytorch.org/whl/cpu -d ./offline_packagestar -czvf openelm_offline.tar.gz \
OpenELM-270M-Instruct \
offline_packagestar -xzvf openelm_offline.tar.gz
cd OpenELM-270M-Instructpip install --no-index --find-links=../offline_packages \
torch transformers sentencepiece验证:
python -c "import transformers; print(transformers.__version__)"run_openelm.pyfrom transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_path = "./OpenELM-270M-Instruct"
tokenizer = AutoTokenizer.from_pretrained(model_path, local_files_only=True)
model = AutoModelForCausalLM.from_pretrained(
model_path,
local_files_only=True,
torch_dtype=torch.float32
)
prompt = "Once upon a time"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(
**inputs,
max_new_tokens=50
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))python run_openelm.py确保目录包含:
tokenizer.model
tokenizer_config.json
special_tokens_map.json使用 CPU:
model = AutoModelForCausalLM.from_pretrained(
model_path,
local_files_only=True,
torch_dtype=torch.float32,
device_map="cpu"
)torch.float32 而非 float16bitsandbytes 或 gguf(需额外工具)mlx 框架(Mac 专用)有网机器
├─ 下载模型权重
├─ 下载 Python 依赖
└─ 打包 → 拷贝到离线机器
离线机器
├─ 解压
├─ 安装依赖
└─ 运行推理如果你需要:
可以告诉我你的 操作系统 + 硬件环境,我可以给你定制方案。