{{SERVER_MODELS_JS}}
Loading...
Controls randomness in responses (0 = deterministic, 2 = very random)
Limits token selection to top K most likely tokens
Nucleus sampling - considers tokens with cumulative probability up to P
Penalty for repeating tokens (1 = no penalty, >1 = less repetition)
🔥 Hot Models
🔧 By Recipe
llama.cpp
OGA Hybrid
OGA NPU
OGA CPU
🏷️ By Category
Coding
Vision
Reasoning
Reranking
Embeddings
Custom
Add a Model