nexus-scribe-models
ML model management, loading, and caching.
Overview
Centralized model management with:
- Model discovery and validation
- Checksum verification
- Version extraction from filenames
- LRU caching
- Warm-up and benchmarking
Model Manager
use nexus_scribe_models::ModelManager;
use std::path::PathBuf;
let manager = ModelManager::new(
PathBuf::from("/opt/NexusScribe/models"),
hailo_device, // Optional
)?;
Model Discovery
// Discover all models
let models = manager.discover_models().await?;
// With checksum computation
use nexus_scribe_models::DiscoveryOptions;
let options = DiscoveryOptions::new().with_checksums();
let models = manager.discover_models_with_options(&options).await?;
Supported formats: .hef, .onnx, .pt
Loading Models
// Load Whisper model
let whisper = manager.load_whisper_model("large-v3").await?;
// Load speaker embedding model
let speaker = manager.load_speaker_model().await?;
// Generic model by name
let info = manager.get_model_info("custom-model").await?;
Model Types
pub enum ModelType {
Whisper,
SpeakerEmbedding,
VAD,
LanguageDetection,
Custom(String),
}
Auto-detected from model name:
whisper-*->Whisperspeaker-*,diarization-*->SpeakerEmbeddingvad-*,voice-activity-*->VADlang-*,lid-*->LanguageDetection
Model Info
pub struct ModelInfo {
pub name: String,
pub version: String,
pub model_type: ModelType,
pub path: PathBuf,
pub size_bytes: u64,
pub checksum: Option<String>,
pub metadata: HashMap<String, String>,
}
Version Extraction
Extracted automatically from filenames:
| Filename | Name | Version |
|---|---|---|
whisper-large-v3.hef | whisper-large | 3.0.0 |
speaker-v2.1.onnx | speaker | 2.1.0 |
model_v1.0.0.pt | model | 1.0.0 |
vad-silero.hef | vad-silero | 1.0.0 |
Checksum Verification
// Validate model integrity
let valid = manager.validate_model("whisper-large-v3").await?;
// Or manually
if let Some(info) = manager.get_model_info("model-name").await? {
if info.checksum.is_some() {
let valid = info.verify_checksum()?;
}
}
Caching
LRU cache for loaded models:
// Check cache stats
let stats = manager.cache_stats();
println!("Cached: {}/{}", stats.cached_models, stats.max_cache_size);
// Clear cache
let unload_stats = manager.clear_cache().await?;
Unloading Models
// Unload specific model
let stats = manager.unload_model(&loaded_model).await?;
println!("Released {} bytes", stats.model_size_bytes);
// Unload by name
if let Some(stats) = manager.unload_model_by_name("whisper-large-v3").await? {
println!("NPU released: {}", stats.npu_resources_released);
}
Unload Stats
pub struct UnloadStats {
pub model_name: String,
pub model_size_bytes: u64,
pub removed_from_cache: bool,
pub npu_resources_released: bool,
pub duration_ms: u64,
}
Warm-up
Pre-load models for faster first inference:
// Warm up default models
manager.warmup_models().await?;
LoadedModel
pub struct LoadedModel {
pub info: ModelInfo,
pub hailo_model: Option<HailoModel>,
}
Registry
use nexus_scribe_models::ModelRegistry;
let mut registry = ModelRegistry::new();
// Register model
registry.register(model_info)?;
// Find by name
let info = registry.find_by_name("whisper-large-v3");
// List all
let all = registry.list_all();
Discovery Options
let options = DiscoveryOptions::new()
.with_checksums() // Compute SHA256 checksums
.without_checksums(); // Skip checksum computation
Usage
[dependencies]
nexus-scribe-models = { path = "../nexus-scribe-models" }
Model Directory Structure
/opt/NexusScribe/models/
├── whisper/
│ ├── ggml-tiny.bin
│ ├── ggml-base.bin
│ └── ggml-large-v3.bin
├── speaker/
│ └── speaker-embedding-v2.onnx
└── vad/
└── silero-vad-v4.onnx