arm64 upgrade
arm64 - GPU tools
Maintain wheels for vllm and olmocr for 1 year
Add support to Gemma, Qwen, Mistral and Deepseek models
Keep wheel size optimized- use proper build arguments
--
Measure token speed/ with and without docker
Measure - olmOCR - text extraction times
Vision models - Enable fast processor ? To get better speed for document extraction
-- Use bits and bytes for qunatization of models
--
Create GitHuB action runner to build the image
Copy complete logic for image building from jetson containers