arm64 upgrade

arm64 - GPU tools

Maintain wheels for vllm and olmocr for 1 year

Add support to Gemma, Qwen, Mistral and Deepseek models

Keep wheel size optimized- use proper build arguments

--

Measure token speed/ with and without docker

Measure - olmOCR - text extraction times

Vision models - Enable fast processor ? To get better speed for document extraction

-- Use bits and bytes for qunatization of models

--

Create GitHuB action runner to build the image

Copy complete logic for image building from jetson containers