Chi Kim

Cool tip for running LLMs on Apple Silicon! By default, MacOS allows GPU to use up to 2/3 of RAM on machines with <=36GB and 3/4 on machines with >36GB. I used the command `sudo sysctl iogpu.wired_limit_mb=57344` to override and allocate 56GB/64GB for GPU. This allowed me to load all layers of larger models for a faster speed! #MacOS #LLM #AI #ML

reshared this

⇧

Chi Kim

Chi Kim 2 weeks ago • •

Chi Kim
2 weeks ago • •