😲 DeepSeek-V3-4bit runs at >20 tokens per second and <200W using MLX on an M3 Ultra with 512GB. This might be the best and most user-friendly way to run DeepSeek-V3 on consumer hardware, possibly the most affordable too. You can finally run a GPT-4o level model locally, with possibly even better quality. #LLM #AI #ML #DeepSeek #OpenAI #GPT #OpenWeight #OpenSource venturebeat.com/ai/deepseek-v3…
DeepSeek-V3 now runs at 20 tokens per second on Mac Studio, and that’s a nightmare for OpenAI
DeepSeek's free 685B-parameter AI model runs at 20 tokens/second on Apple's Mac Studio, outperforming Claude Sonnet while using just 200 watts, challenging OpenAI's cloud-dependent business model.Michael Nuñez (VentureBeat)
victor tsaran
in reply to Chi Kim • • •