How much RAM would be needed on Mac Studio M2 Ultra for inferring from a 65B LLM model. There are three options: 64GB, 128GB and 192GB. If using Apple M2 Ultra with 24‑core CPU, 76‑core GPU, 32‑core Neural Engine with 192GB Unified Memory, how would be the performance of the model?
Asked
Active
Viewed 728 times