Accelerators are becoming key elements of computing platforms for both data centers and mobile devices as they deliver energyefficient high performance for key computational kernels. However, the design and integration of such components is complex, especially for Big Data applications where they have very large workloads to elaborate. Properly customizing the accelerators' private local memories (PLMs) is of critical importance. To analyze this problem we design an accelerator for Collaborative Filtering by applying a system-level design methodology that allows us to synthesize many alternative micro-Architectures as we vary the PLM sizes. We then evaluate the resulting accelerators in terms of resource requirements for both embedded archi...