Performance constraints:

  1. Shared Memory vs Global Memory

  2. Atomic instructions

  3. Shared Memory Bank conflicts

  4. Memory Coalescing - SoA vs AoS