I'm studying PTX and I don't understand how a CTA (compute thread array) is different from a CUDA block.
Are they the same thing? It seems to me that for now (I'm just at the beginning of the PTX document) they're just the same
I'm studying PTX and I don't understand how a CTA (compute thread array) is different from a CUDA block.
Are they the same thing? It seems to me that for now (I'm just at the beginning of the PTX document) they're just the same