On cell developers will have compiler optimizations like autovectorisation and other ...
I think it's easier for me to think cell
as mini cluster
powerpc as master (conductor) and spu's as slaves (orchestra) with crunching jobs submited to them
Vectorial code (simd) aka symphony is arranged by compiler (composer) at another time
Don't forget that memory is closer to cpu (integrated controller) and doesn't need to much of cache and OOo prediction for brached code (low latency)