Delay translating DCT tokens into coefficients until immediately before IDCT
This is generally around 12% faster than the prior method of creating a
linked list for each block as tokens are read, but can be anywhere from
8% to 28% faster depending on file and CPU.
Originally committed as revision 22190 to svn://svn.ffmpeg.org/ffmpeg/trunk