centralize tmp buffer

Instead of each _1 stage creating the tmp buffer it needs, thread one
through for all the _1 stages to use.  I think this makes them read a
little more clearly, and does help in terms of code size.

Change-Id: I1e6a02eadb38f3d923c3b119389ebcfceb024943
1 file changed