Add fast paths in BlitNtoNKey

All following conversions are faster (with colorkey, but no blending).
(ratio isn't very accurate)

ABGR8888 -> BGR888 :  faster x9   (2699035 -> 297425)

ARGB8888 -> RGB888 :  faster x8   (2659266 -> 296137)

BGR24 -> BGR24 :  faster x5   (2232482 -> 445897)
BGR24 -> RGB24 :  faster x4   (2150023 -> 448576)

BGR888 -> ABGR8888 :  faster x8   (2649957 -> 307595)

BGRA8888 -> BGRX8888 :  faster x9   (2696041 -> 297596)

BGRX8888 -> BGRA8888 :  faster x8   (2662011 -> 299463)
BGRX8888 -> BGRX8888 :  faster x9   (2733346 -> 295045)

RGB24 -> BGR24 :  faster x4   (2154551 -> 485262)
RGB24 -> RGB24 :  faster x4   (2149878 -> 484870)

RGB888 -> ARGB8888 :  faster x8   (2762877 -> 324946)

RGBA8888 -> RGBX8888 :  faster x8   (2657855 -> 297753)

RGBX8888 -> RGBA8888 :  faster x8   (2661360 -> 296655)
RGBX8888 -> RGBX8888 :  faster x8   (2649287 -> 308268)
1 file changed