Add SSE4 version of BlurImage optimizations.

Adds an SSE4.1 version of the existing BlurImage optimizations.
Performance of blur_image_filter_* benchmarks show a 10-50%
improvement on Linux/Ubuntu Core i7.

Signed-off-by: Henrik Smiding <henrik.smiding@intel.com>

R=mtklein@google.com

Author: henrik.smiding@intel.com

Review URL: https://codereview.chromium.org/366593004
5 files changed