TextOutputDev: Break words on all whitespace characters

Some PDF creators like Chrome use no-break spaces or other whitespace
characters between words, causing pdftotext -bbox to not break words as
expected.  Fix this by breaking words on any character with the Unicode
whitespace property.

Bug #97399
3 files changed