When using Unicode databases (default in more recent bogofilter
installations), upon encountering invalid input sequences,
bogofilter or bogolexer could overrun a malloc()'d buffer,
corrupting the heap, while converting character sets. Bogofilter
would usually be processing untrusted data received from the
network at that time.
This problem was aggravated by an unrelated bug that made
bogofilter process binary attachments as though they were text, and
attempt charset conversion on them. Given the MIME default
character set, US-ASCII, all input octets in the range 0x80...0xff
were considered invalid input sequences and could trigger the heap
corruption.