The Downside of Converting Full-Text PDFs to XML for Text Mining