Modules¶ TextWizard Extract Text Extract Text (Azure) Clean HTML Clean XML Clean CSV Named-Entity Recognition (NER) Spell Checking Language Detection Text statistics Text similarity Beautiful HTML HTML → Markdown