There's a simple algorithm that uses the Bayes theorem that can be used to classify documents, using their text tokenized into individual words, into categories (e.g. tags on a website). The classifier needs to be trained with existing data, and then it will return which categories a new document probably belongs to.
Read more