The aim of this thesis is to solve the problem of Automated Essay Scoring, using natural-language processing (NLP). Due to the fact that in recent years the interest in self-education and Massive Open Online Courses (MOOC) is constantly growing, systems that are able to deliver automation in academic progress evaluation are gaining popularity. Obviously, such systems can not only reduce the cost of online education but also radically reduce the time spent on task scoring.
In this work, the methods of text pre-processing, feature engineering, text vectorization, feature selection, cross-validation, ensemble methods, and supervised learning classification were implemented. The performance of the final model was evaluated using Cohen's kappa coefficient.