This project was submitted to the Department of Mathematics and Statistics at Amherst College in partial fulfillment of the requirements for the degree of Bachelor of Arts in Statistics.


Natural Language Processing (NLP) is the process of helping computers understand and interpret human language (which is hard to do without an inherent knowledge of tone, connotation, sarcasm, etc.). “Fake news” and the validity of news articles, Tweets, and more are at the forefront of national conversations around free speech and the spread of misinformation. This project uses NLP to classify news articles as either real or fake through the use of lemmatization, stop words, bag-of-words feature extraction, a multilayer perceptron neural network, a recurrent neural network, and other machine learning models.


  • Natural Language Processing
  • Machine Learning
  • Fake News
  • Lemmatization
  • Bag-of-words model
  • Deep Learning
  • Recurrent Neural Network