Sentiment Detector
A program that learns from a list of comments to determine if a test set of comments have a positive or negative connotation.
Languages: Python.

Overview
This Python program takes a list of comments (training set) and uses them to determine whether another set of comments (test set) are considered positive or negative comments. The program takes the training set and test set as arguments. These sets are both text files that contain comments and a value of 0 or 1, which determines whether the comment is positive or negative. The program then learns from the training set and uses that knowledge to make its best guess at which comments are positive or negative in the test set.
How Does It Work?
The program takes the training set and creates a vocabulary, or a list of words that it has learned from the training set.
Once it has created a vocabulary, the program will use the classifier find probabilities for all words in the vocabulary.
These probabilities include the number of times a word from the vocabulary is used or not used in the test set the probability of a word appearing or not appearing in a positive review and the probability of a word appearing or not appearing in a negative comment.
Finally, it uses these probabilities to decide whether the comments in the test set are positive or negative.
How to Compile and Use
- To compile, enter this into the command line: chmod +x sentiment.py
-
To start the program, enter this into the command line: ./sentiment.py "training data file" "test data file"
- Example: ./sentiment.py trainingSet.txt testSet.txt
-
Output and Results
Once the program has run, and it completes its testing, a results.txt file will be made containing the test results. The contents of results.txt will also be printed after the program finishes.
