SissonWorks

Overview

This Python program takes a list of comments (training set) and uses them to determine whether another set of comments (test set) are considered positive or negative comments. The program takes the training set and test set as arguments. These sets are both text files that contain comments and a value of 0 or 1, which determines whether the comment is positive or negative. The program then learns from the training set and uses that knowledge to make its best guess at which comments are positive or negative in the test set.

How Does It Work?

The program takes the training set and creates a vocabulary, or a list of words that it has learned from the training set.

Once it has created a vocabulary, the program will use the classifier find probabilities for all words in the vocabulary.

These probabilities include the number of times a word from the vocabulary is used or not used in the test set the probability of a word appearing or not appearing in a positive review and the probability of a word appearing or not appearing in a negative comment.

Finally, it uses these probabilities to decide whether the comments in the test set are positive or negative.

How to Compile and Use

To compile, enter this into the command line: chmod +x sentiment.py
To start the program, enter this into the command line: ./sentiment.py "training data file" "test data file"
- Example: ./sentiment.py trainingSet.txt testSet.txt

Output and Results

Once the program has run, and it completes its testing, a results.txt file will be made containing the test results. The contents of results.txt will also be printed after the program finishes.

Want to View My Code?

Click here to access the Google Drive folder containing this project.

Sentiment Detector

Overview

How Does It Work?

How to Compile and Use

Output and Results

Want to View My Code?

Contact Me.