Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Project 1 #1

Open
7 of 10 tasks
CKKKKKKKKK opened this issue Oct 27, 2020 · 0 comments
Open
7 of 10 tasks

Project 1 #1

CKKKKKKKKK opened this issue Oct 27, 2020 · 0 comments

Comments

@CKKKKKKKKK
Copy link
Owner

CKKKKKKKKK commented Oct 27, 2020

Data Splitting

  • Split the original ‘train.csv’ into ‘train.csv’, ‘valid.csv’ and ‘test.csv’
    with the ratio of 0.8 : 0.1 : 0.1, respectively.

Data Preprocessing

  • Convert labels into to two classes: low (0, 1) and high (2, 3)

  • Data normalization (i.e., scaling values of attributes to
    the same level, e.g., [0, 1])

  • For Naive Bayes, discretize continuous attributes into intervals

  • For Naive Bayes, split large number into ranges

Model Implementation

  • Implement Naïve Bayes

  • Implement Logistic Regression

Empirical Study

  • Compare the three methods with respect to the classification
    accuracy on the training set and the test set separately:
    Naive Bayesian
    SVM
    Logistic regression
    Report the comparison of accuracy using figures and tables.

  • Report the time required by each of the methods, excluding the
    time needed for loading and processing data. This may be done
    using the time module. The result can be shown in a table.

  • Provide a description for each figure or table. Analyze the
    difference, pros and cons of the three methods.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant