ML Projects

Goal: Given an article’s metadata and text descriptors plus a candidate contributor ID, predict if the candidate is a true contributor. Half of the test cases are positive and half negative.

How it works (high level):

Parse sparse textual indicators from titles/abstracts and combine with metadata (year, venue, known contributors).

Train supervised classifiers to output 0/1 for “candidate is a true contributor.”

Submit predictions to Kaggle in the required CSV format (id,predictions) for 2,000 test rows.

Evaluation is classification accuracy; leaderboard uses a public/private split to avoid overfitting.