this post was submitted on 27 Oct 2023
1 points (100.0% liked)

Machine Learning

1 readers
1 users here now

Community Rules:

founded 1 year ago
MODERATORS
 

I'm currently working on a machine learning task on Kaggle, and I'm striving to achieve a minimum of 0.97 score in accuracy. While I've made some progress, I've hit a plateau at 0.91 and can't seem to improve beyond that.

Task Description: I'm working on a classification task where tweets need to be classified into two categories: "Sports" or "Politics." I've used various models, including BERT, and have explored hyperparameter tuning, but I haven't been able to achieve the desired accuracy.

Current State: My best model currently has an accuracy of 0.91. I'm looking for ideas, strategies, and any advice that might help me break through this barrier and achieve a 0.97 accuracy score. I'm open to trying new approaches or techniques, and I'd love to hear from anyone who has experience with similar tasks.

Questions:

  • Are there specific techniques or approaches you recommend for improving model accuracy?
  • How can I make the most out of my training data and optimize the model further?
  • Any insights on feature engineering or data augmentation that could help?

I greatly appreciate any insights or feedback you can provide. Please share your experiences, suggestions, or any resources you think might be helpful.

you are viewing a single comment's thread
view the rest of the comments
[–] dual_carriageway@alien.top 1 points 1 year ago (1 children)

First off, it’s kind of a funny task given people like to complain some people treat politics like sport these days.

What data do you have access to - just the tweet text or is there other metadata like username, time, bio, profile picture etc. ?

[–] poolyhymnia@alien.top 1 points 1 year ago

I added a sample data to the post body it's basically this:

Data fields

  • TweetId - an anonymous id unique to a given tweet
  • Label - the associated label which is either Sports or Politics
  • TweetText - the text in a tweet