Machine Learning

1 readers

1 users here now

Community Rules:

Be nice. No offensive behavior, insults or attacks: we encourage a diverse community in which members feel safe and have a voice.
Make your post clear and comprehensive: posts that lack insight or effort will be removed. (ex: questions which are easily googled)
Beginner or career related questions go elsewhere. This community is focused in discussion of research and new projects that advance the state-of-the-art.
Limit self-promotion. Comments and posts should be first and foremost about topics of interest to ML observers and practitioners. Limited self-promotion is tolerated, but the sub is not here as merely a source for free advertisement. Such posts will be removed at the discretion of the mods.

founded 1 year ago

MODERATORS

communick@academy.garden

[Discussion] What curve can I fit or what model can I train? (alien.top)

submitted 1 year ago by ninadsutrave@alien.top to c/machinelearning@academy.garden

10 comments fedilink hide all child comments

I have a dataset of two column values something like the one shown below. I need to predict the values of y for values of x greater than 60. The curve must follow the increasing trend it is shown till x=60.

I have tried polynomial regression and SVR but it declines for values greater than 60. I have tried to fit the curve y = alnx + b to this curve but the R2score is 0.94. What model can I train for this purpose, or how can I improve the R2score but regressing over an appropriate logarithmic function?

https://preview.redd.it/f9oxc20zga2c1.png?width=1208&format=png&auto=webp&s=b7918c9d9dd2bb930a2e903483d5a230f2dcfce5

you are viewing a single comment's thread
view the rest of the comments

[–] JPyoris@alien.top 1 points 1 year ago

Not a direct answer, but be aware that overfitting will be a thing here too. You might get an R2 of 0.99+ but the extrapolation could be horrendous (for example, using a high-degree polynomial, you already saw that). 0.94 with only two parameters does not sound too bad for me.

Maximizing R2 and eyeballing the extrapolation is not really a valid approach. You should use a goodness of fit test that includes model complexity. You could also implement a simple validation by leaving out the last x% of your data when fitting and then look at the test error.

I also have to agree that it looks somewhat piecewise. Without knowing the generating process the correct continuation could be anything.