this post was submitted on 12 Nov 2023
1 points (100.0% liked)

Machine Learning

1 readers
1 users here now

Community Rules:

founded 11 months ago
MODERATORS
 

I'm looking for some advice regarding a project idea I have. I would like to predict the big five personality traits for authors based on an analysis of their writing samples. However, would I need to have had some authors take the big five personality assessment and have a training set with those results in order to do a project like this? Or is their a way to "guess" what certain writing patterns would correlate with? What would be the potential strategy for orienting an ml project like this?

you are viewing a single comment's thread
view the rest of the comments
[–] Veggies-are-okay@alien.top 1 points 10 months ago

Since it’s writing style, it’s unstructured data (as opposed to tabular) and therefore a neural network is the best option. Because you’re looking at text, you have two options:

  1. theoretical: rnn -> lstm -> transformer

More so if you’re into the inner workings. Recursive neural networks bring in the concept of recursion, lstm (long short term memory) gives you more power (but a little more complicated), and finally transformers have the fun encoder/decoder features built in to make a super-powered lstm.

  1. huggingface! For simple classification from text this is gonna be real easy and pretty effective:

https://huggingface.co/bert-base-cased

The big thing here is how are you going to fine tune it? You’ll need some classification outcomes to attach to your samples. Because the traits aren’t mutually exclusive, you might want to make a few binary classifiers (yes/no for a specific trait). The link has some examples of fine tuning too.

Hope this gets you off to a decent start!