Technology

59427 readers

2848 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related content.
Be excellent to each another!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, to ask if your bot can be added please contact us.
Check for duplicates before posting, duplicates may be removed

Approved Bots

founded 1 year ago

MODERATORS

108

How AI works is often a mystery — that's a problem (www.nature.com)

submitted 11 months ago by boem@lemmy.world to c/technology@lemmy.world

25 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] General_Effort@lemmy.world 5 points 11 months ago

An Artificial Neural Network isn't exactly an algorithm. There are algorithms to "run" ANNs, but the ANN itself is really a big bundle of equations.

An ANN has an input layer of neurons and an output layer. Between them are one or more hidden layers. Each neuron in one layer is connected to each neuron in the next layer. Let's do without hidden layers for a start. Let's say we are interested in handwriting. We take a little grayscale image of a letter (say, 16*16 pixels) and want to determine if it shows an upper case "A".

Your input layer would have 16*16= 256 neurons and your output layer just 1. Each input value is a single number representing how bright that pixel is. You take these 256 numbers, multiply each one by another number, representing the strength of the connection between each of the input neurons and the single output neuron. Then you add them up and that value represents the likelihood of the image showing an "A".

I think that wouldn't work well (or at all) without a hidden layer but IDK.

The numbers representing the strength of the connections, are the parameters of the model, aka the weights. In this extremely simple case, they can be interpreted easily. If a parameter is large, then that pixel being dark makes it more likely that we have an "A". If it's negative, then it's less likely. Finding these numbers/parameters/weights is what training a model means.

When you add a hidden layer, things get murky. You have an intermediate result and don't know what it represents.

The impressive AI models take much more input, produce much more diverse output and have many hidden layers. The small ones, you can run on a gaming PC, have several billion parameters. The big ones, like ChatGPT, have several 100 billion. Each of these numbers is potentially involved in creating the output.