Skip to content

Understanding the basics #286

Answered by thomasantony
FiveTechSoft asked this question in Q&A
Discussion options

You must be logged in to vote

Here is a sort of basic idea of how a machine learning model like LLaMA works. You take the input (the text from the prompt), convert it into a bunch of numbers (called "tokens"). These numbers then have a bunch of math done on them. This math is defined in terms of matrices (think of them as a grid of numbers). You can do multiplication, addition etc. and there are specific rules for how to do this math.

The "model" in this case "LLaMA" defines how many matrices to use and how to multiply/add/whatever them with the input to get the output. This structure/list of operations is openly available. The actual numbers that go into these matrices, which are then used in the mathematical operati…

Replies: 4 comments 7 replies

Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
7 replies
@thomasantony
Comment options

@FiveTechSoft
Comment options

@FiveTechSoft
Comment options

@thomasantony
Comment options

@FiveTechSoft
Comment options

Answer selected by FiveTechSoft
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
6 participants