How LLMs Work

4 min readMar 23, 2024

A simplified view of how large language models work

With co-pilots everywhere, have you ever wondered how LLMs that power these co-pilots work?

In this post, I am going to give you a 10000-foot view of how LLMs work.

Figure 1: The components of an LLM (By Author)

An LLM is nothing but a large number of parameters, called weights and biases, packed into a file plus a small piece of code to use these parameters to predict the next token given some text.

Figure 2: LLM is a classifier that predicts the next word (By Author)

Figure 3: LLM iteratively predicting the next word given the input context (By Author)

Parameters are the knowledge that a model learns through training and represent the skills of the model in a compressed format. They represent weights and biases. Think about a lossy compression like JPEG or MP3 that is further compressed using zip. A token is a word or a part thereof. These parameters are learned by training on a vast amount of text data. For example, as shown in Figure 4, GPT-4 has 1.76 trillion parameters, Llama-2 has 70 billion parameters and Gemini has 1.56 trillion parameters.

How LLMs Work

Written by AI/Data Science Digest