How to train your large language model

A new technique is speeding up the process


  • by
  • 03 13, 2024
  • in Science and technology

It is no LLMLLM LLM LLM GPT-2LLM secret that building a large language model () requires vast amounts of data. In conventional training, an is fed mountains of text, and encouraged to guess each word before it appears. With each prediction, the makes small adjustments to improve its chances of guessing right. The end result is something that has a certain statistical “understanding” of what is proper language and what isn’t.But an that has only undergone this so-called “pretraining” is not yet particularly useful. When asked for a joke to cheer your correspondent up, for instance, the pretrained model just repeated the question back three times. When asked who the American president was, it responded: “The answer is no. The president is not the president.” Clearly, teaching an to do what humans want requires something more.

  • Source How to train your large language model
  • you may also like