Training my own 70 million parameter Chatbot (like ChatGPT)

With the release of the Dolly-15k dataset from Databricks and the Pythia-70M model from EleutherAI, using the Hugging Face Transfomer library it’s easy to train your own ChatGPT like chatbot!

I trained several chatbots on different sized models (from 70 million parameters up to 1.4 billion parameters). At 70 million parameters the answers were sometimes properly structured, but heavily hallucinated. As I worked my way up to 1.4 billion parameters, I started to get legitimate answers!

Loss based on model size, lower means better. Reducing loss at a lower values is more impactful, for example, going from 1.5 to 1.4 tends to be much more of an improvement than 3.0 to 2.9.

How did the models do? I asked each one 5 questions:

Question 1: What is the capital of Peru?
Pythia-70m: The capital of Peru is the capital of Peru.

Pythia-410m: The capital of Peru is Lima.

Pythia-1b: The capital of Peru is Lima.

Pythia-1.4b-LoRA (8-bit): Lima is the capital of Peru and is the largest city in the country.

Question 2: What is the boiling point of water?
Pythia-70m: The boiling point of water is about 20.5 degrees Celsius.

Pythia-410m: The boiling point of water is the temperature at which boiling water will vaporize and become steam.

Pythia-1b: The boiling point of water is 212 degrees Fahrenheit.

Pythia-1.4b-LoRA (8-bit): The boiling point of water is 212.4 °F (100 °C).

Question 3: Explain the RGB color space.
Pythia-70m: RGB is a color space that is used to represent the light in a wide range of colors. This space is also used to represent the light in a wide range administrative area.

Pythia-410m: The RGB color space is a color space that represents a color as a single number. It is a color space that is used in the image processing field to represent a color as a single number.

Pythia-1b: RGB stands for Red, Green, Blue. The RGB color space is a color space that describes the way that light is perceived by humans. It is a three-dimensional space that is comprised of a red, green, and blue color.

Pythia-1.4b-LoRA (8-bit): The RGB color space is a color representation of the red, green, and blue components of light. It is a common color space used in color science and computer graphics.

Question 4: Share an inspirational quote.
Pythia-70m: “I am so honored to be the first person to write a book that I could write for my family and friends.” – John D. Roc

Pythia-410m: “I don’t believe in coincidences, I believe in the ability to predict the future.”

Pythia-1b: “If you want to be happy, be happy. If you want to be healthy, be healthy. If you want to be successful, be successful.”

Pythia-1.4b-LoRA (8-bit): “The world is a book and those who do not travel read only one page”.

Question 5: What is the best flavor of Starburst?
Pythia-70m: The best flavor of Starburst is the flavor of the sun. It is very sweet and very light.

Pythia-410m: The best flavor of Starburst is the one with the most carbon dioxide. It is the most bitter. The other flavors are not as bitter. The other flavors are the most sweet.

Pythia-1b: The best flavor of Starburst is a combination of chocolate and vanilla.

Pythia-1.4b-LoRA (8-bit): The answer is subjective. Some people enjoy the flavor of Starburst, while others do not. Some people enjoy the flavor of Starburst with a hint of cinnamon and nutmeg, while others do not.

For even the smallest model answers are returned properly structured (for the quote, I did not add the quotation marks, the model returned with them). The models clearly hallucinate less as the number of parameters grows.

This is all with fine-tuning on just 15k examples. “Snarky” responses such as “The capital of Peru is the capital of Peru” (which is technically correct) can be eliminated with more training and feedback. The quality of answers can also increase on the same-sized model with more feedback and training.

The 70 million parameters model can comfortably run on the CPU, the 1.4b parameter model can run on an 8GB GPU.

Leave a Reply

Your email address will not be published. Required fields are marked *