This post is being written during a time of quick change, so chances are it’ll be out of date within a matter of days; for now, if you’re looking to run Llama 7B on Windows, here are some quick steps.
Start by running PowerShell. Create a new directory and enter it.
mkdir llama
cd llama
I am assuming you have Python and PIP already installed, if not you can find steps on ChatGPT.
Next you need to create a Python virtual environment, you can do this without a virtual environment, but as of now it requires using nightly builds of Pytorch (for flash attention) and an unmerged branch of transformers.
python -m venv .venv
.\.venv\Scripts\Activate.ps1
This should create and activate a virtual Python environment. Next we’re going to install everything you need:
Now create a file called llama.py with the following body:
import transformers
device = "cpu"
tokenizer = transformers.LLaMATokenizer.from_pretrained("decapoda-research/llama-7b-hf")
model = transformers.LLaMAForCausalLM.from_pretrained("decapoda-research/llama-7b-hf").to(device)
batch = tokenizer(
"The capital of Canada is",
return_tensors="pt",
add_special_tokens=False
)
batch = {k: v.to(device) for k, v in batch.items()}
generated = model.generate(batch["input_ids"], max_length=100)
print(tokenizer.decode(generated[0]))
That’s all there is to it! Use the command “python llama.py” to run it, you should be told the capital of Canada! You can modify the above code as you desire to get the most out of Llama!
You can replace “cpu” with “cuda” to use your GPU.
I haven’t updated this site in a long time, but I’ve been paying a monthly hosting fee all along. This year, when the site came up for renewal with Bluehost, I felt their prices where high, and their pricing scheme was deceptive. I wasn’t able to switch packages to the lower tier on the dashboard, and when I called in the price quoted wasn’t the same price on the website. Before calling, I decided I’d not let the call run for more than 5 minutes, if they couldn’t figure it out that fast, I might as well self-host; I’ve been wanted to set this up a while anyway.
First, Get a Server
You’re going to need a server, I am going to assume you know how to get a server up and running. I used a nano AWS instance, but a basic Digital Ocean droplet would do nicely.
Prepare the Server
Now you’re going to need to do is install Docker, and Docker Compose. SSH in to the server and let’s get started. The lines below are meant for Amazon Linux, if you’re using something else, you will need to adjust accordingly:
# Update your packages, always good to start with this.
sudo yum update -y
# Install Docker
sudo yum install docker
# Start Docker
sudo service docker start
# Allow the default ec2-user to interact with Docker
sudo setfacl --modify user:ec2-user:rw /var/run/docker.sock
# Install Docker-Compose (all one line, wrapped because of WordPress)
sudo curl -L https://github.com/docker/compose/releases/latest/download/docker-compose-$(uname -s)-$(uname -m) -o /usr/local/bin/docker-compose
# Enable Docker-Compose to be executable
sudo chmod +x /usr/local/bin/docker-compose
# Make sure Docker-Compose is working
docker-compose version
# If you see "Docker Compose version v2.1.1", it worked!
Now you’ve got your machine all ready to go, let’s install the software
Optional Swap Space
If you’ve chosen a server that doesn’t have much memory, like me, you will probably see errors when you start the Docker network, I’ve added 2GB of swap space to fix this:
# Create a 2GB file
sudo fallocate -l 2G swap
# Set the file permissions
sudo chmod 600 swap
# Turn it in to a swap
sudo mkswap swap
# Enable the swap
sudo swapon swap
Install WordPress + MySQL + Redis
Let’s install all the software.
# Create the directory for the files, chown it, and enter it
sudo mkdir /var/WordPress
sudo chown ec2-user:ec2-user /var/WordPress
cd /var/WordPress
# Open "docker-compose.yml" to editing
nano docker-compose.yml
The last line above should open a basic text editor called Nano. Copy and past the following contents in to the file, I will break down this file below. After pasting use “ctrl+x” , “y”, “enter” to save it.
Version 3.9 is the version of the docker-compose, nothing special here. This creates a network of 3 containers; a WordPress container that maps port 80 of the host to the container. A “db” container, which consists of MySql 5.7, you might have noticed there is a random root password, you will not need to access the database as root, so we can do this. Lastly, a generic Redis container.
It’s not good practice to embed your secret password in the open like I did above. You should use Docker secrets for this, but for now, this will do.
Turn it on!
Use the following command to turn it on:
docker-compose up
You should see a lot of text on the screen, once it’s up and running, test it by visiting http://<SERVER IP> in your browser. You should see this:
The WordPress setup screen!
If this works, go back to your SSH session and hit ctrl+c to stop the network. Let’s run it in detached mode (so it stays running in the background):
docker-compose up -d
Now go back to your browser and go ahead and setup your WordPress instance, we still need to finish off setting up Redis. Continue once you’ve setup WordPress.
Enabling Redis for Max Performance
Go to “Plugins” and top the “Add New” button.
Search for “redis” and install “Redis Object Cache”
Don’t forget to activate it!
Now go to settings:
and enabled it:
We’re not yet done, this will result in a “Status: Not connected”. This is expected because by default this plugin tries to connect to localhost as the Redis server, we need to change it. Let’s go back to our SSH session.
# Edit to the wp-config.php file
sudo nano /var/WordPress/site/wp-config.php
# Add this after "<?php" to use "redis" as the hostname.
define('WP_REDIS_HOST', 'redis');
# ctrl+x , y , enter to save
Your file should look like this.
Finally, refresh the WordPress admin panel and you should have WordPress connected to Redis
Redis is connected.
That’s all! You can not setup your WordPress instance and can use it as you wish!
So, you’ve created a project on an Arduino and want to deploy it in to the world. The problem is that an Arduino is a big and relatively expensive device that has far more things than necessary for your project.
What you really need is just the microcontroller to run your code and control the pins. All you need to do is buy an ATtiny45, ATtiny85 or similar Atmel chip, and then upload your program to it.
The Arduino makes putting your code and powering your Atmel chip very easy. Getting your code on to the stand-alone Atmel chip is a little more tricky; there are ways to program the chip with an Arduino, but they aren’t as easy as using a USBasp AVR programmer. The one I used is this one found on Ali Express.
If you are on Windows, you will need to install the drivers for the programmer. A lot of sites will say you need to disable driver signing and do a whole bunch of steps, but the drivers found here will install on Windows 10 (they are signed) when installed using the installer (but not when you try to install the drivers yourself). Once the drivers are installed, plug in the programmer and confirm your the drivers are working in device manager:
Next, you need to install the ATtiny board in to the Arduino IDE. To do this, follow these steps:
Go to File > Preferences.
Click on the button to edit the “Additional Board Manager URLs”.
Add “https://raw.githubusercontent.com/damellis/attiny/ide-1.6.x-boards-manager/package_damellis_attiny_index.json” on a new line.
Click “OK” to close the URL editor, and “OK” again to close preferences.
Then go to Tools > Board Managers
Search for “ATtiny”.
Install the attiny boards that appear (you should only have 1 result).
Next we need to set the board to be the ATtiny and the programmer to “USBasp”, use the following screenshot for reference:
We are now all set from the software end, we need to then wire up the hardware. The following pins will need to be connected:
The MOSI from the programmer to the MOSI of the ATtiny (Pin 5)
The MISO from the programmer to the MISO of the ATtiny (Pin 6)
The SCK (Clock) from the programmer to the SCK of the ATtiny (Pin 7)
The RESET from the programmer to the RESET of the ATtiny (Pin 1)
The Vcc from the progammer to the Vcc of the ATtiny (Pin 8)
Ground from the programmer to the ground of the ATtiny (Pin 4)
Setting these pins up can be done in a couple of ways, I decided to make a dedicated board for this with a socket, so I can program chips quickly in the future. If you do this, I recommend running some tests on a breadboard first and making sure you have all your pins properly connected before soldering it together. Here is my programmer.
I can quickly program ATtiny chips by putting the chip in the IC socket and plugging an USBasp in the ribbon socket.
Now, all that needs to be done is plug the USBasp in to the USB port of a computer and hit play, here is what the console will look like when it is all working:
Now, you can use the newly programmed ATtiny chip in a project, without have to log around the entire Arduino with it.
Running a simple blink program on a breadboard with a stand-alone ATtiny85.