Local LLM Llamafile

happybeing · December 3, 2023, 5:11pm

Try a local LLM using Mozilla’s Llamafile:

Get the 4.3GB file:
curl -LO https://huggingface.co/jartine/llava-v1.5-7B-GGUF/resolve/main/llava-v1.5-7b-q4-server.llamafile
Make executable
Run it. If you get errors at this point see this post
Visit http://127.0.0.1:8080/

H/t Simon Willison, more here: llamafile is the new best way to run a LLM on your own computer

dirvine · December 3, 2023, 6:37pm

It actually works pretty well. I got it a few days ago and worked out of the box. Merging many of these models looks to be something doable soon too.

happybeing · December 3, 2023, 6:59pm

Have you asked it to repeat a word forever yet? I’m still downloading but its the first thing I’m gonna try

dirvine · December 3, 2023, 7:06pm

No not yet. It’s a quite good model though and helpful

JPL · December 3, 2023, 7:40pm

I think my favourite thing about llamafile is what it represents. This is a single binary file which you can download and then use, forever, on (almost) any computer.

Except for mine apparently (Linux Mint)

user@dell:~/Documents/Llama$ ./llava-v1.5-7b-q4-server.llamafile 
run-detectors: unable to find an interpreter for ./llava-v1.5-7b-q4-server.llamafile

Southside · December 3, 2023, 7:41pm

Sweary words…
Looks like I need to learn to get CUDA setup correctly first.

llama_new_context_with_model: compute buffer total size = 162.63 MB
Available slots:
 -> Slot 0 - max context: 2048

llama server listening at http://127.0.0.1:8080

*** stack smashing detected ***: terminated
Terminating on uncaught SIGSEGV. Pass --strace and/or ShowCrashReports() for details.

Southside · December 3, 2023, 7:41pm

JPL:

I think my favourite thing about llamafile is what it represents. This is a single binary file which you can download and then use, forever, on (almost) any computer.

Except for mine apparently (Linux Mint)
user@dell:~/Documents/Llama$ ./llava-v1.5-7b-q4-server.llamafile 
run-detectors: unable to find an interpreter for ./llava-v1.5-7b-q4-server.lla

try wine ./llava-v1.5-7b-q4-server.llamafile and then see what errors you get

JPL · December 3, 2023, 7:43pm

I get “you need to install wine…”

BRB

dirvine · December 3, 2023, 7:51pm

I went over to the dark side a few years back and went MAC os. This stuff all works on apple metal which does make things much simpler.

JPL · December 3, 2023, 7:57pm

Well that worked more or less, thanks @Southside

Some issue with the webserver though


failed to open http://127.0.0.1:8080/ in a browser tab using /c/windows/explorer
.exe: Error 1359 (win32 error 6)
loading weights...
{"timestamp":1701633290,"level":"INFO","function":"main","line":3039,"message":"
HTTP server listening","hostname":"127.0.0.1","port":8080}

Southside · December 4, 2023, 12:33am

After getting into nvidia driver hell, I managed to make some progress. Still not seeing anything at localhost:8080 but the error msgs are starting to make more sense.

GitHub - Mozilla-Ocho/llamafile: Distribute and run LLMs with a single file. may help

EDIT - sorry forget about wine.   

./llava-v1.5-7b-q4-server.llamafile      worked for me eventually after a LOT of faffing with nvidia drivers. I have a GTX1650 - was once upon a time a pretty good card.

happybeing · December 4, 2023, 12:41am

It’s working out of the box on Ubuntu

Southside · December 4, 2023, 12:51am

working now but I had to bash the box into shape a wee bit

Southside · December 4, 2023, 12:56am

Its no use at counting people. Dunno where it gets 12 from.

Scene from the wedding of an ex-colleague of @JimCollinxson btw

EDIT: Its still inventing people. I asked it try to identify the car in the photo below

User: identify this vehicle

Llama: The vehicle in the image is a red car, possibly an old-style sports car. It appears to be driving on dirt or gravel and has been described as being “in motion.” This suggests that it may have been racing or participating in some sort of off-road event. Additionally, there are two people visible near the vehicle; one person is closer to the left side while another person can be seen further back towards the right side of the image. The scene captures an exciting moment as the car navigates through its environment.

neo · December 4, 2023, 3:51am

Their names being “A” and “B” LOL

I still cannot find the guy holding his Tie up to show it off

TylerAbeoJordan · December 4, 2023, 5:42am

“All work and no play makes JackGPT a dull AI.”

Be careful with that!

TylerAbeoJordan · December 4, 2023, 5:46am

On some Linux systems, you might get errors relating to run-detectors or WINE. This is due to binfmt_misc registrations. You can fix that by adding an additional registration for the APE file format llamafile uses:

sudo wget -O /usr/bin/ape https://cosmo.zip/pub/cosmos/bin/ape-$(uname -m).elf sudo chmod +x /usr/bin/ape sudo sh -c "echo ':APE:M::MZqFpD::/usr/bin/ape:' >/proc/sys/fs/binfmt_misc/register" sudo sh -c "echo ':APE-jart:M::jartsr::/usr/bin/ape:' >/proc/sys/fs/binfmt_misc/register"

JPL · December 4, 2023, 9:27am

That works. Thanks @TylerAbeoJordan !

dirvine · December 4, 2023, 9:29am

The server for some reason does not seem as good as the cli I have not tried images but that part is not so great. However the diff between nearly there and beyond belief is just more compute in training and that is the amazing part.

scottefc86 · December 4, 2023, 9:30am

It sees dead people

Topic		Replies	Views
Local LLM'S, Hardware, UI and Models Community ai , llm	51	3230	April 21, 2024
Uploads to test with Community	24	331	November 12, 2024
Mission Ctrl Community mission-ctrl	9	240	April 14, 2025
SAFE Compute? (AI) Features	17	818	June 12, 2023
Files to test with! Community	6	165	October 25, 2024

Local LLM Llamafile

Related topics