locallama: Local LLM Chrome Chat

This talk demonstrates using a local GPU-powered LLAMA model via a Chrome extension to answer questions on any webpage, exploring future WebGPU integration and macOS app development.

Overview

Hey!

This extension uses local GPU to run LLAMA and answer question on any webpage

I am planning to make it 100% free and open source!
Soon, I would need help since this would require a macOS app running LLM in the background efficiently and I am not experienced in this at all!

Another important tech I want to adopt is WebGPU support, to make all computation in the browser

Here is the original tweet demo. DM me if you are interested and want to help me open source it.

This will be an alternative to Firefiles askfred, Perplexity, Merlin ai and others!

Links

https://www.reddit.com/r/LocalLLaMA/s/AHzMP1qDJr
https://x.com/Karmedge/status/1718352446710222877
Extension uses local Llama model for private, offline page querying.

Tech stack