NobodyWho

Local LLM inference on desktop and mobile

Fast, free and open source

NobodyWho is an on-device inference engine that runs LLMs locally on any device. Developers can embed fast, scalable models into mobile and desktop apps with just a few lines of code. Because inference runs entirely on-device, no servers are required and applications scale automatically with user adoption. This significantly reduces both costs and overall footprint. The result is offline, privacy-preserving and sustainable AI. The library is open source under EUPL 1.2 and is free for both individuals and companies.

Fully offline to protect your data
Choose from thousands of open-weights LLMs
Hardware acceleration on any hardware
Don't pay for inference
Works on all 5 major operating systems
Dead-simple to use

Local, offline
Everything runs on the end-user device. Great for software resilience, data privacy, and eliminating running costs. No more stressing about server capacity.
Fast on any hardware
Powered by Vulkan to run on any GPU, Metal to run fast on Apple devices, and SIMD instructions to fully utilize CPUs.
Keep your data private
User data never has to leave the user's device. Provide hard privacy guarantees to your end-users.
Free for commercial use
Want to build a company using the NobodyWho inference lib? Go ahead!
Thousands of LLMs
By using the GGUF open standard, you can choose between thousands of pre-trained LLMs of various sizes and capabilities. Pick one that suits your specific task!
Boilerplate-free tool calling
Just pass in a function. NobodyWho will figure out the rest, and even generate a formal grammar to guarantee that the types match!

Run real-time AI on users' devices

Fast, free and open source

Cut costs. Shrink your footprint. Keep your data private.

Local, offline

Fast on any hardware

Keep your data private

Free for commercial use

Thousands of LLMs

Boilerplate-free tool calling

Start building with NobodyWho