Run any LLM anywhere

Engineered for real-time speed, data security and cost reduction.

Fast, free and open source

NobodyWho is an inference engine that lets you run LLMs locally and efficiently on any
device. Our engine enables developers without prior LLM experience to embed fast,
scalable LLMs into mobile and desktop applications using just a few lines of code.
The library is open source under EUPL 1.2 and can be used for free by both individuals
and companies.

  • Fully offline to protect your data
  • Choose from thousands of open-weights LLMs
  • Blazingly fast hardware acceleration on any hardware
  • Don't pay for inference
  • Works on all 5 major operating systems
  • Dead-simple to use

We're building it right.

  • Local, offline

    Everything runs on the end-user device. Great for software resilience, data privacy, and eliminating running costs. No more stressing about server capacity.

  • Fast on any hardware

    Powered by Vulkan to run on any GPU, Metal to run fast on Apple devices, and SIMD instructions to fully utilize CPUs.

  • Keep your private data private

    User data never has to leave the user's device. Provide hard privacy guarantees to your end users.

  • Free for commercial use

    Want to build a company using the NobodyWho inference lib? Go ahead!

  • Thousands of LLMs

    By using the GGUF open standard, you can choose between thousands of pre-trained LLMs of various sizes and capabilities. Pick one that suits your specific task!

  • Boilerplate-free tool calling

    Just pass in a function. NobodyWho will figure out the rest, and even generate a formal grammar to guarantee that the types match!