On premises custom LLMs



Harness locally the most complex LLM (Large Language Models). Run Generative AI with full privacy, on-premises, customized models and state-of-the-art performance, offline and without subscriptions.

RaiderChip offers Generative AI hardware solutions based on affordable and readily available devices such as FPGAs. The main advantages are:

  • In the continuously evolving field of AI, FPGAs allow reprogrammability, paramount to adapting to new algorithms without the need to upgrade the hardware or the underlying board (for example, new quantization techniques).

  • Our IP cores sport single precision Floating Point arithmetic (i.e. FP32), yielding the highest precision out of LLMs. This translates to inference results of better quality and coherence, without curtailing the original model’s reasoning abilities.

  • Always available without any network connection, a must to meet privacy requirements of many applications (such as home assistants with in-home cameras)

  • Ability for stand-alone operation, set-top-box like, without bulky extra devices

  • Run forever, without any monthly subscriptions to Cloud AI providers (i.e. ChatGPT)

  • Allows using Open Source and commercial-friendly licensed LLMs (i.e. Meta Llama-3, Microsoft Phi-2, Microsoft Phi-3)

  • Run custom models fine tuned for specific tasks (industrial control, home automation, consumer devices, etc.)

  • At interactive speeds, on low-cost FPGAs, to allow real-time implementations with Automatic-Speech-Recognition and Text-to-Speech engines.

  • Low power consumption: our efficiency advantage directly translates into the lowest possible energy consumption per inference token.