The bitnet1.58 model has an 8X or more Smaller memory requirement versus conventional 32bit LLMs , and as such Bitnet1.58b is more capable of delivering complex engineered prompt responses making use of fewer cores with same memory bandwidth speeds as offered by processors with direct attached memory , using 8X less memory in the process. The accuracy of the response to the complex prompts outlined in their paper are within 5% of the performance of the very best 32bit quantization models reduced to be 8bit Transformer Stack models , which still use 4-6X more Compute cores and Memory and power.
I’ll cautiously say this is a seminal development by MS and CN, which needs more multiple 3rd party independent validations AND needs commercial competition to emerge. I suspect ternary computing will be revived in new silicon forms to tackle the AI inference challenge more effectively at much higher speeds, perhaps 10X or more faster in the medium term, which means an 80X fold improvement or more over what we currently have.
n.b- The guys at IOTA in Norway and Germany as well as in Asia Pac were heading down this path a few years ago, then bailed on their plans for a ternary processor to handle Edge IoT gateway jobs handling lost of sensors , util the guy in Austria steered the IOTA foundation into smart Contracts and their Shimmer overlay network effort.
Maybe its time for Autonomi to revisit that earlier IOTA work, which was ternary math applied to their DAG PoW validation method, which imo is very similar conceptually to what Autonomi does with PoW today.
The point is that there are likely a few ternary math people in the IOTA sphere which could be contacted and tapped as contractors to help @dirvine and co. create an Autonomi specific LLM bolt on inspired by Bitnet1.58B, which could in theory and likely practice, run on everyday i5gen10 Notebook CPUs and 8GB of memory.
The idea would be to create a private Autonomi LLM which trains on data the end user points it at data found on their own local private storage and/or secure Autonomi stored data, while interfacing an Autonomi genAI Assistant client accessible via the cli, or via plugin that loads into any chromium based browser. (Brave Browser is a reasonable place to start.)
In this way the proposed Autonomi genAI Assistant Plugin can be uploaded to the Google Store and downloaded to any Chromium based browser, of which there are several, once downloaded the plugin could, in a user permissioned way, then trigger the Autonomi safe client and safenode and safenode-manager installs and wallet setup…
Think of it as a Trojan horse operation to boost adoption of Autonomi Network if you like, which could go viral across Win11 and Linux desktops and even Apple Safari Desktops.
It would probably mean making sure part of the dataset is made immutably public to support GenAI Plugin Browser use of a useful LLM found on the Autonomi Network, optionally hosted as a node by Autonomi users which each contribute some CPU clock ticks to autonomously train up portions of the proposed Bitnet1.58B like Autonomi LLM public Model, a model which nobody could shutdown, as the inference training nodes which come and go can be replaced in the same fashion in which the storenodes work today. Anyway its raw concept stuff, food for thought. 