Autonomi AI Talk

This thread is for general discussion of the personal Safe AI’s David has been mentioning as a first possible app to interface easily with the Safe Network.

15 Likes

We are moving further towards David’s vision at pace. We need to hit the ground running. This device is actually quite neat and the reasoning, transparency, etc makes it better. My favorite bit was that they will allow you to choose what LLM’s or AI engines you use with theirs in the future. :wink::wink::wink: see where I’m going?

7 Likes

@Nigel Thanks for starting topic with David’s safe ai needs and making his comment that … “this is probably is a whole new thread” … real. Establishing the safe network as an autonomous ai enhanced network with … as David has also described “a personal ai assistant with localised data” (a natural extension of the currently safe primer) is fundamental to save humanities data or humanities knowledge.

It comes down to a proprietal versus open society. This will be the life choice each of us will have to make. safe is a fundamental part of that! So this safe ai talk is a very important one to progress. To me it will come down to a gap analysis between the AI development product team and the safe/Maidsafe team and establishing a working agenda to a mutually beneficial testnet to a commercial outcome … ideally around an MVP using an existing client core dataset/agenda that the AI & safe product teams can use to identify how it can be done i.e. a safe.ai product delivery … ideally with solid involved as per previous David comments re reviving the safe-solid discussion/contact when appropriate. :slightly_smiling_face:

5 Likes

How to start 2024?

I suggest spending ten minutes catching up on the state of AI in the form of Large Language Models like ChatGPT…

Here’s a superb summary based on Simon Willison’s first hand explorations with numerous LLMs which he’s been generously sharing all year, even showing us to run out own local LLMs on regular computers.

12 Likes

Excellent write up and one with which I totally agree. I think though that the fact code is perhaps the best use case for LLMs (at least right now) is not appreciated.

What I mean is not that folk don’t “get it”, but what that actually means. So if you can get an llm to write code, write tests and run it all to prove it works. Where you can define the tests, such as test X and then test Y etc. then you have quite a powerful system. With that powerful system then you expand software engineering exponentially.

What is missing right now is whole code base analysis and rewriting. It’s not far away though and then we could easily tell it, “add range based queries to rust-libp2p” then with that tell it to add anti sybil defences to safe_node and so on.

Right now though, well when we stabilise and confirm the client API, ordinary folk who have never coded could tell it “write me a dns layer for the safe network and embed that in a client”, then “use the dns layer module and write a web hosting solution” and so on. (when the models have absorbed that API or we fine tune an open source model on it)

We are right on the edge of this. But that is meh for me. What excites me is the completely creative folk who can then say things like

  • Write me a multisig wallet for safe
  • Write me a mechanism for a personal AI in SAFE that provides the full security and privacy of all my personal data.
  • Write me an interface between SAFE and the tesla bot (or any other robot)
  • Write me a desktop app that is completely in memory that I can log into any hardware I wish and leave no trace, but still access all my data.
  • Write me twitter/facebook/medium etc. on SAFE

And much, much more and do all of the above in less than 5 mins of playing. The things I Cannot imagine are immense, but when all the creatives get involved I am sure there will be apps (will they even be called apps) that will blow our minds.

12 Likes

I don’t doubt there are uses but I think many of the things you suggest there are questionable. Not that it can’t generate code that passes tests, but there’s a long way from a calculator that generates sums that add up, and a human who can engineer a solution and maintain it, especially when the solution is to critical or safety related problems.

So make me a blog, yes, but create me a wallet or sybill defense for a critical storage platform, I’m very doubtful.

And I think the danger here is very real because people will use it to do those and other critical tasks, and will deploy them, sell them and use them without an engineering team that has the understanding of the solution to consider all the ways it might go wrong and to minimise risks. What could possibly go wrong? :roll_eyes:

You can say, well, the LLM can do those roles too, and one day that may be so, but not until some pretty nasty things have happened due to a rush to be first, and testing on an unsuspecting public.

It’s one thing for a bridge to collapse and a train full of Victorians to act as a catastrophic learning mechanism by being plunged into a rivine, and another when your systems are responsible for critical functions relied on by millions or billions of people.

For me it is very concerning that we already have earlier AI deployed in dubious ways, and to see this rapidly amplified by the advent of LLMs with little discussion or even acknowledgement of the many risks this is creating.

1 Like

You can actually do this right now. The misconception LLM cannot do maths right now meaning do maths at all is wrong (the vanilla LLM RL part is terrible at math, but it’s not hard to see it getting better and in teh meanwhile use tools to ensure correctness). OpenAi code interpreter for instance will write you a calculator and test it, here is a 10 second test of that https://chat.openai.com/share/20513467-519f-46b1-b018-3de2ca71b77f It can be expanded on with reliably sourced data sets as well.

I think using these extensively is an eye opener. Already there are hackathons being won by non programmers.

Where we agree is the computational capability of the builder/human and the ability to request enough testing and so on, but it’s already doable IMO

6 Likes

I wasn’t saying it can’t do maths, I was pointing out that it doesn’t have understanding of what it does, and can only generate things that are simplistic (calculator-like) responses to the inputs.

Simon’s examples of how hard it is to work out how to get a particular model to do what you want illustrates this.

If you asked Joshuef or any of your team to do any of the tasks you listed, the result would be very different from asking an LLM.

The process they would go through would be very different being driven by understanding and their own initiative, to actively establish exactly what you meant, enumerate and evaluate options, research possible implementions etc. This involves an awful lot of skill and understanding, practice of methods, and not least the ability to realise they might not know how to do some or all of what is requested.

Working in a contract design and development organisation also teaches that the client rarely knows what they want, even though they believe they do.

As an analogy, it’s like everyone has a personal genie who will create whatever they ask for. What could possibly go wrong? :man_facepalming:

What’s missing is the engineering team that takes time to understand the domain and realise what is needed, what’s possible, and to help the client to understand what is practical and how to achieve it before even starting to discuss the possible solutions.

4 Likes

Right now AFAIK all of the guys will start with asking an LLM. It gets you the API calls you need, the libs you need and connects them. They will also have it write tests and fix bugs, right now. (see cursor.sh or gh copilot)

What I am saying is this already is enough in many cases (the hackathon examples). But, this will accelerate extremely quickly and I feel folk need to be on this train now. Yes understand the current limitations, but be ready.

[EDIT - maybe fun to run a poll of who has used and LLM to code and ask if they would go back to a vanilla IDE?]

6 Likes

I DON’T think that’s a reliable test.

See also my edits to the above.

I think we are almost saying the same thing in respect to the computational mind you need. If you are versed in programming or design or product engineering, then this is much simpler to do for certain.

An example, I was using an LLM last week that was text → CAD and had it create some enclosures for me that I could then 3d print. No autocad etc. involved. I did have to know what I wanted, but I did not need to learn autocad to do it. I see cad Engineers and software Engineers being much different beast in the very near future.

5 Likes

Probably not exactly what you’re talking about, more likely about how intelligent/capable and functional the results would be but this is a hacky way of giving code interpreter an entire code base to make changes to.

4 Likes

That’s my evening sorted :smiley:

3 Likes

Your CAD example is a good example use case because:

  • if it goes wrong, only you are affected
  • you know enough about the domain, problem and the solution to direct the process in a way that proved reliable.

What I’m seeing very little of anywhere is recognising the difference between that and say, creating a critical service that will be inflicted on millions, and creating software is one of the areas, probably the area where that is most likely to go catastrophically wrong and that’s why talk of it being so suitable without acknowledging this is dangerous.

Some no doubt recognise that and will do better, but my concern is that is not going to be the norm and that corporations, governments etc will press on whether or not they recognise the risks.

I think if you are going to be the only user of an AI generated thing that’s one thing, but if it is going to affect others, maybe millions that’s a completely different proposition.

2 Likes

Interesting and again, for me, very scary.

It’s like outsourcing your project to a programmer on the other side of the world who you know nothing about.

What agentgrunt et al do depends entirely on what it was trained on and on the prompts and configurations applied to it, which you know little if anything about.

These could well lead it to include things in your code that you don’t want, are unlikely to notice, and may well not understand even if you look.

Whether by design or accident unwanted elements (phoning home, vulnerabilities, backdoors etc) may be introduced and only you will be responsible, not the provider of the AI.

Now, come to think of it, even using it to make a blog that only I will use seems a leap of faith.

3 Likes

It’s just like anything we use and blindly trust really. We use phones and operating systems that run on chips, none of which we make or most of us would be able to verify is safe or trustworthy.

Agentgrunt is interesting but it’s just a tool like a calculator. Most prone to user error. The calculator is just doing what it’s programmed to and the user can definitely do math wrong, not following order of operations/bad formulas, so the output will be garbage. On the other hand was it programmed wrong? Maybe, nothing is perfect and we now know about bias being programmed into AI’s which is also a danger.

I don’t know, I just see these things as tools. Some better than others, those will lead to better designs, etc.
They are impressive but not fully capable but that can’t be too far off. We should definitely be wary though and I like the idea of training our own LLMs most but this stuff is happening and we either keep up or get left behind. That attitude alone is probably widely held and dangerous, I’ll admit. So I see you being quite wise in warning others to slow down and think harder about this.

5 Likes

Let us know how you get on!

2 Likes

I think what could be very useful is an IDE for writing code through an LLM.

I don’t think writing a single line like “Write me a Facebook on SAFE” would work in itself, but it could be used start.

When starting a project the IDE could as what you want to built and you’d write “Write me a Facebook on SAFE”. What would then happen in the backgrounds is it would create a number of prompts to make it spell out a step-by-step plan for an MVP. You’d go a bit back and forth until you’re happy with the plan and then it would start to make unit tests for the first parts. You’d look through the unit tests to see og what it’s trying makes sense, if it makes something wrong you’d tell it and got back and forth a bit more, making manual changes if the LLM got stuck. Then make it implement code for the unit tests and run these. This should be done with something like tree of thought where it’s basically creating a tree of possible solutions, then reinforcement learning can be used here to improveme the model.

It would be a very iterative process and for some things, like making some simple browser based game, it could probably be done by a person who is not very technical. In general though, until we have AGI, it is more useful as a tool for for engineers that knows the domain (or is learning/exploring it) to essentially write code as a kind of specification in English, but where a lot of accidently complexity goes away. What Is mean by accidental complexity in this case is basically non-domain specific code. If you’re writing a game, the essential complexity would be things like the rules of the game, while the name of function to call to display the score would be accidental complexity. The LLM could be useful for suggestions on essential things too, like first generating some example rules for the game that you could then start iterating on.

3 Likes

This is really what cursor.sh is aiming for. You can select code and improve it, send errors to the LLM directly and have them fixed and applied. It tries to be an AI first IDE. It’s actually very good and works almost as you outline above.

4 Likes

It’s very early stage yet. Every rust repo is 64Mb which is too much for it, even after cargo clean etc. The process is good though, a bit like prompt stuffing. I suspect we won’t beat fine tuning on the codebase really. I will dive more into that though.

3 Likes