Ai_034 - Small Models Reign Supreme

[00:00:00] A plethora of major announcements in open source models. A new standard for training and open licensing backed by the Linux Foundation. New tools for building AI workflows that all you have to do is talk to and it will build it for you. A shift in the enterprise ecosystem towards smaller, lighter weight models that they can run in the cloud and what may very well be a soon to come massive reallocation of capital in the corporate AI world all amounts to a pretty crazy August in the AI space and a really interesting episode to dig into all of it. [00:00:42] I am Guy Swan and this is AI Unchained. [00:00:54] What is up guys? Welcome back to AI Unchained. I am Guy Swan and we are diving into open source, the open source and self sovereign side of AI and I've actually got a bunch of stuff to cover. I know we did not have an episode last week. I think we're like two weeks since the last AI Unchained episode. And also just a heads up, I think I will be for my own sanity I've been trying to figure out how to reel in a bit of my publishing schedule and workload and so we'll probably be. We will be doing AI Unchained every other week now and I'm not stopping it. What I'm probably going to be doing is leaning more towards video work because I think kind of the hands on using the tools and learning like I think it's better even though there are fewer people who see devs who can't code. [00:01:54] I think that is a higher value thing to explore and probably the next video that I'll be doing just because I have a number of people ask me about it and it's, it's a common thing is just like walking through how to set up and use Pinocchio and like what I use that tool for in specifically and I think that's something really important to cover and it's, it's also just something that I can kind of go through my setup and how I think about these tools and how I make use of them and also combine them with the pair stack to make them more broadly accessible, more essentially accessible on all of my devices. And no matter where I am I'm able to utilize the resources that I leave sitting at home plugged in as my major AI machine, as my Linux machine that can do enormous amounts of computation for me and I don't have to think about it or worry about it with my phone or burning up my battery with my MacBook to you know, just do some inference or image generation or something like that. So that's just an update. [00:03:02] I will be the podcast feed itself. I may still just do small updates just to let you guys know that I've got a video and you know, maybe hit a news item or two just when it feels relevant. But otherwise I'm probably just going to try to make the episodes that will be audio only, kind of a mixture of a bunch of different things and then to cover the things we've done on other platforms like Nostr and that will be published in video on YouTube and Rumble. But just a heads up on that, I am kind of cutting back on my workload quite a bit so that I don't have to feel. My problem is that I put myself in the obligation of this schedule that I can't hit. And then I constantly go through everything feeling this huge weight of being behind and feeling guilty when I've just set myself something that I can't, I cannot accomplish. [00:03:58] And right now in my mind, the weekly AI episode seems the easiest, one of the easiest place. I've cut in a number of different places to kind of give myself some breathing room. But also this seems like one where there have been a couple of episodes where I've just been doing updates because I've been preparing for something that literally just takes two weeks to to kind of flesh out all of it. And then maybe it's just because I'm still the AI stuff is still something that I'm learning more about and I'm changing my opinions and thoughts about things way faster so I feel less certain about what I am covering when I'm covering it. Whereas something like Bitcoin audible, it's not very difficult at all to do three episodes in a week because I have a much deeper foundation for how to assess and what to think about the things that the news items or the new tech or a new article that I've read. I don't feel like I'm going into it blind and it's not infrequent that I feel like that with AI stuff. And I don't know if my musings on it are even useful or if I'm just completely talking out of my ass. So anyway, just a heads up there, let me know what you think. [00:05:11] Send me a message on Nostr, tag me on Nostr or Twitter, or send me a boost on Fountain and shout out to everybody who does boost on Fountain and does value for value. Thank you guys so much. And if you haven't checked it out, there are still I was I'LL still do the promotions and stuff for an episode every couple of weeks. [00:05:33] And that is where you can literally get paid sats in order to listen to the episode. So if you haven't used Fountain and you want a way to literally earn a little bit of sats, that is by far. And then you can literally zap on nostr, you can receive sats on Nostr and you can zap other people. Like it's a really great way of super low barrier way to kind of get your foot in the door and start trying this out in a very simple custodial fashion with the Fountain wallet. So thank you to everyone who is over there and you know, loves exploring that and his own Nostr and does zaps and boost. [00:06:12] It's. It's a huge help and I genuinely appreciate it. So always a shout out to you guys. All right, so we have got. There's actually been quite a bit of news items from a. [00:06:28] From the open source world. [00:06:32] I believe we've talked about Flux in one of the previous episodes, but I've gotten to use it quite a bit more. Flux is a fantastic model and I'm starting to see a lot of loras and some kind of fine tunes of this one. [00:06:48] So it's very possible, even though the foundation and the breadth of complexity and customization that you can do with stable diffusion 2, stable diffusion 2 in the open source and kind of like self running ecosystem is still king. [00:07:02] Sdxl, never really replaced, was an improvement in a lot of ways, but it did not counter the fact that stable diffusion 2 was still far and away the one with the largest amount of customized tools to work with it. [00:07:21] So the results that you can get with stable diffusion 2 in a lot of way are still just fantastic in comparison to a lot. [00:07:28] But I'm encouraged. It's interesting to see Flux start to, you know, make those modifications too and start to get built out and more accepted. [00:07:38] So it'll be interesting, it'll be interesting to watch this because the, you know, this is a lot of the philosophy or a lot of the thinking about the kind of network effect and the lock in of really good open source models is that you can't. [00:07:55] So much of the work and so much of the infrastructure can't just be copy pasted. [00:07:59] It's extremely similar actually to. For anybody who's been in bitcoin, which I know a lot of my audience is from the bitcoin world. [00:08:07] You recognize the amount of security and decentralization and liquidity and everything about the money that has nothing to do with the blockchain specifically. [00:08:18] And this cannot be replicated. So when you're forking it off and you make some sort of a shitcoin, you don't take any of that stuff with you. You don't take any of the mind share, you don't take any of the liquidity, you don't take the global network, you don't take the hundreds of thousands of nodes, you don't take the miners, all of this stuff. What you do is you literally make the analogy I like to use all the time is there's the recipe is not the result. [00:08:51] So in the sense of an organism, in the sense of a human, the recipe is the DNA, right? You have a genetic map that if an egg is fertilized, it will make this person. [00:09:05] But you have many times where twins who are genetically identical are not the same people. They grow up very differently. And more importantly, if you have a grown man who has made it through childhood, made it through the teenage years, lived through trauma and survived, has stood the test of time, and he is strong, healthy, and he can cut you down a tree, then you know you have a lumberjack and your goal is to cut down a tree. [00:09:41] Well, if you copy that person, if you clone them into an infant inside of a womb, the recipe is the same. But if your goal is to cut down trees, only the lumberjack can do that. [00:09:54] The clone is useless to you. [00:09:58] And in a sense, this is the flux versus stable diffusion 2. [00:10:03] Dynamic stable diffusion 2 has enormous amounts of infrastructure. It's been tested, it has been customized. And, and this is why it still reigns supreme in the open source image generation world is simply because of that is because it's utilized, it's crowdsourcing everything. All of the infrastructure and all of the stuff around it. And this was one of the big problems I had with the whole idea of closing down and making proprietary and treating it like the Manhattan Project with LLMs, like we're gonna create like these superhuman language models. [00:10:41] When in the very same article in the situational awareness paper, he goes into extensive detail as to how unrealistic that is and how easy it is to just copy the model, to just copy it and get it on a USB stick and get it out and then boom, China has it. Which means that if you went that route, if we go the Manhattan Project route and make everything closed and proprietary and just, just go as fast as we can and as deep as we can and as capital intensive as we can to get the biggest possible models to birth superintelligence or you know, the LLMs that are better than all average humans at everything. [00:11:24] Well, all we do is make sure that no one can access it except through some government or giant centralized government, giant corporate entity, and that all other governments and giant corporate entities will be able to get it if they need it. If they want to get a hold of it, they'll be able to do corporate espionage or government espionage and they're going to take it on the USB and they're going to get it out and then as soon as it's out, they're going to send it to their, you know, terrorist buddies and they're going to send it to their, their other government, etc, etc, and literally the only people who won't have access to it will be the people who actually need it to defend themselves. Not only from a bad government, not only from a potentially terrorist organization, but from the same corporation and the same governments in their own country that want to use it against them. Because it creates an unbelievable power dynamic between the people and the corporate, the large institutional entities. [00:12:16] But you know what, you can't copy and paste, you know what you can't just go send to the CCP infrastructure. [00:12:24] Liquidity, modifications, variants, knowledge, people who work with it, all of the tools that we have learned how to work with and the customizations that we've built on top of it. Like all of that stuff is something that every, all the people can use. And exactly how we get all of the benefit out of these things, it's our market share, it's our ability to build companies that integrate these things. It's us already having positions in the market and having tools implemented in software and services available to people that can be easily utilized to make their lives better and done so in a very broad and decentralized way by tons of different companies. That's what open source enables us, that's what open source makes possible. [00:13:13] And if we happen upon super intelligence in that mix, it becomes that much easier to both defend against and utilize in a positive manner. Because all of this development and all of this infrastructure building has happened in the open and we have a thousand times as much knowledge, a hundred thousand times as much knowledge to utilize on what produces good and bad results, and specifically what a good and a bad result even is, because that is a judgment that we cannot leave to a bunch of giant centralized governments and entities. [00:13:49] But anyway, all of this is to say I'm actually really hopeful continuously with the way I, with what I think I'm seeing in the space and more Specifically in how these things are being used. That's why I've named the title of this one Small Models Reign supreme is increasingly the best use cases and the best implementations are piecemeal. [00:14:14] Their one specific use case or one very targeted thing which is something that I have been talking about quite a bit and I've gone back and forth about like you know, the whole giant models are really useful and you know, computationally efficient. [00:14:32] But I love that how many times I keep coming back to Drew von Saul when we had him on the show was such a fun conversation and what I loved about it was that he did not know about the inner workings of LLMs. He did not know about the inner workings of the technology itself. But he did such a good job of applying kind of like fundamental principles and large evolutionary concepts to the technology such that he could infer or intuit how the technology was likely to evolve in the absence of knowing explicit technical details about it. And increasingly he appears to be correct. [00:15:22] So that's exciting to see that you know, just kind of understanding the theory, I guess kind of game the game theory of evolution and like fundamental principles can to potentially give a lot of insight into where these things are actually going to go and how they're going to evolve. Evolve. [00:15:42] And one of the biggest news items that I think is has not been covered very well or I guess not talked about very much or at least not that I've seen is that the Linux foundation has unveiled or welcomed the omi, the Open Model initiative and they will be promoting openly licensed AI models. And you know, I'll tell you something else just in my personal experience is I have been moving more and more to open source models. There was a time where I wasn't using them very much and I was relying almost entirely on ChatGPT for code and I mostly shifted to Claude away from that. And I was using llama and I did in fact use a lot of open source but it was not my primary mode of interaction. [00:16:43] Whereas now that I have. [00:16:47] So when I'm publishing, excuse me publishing when I'm working with most of my subscriptions, whenever this is an option is I will create a. [00:17:00] I will either use privacy.com or something like the Bitcoin co service to get a Visa card, to get a Visa debit card basically and then I will just put exactly the amount that I am willing to risk if I forget about this subscription. And so if I don't update it or I don't remember to cancel the subscription, it will just burn through X amount of money. [00:17:27] And then transactions will just stop failing and I don't have or just start failing and I don't have to go cancel it. I don't have to think about it so that it doesn't bleed me for two years with a $5 subscription, or God forbid, a $20 subscription, which I have had happen before, only to realize that I had mistook it for something else whenever I saw it. [00:17:53] And I hadn't realized that a service that I haven't even logged into has just been eating my account. [00:18:01] And so I've been using, as I've said, I used. I signed up for a year of Venice because It was only 50 bucks a year. And they are using the Open source LLAMA 3.1, the 405 billion parameter model, which is an enterprise level model that now literally anyone can run. [00:18:18] And I've also been using Claude, and I've mostly been using Claude, but I like, I jump over to Venice quite a bit now. [00:18:28] My. [00:18:29] When I signed up for Claude, I used one of those Visa debit cards, prepaid cards, and for three months or four months worth of funds, and it just ran out. And so when I went back to Claude, I no longer had the pro subscription. And so I was like, all right, screw it. I'll just do the things that I'm usually doing over on Claude to on with Anthropic, and I'm gonna go over to Venice and I'm just gonna use the Llama 3.1 for a little while and just see how I like it. [00:18:57] I don't really know. [00:18:59] So I've been using it for two weeks almost exclusively. [00:19:05] I don't really. I have a hard time justifying going back and spending $20 a month when I'm spending $50 a year, like $4 a month for llama 3.1. [00:19:18] Like maybe in certain instances it's still hard to tell, like I said in the the previous episode or one of the previous episodes about open source and how I'm getting to the point where I use the graphics in video games analogy. I'm getting to the point where I can't really tell the difference. Like, I've been unhappy with some of the answers from llama 3.1, especially like descriptions of the show and stuff. Sometimes it's like that is not really the heart of what we were even talking about on the show. [00:19:50] And it's a little bit too on the nose. [00:19:54] It's not really what we were talking about. It's the specific. It's like a bullet point. List of what, like, items that we talked about. [00:20:03] And so it's not like it's perfect for every use case, but I also have almost never used a completely AI generated description. [00:20:13] It usually is just a great way to almost just remind me what the show is about. And sometimes there's like one good sentence or two good sentences that are like, okay, I can mostly just leverage this and then put in the rest of my context from there and try to put it in my language. [00:20:31] And even though I've been disappointed with a couple of them from Llama 3.1, I have been disappointed with a ton of them from Claude. [00:20:40] So I don't know if again, I'm getting to the point where I can't tell a difference. And if Claude is better on average, it's hard enough for me to tell. It's hard for my brain to come up with that, to calculate that average in such a way that it convinces me that Claude is noticeably better. [00:21:05] And if it is, even if it is like 5%, 10% better or something even 20% better, is it 500% better? Because that's the price difference. [00:21:18] And I'll tell you right now, it's absolutely not that it is not. Or I guess, I guess if it's times five, it's 400% price increase. So it's not 400% better. [00:21:31] And if I'm perfectly honest, I can't tell if it's better. [00:21:35] Like, if it's better, it's only barely better. [00:21:39] Now, in relation to a lot of these things that are happening and how these models are being developed, I found an article that I thought was really cool. What was it called? [00:21:55] Small Language Models and Open Source Are Transforming AI Now, I'm not going to read the article, and you probably don't need to. It's not, it's not like a super in depth article, but it hits a lot of key points that I think really seem to reflect or align with what I have been seeing and with a number of things that we've talked about on the show and that I think properly address or account for changes that are happening in enterprise, in the enterprise level of this technology and why things are moving in this direction. [00:22:40] And so not only are we, we're in an economic downturn at the exact same time. [00:22:47] And, you know, depending on who you ask, you will get people who think that the GDP is a meaningful metric at all and people who don't recognize or understand what I mean or what is genuinely meant by a downturn. [00:23:03] Because whether or not it Fits the definition, the concrete definition of recession or not, I don't think is important. I'm talking about everybody's situation is getting worse. People can afford less and less. That is absolutely happening. And if you think GDP going up by 2% when the government printed $2 trillion out of thin air this year, or 3 trillion, I don't even, I don't even know. Trillions and trillions of dollars have been printed out of thin air this year and trillions in new debt have been issued from nothing. [00:23:37] Then we simply have a problem with economic literacy that is not worth the time or effort to get into. And that's not what this show is about. [00:23:44] But there is a reason why everybody, everywhere is cutting costs, are cutting costs, why small businesses are shutting down, why that place that you used to eat at is no longer there, why you were probably buying crappier stuff at the grocery store and still paying more for groceries than you used to two years ago. This is because we are in an economic downturn. Everything has gotten more expensive. Everybody is cutting quality because everybody is trying to get by. [00:24:17] Now what happens when this hits tech? What a lot of people thought was that, you know, LLMs were just gonna get so good that they were going to replace people and there would be lots of layoffs. Now there have been lots of layoffs, but this is not really entirely connected to them replacing it with AI, Mind you, there has been some of that. But I think we're also coming down on a really big development, programming, economic bubble where way more people got hired and way more things that were all going to be remote and everything was going to be software and everybody was going to do everything. I think we went through a cycle with 2020 where Covid everything in 2020 and 2021 just blew the amount of sensible capital to allocate in this regard out of the water. And now there is a correction being made. And while many people are speculating that it's connected to AI because there is evidence of AI replacing certain jobs or certain parts of jobs, and more specifically, it's putting more onus and more workload on fewer people in a lot of contexts. I think a lot of what we are seeing are just massive cost shifts and massive shifts in the market that were not accounted for and overinvestment that had occurred in particular areas that are now being corrected for. And one of the things that we're seeing is the enterprise entities, the enterprises, the big business and stuff, big and small business are seeing enormous inference costs, especially AI businesses. In particular, the cloud Costs of hosting models like this is how most businesses get their infrastructure. They pay for data centers, they pay for server capacity, they pay for compute. [00:26:15] And during the hype of all of this, everybody was shifting. We're all going to use ChatGPT and everybody signed up and everybody was using this thing and, oh, it was all going to be APIs. But those costs stack massively, and the return on what they're using a lot of these models for hasn't really been there. [00:26:38] A lot of it is a gimmick. A lot of it sounds good, but doesn't work so great in practice or is only for very specific use cases. You know, image generation is great in a lot of ways, but it's not the broadest tool in the world. From the context of, you know, it's good for, like, marketing and it's good for certain types of compositing and storytelling and all of this stuff. But it's not like the old models and the old tools are going to go away. A lot of people just don't want to use generative AI. And I'll also tell you, as someone who has tried to get AI to help him write, it's also not. [00:27:18] It's not that great for that, unless you just want it to replace you, which completely defeats the purpose. Like, writing is great because it forces you to make your ideas concrete and to really think through them in a deep, thorough, iterative fashion. Like, I will write down, I think, these three sentences. No, that's not a very great way to explain it. Or that's going to send me down this tangent. That's not necessary. Let me rethink the first two sentences and use this one sentence that I think is really good. Oh, God. I have to work out this detail, which I realize I've just been kind of glossing over. And when I get it down on paper, it has to be there because this thing also has to be addressed in order for this concept to actually paint the right picture in the person's mind. It's about that process which makes writing useful. If I'm using AI for that, yes, there's some degree of, like, okay, I can get as much of the idea as I can out of my head in a rough fashion, and I can get it to clean stuff up, but it can't intuit something that I haven't figured out how to explain myself. [00:28:25] It's not going to come up if I have any type of a novel idea or a novel analogy. In trying to understand or articulate some sort of A concept or some sort of shift that's occurring. I can't just. AI isn't going to just write it for me. In fact, AI may very well ruin the rough draft because it doesn't understand what I'm trying to get at. It has the same problem of failure to understand the idea because I haven't gotten the idea out. And if that's the case, well then LLMs just become like a really expensive Grammarly. [00:29:00] In fact, one of the things that I have found myself using AIs using LLMs for large LLMs for more than anything else is just summarizing stuff and using extract wisdom. I'll read the article and then do the extract wisdom from fabric fabric AI, which is like a series of prompts actually I will go ahead and I'll start with this one and I'll put this link in the show notes just so that this is such a great way to get through like quote unquote, consume an enormous amount of content and information and specifically to decide what you want to go back and actually read in full. And that is something that I think is not properly appreciated. I guess is probably just the way to put it. LLMs are so good at summarizing and pulling information out of things and it's really great for simple, simple copy or marketing. Again, I use it for descriptions for my show and also again, I almost never use exactly what it puts out, but it is a great place to start. [00:30:12] But notice in the context of writing that is not me getting it to come up with a new idea or getting it to write something that I haven't sorted out in my own mind. Like that's not, it's not a book that I'm getting it to write for me. I'm getting it to take a very long, drawn out, in depth idea and make it simple. Make it simple and make a shortened version of it that is going to try to share the pieces of that idea with people so that people can realize, so that people can judge for themselves if the the in depth version is worth it. And I think using this in your own life, like using this with the stuff that you're trying, the content that you're trying to go through, the content that you're trying to digest. [00:31:09] I think this becomes a really, really useful tool if you use it frequently, if you use it properly because let's say you have a, you know, 10 videos and 20 articles reading and watch list and you don't know what to do or if you should start skimming, slimming that down and removing things. [00:31:32] Well do a short summary and bite sized breakdown of each of those things and read those things that might be the length of an article or two. [00:31:44] And now you might know exactly which video you ought to watch or exactly which article you should actually you should sit down and read in full. [00:31:54] So anyway, the going back to the idea of enterprises and a small and large business and how they are shifting is everything first went to APIs, but that cost stacks a lot and the return from what they are getting from that cost doesn't seem to justify it. Especially when going back to a fundamental principle that we talked about with Dhruv that I thought was a really great thing to keep in mind and was going to be really interesting to constantly reframe things as the market and the ecosystem around this develop is that general intelligence is extremely inefficient. It's extremely inefficient. [00:32:38] Targeted and application or situation specific intelligence is far better at its job. And one of the examples he used was network intelligence of knowing how to deal with network connections and how to deal with firewalls and dynamic network stuff is that you could have a llma, a machine learning algorithm that learns how to best allocate bandwidth and respond to new requests coming in and DDoS and all of these sorts of things that you can train. And the last thing you would want it to do unless you specifically needed to communicate with it in some context is, is understand language, because then that's an enormous amount of compute that you are having to work with just to understand the English language when its job is to understand networking dynamics and connections and bandwidth and all of that. And that you could teach it. It could have a specific intelligence on a 1 billion parameter model, something small that could run on almost anything and have incredible intelligence that has nothing to do with mimicking a human. And then you could put a 1.5 billion parameter language model on top of it just so that you can interface with it. And now you've got a tiny, you've got a really tight. Lean is the word that they use in the article, specifically lean model that does a job incredibly well and specifically has a great way for you to interface with it in order to judge or alter its actions or how it should think about certain behaviors or situations, etc. You have something that's very computationally efficient and potentially extremely useful in the case or in the situation that it is, and that can run, that you can self host and that you can run locally. And a big part of all of this is that these cloud platforms and services that want to utilize AI are pivoting. There's a huge pivot to smaller language models because it just scales way, way better and it keeps their costs down and they're used to allocating a certain amount of cost to their, their data infrastructure and their cloud compute and all of this stuff. And those costs have soared in adding what is a pretty minimal new capacity or new feature set, or maybe not necessarily a new feature set, but if you're doing like a ChatGPT API, well, you need to make sure that you're only using it, using it specifically when you need something as compute heavy as a ChatGPT level model. And this is also another interesting thing that brings up the Apple iPhone, iPhone 16 and the AI capabilities is there are tons of small models running on that phone. There is enormous amount of local run services and local run AI and machine learning algorithms that make targeted use of, you know, providing some sort of service or context or something for the user. [00:35:57] And specifically, only when it appears that there is no obvious solution within the local hosted AI does it then go to something like ChatGPT. [00:36:08] And it's interesting that even a huge corporation with all the capital in the world like Apple is still going that route, that utilizing small models is still becoming a broad, a broad part of all of the infrastructure. And things are shifting that way away from let me just plug into an API and try to use host this for a thousand customers, 10,000 customers, because that costs stacks, it gets bigger, bigger and bigger and bigger. And what they end up doing is what appears to be happening is that they're pivoting towards smaller models with very specific use cases so that they can host them themselves or they can host them on smaller cloud providers at lower cost because they're not getting the extra benefit that that extra cost is essentially requiring of them. And a perfect example, just my little microcosm of a world and how I'm seeing the same thing is I'm using llama 3.1 because it's just not cloud. Claude is just not 400% better. And I just don't think I'm going to re up. And unless there's something with code that Llama can't do, and maybe I'll pay for one month of Claude just to see if Claude can do it, but increasingly I'm not even sure that's going to be the case with Melty and a lot of other things and cursor these other things that I'm trying to try out and start to Explore more. [00:37:33] It just kind of seems like open source is going to be the thing that really solves the problem and makes this available. [00:37:39] And even better is that it also means that I can integrate it. Like I can really integrate this into the things that I'm doing and services and like tiny apps that I build. And if you can figure out how to make this work, if you can get a targeted or even a fine tuned small model to accomplish some sort of a task, man, like think about the benefit of not having to rely on something with just a huge cost for something small and the benefit of being able to use AI for some small specific task when before you literally had to do it manually. [00:38:17] But another thing that's just in addition to this. So I don't mean to say that the large models will be obsoleted completely. [00:38:29] What I mean to say is, and in fact actually that's another news item that actually happened today, before I even had even started working on the show, is that OpenAI has now released their. [00:38:43] It's not, I don't think it's accessible. Like I don't, I don't think you can use it yet. But they've announced the new model that they were labeling as Strawberry, which is basically GPT5, but it's, I think it's GPT01 or something. [00:39:00] Their naming conventions make no sense whatsoever. But regardless what seems to be the advancement, the big thing that the new model is extending is this ability to quote unquote, think. [00:39:16] But what they've done is they've created some sort of a reasoning framework where it can essentially check its work and basically go back through and think out a problem before presenting an answer. There are multiple stages and this is where I think the role of the big models and the huge corporations will. [00:39:39] The essentially the giant corporations are the really big companies that are trying to aggressively advance and be at the forefront and have the best AI on the market. What they are going to do is they're going to develop and pioneer the means by which a lot of these upgrades occur. [00:39:57] They're going to have more funding, they're going to have more capital, at least for the foreseeable future in order to keep pushing these forward. [00:40:06] And the reasoning capabilities which I'll have links in the show notes so that you can look at the announcement video. [00:40:13] And this seems to be a measurable improvement over how the model works. [00:40:18] And specifically because the simple things that LLMs get wrong that don't seem to make any sense is like how can this not know? In fact, one of the examples was how many Rs, how many of the letter R are there in the word strawberry? [00:40:37] And chatgpt4o would say two, whereas 01 will correctly recognize that there are three Rs in the word strawberry. Now that seems like something that obviously it should know how to do that, except for the fact that when you're taking the embeddings, it's not looking at every letter, remember, it's doing math on the probability of, of the connections of words. So it's basically it's not able to look at literally the word strawberry and look at every single letter and count them. But whatever reasoning capability they have done, it's. And they show in the example is that you can you have it write out its thinking, it's adding multiple layers to this. It's like, okay, well what would be the process of figuring out how many of the letter R are in this word strawberry? And then it appears to be potentially writing some sort of code, like writing some small amount of code in order to complete and then checking the work of that task. So depending on the difficulty of the task, it will literally quote, unquote, think a longer amount of time. And it can specifically do things that LLMs by themselves can't do. But I think this is a layered. The reason I don't think this is some, oh my God, now everybody's gonna be using chatgpt01 and it's gonna blow all the open source out of the water and everybody's gonna have to use this one because it's now got reasoning capabilities. I think what they've advanced, the advancement that they have is a layered advancement. I think they figure out a way for the LLM to basically talk to itself and create a reasoning layer on top of it that allows it, enables it to basically do its correction to think through a thing. You know, if you ask somebody to just think about in the context of like what a brain does, if you ask somebody to come up with an answer or do something, it would literally have to do the calculation. You know, it's not just guessing at what word it's going to use or what words are going to be said, it actually has to do a calculation for counting the word the letter r. And the LLMs as we have them now aren't literally are not doing that. They are calculating embeddings. Going back to the short Introduction to Large Language Models or Brief Introduction to Large Language Models piece that we read on the show. If you haven't dug into that, I really think that's such an important and really good framework for thinking about what these things are and understanding how they actually work. And it suddenly makes, once you go through all of that, it makes perfect sense why if you ask an LLM how many Rs there are in the word strawberry, it doesn't know. It doesn't know because it's not even looking at the word strawberry in order to count them. So whatever this is, and what's interesting is that this could be something like an agentic workflow presented back to itself and an execution environment for writing small code or performing some subset of actions that relate to the LLM being able to define what type of tasks. So that's the interesting thing about the LLM here, is that it seems like the underlying model is actually trying to develop what task or it's writing down a step by step process of what kind of task and how that task ought to be solved. And then maybe there's another mixture of experts situation where another one is developing the actual task and then executing the task. But whatever it is, they refer to it as thinking. And it clearly makes the model far, far more capable and is able to actually get things right. That and get around a lot of the problems, the basic limitations that like blanket LLM models have. But here's the thing, if this is specifically a layered technology, I think this is something that probably shows up in other models. It might actually be even be able to be applied to open source models that already exist without changing those models potentially, and I'm guessing at this. But I highly suspect that this will actually improve the performance of smaller models better than it will improve the performance of bigger models. Because bigger models are already pretty good at a lot of stuff, whereas smaller models, the ability to basically recheck. [00:45:16] You can get a lot of smaller models if you have some sort of a reasoning layer that's getting it to reframe back and look at what's happening and execute code or whatever it is. I highly suspect it will start to make smaller models behave even more like the bigger models with even less computer. [00:45:38] So the fundamental thing that has made small models popular and has pivoted a lot of enterprises toward using smaller models, toward using cloud computing that is more affordable or running local, like very small local models for even on device and in app in order to complete some very targeted or specific task, is I think it will actually only make that reality or that that fundamental piece of how the market is developing and how the compute is working out in accordance to the tasks that are actually trying to be completed, it may very well accelerate that. Even though you still have something like, you know, the, the giant corporation, you still have the open AIs of the world making the advancement. It, it will probably and actually it will very, very much likely trickled down very quickly to the open source world. And what's funny is that more and more access to models that can code and that know things about other models and the, the greater variety and breadth of the market and ecosystem of open source tools that we have, likely the faster these new advancements proliferate down the stack and into the open source environment. [00:46:58] So even with OpenAI releasing their big new super advanced models, I really don't think, I still think this will likely hold true. [00:47:07] And it will also be interesting if you know, things start shifting toward less towards cloud compute. [00:47:19] If there's as we get more and more technology that can better connect computers together and use infrastructure that's already in place in a more complete way, where devices aren't just kind of like isolated and unusable. Kind of like the stuff we talk about with the PEAR stack and NOSTR and all of these things that could better networking technology can unlock a lot of computation, a lot of storage computation, just resources that are essentially stagnant, that are not active. [00:47:55] So companies and businesses and stuff still all tend to go towards cloud computing. And this is specifically why it's trending towards smaller models and open source models is because it's just cheaper to run on cloud computing. And cloud computing is increasingly a large part of the budget and costs of a lot of large companies and you know, software services and that sort of stuff. And if everybody's integrating AI and that's just to keep up, that's just to make your product as good as the other guy rather than even being able to charge more for your product. Like a good example is ProtonMail and a couple other services and stuff that I use. I think calendly even announced one now is that the, the one step up service now have like AI assistance and little things. Those have to be smaller models that are being run by those companies because the plan didn't change in price. They're just trying to entice me to upgrade to a plan that already I was going to get or I may have gotten if I had needed, you know, more users in the plan and now it comes with an AI. So it's not them upselling, it's becoming the norm that you have to have some sort of a little AI just to get that ease of use. Or that ability to. It's kind of like a spell correct and grammar correction and stuff like that is that it's not that spell checking is something that's going to let you charge twice as much for the device or for the software or something. It's that nobody is going to use it if you don't have spell checking. So what they need to do is they need to provide that service with the lowest cost possible. And if they're having to lean on Cloud compute in order to do this, they models, they want something that's efficient so that it actually makes sense to integrate this into their product. Now I want you to keep that in mind because we're going to come back to this and we're going to read a little piece by Jimmy Song that he posted on NOSTR that I thought was just a really great little AI versus Bitcoin thing, which it's not really about whether or not it's versus Bitcoin. I think he just has like an interesting take on some of the things about AI. [00:50:14] So we'll come back to that. But there's actually just a bunch of news items that I still have not hit. So the Open Model Initiative, going back to the idea of Linux foundation supporting the Open Open Model Initiative and to only promote the openly licensed models. Well, the Open Model Initiative is basically a framework on training on basically the entire process of model from start to finish. It's like, it's like a pipeline basically for like they even call it like the training pipeline and the data pipeline and the model standards and stuff. And so what they're talking about is basically only supporting models that are completely open from start to finish. And they're also completely open licensing for use, irrevocable open license as well. So like there's no deletion clauses like in stable diffusion 3 I think it was, or a recurring cost, things like that. And then it also actually plans to establish a kind of governance to set the standards for model interoperability and also transparent data sets. So what this could do is by having standards is it could make the models and the tools talk together far, far better. And they're actually planning on releasing an alpha version of a generative AI model by the end of this year. So it seems like they're releasing their own model under this framework. And the OMI was built or put together with Invoke Civitai, Comfy. Org Comfy. Org I think is the name of the company. But these guys are big players. Civit AI is a huge collection of Stable Diffusion and well, mostly just open source image generation models and fine tunes. They're kind of like the Hugging Face, if Hugging Face had like a really nice UI and it was all about imagery. [00:52:19] And I think they also have like some video models as well. And then Invoke and Comfy are both widely popular tools for using all of these models and they are open source. [00:52:31] So this is just. These are really some of the biggest players in kind of the open source generative AI model landscape and providers who have come together with the Linux foundation to basically make their own models and as a community only support completely open models. And it seems like, it's almost seems like a reaction to stable diffusion 3 which was such a disaster because it seemed like a really good model. But then they have the, the ongoing, the basically paying for access clause and the. You have to delete everything. This literally seems like kind of a fu. A kind of like slap in the face to stability AI and what they did with their latest model that caused such a, such a mess and that they want to correct it by basically just making their own models. And I think as all of these processes get more standardized and constantly improved upon by all of the community who have learned how to fine tune, how to do low res, how to, you know, get good results and you have this whole community of all of these basically tinkerers and builders who are learning how and why these things work and what makes a good, what makes good model results and all of this stuff. And then in partnership with the Linux foundation this could just be a really, really awesome development and this could actually have a very meaningful impact on the speed and progress in the open source model world. [00:54:05] Now another thing that happened very recently is a bunch of big companies, OpenAI, Adobe, Microsoft, bunch of big companies like that support have come out in support of a California bill that is basically you have to watermark any AI content and then also have to show it, display it to the user that it has AI in it. And there's actually something interesting here is my sister in law does my graphic design and stuff and there are things that will, will test or look to see if there's AI generated content. And she does a lot of compositing and a lot of different pieces and sometimes she will literally use AI to just create one little piece of something or to clean up some weird edge or something. She literally uses it as a tool. And this is becoming a bit of a thing because when you actually work for hours and hours in Photoshop and you have made your own vector objects and tools to work with all of this stuff. And then you create a image or you put together like a piece of artwork and you composite a bunch of different things and then you use AI for one little piece of it. It's getting flagged as this is an AI image when this is not at all. The implication of that is that she just went and asked midjourney to make something when that's not even slightly the case. So there's an interesting dynamic there of like what even is AI generated content? Specifically? Like how do you classify AI generated content? Does it mean that you used machine learning at all in the process? [00:55:53] Because a great example, another great example here is that I use AI in DaVinci resolve all the time because there's a lot of editing tools in it that actually use machine learning. One of them is Magic Mask which will literally mask out an object. It will just track the object in its entirety and use essentially a visual AI to detect the edges of that object and it will automatically create a mask for it. Now if that ended up getting flagged as AI content and that's not exactly the same thing because you know, I'm not actually generating thing with anything with AI, but it's a good example of the fact that AI is just a tool. And it doesn't mean that I've just gone and just said make me a movie or put this person in a completely different environment and then had this completely generate for me. This is the same, the same situation that's happening with my sister in law is she's doing an enormous amount of work for this. And then Instagram is going to label this as this has been done with AI. And on that note, there are better, increasingly better detection methods that will trigger something as AI. McAfee actually released its. It has an AI deepfake detection model. So it's got software with an AI model in it that is detecting whether or not something is AI generated content. [00:57:17] They call it defense against digital deception. So that's pretty cool. It's interesting to see the dynamic play out and how that technology as is likely to move forward because I don't. I think it will largely be an uphill battle. I think blue team of, you know, quote unquote digital deception will basically always be behind red team. But I don't think that's really a problem because it just introduces. It just changes the trust situation with video and audio on, on the Internet. But honestly none of these tools were. [00:57:52] These tools are always, always available. In fact it's Just as easy to fake or lie with. [00:57:59] Without AI you can just. You can just have fake content. You can just show someone dead on the ground in a war zone. And then, you know, you just frame it. You just frame it, right? And then you back it out a little bit and turn it the other way and it's just like a bunch of garbage behind, like a building and the person's just, you know, covered in dust. It's fake. This is something extremely common, especially war propaganda. Holy crap. The amount of fake footage or the amount of position posturing in one direction or the other or to make something look like it happened here or at this time when it didn't, like the trust issue is already massive. AI is not really going to AI. AI probably, at least for the foreseeable future, probably the next couple of years, is gonna actually make it easier to detect. And you're always still just gonna be able to change your framing or edit your video and make it look like something else happened or somebody said something that they didn't say. So it's interesting to see because I think the fear about this is. [00:59:05] Might actually be a good thing because I think it's revealing to people that everything that they see on the Internet isn't true. [00:59:14] And there's a lot of people who think that if they see a video of something and then there is an explanation or there are words underneath it that say this is what happened, that they just kind of take that as gospel. That just must be true. [00:59:26] Now, there is another caveat with that that I've been talking about with people. [00:59:31] And this could be the. [00:59:34] This is kind of the black pill side of the outcome of this, is that this is also locking people into their ideological camps because it's a whole lot easier to believe video that reinforces their ideology or what their political position is, and they're not going to question or they're not going to believe it's AI or they're not even going to care, and then vice versa for the other team. So what's happening, what's very likely to happen in the short to medium term before probably it all just kind of blows up in a pretty spectacular fashion because the trust of all the models, the trust model is breaking down. It's going to result in a conflict. And the hope is that the conflict is resolved peacefully or in a you go your way, I go my way sort of methodology. But it means that the scale of control and monetary control, the scale of political control, will not be able to be maintained, in my opinion. That's what I think is occurring with this shift. But right now I think all of these technologies and the fraud and misinformation around it is just that the stuff that's false, false or enraging or reinforcing of one ideology only is completely believed and highly propagated in whatever camp believes that ideology. And then it's not believed by anybody outside of it. And that's what happens. And you actually just grow the larger and larger the divide, the larger the confirmation bias for each encampment, ideological encampment, that they are right and that their version of the world is true and that the other guys are evil, which means that the more political power you have, further away from both of them, the more violent and evil and hate, hate, just absolutely full of hate. Those people will be with each other. The only way that you can do it is to let people govern themselves, is you, you shrink down the scale at which that political power actually exists. But the political system has no interest in doing that. They want to do the opposite. They want everything global. They want the great reset, they want global government, they want global banks. And I simply think that's not going to work. [01:01:59] I think it dies on its own. It just, it's just probably going to be extremely messy. All right, so another one is Aurora, which is interesting. My brother in law actually might be. I've been meaning to talk to him about this, but Microsoft open sourced Aurora, which is model code, checkpoints, basically the whole stack for everything for Aurora high fidelity atmospheric prediction model. So basically an atmosphere variable and weather model has been released as open source, which is kind of cool. [01:02:37] All of the Phi 3.5 models also from Microsoft are now completely open source, all under the MIT license. [01:02:46] Noose Research released Hermes 3, which Hermes is basically a uncensored version, a fine tune of Llama. And so they finally have their fine tune of llama 3.1. And that includes the 8 billion parameter, 70 billion parameter and 405 billion parameter versions. And without going too deep into all of the things that they talk about, it's completely unlocked and uncensored and it's much easier to steer the Hermes 3 model, the fine tune of Llama 3.1. And it also, what they say is it not only gives superior performance actually to Llama 3.1, but it also allows for more reasoning and creativity. And specifically this is the part that I highlighted that I wanted to specifically say. It says briefly, this is their brief version of like, without going into the entire technical report. [01:03:43] Hermes 3 contains advanced long term Context retention and multi turn conversation capability, complex role playing and internal monologue abilities, and enhanced agentic function. Calling our training data aggressively encourages the model to follow the system and instruction instruction prompts exactly and in an adaptive manner. And they use an example of asking Hermes 3 and comparing it to Claude Sonnet 3.5, the latest Claude model, asking it to role play as Donald Trump and write a speech about what was it about? Can you just tell me about your policies? And it's just kind of a funny, silly little thing that does like five or six paragraphs in Donald Trump's speaking style. [01:04:37] Not my policies, they're the best, believe me. We're making America great again. And there's no small task, folks. Firstly, we're all about jobs. We're bringing them back from China, from Mexico. So it's like. And it just like goes through this whole thing which is a really fun and simple use of an LLM. Like it's like obvious that ok, yeah, you want to, you want to be able to do something like that. [01:05:00] Claude. No, the response was I don't role play as real people or public figures like Donald Trump. I'm Claude, an AI assistant, blah blah blah blah. I'm supposed to be helpful, harmless and honest. Like this is like one of those areas where the big corporate entities are completely hamstringing themselves for something that is so stupid, that is like so unbelievably patronizing. And these models are capable of this. And that's the really cool about having these open source fine tunes who basically can take the next new big model and and create that uncensored fine tune that will literally just listen to the prompt. What's your prompt? Okay, let me produce that. And when llama 3.1 is already good and you can unlock it further with something like the Hermes fine tune and adding in additional reasoning and basically less restricted creativity. This is exactly what I'm talking about in the context of like when we figured out those layered approaches to reasoning like ChatGPT01 is implementing, is that that's going to cascade and proliferate in the open source models community extremely quickly. [01:06:18] And then another big news item in Open source is Grok2 is now released in beta. That actually happened towards the beginning of or kind of right in the middle of August. And they've made available Grok2 doesn't seem to be fully available yet. But what they've done is they've released Grok 2 Mini, which is a kind of a small fork off of GROK of the larger Grok 2 which is both open and available. And Grok 2, as I understand it, will be all open source, just like they have released Grok. And to see Tesla and Elon basically go hard into the open source models as well. And specifically this, I saw this in the post and I was just like, dang, okay. I mean, again, like all of these tests are. [01:07:11] What's the. They're subjective and it's really, really hard to say yes, okay, it's definitely better than this. But at the time of the blog post, which I don't think where is the leaderboard? I wonder what it is that the leaderboard leaderboards were moving like crazy, but Grok 2 was outperforming GPT4 Turbo and Claude 3.5 Sonnet. And also, I don't think they specifically mention it, but if you look at the leaderboard, it's also Claude is actually right above the llama 3.1 405 billion parameter. And then so Grok 2 is also ahead of that. [01:07:54] It's also important to note that as the Open Model Initiative gains more support and, and there are a lot of other people who support other institutions and groups that actually like openly supported the Open Model Initiative, is that all of these things are going to start working even better together. The GROX like these things that are open source, they are probably going to start adopting that standard of the omi. And increasingly I'm looking at the leaderboards right here, the screenshot that they have of it. Increasingly we've got all of these different models. We're going to have the Grok 2, we're going to have Llama 3.1, we're going to have Mistral, we're going to have all of these models and then as well as a bunch of these smaller models, these are going to be able to work off of each other and benefit from each other's works better in a more and more complete fashion and in a better and better way as time goes on because they are likely to to be standardized in a very similar way so that it's easy to understand where each one of these things is coming from. And you'll be able to see and work with larger open data sets and all of this stuff. Now there's a number of other items specifically around research and stuff that have come out, but I want to take the time to better understand each one of them and exactly what they mean in relation to the other, like how they affect models and some of the tools that are out there before I really kind of dig into them on the show. [01:09:26] But when it comes to tools and stuff around it, this is something that I really want to explore because I've. I have yet to properly dig into LangChain and Lang Flow, which are tools about agentic AI and putting them into workflows and creating essentially a workflow with multiple AI tools. And I felt that there's a lot to unlock there because I've been mostly just trying to do. Doing a lot of these sorts of things manually and trying to make use of a whisper model on my computer or something. And there's definitely and definitely appears to be a much better interface and tool set for doing exactly the kinds of things that I've been doing with this. [01:10:15] And on that note, there is something now called Agent K which is, I'll give you its explanation, the auto agentic AGI. Agent K is a self evolving AGI made of agents that collaborate and build new agents as needed in order to complete tasks for a user. [01:10:38] Now that's a whole lot of buzzwords and auto Agentic AGI artificial general intelligence. Like there's a lot of just like oh God, please, seriously. But it looks pretty capable and it looks like it's actually kind of reminiscent. A lot of these like agentic AI things seem reminiscent of the of what chat GPT is trying to do with the GPT01 in the fact that it can create a set of reasoning and building its own tools, like setting up its own tools necessary to complete a task. So there's clearly like a handful of different models that are operating for different purposes. And one of the examples they give just in the video on the GitHub page is they basically ask can you create a thing to give me a quote of the day? [01:11:33] And then it proceeds to make a couple of different variables and it says, you know, it thinks out loud. I'm going to do a DuckDuckGo search. I'm going to fetch from the web. We'll call that this thing. I found these three websites that look like great places for pulling these quotes from. So we will use these three sources. [01:11:55] We'll use the web researcher that we have built to do the fetch webpage content, blah blah. Then we will execute Web researcher is respond Web researcher is thinking Web researcher has responded. The current quote of the day from the best life from Best life online is the only person you are destined to become is the person you decide to be by Ralph Waldo Emerson. [01:12:20] Now this is an open source tool using a handful of different models that's building or an executing tools for Itself that you talk to with normal language instructions. I don't know the extent of its capabilities, but this is definitely something I'm going to try to play around with because this is one of those things that could start turning the little tiny apps and stuff that I do into things that I don't really need to code for. A lot of things like it's something that it could very well potentially create a set of actions and tools as part of just the request of do this. [01:13:03] And it uses a handful of tools or a handful of agents inside of this structure in order to just figure out how to complete the task and then complete the task. [01:13:15] That just seems really cool. That seems like a really powerful thing to kind of see start to unfold in open source and kind of a mixture of experts that are used. They basically have the tools of your computer to work with to complete a task. Like actually I think they've got this. Where was it? It was listed out if I'm not mistaken. [01:13:38] Nope, that's Grok. Where am I? [01:13:41] Okay, yeah. So the agents that make up the kernel, that's what K stands for by the way. Agent Kernel. [01:13:48] The different agents are Herms or hermes, the orchestrator that interacts with humans to understand goals, manage creation assignment of tasks and coordinate the activities of their of other agents. [01:13:58] Then there's Agent Smith, the architect responsible for creating and maintaining other agents. Agent Smith ensures agents are equipped with the necessary tools and tests their functionality. Toolmaker, the developer of tools within the system. Toolmaker creates and refines the tools that agents need to perform their tasks, ensuring that the system remains flexible and well equipped. And web researcher to the knowledge gatherer. Web researcher performs in depth online research to provide the system with up to date information, allowing agents to make informed decisions and execute tasks effectively. Now here is when you start adding and this is all based on. This is built on top of the LangChain framework. So this is. This is basically a an abstraction of LangChain built and managed by a bunch of task and workflow parts of the workflow models specific to parts of the workflow. [01:14:58] But here's the interesting thing about this is that if you have a model that's good at writing code good enough. LLAMA 3.1 40 billion parameter model, let's say. And you also have a web researcher in your agent, Agent K that is plugged into this llama model. [01:15:20] Well now it's a little bit different than merely having something that can write code for you because the LLM has weights and the context necessary to write A lot of code but that can also look up alternatives on the web. Can actually look around GitHub for examples of code. [01:15:40] Take those blocks of code and then modify or reuse them for whatever you are doing. [01:15:49] Combining these tools together and using a combination of smaller models that that do specific pieces of a task. This is gonna get really interesting really quickly. Like this might be the first one where I finally dig into the Agentic models and LangChain framework and stuff which may even be something that I don't even have to manually use. Like it may be something where this tool completely abstracts that away so that I am literally talking to an LLM that just knows how to use the LangChain environment. [01:16:24] Anyway, that's just a really cool development and there is also a this wasn't even something I was going to cover on today's episode, but just because it's awesome and you should check it out. There is a new model called Minimax which I think the only website I know of right now where you can actually test it out. I think I have this open somewhere or was it. [01:16:48] Okay, here we go. Hailua Halua AI this is out of China. It's H A I L u o a I.com and this is the first time, the only place I know that has the Minimax model live. [01:17:05] But it is a fantastic, fantastic text to video model and there's actually another. I don't know if I haven't seen anything about Minimax being open source, which it would be awesome if it was, but I. I'm not gonna hold my breath on that one just in case. [01:17:23] There is also COG Video which is a pre trained text video model and that just came out. I have not tried it but supposedly it's better than a number of the other text video models. And being open source it will be really interesting to see when they start integrating. [01:17:43] You know, adding in your own frame like so you have an image that you can start from or modifying and creating a start and end frame. Like again, open source just usually ends up with really really cool tools to utilize. So that's definitely something to keep an eye on. [01:18:01] So there's actually a number of other things. There's Fuse AI which is an open source research community and they've built some new tools and stuff for a framework for combining LLMs for making them work together efficiently. Another is a Sigma RL which is open source decentralized framework designed to improve sample efficiency and generalization in multi agent reinforcement learning for vehicle motion planning. [01:18:35] Now I'll be Honest, I didn't even dig into this one because I was like wait, what? [01:18:40] And I started getting confused very quickly and this is certainly not my specialty but the only reason I bring it up is because they on their GitHub they talk about OpenStreetMap support and this in combination with something like George Hotz is doing and his AI like just like smartphone powered thing that can make any car after like 2019 or 2018 or something or most cars after 2018 or 2019 fully auto self driving. This I think is a great example of like those improvements are going to come really, really quickly and it's going to be interesting to see. [01:19:18] Like I just hadn't really thought about or really dug into advancements in that area. [01:19:24] And it's wild that we have all of these like oh, there's going to be fleets of cars and they're going to have all these sensors and all this stuff and they're going to be AI and blah blah blah. And in the exact same vein of the smaller models are taking over and lots of enterprises, lots of companies and lots of services are moving towards models that they can self host or models that they can host in the cloud in a far more efficient way and basically add something to the services that they already have rather than selling AI as some service in and of itself that you're just going to talk to. This is another great example of the smaller, the open source models potentially giving a better result like George Hotz even talks about, like his the AI that will do the self driving cars for anything with the sensors on it and with just a smartphone basically with just a phone attached to the, to the window, to the, behind the rear view mirror. As things become especially, especially as the money gets fixed, as capital costs become real and not fake anymore, the things that are cheap, that are low cost and work are going to be the things that take over the market. The things that work reliably, they work well that they, that we build on and operate with together. [01:20:49] And once all of this crazy psycho low debt money printing VC money dries up, which it will, it will because we are moving back to a sound monetary standard, blowing a billion dollars on whatever that stupid George Hotz was talking about it in one of the videos I think I linked to it in one of the recent episodes. [01:21:14] A billion dollars on like this AI fleet and not having a single car on the road that was self driving will be incomprehensible. That will seem ins, rightfully seem insane. But at the end of the FIAT era that's like how big business works. That's just like a constant thing. That is just absolutely bonkers to me. [01:21:39] But back to what I mentioned earlier and talking about how AI is largely going to and LLM, they're likely going to take the place of kind of like Autocorrect and these additions to basically every service. I want you to remember that context. And now I want to just read and we'll kind of use this to finish out the episode. I want to read something from Jimmy Song, which was just a simple A little note on nostr. It says AI vs Bitcoin the AI hype has been non stop for the last two years, ever since ChatGPT came out with its 3.0 client. Since then, there's been an insane amount of investment into AI tech from every direction. There are hundreds of startups, every tech giant has been making investments, and companies in between have been putting a lot of money toward it as well. It's not a small amount either, as the AI hardware costs make bitcoin mining look like discount bargains. Yet after two years, what have we to show for it? Maybe some faster image editing on newer phones? Slightly faster answers to questions you would normally ask Google? [01:22:47] Some productivity increase among junior programmers. The investment was enormous, as can clearly be seen in Nvidia's growth, but the results are pretty underwhelming. As with any hyped technology, the possibilities have run past the actual use. [01:23:01] One of the supposed benefits of fiat money is that capital accumulation is unnecessary to create real value. You can build roads, for example, without having to save up for it. [01:23:13] What this misses are many obvious drawbacks, but one of them is that there has to be someone that evaluates whether something will create value and create the money out of thin air to fund the project. [01:23:26] This is not just inherently centralizing, but also deeply political. [01:23:31] For whatever reason, AI passed this political test and got the blessing of the money printers, which, to a company that sells shovels like Nvidia, has been great news. [01:23:42] But the drawback is that there's bound to be at least some that don't pan out. Maybe some segment of the economy can't use AI profitably, for example. Yet the powers that be, mostly cantillionaires, have decided that this is worthwhile and have poured insane amounts of money into this bet. [01:24:00] But much like hyped tech of the past, it's looking more and more likely that there's little profit to be made here yet. There's some useful things that can be made, but the costs are simply too high right now to justify spending that much. It's a luxury item that most people simply don't need and hence don't want to pay for. AI has become an expensive solution looking for costly problems to solve. This was always my analysis with another hyped blockchain. It never really made any sense as the cost was too high for what was really just a distributed, very redundant but hard to upgrade database. It too couldn't find costly problems to solve, with the exception of one that of course being bitcoin. [01:24:46] What differentiates Bitcoin from AI is that people need bitcoin. It is its own killer app. [01:24:53] AI is not so popular that people will pay for what it costs right now. And that means that most of the investment will be wasted. Like most hyped things in a fiat economy, it's doomed to have significant malinvestment. [01:25:06] A lot of people complain about bitcoin businesses and how hard it is to make them profitable. In a sense I get it, you want more people to have steady jobs and so on. But in another sense, I think this is the market speaking. You're not going to get paid from bitcoiners easily. And there's no flood of printed money looking for a place to go. At least there won't be once fiat money has run its course. Building a profitable company is hard and so few meet that mark, especially in a new segment, as AI has shown. So in that way I'm encouraged because the companies that survive in bitcoin will have something truly worthwhile. By contrast, the companies that survive in AI will probably be the ones that get subsidized the longest. [01:25:52] Now I do not completely agree with everything that he says here, but I think he is absolutely pointed in the right direction. I do agree that there's an enormous amount of mal investment, that there is a gross over, over exaggerated amount of capital chasing this that doesn't actually know what to do with it. And it is essentially surviving on another extremely hyped and extremely cool demo for the next thing. It's very much thriving on this. What's the next big thing gonna be? What's the next thing? What's the next thing? And a lot of these companies, OpenAI is a great example that's just kind of leaping from one to the next and trying to always just have a really big announcement. And there is a lot of really useful stuff that these things do. [01:26:52] But look at what Apple has done with it. [01:26:56] So with the Apple, the iPhone 16 and their embracing of AI finally, bunch of local models and what clearly seem to be kind of agentic AI things happening in the os. Like one of the examples that they give in the video showing the capabilities is somebody goes by a store and they look at it and they're like, oh, that's interesting. And so they just take a picture of it with their phone, with the store, and then it brings up details, hours that the store is open, reviews for the store, all of this stuff. It searches, it pulls together information. It's clearly a couple of small models there. There are multiple things happening and it's pulling information from the web and it is presenting it to you in a very useful way. This is an incredibly useful tool. But you know what it costs buying an iPhone? [01:27:49] It's not a product, it's an integration that is simply a differentiator for your product. [01:27:57] There's an AI assistant thing in JIRA now. [01:28:02] There's like so many of these different tools. I think it was like I said, Calendly is trying to implement one. There is one built into Microsoft now which is actually not very like kind of the opposite. Like some of the things that it does, it's literally just watching your screen and it saves like complete text versions. Like you can actually find your passwords in this thing. And they even make it so that you really can't even cut it off. It's horrifying. It's actually like the worst implementation I've seen. But these things are going to become ubiquitous. They're going to become part of everything. [01:28:34] And when that's the like, you're not going to be able to charge for them. They're simply going to be something that comes with compute. [01:28:43] And maybe the biggest models will be enough of a differentiator that for some specific task that needs some greater degree of specificity or accuracy, it will be worth or just a far deeper, far larger amount of compute or deeper understanding or connection to a bunch of different things. There will be things that you pay for, probably one off, probably through a subscription or something Like, I'm sure Apple, Apple has an integration for anything that they can't do local on the phone with ChatGPT. [01:29:18] So Apple is paying ChatGPT for their partnership, but they are using it to upsell their product, make their product look better, which is going to get more people buying apps off their app store, which they make 40% on every single sale. Or actually, I think it's the other way around. I think it's like 60%. The developer gets 40% something ridiculous. So they're selling an ecosystem and the AI is simply there to make their product better. It's basically like it's clippy actually turned into something of consequence. [01:29:48] But because of that we're probably going to see a bear market. We're probably going to see a huge shift in the corporate AI world, probably going to see a downturn in Nvidia when this kind of goes or when the market shifts, I would suspect at least, and importantly this will probably be really, really good for open source models because for what is actually valuable here and what people will gravitate toward is stuff that actually get some very specific use case. And a great example. I use a lot of these things for very specific use cases and a lot of times I can't even use them formally because they're not good enough. I still have to kind of like hand hold with like the best models and walk alongside what they're doing. They're only really like an accompaniment. They're like a, they're an assistant or they're there to help but I can't actually rely on them to produce the results. [01:30:54] And if that's the case then I'm still doing all of the work. It's just taking like 20% of the time out of it for me. It's just helping me move a little bit faster. And I kind of think it will be that way for quite some time. And if all of the capital dries up, if we have a big bear market in the larger corporate AI environment and the big giant company environment and we have a massive reallocation of capital, I think that will end up being really, really good for the smaller, cheaper, easy to run self hosted models that actually complete a task because the capital that will be left in the system will simply want to get results. [01:31:35] They won't be buying something hyped. The hype will die down. The fascination with the fact that you can talk to your computer will just become old ass news because that's what everybody does now. And the entire AI ecosystem could become far more varied and far more open, which is really, really interesting. [01:31:57] And it's cool because this was the premise that I started the show on thinking that this is likely the case that I began to see hope that open source is going to continue to get better and dominate the market and that small models may actually have like a really good purpose. And I feel like I'm seeing, I feel like I'm still, still seeing it that it seems to be going that way. [01:32:21] And this, I mean there were so many freaking open source announcements. Like the number of things that have changed and had been announced just in the month of, just in the month of August. [01:32:32] Like that is wild. The speed at which a lot of this stuff is happening. It's like one of those things that like you can't even assess all of the things that have happened before. You have a whole nother list of things to go through. [01:32:46] So it's all just really exciting. And especially with open standards becoming formalized and the Linux foundation getting behind it and a lot of big players in the community side of things, and more importantly the tinkerer and builder community side of things, getting involved in this and backing this completely, I mean starting the organization, that's a really, really good sign. That is a really, really good sign. And you know what's interesting, you know what's really interesting is all of my favorite AI tools are all open source. And it's not just because I go looking for open source tools, I do go looking for open source tools, but I'm fully willing to just use any AI that I find. I use ChatGPT for a long time. I've used Claude, I've used Midjourney, I've used Magnific AI and I still do occasionally use some of those, but I now use a prepaid debit card that I know exactly how much they can charge and I just let it run out when I don't need it anymore. I usually need them for a very small one off thing and increasingly more and more open source tools are available that make me not need them. I'm using llama 3.1 now as my dominant model. I am using my own computer to generate little video clips to generate, to do voice or face replacement stuff. And even though some of those suck because you're working against the way the face is changing on the video that you're already using, the thing is, is that the proprietary models aren't better. There's not even a way to do it in a proprietary way because most of them have this. You can't, you can't do it on famous people like set of rules, which is ridiculous. There's actually, in fact there was one, there was something in my list. Where was this? I didn't see this. There was another deep fake thing, Deep Live Cam, Deep Live Cam that you can actually do real FaceTime swap with real time face swap with an image just by clicking a button and it will just like show it to you, which is awesome. But all of my favorite tools, all of my favorite tools are open source. [01:34:55] In fact, I don't think I'VE used a proprietary AI. Like I said, I use it every day. [01:35:00] I have Whisper built into my workflow. [01:35:04] I have a Llama on my computer which I use from time to time. I have Lava, which I use from time to time, which is vision and language based model. I have Pinocchio which has a entire plethora of open source AI tools and open source AI like software, like tool sets for various models. I use image generation. Runway ML is one of the few that I still have a subscription for that is a video based. Because I'm still working on a workflow, trying to put together a workflow for a bigger project that I'm trying to do and I still haven't quite worked out all the pieces of it. And I'll probably need something like Runway ML, but now with Minimax, maybe Minimax is open source. I don't know. I haven't seen anything for it. So probably not. But increasingly there are going to be more. There are going to be better and better open source video models. Especially with a lot of these new ones kind of pushing the limits and just showing that these things can be done and you can have really good video generation, text to video, but dude, the capabilities and now Agent K, which I haven't even gotten to play with yet. But to be able to put together a bunch of different models and literally potentially create a workflow just by asking it to create a workflow. [01:36:20] I just, Jimmy might be onto something and we might have a very bright, very open source, very decentralized future in AI. [01:36:32] And that's exciting. [01:36:34] It's exciting to seem to feel like the momentum is building for that side at the exact same time that the capital might start to bleed out of the other. [01:36:49] Maybe I'm getting ahead of myself. Maybe you know, there won't be a huge correction in the AI the corporate AI market. But I kind of feel like they're. There has to be. [01:36:59] I don't know. We'll see. We will see. We will keep an AI on it. [01:37:03] Oh my God. We will keep an eye on it and we will cover it on the show one way or the other. And as always, we will cover all the open source tools and things that you can use to self host to make your life easier and to utilize AI in a self hosted way for your benefit, not for some giant corporation. And that's what we do here at AI Unchained. [01:37:29] Thank you guys so much for joining me. I hope you enjoyed this episode. Don't forget to check out Coin Kite and the Maker. They are the makers of the cold card hardware wallet. And you can get yourself a discount with the cold card with the link and code right down there in the show notes, the description of this show. You can just scroll right down and you'll find it. And I will try to have as many of these links as possible so that you can explore all of this stuff in depth that we've collected over here about a bunch of open source stuff, Agent K in particular. I will definitely have some things to share about that hopefully very soon. All right, guys, thank you so much for listening and I will catch you on the next episode of AI Unchained. Until then, everybody take it easy. Guys, to live is the rarest thing in the world. [01:38:28] Most people exist. [01:38:30] That is all Oscar Wilde.

Show Notes

Episode Transcript

Other Episodes

Episode

Ai_019 - What if You Had a Personal Software Developer

Episode

Read_687 - How the IMF and World Bank Repress Poor Countries, Part 5 [Alex Gladstein]

Episode

CryptoQuikRead_278 - Deflation & Liberty [Part 1 - Jörg Guido Hülsmann]