Accessibility and Gen AI Podcast

Simon Willison - Creator, Datasette

Episode Summary

Hosts Eamon McErlean and Joe Devon interview Simon Willison, the creator of Datasette, an open source tool for exploring and publishing data.

Episode Notes

OUTLINE:
00:00 Opening Teaser
00:36 Introduction
01:35 Working On Django
04:35 Future of Generative AI & Accessibility
07:23 Latest Tools & Models (Google Gemini Flash 2.0, Open AI, Video Streaming APIs, Amazon Nova)
11:39 Frontrunners of AI?
14:48 Daily Tools
19:14 LLM Command Line Tool
22:10 Using LLM For Alt Text For Images
24:58 Making LLM More Accessible
32:36 Will AI Replace Jobs?
39:50 The Dangers of AI
43:13 Launching Django
46:52 Datasette Open Source Tool
51:29 Developers Working With The Accessibility Community
57:34 Using NotebookLM To Prepare For This Podcast
1:00:43 Wrap Up

EPISODE LINKS:

Datasette
https://datasette.io

The Book on Accessibility by Charlie Triplett
https://www.thebookonaccessibility.com

Accessibility Acceptance Criteria
https://www.magentaa11y.com

NotebookLM
https://notebooklm.google

Simon Willison's Blog
https://simonwillison.net

Simon Willison's Social Media
https://x.com/simonw
https://bsky.app/profile/simonwillison.net

Episode Transcription

- You can't talk about NotebookLM without talking about their brilliantly weird podcast thing where they can generate a podcast about whatever content you've thrown in there. So I like doing things like I fed in a big, boring report about some like generative AI study, and I told them, you are banana slugs, and I want you to have a conversation about this paper from the perspective of banana slugs talking about how this could affect your slug society. And they did it.

- It's like a Monty Python skit right there. I like it.

- Oh my God.

- Welcome to episode six of Accessibility and Gen Ai, a podcast that interviews the newsmakers and thought leaders in the world of accessibility and artificial intelligence. I'm Joe Devon, and I'm joined by my co-host Eamon McErlean. And today we are interviewing Simon Willison, a true OG of the web. He created my favorite framework called Django. He created Lanyrd dataset, and now he is a prolific blogger, talking about and building projects on top of AI on a daily basis. Simon has contributed so much to the world we live in, and it is a true pleasure to have him join us today. Simon, welcome to the pod.

- Thank you very much. I'm excited to be here.

- You know, when I think about all the things that you've done, your impact on the web spans so many influential projects, but I'd love to understand what has been the most meaningful to you personally, and what does a typical day look like for you?

- So I think the most impactful project I've worked on is still Django. Like Django has, it's extraordinary how far that framework's gone. You know, NASA have used it, Instagram and Pinterest were built on top of it. Meta's Threads is just another Django application I found out recently. So that's amazing. And I love the fact that Django can be classified now as boring technology, and that it's the safe choice, right? If you just want to build something on the web, if you pick Django, you're not gonna run to any sharp edges. There won't be any surprises. I'm really proud that it's made it to that point. But these days I'm really excited about the more recent stuff I'm working on. I'm working on open source tools for data journalism, where the dream here is I want the tools that I'm building to help somebody win a Pulitzer Prize in investigative reporting. And that's sounds wildly ambitious, and I think it is, it's ambitious, but that would be such a cool sort of example of the kinds of problems I want to solve. Like I build tools for other people, and I want those other people to achieve sort of world changing results with them.

- That is so cool. And I'm just gonna tell you an anecdote which I have not mentioned to you before when we met. I had never touched Python, and I needed to deliver an e-commerce project for a client. And I learned Python, Django, built the app in two weeks, and my client was so happy. She was like, because it, you know, it had that admin that you guys put in there, and my client was like, you should sell this whole admin thing. And I was just laughing so loud 'cause this was a two week project for me.

- So that is so thrilling, and I've heard a lot of that. A lot of people got their starting programming, learning Django. And when we built Django, we never dreamed it would be a beginner's project. We thought it was like, you know, for advance, for experienced engineers to quickly build web applications. But since then, there's the Django Girls Tutorials that have been running for I think nearly a decade now. There's been so much activity on that front. And yeah, I meet people, and they're like, oh, Django was my introduction to programming and web development. That's so exciting. That, you know, that wasn't the plan, but so thrilling to see that happening.

- Yeah, though, to be fair, I did start with PHP and MySQL before, but.

- Oh, did I?

- That was my intro to Python. It was really hard to get Python working, but Django part of it was awesome and easy.

- Nice.

- It's funny you say that, Simon, pleased to meet you. You and I never met before, so thanks for spending your time with us today. I've heard Python a lot over the past several weeks. Our youngest son, who's a sophomore in Oregon State, he's currently doing a Python class, and he loves it compared to C++. He's like, I'm all over it. It's much easier. He's a fan. He's a huge fan, he is. You mentioned in your initial response about, you know, your goal, core goal of helping people and making things easier. Tying that into accessibility, digital accessibility, how do you see the gen AI roadmap and accessibility coming together, and maybe the improvements from an inclusiveness perspective?

- So this is something that, so I do not, I'm not an expert in accessibility. It's something I've cared about throughout my entire career, like Django from the very start always was built with sort of semantic HTML and all of those sort of 20 year ago accessibility concerns in place. And even these days, there is an accessibility working group that I'm not involved with, but that's doing amazing work on the Django admin and so forth, as that's something I care very deeply about. I'm fascinated to learn more about that intersection between generative AI and accessibility myself. I feel like the most exciting trend over the past year for me has been the vision models, or these multimodal models, right? Which they can consume images, video, the audio stuff has got incredibly powerful in the past two months. Like as of now, it's not even surprising that you can have an audio conversation with one of these models. Two months ago, that was hardly a feature. That's so cool. And I feel like the accessibility benefits of these seem underexplored, at least from my perspective, but so, so promising. Some people I've talked to have been skeptical about the accessibility benefits, because their argument is, you know, if you give somebody unreliable technology that might hallucinate and make things up, surely that's harming them. I don't think that's true. I feel like people who use screen readers are used to unreliable technology. You know, if you use a guide dog, a guide dog, it's a wonderful thing and a very unreliable piece of technology. So, you know, when you consider that people with accessibility needs have agency, they can understand the limitations of the technology they're using. I feel like giving them a tool where they can point their phone at something and it can describe it to them, it's got really good OCR capabilities built in, you can have an audio conversation with us, this just feels like a world away from accessibility technology just sort of three or four years ago.

- Agree, completely agree. And I think, you know, that concern ties into, and well, maybe the concern of LLMs not being fully inclusive. And if LLMs are not inclusive, there will be gaps. But we can solve that. We can solve that with engaging with individuals with disabilities, with the prompts. We can resolve that with making sure that we have a true comprehensive non-bias dataset when we're building datasets. So I think as long as we get ahead of it, as long as we're aware of that potential gap, that we can solve it. I do, I believe that.

- Simon, we are now, I think it is day 12, was it 12 days of OpenAI or 14 days? I think it's 12 days, and we've reached the final day, but it also feels like the 12 days of Google AI, and when Google is doing their announcements, OpenAI, it sort of like did their own bunch of announcements. It just feels like we're in a war right now, massive competition, so much to talk about with respect to that. But what do you, what were you most excited about in the last couple of weeks that was released? Like what are the top items, and then as an aside, if any of those tie into accessibility, would love your thoughts, because I haven't had enough time to even look at most of these.

- It's been bewildering, the whole month of December has just been a whirlwind. And when is December the month that people release everything, right? You'd expect people to be dialing down for the holidays. But no, we've had extraordinary releases from OpenAI. Google have managed to undercut OpenAI, which has never happened before. Like last year, every time Google made a Gemini announcement, OpenAI would launch something better that morning, almost as sort of a power move. The opposite is happening today, which is so fascinating. Like Google's Gemini team are really ramping up. And so there's a bunch of Gemini stuff that's really exciting. They released a Gemini Flash 2.0, which is the, for sort of the cheapest version of their Gemini 2.0 series is a really impressive model. I've been playing around with that one a whole lot. The Gemini models, they can do audio input, and they can do video input, which puts them a step ahead of OpenAI. OpenAI have some preview audio models, but nothing like what Gemini can do on that sort of multimodal front. The really fun thing is Gemini and OpenAI both now have streaming video APIs where you can literally point your webcam at something, and you can then stream video images into the model, talk over them and have it talk back to you. And this is absolute science fiction. Like Gemini managed to squeeze their version of this out the day before OpenAI did, which was extraordinary. OpenAI, however, productized, it's in the ChatGPT mobile app now. So I can fire up ChatGPT, I can turn on my webcam, I can point and I can start having a conversation including with Santa Claus. They've got a gimmicky Santa Claus voice that you can talk to. I'm like, I introduced it to my chickens. I said, here are my chickens, this is their names. And then a few minutes later, I pointed to a chicken and said, "Which chicken is this?" And it got the answer right. That's, what are we even doing? That's amazing, right? So that's so, and the accessibility implications of streaming video and audio into these things, that's extraordinary. Absolutely extraordinary. Those capabilities became available, what, three or four days ago. Like this is absolute cutting edge. The stuff is available over APIs as well. Just, was it yesterday? I'm losing track of the days. OpenAI now have a new web RTC API for their real time stuff, which I knocked out a little test webpage, which you can click a button on, and now you're having an audio conversation with one of their GPT voices. And it was like a dozen lines of JavaScript to get that working. Unbelievable, right? So that stuff is, and again, it's just so new, like these streaming APIs didn't exist two weeks ago, now they're rolling out, where I feel like we've hardly even started dipping our toes into what those can do. The other exciting thing is OpenAI drops the prices of their audio API by a lot. Like previously it was prohibitively expensive. Now it's just about affordable. Gemini haven't announced the pricing on their API yet, but all of their other models are just bargain basement prices already. There's this, part of the benefits of the competition is that the pricing just keeps on going down. It's unbelievably inexpensive to use these vision models right now. A little while ago, so I said a while ago, like two weeks ago, Amazon announced their Amazon Nova models, which were effectively their version of the Google Gemini models. They're similarly priced, they have similar capabilities. And I did a napkin calculation, and found that if I want to take 67,000 photographs from my photo library, and run all of those through either Gemini 1.5 Flash or the Amazon Nova cheap one, it would cost me $10 and 25 cents to do 67,000 photos, to get a actual useful text description of those photos. That, I ran those numbers three times because I didn't believe them like the first time I calculated them.

- That's amazing.

- Wow, right? Absolutely incredible.

- You know, I was gonna ask you what tools you currently use. It sounds like you use everything on a daily basis. But from what you see now or where you see things growing, do you believe that there's a current front runner, or somebody that's just gaining that constant momentum and getting ahead of the game?

- No, no. And this is new. Like six months, it was OpenAI, right? OpenAI with GPT, they launched GPT-4, and GPT-4 was the best available model for like nine months, it felt like an eternity. Then that broke when Anthropic released their Claude 3 Opus model. I'd have to look it up, I think it was March this year. And they followed up, this model, Claude 3.5 Sonnet. None of these names are very catchy. That's been the best model, like the model I use every day for about six months now. And I think it's still just ahead of the rest of the pack. But the Gemini models are catching up really quickly. The OpenAI models, and they're the new o1 models that, they released another one of those like a few days ago. Those are really interesting as well. But meanwhile, the Chinese labs put out this Qwen 2.5 is an openly licensed model that I can run on my laptop, which is in that GPT-4 space. The Llama models from Meta, Llama 3.3, again, GPT-4 level runs on my laptop directly. I never thought that would happen. I thought I'd need a $50,000 like mainframe computer to run that kind of model. This is all changing so quickly. The flip side is there is a suspicion that the scaling thing is no longer holding. Like it used to be, you chuck in more data, and more parameters, and more GP time, and you get better models. But everything seems to be sort of leveling off at the GPT-4o, Claude Sonnet level. Like the new models are slightly better, but they're not like two times better. And so the new frontier appears to be this idea of inference scaling. It's this thing where you take a model that's really strong, and you just give it longer to churn away to try and come up with answers. And that OpenAI started out with their o1 model. Gemini, four hours ago, Google announced, released their first version of this, this thing called Gemini Flash Thinking, I think it's called. I just literally got some software working against that like half an hour ago. And it looks--

- I'm happy I saw it a half an hour ago. You already have code working. This is Simon for you.

- It's, the Chinese ones. There's a Qwen model called QwQ, which is the same patent, there's another. And so this is all happening now. And what's interesting about those is they're not better at every task, but they are better at tasks like coding and mathematics, where it helps where if you are a human being, and you've got about a big notepad, and you wrote down your thinking step by step, that would help you solve a problem. These models are now capable of doing that kind of stuff. So it's a different kind of better, they're not just getting universally better at everything, but certain sort of like problem solving tasks that we've got a new frontier that people are working on. And all of this happened so recently as well. Like it's a full-time job keeping up with it, definitely.

- It really is, and it's definitely eating into my days just trying to keep up. I am curious, since Eamon brought it up, what do you use on a basis and what is good enough for you to have taken done the subscription for, including, are you spending that 200 a month on the new OpenAI model? Is that any good? I've seen some people say it's not that good, and then a few people are like, this is the best thing out there.

- I'm so torn on that one. So I'm not spending the $200 a month yet. At the moment, I pay for ChatGPT $20 a month, and Claude $20 a month. I would be paying for GitHub Copilot, but I get it for free as an open source maintainer. And that's it for my subscriptions. But then I've also got API accounts with basically everyone, and I'm constantly experimenting with the APIs. The thing is, they're so cheap, that most months, my API bill across everyone comes to like 10 bucks. Like I've never managed to spend more than $20 on the API, on all of the APIs in any given month. So it's not a huge amount of money that I'm spending right now. Yeah, the $200 thing, it gets you less caps on the o1 model. Like the o1 model I think you can use 50 times before you get locked out of it for a few days, which is a bit frustrating, 'cause I've started using that one a little bit more. And you get this thing called o1 Pro, which I really want to try, but I don't wanna spend $200 a month to try. So just give me one free go at it, you know? I don't know, I might end up paying $200 a month at some point, but I haven't quite justified it to myself yet.

- We just had Ed Summers, the Head of Accessibility for GitHub, and he announced that Copilot is now free.

- Yeah.

- So, that's not even--

- Free with limits.

- Yes.

- I forget what the limits are, but it's, and the great thing about that is it's not just a free trial. This is a free tier that GitHub are planning to make permanent. So you will always, it's especially important for people around the world, you know, people in India are much less likely to be able to set up that credit card subscription and so forth. Now they get access to the Copilot experience. I'm really excited about that. I think that's a, it's actually, it's the oldest generative AI tool in mainstream use. Copilot, it turns out, predates ChatGPT by like, what, nearly two years. They released the first version of Copilot in 2021. And I love that it's not a chat, or at least it originally it was that auto-complete interface, which was really innovative. It was a really, really interesting way of interacting with those models. Yeah, I've been, I'm a huge Copilot user. I'm at that point now where if it's not running in my text editor, I feel restricted. Like wow, now I'm having to actually type the code out in full myself.

- Yeah, and now that they've added a system prompt for the accessibility, that's so helpful because then you can really make sure that what it spits out is much more likely to be accessible.

- Oh.

- I've been asking them about that for a year. So they announced it just now at the Universe.

- That's amazing. That's something, I wrote up a thing last night, where one of the things can do with Claude, and ChatGPT has this as well now, there's this thing called a project, and where you set up a project, and you can dump a bunch of files into it, but you can also set custom instructions in there. So it's a nice easy way of doing system prompts. And I've been setting up little projects for different types of code that I write with custom instructions that just have like always, I always start my HTML documents in this way, always includes box sizing border box in the CSS, little things like that. And it's fantastic. Like I can now one shot prompt a full page of working code, and have all of those little ideas baked into it. It's also interesting 'cause it means that you can use the model for things that aren't in its training data. Like I've started using the Python uv tool a lot, which has ways of running Python scripts where you list the dependencies in a magic comment at the top of the file, and then you don't have to think about your dependencies, it just uses them correctly. And so I built a little custom project which teaches it, gives it one example of, here's how you list your dependencies, and now it can one shot, fully working self-contained Python script. So yeah, I absolutely buy that if you have expertise in accessibility, in our area of stuff that the models aren't doing, you give them one example, just one example of your sort of ideal framework, your ideal layout. And from that point on, they'll be really good at producing code that fits that example.

- I love to see your enthusiasm here. It's clear how much you love it. It really is. You personally created a command line tool called LLM. Can you share with our listeners what that's all about and how useful useful that can be?

- Yeah, so this is, the initial idea around this was I spent a lot of time in the terminal like running, I'm on macOS, but you know, I'm a terminal person. And I realized that large language models and the terminal are really good match, because at their most basic form, a large language model is a thing where you give it a prompt and it gives you a response. And in the terminal, you're always piping things from one tool to another. So wouldn't it be great if you could just pipe text into a language model, and then have the response come back out again. So the first version of LLM was exactly that. It was using the OpenAI API, and I noticed that nobody had LLM on the Python package repository yet. So I grabbed that, it was like a namespace grab, 'cause a three letter acronym tool felt like a cool thing to have. And so I built that, and it turns out it is great. It's really fun being able to say, cat myfile.py pipe LLM explain this code, and it spits out an explanation. That's really fun. And then I added plugin support to it so that you could have it support additional models, because you know, why talk to just OpenAI when you could talk to Anthropic, or Gemini, or all of these other models as well. And because that's based on plugins, anyone else can write a plugin that adds support for a new model. And I also can write plugins that do local models. So now my little command line tool out of the box does OpenAI, and if you install a plugin, it can do Gemini, and then Anthropic, and Claude, and so on. And then you can install some plugins that will install models on your laptop, and now you've got a completely offline language model environment. So much fun. Like it also means that whenever a new model comes out, I've got something I can do with it. Like I can be like, okay, new Gemini model, tap, tap, tap, tap, tap. Now my LLM plugin for that can support that model. So it sort of helps me stay on top of new developments, because I've got, I'm actually writing code that interacts with these models, and I use that on a daily basis. Like there are all sorts of things where it's convenient to be in the terminal, and to quickly ask a question or quickly analyze something, or you can do things like curl a URL and pipe that into the model, and now you've got ask questions against a webpage. It's really, really fun. My one problem with it is that the terminal itself isn't, like a lot of people don't know how to use a terminal. Like it's a power user tool, and it bothers me that a lot of the stuff I'm building is then only available to people who are terminal users. So I have an ongoing goal to build the sort of web application on top of LLM. So you can type LLM, Space, web, Enter, it runs a local web server, it pops open your browser, and now you've got a GUI where you can start playing with models. And I'm forever two weeks away from getting that feature working.

- Forever, yeah.

- That'll be fun. And you're also doing multimodal on the command line. And I think I saw something you did that was accessibility, that could be used for accessibility, like alt text or image description, or something like that, correct?

- That's something, I actually use large language models for most of my alt text text these days. Like whenever I tweet an image or whatever, I basically, I've got a Claude project that's called Alt Text Writer, and it's got a prompt, an example, and I dump an image in, and it gives me the alt text, and I very rarely just use it because that's rude, right? You should never just dump text onto people that you haven't reviewed yourself. But it's always a good starting point. And normally I'll edit a tiny little bit, I'll like delete in unimportant detail, or I'll bulk something up, and then I've got alt text that works. And often it's actually got really good taste. Like a great example is if you are, if you've got a screenshot of an interface, there's a lot of words on a screenshot of like an interface. And most of those words don't matter. Like the message you're trying to convey in the alt text is, okay, it's two panels on the left is a conversation, on the right, there's a preview of the SVG file or something. My Alt Text Writer normally gets that right. Like it's even good at summarizing tables of data where it will notice that actually what really matters is that Gemini got a score of 57, and Nova got a score of 53. And so it'll pull those details out and ignore like the release dates and so forth. That's really cool. So yeah.

- So you'll be able to prioritize actually what the overall screen, I'll be able to pick out the key components and key metrics on the screen.

- Just does it already. It's just got good taste by default, and then you can always talk to it. So it can give you alt text, and you can reply and say, yeah, ignore this column, and then dump, it'll try again. That, I love that. Like I take pride in the alt text on these images because so many people don't bother, and I love, I'll often try and drop little like in-jokes in, or things like, not jokes that are sort of like Easter eggs that would spoil the experience of somebody with a screen reader who's actually using a screen reader. But just little things that make it clear that I'm trying to convey the sort of the message that's embedded in the image. It's really fun you know?

- Simon, I listened to about a one hour podcast that you did a couple of months ago, an engineering podcast, and you really did a good job of explaining where fine tuning made sense, which I think you said doesn't really usually make sense. And you compared it to RAG. And recently there was another release that OpenAI did to improve the fine tuning. And where I'm going with this is I'd love for you to explain, like I'm very passionate about the coding LLMs, and that they should be accessible by default. So there's two aspects to this. One is how do you create a model or a version of the model that is more accessible? In other words, you feed it accessible code, right? Just to try and counteract the bad code that's in the training data, which as you mentioned on that podcast, you're gonna really struggle to really make a difference if you're adding data, because there's too much data that might, in this case, not be accessible. So how would you recommend customizing the model? And then part two of the question is how would you recommend that I go ahead with my mission of trying to get AI researchers to pay a little more attention to this, perhaps with a benchmark, or somehow get them to compete with each other so that every time there's a new foundation model, they would take a look at the accessibility and say, hey, we perform well on this benchmark.

- A benchmark is an incredibly good idea, like absolutely fantastic idea to have an accessibility benchmark. And I feel like it's difficult, like building benchmarks is not a trivial thing, but it's definitely achievable. There's lots of examples out there, there are people who could help with that. I love that idea, because yeah, one of the things that's becoming increasingly clear with these models, is a lot of people have this idea that all you do is you scrape the entire internet and dump it into the model, and then try and get even more data and dump it in. That's not actually an accurate model of how this stuff works. The more the AI labs experiment, the more it's becoming clear that the quality of the data matters so much. Like you really don't just want a random scrape of a boat, a bunch of junk. You want really high well-curated data. There's a lot of work going on right now with synthetic data where people are artificially creating vast amounts of data and feeding into their models, because they know that they've just fed in a bunch of Python code that passes its unit tests, for example. That's just better. So, and the flip side of that, is that occasionally you hear little hints that the labs are hiring experts just to help with their training data. Like they will hire experts, like expert biologists to help like refine and dump in way more high quality biology data. There is no reason at all that they couldn't hire expert accessibility engineers to help curate and dump stuff in there. They need to see that there's demand for that. So yeah, and if there were benchmarks, that would help push the needle on that one.

- I've written to all of the foundation models, and not gotten any response so far. But anyway, what were you gonna say?

- Well, yeah, so it's, we should talk a little bit more about fine tuning. So the, everyone who starts working with these models, one of the first things they think is, I wish it knew my stuff, right? I wish it had been trained on all of the documents within my company. And so obviously I should fine tune a model to train it to understand that information. That's the thing which mostly doesn't work. I mean, you can try and do it, but it turns out dumping a little bit of extra information to a model that's been trained on a giant scrape of the internet, there's so much in there already, it's very difficult to bias it in the correct direction. And my big frustration with fine tuning is lots of people will sell it to you. There are very expensive APIs from all of these providers. There are companies and startups that will help you do this. When you ask them for demos, like I just want somebody to show me a really clear demo of, look, here's the default model, here's the fine tuned one. The default one sucks at answering this question, that the fine tuned one is really good at it. And these demos are really hard to come across, which is one of the reasons I'm very, I remain skeptical of fine tuning as a technique. I think someday it's gonna be useful, and people will have those demos, but right now I feel like you can spend a lot of time and money and energy, and just not get really great results out of it. The flip side is, the thing that's getting increasingly easy these days is just straight up prompting using these long context models. So like just two years ago, most models only accepted up to like 8,000 tokens, which is like maybe 20 pages of text. I'd have to look that up. Today, almost all of the good models will accept 100,000 tokens, and Gemini, it takes a million or 2 million tokens, and that's like you can dump multiple novels worth of information into Gemini in one go, which means if you wanted to build a model that was really good at accessibility engineering, find like 50,000 tokens worth, 10,000 tokens worth of really high quality code. Stick that in the prompt, and it'll pick up from those examples. Models are amazingly receptive to examples. Like that's the most powerful way to work with 'em is to give them examples of what you want. And honestly, even like three or four really good examples of well written like accessible code might be enough to start the models along the right route. And that's a really cheap experiment to run. There's also, there are these prompt caching mechanisms that a lot of the providers have now, where if you give it the same exact sort of system prompt, it costs way less money on the second and third and fourth goes. And that's really useful as well. If you're gonna have like a long prompt full of examples, you pay money up front for the first one, and then from then on it gets cheaper. I think that's the way to do it. It's also really quick to iterate on these, like you build a really big prompt, try it, then you tweak it, and try it again, and see if you get better results. So I think that's the most promising avenue right now.

- Now I finally understand the caching because I didn't totally get it, but it's the system prompt, that makes so much sense, because that's gonna be a hit every single time.

- I mean, it's also, it's common prefixes. So you might have a system prompt that says, you are a useful robot that answers questions based on this document. And then a regular prompt, that's the document, and then questions after that. As long as the document stays the same, you'll get that benefit. Also, if you're thinking about doing chat interfaces, the way chats work is each time you say something new, it replays the previous conversation. And again, that's where caching kicks in. So if the caching is happening, the subsequent posts in the conversation save a lot of money.

- That makes sense, that makes sense. And then for your approach of these really long prompts with examples in them, how would you contrast that with RAG as an approach?

- I think it's the same kind of idea. So RAG, it stands for Retrieval Augmented Generation. And the first version of it was a trick where, you ask the model a question, and rather than just answering, it goes and tries to look in your big corpus of documents for anything that looks roughly similar to that. And then junk, just basically, so it does a search, gets the results, sticks those into the prompt hidden from you, and then tries to answer the question. It's a really effective trick. Like it's the answer to, how do I teach the model about my company, isn't fine tune a model, it's set up a rank system that can run searches against things. And really, the lesson from that is most of prompt engineering, most of building on top of LLMs is thinking about the context. It's thinking, okay, what is the best thing I can cram into those 8,000, a hundred thousand, million tokens to guarantee, to increase the chance they get good answer. And yet the examples thing is almost like a fixed version of RAG, where there are actually things you can do where you could have a system where the user says, I want to build an interface that does this, and you do effectively a RAG search against 100 examples, and find the five most relevant pieces of example code, bung those in the prompt, and then answer the question that way. And that would work really well. That's like a very effective technique.

- You touched upon it a few minutes ago about how AI has got the potential actually to generate jobs. As you well know, there's a general concern out there right now about AI replacing many jobs. What's your response to that? I know it's a very general question, but it is one of the larger concerns out there right now.

- And it deservedly should be. Like this is a very disruptive technology. Like there are jobs that will be damaged by this. There were jobs that be enhanced by this. And so there's the sort of negative, and there's the pessimistic and the optimistic way to look at this. And I can actually focus on this as a software engineer, because it turns out, writing code is one of the things these models are best at. Like writing, it's interesting where the great thing about code is that it's got fact checking built in because if a model spits out code, and you run it and get an error, then obviously there's a mistake. If it spits out an essay, you need to fact check every line of that essay, and that's a much harder process than just trying to run the compiler or whatever. So a lot of software engineers are terrified of this. They're like, hey, this is a technology which is going to, like anyone can get it to write code now. My 20 years of experience are no longer valuable. I may, I need to find a new career in plumbing or something which won't be replaced by AI. My perspective on this, as a developer who's been using these systems on a daily basis for like a couple of years now, I find that they enhance my value. Like I am so much more competent and capable as a developer because I've got these tools like assisting me. I can write code in dozens of new program languages that I never learned before. But I still get to benefit from my 20 years of experience. Like take somebody off the street who's never written any code before, and ask them to build an iPhone app with ChatGPT, and they are going to run into so many pitfalls, because you know, programming isn't just about can you write code, it's about thinking through the problems, understanding what's possible and what's not. Understanding how to QA what good codes, having good taste. There's so much depth to what we do as software engineers. And I've said before that generative AI probably gives me like two to five times productivity boost on the parts of my job that involves typing code into a laptop. But that's only 10% of what I do. Like as a software engineer, most of my time isn't actually spent with the typing of the code, it's all of those other activities. The AI systems help with those around other activities too. They can help me think through architectural decisions, and research library options and so on. But still I have to have that agency to understand what I'm doing. So as a software engineer, I don't feel threatened. I think that there's my most optimistic view of this is that the cost of developing software goes down, because an engineer like myself can be more ambitious and take on more things. As a result, demand for software goes up because if you're a company that previously you'd never have dreamed of building like a custom CRM for your industry, because it would've taken 20 engineers a year before you got any results. If it now takes four engineers three months to get results, maybe you're in the market for software engineers now that you weren't before. But that's the software engineering side of things. That's what's my sort of like rosy glasses. There are other industries where this stuff is just massively disruptive, and I don't think there's a happy ending. Like my favorite example there is a language translation, right? If you are a human being who earns money translating text from one language into another, the models are not as good at it as you, but they are good enough and they are vanishingly inexpensive, that I know a lot of translators now are finding that their job has changed from translate from one language to another, to here is an AI-generated translation, fix it up. And that, you get paid less for. And that sucks. Like that's an entire industry of people who even before ChatGPT, like just Google Translate about five years ago got good enough that that industry took a massive hit. And the question then is, how many other examples are there? Like that professional illustrators who worked at the sort of like lower end of the scale, like being commissioned to do illustrations to illustrate blog posts. They are having a terrible time, because they are, you can now prompt a image generation model and get an illustration that's good enough for your blog post, right? It's no way near what a professional illustrator could do, but it's really taking a chunk out of that end of the market. I think in the movie industry, the group that are most affected to my understanding is concept artists, right? It used to be that if a director is dreaming up a sci-fi scenario, they are paying artists to just come up with those initial concepts to help 'em think through. That's the kind of thing which maybe they're turning into generative AI for. So yeah, so I'm not gonna say that there aren't huge negative implications to a whole bunch of people around this stuff. And ideally, like hopefully this shakes out to a point where now maybe concept artists are taking on more ambitious projects, and they find a new niche in the market that pays well and so forth. But I can't guarantee that's going to happen, and that sucks, you know? It sucks to be in the, like you talk about cars and horses where the automobile came along, and caused mass unemployment against people in the horse industry, and created loads more jobs. Like now there are more jobs as drivers and people making cars. It still sucks if you're a professional farrier working on horses, you know? Being disrupted by technology that creates new jobs isn't great if you are one of those jobs that gets disrupted.

- It's interesting, Justine Bateman, who in my youth, "Family Ties", she was Mallory on "Family Ties", she has a computer science degree, I believe, and she has spoken out as an actress and a filmmaker, she has been railing against AI in a really big way. And it's kind of interesting to see a techie go against it, because even though it is awful, the jobs that are gonna be lost, there's nothing we're gonna do to stop it, and it will at least take it away from the big names in Hollywood and democratize it. So a lot of people are going to be able to make movies for a lot cheaper, right?

- Right. It's like the thing with translators, there's a trade off here. Like on the one hand it really sucks if that was your profession, and it's been impacted in that way. But if we now have technology, that means a billion people can have conversations who never could have afforded a human translator before. That is, I mean, I don't like to make just straight up statements about one thing is worth something else, but that's a pretty compelling trade off at that point, you know? But, and yeah, for filmmaking, so much of filmmaking is expensive, and slow moving, and frustrating. I love, as a programmer, I love when frustrating parts of my job gets sped up. But do we, does the economy shake out that people who were doing like skilled but frustrating aspects of their work still get employed and earn more money doing more creative things? God, I hope so. But I don't know, I'm not in that industry, I'm not an economist. I have no way of, I can't say with any certainty that it's gonna play out in a good way.

- That's my fear too. And here's another angle for you. So yesterday, a friend, the sister of a friend of mine got scammed out of $20,000, and they kept her on the phone the entire time, and went from blockchain, or Bitcoin ATM to Bitcoin ATM throwing in some money here, some money there, otherwise she would get arrested, and she totally fell for it. And it hit me that not, it won't be long before you're going to see some scammers take the voice of somebody you know. And we've, you know, a lot of people in the AI have talked about this. And the solution is everybody should create a safe word. But then it hit me that all you have to do to get that safe word is you call the person whose safe word you need, and whose voice you're gonna grab. And you know what I mean? Like, if you have, let's say two siblings, you get both of their voices, you use ElevenLabs or something to emulate their voice, and then you grab that, you play man in the middle, right? And you grab that safe word, and then you hand it over. It sort of feels like you need a double safe word, and a really, you really have to spend some time to get this right. And most people are not gonna be able to handle that.

- Yeah, I mean, there are a lot of bad thing, like bad people can use this technology to do a lot of bad things. And in most of these cases, there's always an argument they could have done it before. Like if you get, if you are talented at impersonating voices, you could have pulled off that scam. But not a lot of people are talented at impersonating voices and it's that extra friction, and that wasn't necessarily a widespread scam. The quality of voice cloning and so forth these days is shockingly good. Like the, I recently found out that the OpenAI, the really good OpenAI voices, they can train those on like a 15 second audio sample. They pay for a professional voice actor for 15 seconds of their voice, and they've deliberately not made that capability available to everyone else, but it's the way the models work. So yeah, I mean that's one of the other things that scares me about this stuff, is as a society, are we ready to understand and to cope with this? And if not, how quickly can we get up to speed? The one that worries me the most isn't voice cloning, it's the romance scams, right? The thing where you get a text message out of the blue, you apply, they try and form a relationship with you. Those romance scams have been run out of effectively sweatshops, in places like Indonesia and the Philippines for years where they get people with good written English skills, and effectively like force them to pull these scams on people. It's even cheaper if you can get a generative AI model to do that. And yeah, I think that's just going to be a growing problem that we have, is that like scams are going to become more prevalent, and they'll be cheaper to run. Like, and yeah, it's something--

- And scale.

- Yeah, it's always scale. Like so many of these problems come down to the fact that the bad thing was possible before, but now it's possible at a hundred times the volume. And yeah, how do we fight back against that? I don't know.

- Yeah, I watched you talk about the romance scam on 60 Minutes or some show on last week, and it was exactly around that, and how that is growing, and growing, and growing globally, and it's just, it's just so unfair. It really is. Many of our listeners on today's podcast will be interested to kind of get a little bit more insight about your Django story, kind of how that, how you created it. I know we don't have a lot of time, but could you give us just an overview of how that was initiated, and your journey there and where it's at today?

- Absolutely. So this is going back a long time. This is 2000, this is 21 years ago. This is 2003. I was a university student, and I had a blog. And in 2003, there were only about 100 people with blogs talking about web development. So we all knew each other. And this chap, Adrian Holovaty, was a journalist, web developer working in Kansas. And on his blog, he put up a job ad, and my university offered us a year in industry placement program. So you could take a year off of university, go and work somewhere, and then come back again. And it meant that you could get a student visa. So I got in touch with Adrian and said, hey, would this work is like a year long, sort of like paid internship kind of arrangement? And it did. So I moved out, moved from London, I moved from England to Lawrence, Kansas, and spent a year working at this tiny little local newspaper. And yeah, Adrian and I were both PHP developers who wanted to use Python, and none of the Python web frameworks at the time quite did what we wanted. So we ended up building our own little thin abstraction layer over the mod Python Apache module to build newspaper websites with. And honestly, we had no idea that it was ever going to be an open source thing. We thought it was the CMS that we were using to build these newspaper websites. But I was there for a year. I left and then six months after I left, they got the go ahead from the university publishers to release this as open source, partly because Ruby on Rails had just come out, and was like taking the world by storm. And they were looking at their thing and saying, hey, we've got a thing that looks a bit like Ruby on Rails, but it's for Python. This company, 37signals, are doing well out of their release. We should go ahead and put that out into the world. And they did. And they called it Django, because Adrian Holovaty is a huge Django Reinhardt gypsy jazz fan. He actually, like he has a YouTube video where he does gypsy jazz guitar covers of different things. He's a very talented musician. And yeah, and that put Django out into the world, and it just grew, and grew, and grew. And it's been, yeah, it's been out for nearly 20 years now. We're planning a 21st birthday party for it hopefully next year, which would be really fun. But yeah, and so I was involved at the very start, and then tangentially involved after that. I haven't been like a core contributing developer for a very long time. But then I throw out ideas over the fence and occasionally knock up a few patches and so forth. It's just been amazing, like watching that grow, watching the community around it grow around the world, and seeing all of these things that people have built on top of it. Yeah, I'm really excited to see how that's worked out.

- The engineering quality of Django is just topnotch. The only ORM I ever liked. I always hate ORMs. It's like, just go straight.

- Same here. And that was nothing to do with me. The ORM, so when I was working on it, Adrian built a code generator that generate, 'cause like database code is really repetitive. So we built a thing that that generated Python code for you to talk to your models. And then Malcolm Tredinnick was the person who joined the Django community and helped turn that into some, into what we have today. An incredible piece of work. It's such a good design. And yeah, for years, after the ORM came out, I still wasn't very good at SQL. I just rely on what the Django ORM did. It's only in the past maybe five years that I've got super confident in using SQL for these things instead.

- Yeah, interesting. Well, speaking of data, you have another project called Datasette, which I would love for you to explain. What does it do that no other database does? Like what problem is it that you're trying to solve, and where are you going with Datasette?

- This is a very interesting question, and I wish I had the one sentence answer, but I don't. So I'm gonna have to, I'll have to give you a few paragraphs. So Datasette is an open source tool I've been building for nearly seven years now, and it's a Python web application for exploring, analyzing, and publishing data. So the initial idea was it actually came out of work I'd done at newspapers, where when you're at a newspaper, you often publish data-driven stories. You'll have a story about the number of hospital beds currently available across the state, whatever. And those stories come with data. And what I wanted to start encouraging newspapers to publish the data behind the stories. This is something we started doing when I worked at The Guardian back in 2010, 2011, where the idea was, you'd publish a story, and then we'd put out the data behind the story, and we'd just publish it as a Google spreadsheet. We'd have a Google spreadsheet, with these are the raw numbers that went into this piece of reporting. I always felt there should be a better thing than Google spreadsheet, something a little bit more open, a little more, but with more capabilities. And so the first version of Datasette was just that. It was like, okay, take a bunch of data, stick it in a SQLite database because SQL Light doesn't require a separate server, it's just a file. And then deploy an application that gives you a little interface, like a web UI on top of this database where you can click around through it, and adjacent API so that you can start building things against it as well. And then other features like the ability to export CSV versions and so forth. So that was the initial idea. It was what's the best possible way of publishing data online, because to my surprise, there weren't really any solutions to that. Like if you want to publish a million rows of data online, your options are basically stick a CSV file in an S3 bucket, and how's that useful? Like it's great for CSV nerds, but it's not exactly something that general, the people can generally engage with. So that was the initial idea. And then I added plugins, and I realized that my inspiration there was actually WordPress, right, where WordPress is a perfectly decent blogging engine with tens of thousands of plugins that mean any publishing problem you have, you can solve with WordPress plus some plugins. And I thought, okay, what if that was the answer for data exploration and analysis projects? Like any project you have that involves data, which is basically everything. If you could take Datasette, plus specific plugins for visualizations that you want to run or export formats, that would be a really cool thing to build. So that's how the project's been evolving over the past few years. And so you've got, there are I think 150 plugins now for things like GeoJSON export, or visualize everything on a map, or I've started building plugins for editing data as well. So you can actually use Datasette as a kind of Airtable alternative where you are loading in your data, making edits to it, running transformations against it, doing geocoding operations, all of that kind of thing. And I love this project, because thanks to plugins, if there's anything in the world that I think is interesting, I can justify that as a plugin for Datasette. I can be like, okay, this week I'm into GIS, and I'm gonna do geospatial plugins, and then next week I'm doing some weird AI stuff, and I could write plugins for Datasette that use language models to generate SQL queries, or whatever it is. So effectively this is the project I want to work on for the rest of my life. If I'm gonna do that, it needs to earn its keep. So for the past year and a bit, I've been putting together the sort of business model side of it, which is effectively the WordPress thing again. It's the hosted SaaS version of Datasette. So Datasette, completely open source. If you are comfortable running it an Ubuntu virtual machine somewhere, go ahead and install it and run it. Or you'll be able to pay me a summer month and I will run a private Datasette instance for you and your team with all of the plugins and the integrations and API stuff, and all of that kind of thing. And I'm working at the moment, I need to put the final touches on the billing side so I can actually turn on like self-service payments for it. But it's getting there. It's, that's an exciting, that's called Datasette Cloud. It's datasette.cloud as opposed to Datasette, which is datasette.io spelt like the word cassette. So it's D-A-T-A-C-A-S-E-T-T-E. But yeah, so that's something I'm spending a lot of time on the moment is that commercial side of the open source project.

- Did you say C, do you mean S?

- I did mean S. D-A-T-A-S-E-T-T-E. Thank you for that, yeah.

- So Simon, obviously the engagement between developers and accessibility community is key. It's key for the ongoing progress, both from a conformance and usability perspective. As a lead developer yourself, you know, how can we bridge that gap? What are specific areas that we can assist by sharing with you?

- So the thing I find most difficult about like building accessible sites at the moment is there are lots of, there are the WCAG guidelines, and there are like Chrome extensions that do audits and so forth. And I just don't trust them, because just because my site passed an audit, that doesn't mean it's going to work in a screen reader. Like, especially with like modern JavaScript things, where if I'm building an interface where I click through tabs and parts of the page update, how should I make sure that screen readers are notified about the correct piece of the page? I don't just want to be told that with WCAG guidelines. I want demos. The thing that's missing for me is I would like almost like a cookbook of accessibility patterns, where it's like, here's how to do modal dialogues, and here's how to do tab switching. And for each one, I want a video of what a screen reader does with that demo. Or I want multiple videos, like show me for the two or three most popular screen readers, how do they behave when you lay out your modal dialogues or your tab interface? The one that's really relevant right now is chat interfaces, right? LLMs do that streaming text thing where you ask them a question, and everything comes back a word at a time. How do I make that accessible to a screen reader? Like there must be patterns. What those patterns are, it's very difficult for me to find examples of those put together by experts with proof in sort of video form that they do the right thing. I would love to see that more of that kind of documentation.

- Yeah, I think from our perspective, you know, the accessibility community shares as much as they possibly can across the board, and Joe has played a phenomenal part in that. But as you well know, different tax stacks, there are commonalities to your point. And I do believe there should be some type of base documentation lessons learned. And again, the videos would go a long way. We've done it from our perspective, from a ServiceNow perspective, and we're more than willing to share what we've built. But a more open source, if you will, access to that type of content could go a long way.

- My technical preference here, I like HTML and JavaScript. I don't want React and I don't want like Vite and so forth. I just want, give me an accessible HTML and JavaScript demo of how to do like five or six of these common interactions. And that's enough that like then if somebody's using React, they should be able to port that to React. You know, but having those example, the thing I want is proof. Like every time a new JavaScript library comes out, one of my first things I do is check to see if they've got any documentation about their accessibility. Most of 'em don't at all. Some of them will say we are, we have the right ARIA attack. That's still not enough for me. I still don't, I won't believe that it's accessible until I see video evidence that it's been tested and shown to work with the screen reader. And I feel like, I think Adobe do have some of this documentation for some of their accessible React things.

- Yeah, so what I'd recommend that you look at is Charlie Triplett, who's one of the accessibility architects that helps create what we're working on at my company. He wrote "The Book on Accessibility". So you can just go to thebookonaccessibility.com and he was working for T-Mobile, and for T-Mobile, built MagentaA11y, so magentaa11y.com. And it's got acceptance criteria, it's got sample HTML, it's got videos of screen readers or assistive technology. And I think it's gonna provide a lot of the solutions that you're looking for.

- That's really exciting. Like that for me, the thing that would make my life as an engineer who cares about this easier, it's demos, it's straight up demos that are proven to work with explanations of why this is the right pattern for doing a tabbed interface, or replacing parts of the screen. The other problem I've got at the moment, which is a really interesting one, is tables, just actual tables of data, 'cause my software Datasette presents tables of data. I have no control over what those tables are. Often it'll be like a 40 column table with like 2000 rows in it. I don't care how good my table markup is, that's gonna suck in a screen reader if you're trying to make sense of like 2000 rows of data with 40 columns. I had a great conversation actually with Ed Summers about this, where we talked about how this is a great opportunity for chat-based interfaces, right? If you can, if you've got that table and you can then say to it, what's the school with the highest number of children from this particular background? And it gives you an answer. That's just better, that's just a better way of interacting. So I'm really excited for my own Datasette tool about like, what are the plugins I can build that use language models to give you that sort of conversational interface? Because I realized, I thought that was a gimmick. I thought having a conversation with the table feels like that's kind of fun, but is it really useful? From accessibility perspective, it's amazingly useful.

- Completely, completely agree. And we touched upon this yesterday with Ed from a ServiceNow perspective. We will be releasing a conversational AI with Now Assist, or Assist AI, full conversational AI, with the goal of creating a full conversational AI for an entire user journey from start to finish. Then we're going to be hopefully overlaying that with what we call intuitive page summarization. So an unusual here, what you just touched upon before, the key parts of that page, the most critical parts of that page will be read back right away, so they don't have to be tapping across everything. So we're actually working on both of them right now.

- And isn't it amazing that what you just described is now something that can be built, like it's now feasible to build these kinds of things, with like just interacting with webpage, straight up a screenshot, an entire webpage fed into a high quality model. We'll do a good enough job right now that it can start being useful, and that's fascinating.

- Yep, love it.

- Yeah. And in fact, to prepare for this podcast, I took your blog, which is just incredible. I threw it into NotebookLM, I took a whole bunch of your podcasts, threw that into NotebookLM, and then some of the questions I asked you, I did honestly get out of NotebookLM because it was able to mung through so much data, and provide that information. It's such a cool tool. Have you played with it?

- Yeah, I'm really fascinated by NotebookLM. What you've just described, it's the best current consumer RAG product. Like really what you're doing there is you're just dumping a bunch of documents into a RAG system, and when you ask it a question, it looks things up for you and so forth. And it works really, really well. The questions that you gave me, I've been on a few podcasts, they were by far the best selection of questions in advance I've ever seen. It's so interesting to hear that that was part of your process for putting those together. The other thing, you can't talk about NotebookLM without talking about their brilliantly weird podcast thing, where they can generate a podcast about whatever content you've thrown in there. I love that. It's both the gimmick and incredibly useful. And it is spookily good. Like it's the best version of like fake human voices that I've heard from anything. They just released a new preview feature last week, I think, where you can now interrupt the podcast and ask them a question. And so you can basically join in and be part of the podcast experience. Very, very weird, right? They also added custom instructions to it. So I like doing things like, I fed in a big boring report about some like generative AI study, and I told them, you are banana slugs, and I want you to have a conversation about this paper from the perspective of banana slugs talking about how this could affect your slug society. And they did it.

- It's like a Monty Python skit right there. I like it.

- Oh my God, that's incredible.

- It's so fun, yeah.

- And their team announced that they're leaving and starting their own company, so that should be a good one.

- Yep, that's the constant pattern with this thing. If you build anything good at all around AI, you should quit your job and raise $10 million of VC straight.

- Or a billion, some of them are getting a billion.

- The money is flowing freely right now. If you ever want to do a startup, if you can prove yourself on a product, then yeah, it's the time to be doing that. That's something I always like to emphasize is it's important to have fun with this stuff. Like a lot of people haven't cotton on to how deeply entertaining these things can be if you give them the chance. Like don't just ask them to tell you jokes, they'll tell you crap jokes. But if you ask them to be banana slugs and talk about the impact of generative AI on this society, that keeps me entertained all the time. There's just so much you can do with that.

- I can listen to you all day. I really, really could, Simon. Enjoyable, educational, and just honestly, again, as I mentioned before, your passion for what you do is pretty apparent and your authenticity is pretty apparent. So thank you so much for your time today. Greatly, greatly appreciate it.

- Thanks very much. This has been a really fun conversation.

- Thanks Simon. And just let let our audience know where they can reach you and read your stuff.

- So I'm online at simonwillison.net is my blog, which I have updated every day since January the 1st this year. So I'm just about to hit a year long streak, which I'm quite excited about.

- Congrats.

- And that will link to all of my other stuff. I have a very active GitHub account with 900 projects on it at the moment. I'm on Blue Sky, and Mastodon, on Twitter and so forth as well.

- Great. Under SimonW, at SimonW, right?

- Yes, or at simonwilson.net on Blue Sky, and at simonwilson.net on Mastodon.