Thursday Nights in AI
Thursday Nights in AI Podcast
Recording: Notion AI Lead Linus Lee
0:00
-34:33

Recording: Notion AI Lead Linus Lee

On Linus' journey at Notion, the shortcomings of chat interfaces, and how AI can augment and accelerate human work

Amazing to host a fireside chat with Linus Lee, lead for Notion AI. We discussed the shortcomings of AI chat interfaces, how AI is often used to augment and accelerate, not replace human work, why Linus decided to join Notion, and his hopes for the future of human-AI collaboration. Here were our top takeaways…


Key Takeaways…

On why AI will augment and accelerate human work:

“A lot of times, even if people are writing something from scratch, users often get an initial draft going with the help of AI, then do rounds of polishing themselves. AI is more augmentative and accelerating than replacing. I’m optimistic about a future where humans always steer the interaction or relationship with AI. I think AI is good for executing on things, but there always has to be a kind of tastemaker. I think for a very long time that’s going to be a human realm.” 

On the shortcomings of chat interfaces for AI:

“There are all these people building the face of God, and you're telling me we're going to interact with this by texting it? It doesn't seem quite right.”

“In real human-to-human dialogue, there’s so much context around the conversation. We don’t lock ourselves in a blank room with no walls and no context and ask a prompt. The people that are in the conversation take advantage of the context to be interesting and useful.

“With ChatGPT, you are locked in a blank room with an AI, and you're telling the AI to do something. And so there's a lot of context that's not being put to use.”

“To make chat more practical in a software application—say for example when you’re writing an app—there's context you can use. There's edits that you've made in the past. There’s information that's already on the page. There's edits that you made previously, the location of your cursor, if you have something selected, other collaborators on the page, etc. There's tons of context. 

“And if the AI is able to better take advantage of the context and look at where you're pointing and things like that, there's much less information you have to funnel into the text itself.”

On why Linus chose to join Notion, rather than continue as an independent researcher:

“I spent most of 2022 building tools and interfaces to navigate text and input information into systems. That year was like sailing around in a speedboat. You have a lot of exploration you can do, you can switch directions very quickly, and you can put down a project if you don’t like it and work on something else. A speedboat is good for exploration and picking a direction, but once you pick a direction you can’t actually move a lot of water with a speedboat. You want larger ships with more people, resources, and momentum, and that helps you figure out if the direction you picked was a good direction. I joined Notion because I had some theses around how to build good interfaces for AI and I wanted to put them to the test.” 

On the opportunity to define humans’ relationship to AI:

“In the aftermath of a new technology, there are winners that set the defaults... For example, like scrolling directions, the concept of apps in mobile and how they differ from desktop, zooming in and out... There are so many subtle things that we take for granted, but actually someone decided them. So there was an appeal to being part of a team that could potentially set those things in motion.”

“Beyond interfaces, there’s also a larger, important thing: setting the tone for what kind of relationship humans have with AI… On our CEO’s recommendation, I was reading Steve Jobs’ memoir, put together by the Steve Jobs archive. One of the stories in there talks about this moment, a week before they launched the original Mac, when Steve had everyone in the room. Before the Mac, the computer was this cool, dark, mechanical machine. It was powerful, and everyone could kind of see that the computer was going to be everywhere, but it was not clear that the relationship was going to be this fun, creative, often humane kind of thing. So when everyone at Apple saw the machine, the Mac they had built, they were like, okay, now that we have this more human, more creative, inviting thing, this is so obviously going to be the tone of the relationship that humans should have to computers. And it's not a matter of whether this is going to be the relationship, but how long it's going to take to get there.”

“And I found that really motivating. AI is still kinda weights that run inside data centers. And, that's one way things could go, or maybe there are products and designs and things we can build and messages we can send that set the course to be more like AI as a kind of a partner or collaborator, a more human kind of relationship. And so the chance to be a part of something that pushes the world a little bit in that direction was also alluring.”

To watch the full conversation, click here. You can also listen to this recording on Spotify. Join us for our upcoming firesides! See the full list here.


Transcript:

This transcript was edited for brevity.

Q: Can you tell us more about your journey? What led you to join Notion?

Ali Rohde: You joined Notion earlier this year after a year spent prototyping, building, and hacking on your own. You considered starting your own company, doing research, joining a lab, kind of staying in the realm of the theoretical. Instead, you chose to join Notion. What helped you reach the conviction that this was the place to be at this time?

Linus Lee: I spent all of 2022 prototyping, hacking around on my own research. Before then I was working a lot of stuff on the side around tools for thought and all those tools, note- taking apps and my interest in AI was more of a recent thing in 2022, I spent most of the year investigating this high level problem of like so much of modern life, especially modern knowledge work is just inundated with like text walls of text, reading walls of text, producing walls of text, searching through text.

There's all this information out there, but it's in what I think is a pretty inhumane form pretty inaccessible form it's not like when you look at a wall of the page of text is not looking at a picture. You have to like parse it and do this very mechanical, pretty intellectually laborious activity of like parsing through text and trying to figure out what nutritional information is actually in there. And so I spent a lot of time building automatic highlighting, summarization interfaces, tools to navigate text, and input information into these systems. Some of it was more interface. Some of it was more AI. Some of it I might talk about later. I like to say that that period of that year was like You're like sailing around, sailing around in a little speedboat.

You have a lot of exploration you can do. You can switch directions very quickly. You can just put down a project if you don't like it and work on something else, you get to meet lots of people, talk to lots of people without being anchored to a larger kind of entity. And so as speedboats go for exploration and picking a direction, but once you actually picked a direction, you can't actually move a lot of water with a speedboat, you want a larger ship with lots more people, resources, momentum, and that helps you.

Once you pick that direction, that helps you actually figure out whether it was a good direction or not. And so for me, part of it was, I had some theses that I had come out of the year with around how to build good interfaces for AI and for tools for thought. And I wanted to kind of put them to a test.

So that's one reason. The other reason is the sort of second thing you alluded to, which was in the aftermath of a new kind of technology or medium, there are winners that set the paradigms. My thoughts on this have kind of expanded a bit. I think a part of it is definitely the winner set the interface defaults.

As an example, like scrolling directions, the concept of apps in mobile and how they differ from desktop, multitouch, zooming in and out, things like that. There's so many subtle things that we're just kind of used to that we take for granted for how these materials work, but someone decided them.

So the appeal of being a part of a team that could potentially set those things in motion, given Notion's large distribution was a part of it. I think beyond interfaces to there's also a larger important thing which is setting the tone for what kind of relationship humans have with AI.

This is a little grandiose but Ivan's, our CEO's recommendation, I was reading this, I think it's called, Make Something Wonderful or something like that. It's Steve Jobs's memoir put together by Steve Jobs archive. And in one of the stories in there, Steve talks about this moment, a week before they launched the original Mac, when he just had everyone in the room and he was like the Mac is significant because in part changed the color of the relationship people have to computer. Before then it was kind of this like cool, dark, mechanical machine it has automation, it was powerful and everyone could kind of see where the power was going and everyone could see that this was kind of going to be everywhere, but it was not to be taken for granted, that the relationship was going to be this very fun, creative, often humane kind of thing. It was more of a machine in the room. And, when they saw that machine, they were like, okay, now that we have this more human, more creative kind of inviting thing, this is so obviously going to be the tone of the relationship that humans should have to computers. And it's not a matter of whether this is going to be the relationship but how long it's going to take to get there.

And I found that really motivating AI is still kind of weights that run inside data centers. And, that's one way things to go, or maybe there's products and designs and things we can build and messages we can send that set the course to be more like AI is kind of a partner or a collaborator. Whatever you want to call it, a more human kind of relationship.

And so the chance to be a part of something that pushes the world a little bit in that direction, I think was also really alluring. 

Q: Do you think there’s a default path for human-computer interaction?

Linus Lee: I don't, there's a lot of really smart people working in pushing this in all kinds of directions. A lot of people are obviously working at the model and training level, trying to steer the tone and the speaking style and these models, a lot of people work at the interface level. There's also definitely a lot of valuable work happening at the policy strategy, regulation level.

So I don't know if there's a default path that I would push against but there are paths that I have opinions about.

Q: Can you tell us more about Notion AI?

Linus Lee: Yeah, so our AI journey kind of starts about 10 months ago at this point, last November of 2022, we first started exploring a new, very quickly, prototype together and launched, the initial version of Notion AI which was putting this power of GPT 3 following instructions inside your Notion Editor.

So you could take a document, pull instruction out, or pull information out, summarize it, pull action items out. You could try to use it to translate things, or change tone, or help you write or edit. So initially there was a lot of things you could do. We had a brief period of preview. I don't know if you call that an alpha or a beta, but a preview period.

And then we made it available to everyone in February. And since then, we've had a few sort of incremental things on top of it. So the initial thing which we sort of internally called the AI writer, was the first push. Then we made AI blocks, which is a special type of Notion block that you can put inside that has kind of a prompt built into it. And you can use it to update and regenerate information on a page. More recently in May, we launched AI autofill, which is putting AI inside Notion databases.

So this is in contrast to kind of writing a single document. This is if you have lots of structured information, like a thousand meeting notes or a thousand 

interviews or people that you're looking at candidates you're looking at for recruiting, you can take that information, maybe transcripts or maybe notes and pull structured information out or translate or summarize.

We've taken this as a kind of thing, and sort of learning every step of the way to figure out how this new technology that can generate all kinds of content and follow instruction can actually fit into people's workflows that they do day to day. And I think the crossing the gap between like, this is a cool tech demo to this is something that I now depend on for my day-to-day work, where its valuable, every time I start a meeting or every time I take notes, that's the like interface gap, right? That's the general story. 

Q: Why do people prefer using AI as a thought partner?

Ali Rohde: I was reading the blog posts that Notion put out when you announced a new feature and one of them really caught my eye. It said that the number one way people interact with Notion AI is by highlighting existing text and asking AI for help. This suggests that most people start by writing their own content and then treat AI as a thought partner and editor. Do you think that is a short-term pattern until the technology gets better and gets to a place where it can do work on its own? Or do you think this kind of like augmenting thought partner will be the paradigm?

Linus Lee: Yeah, I think that that discovery is pretty interesting. I mean, personally, one of the more fun use cases that I have for Notion AI is if I see a word or an idea or like an acronym that I don't recognize, I'll just highlight it and ask Notion AI to define it which is funny because you have this massive tower of GPUs burning tons of electricity and it's just like define this word, but it's true. I think this is something that we learned by looking at data, but also by talking to people who actively use AI in their day-to-day work is a lot of times even if they're writing something from scratch, they'll often get an initial draft going, maybe with the help of AI, and then polish it, do rounds of polish, maybe change wording, maybe there's something that they didn't catch and you want to go back. And so it's definitely a pattern. It's I would say more augmentative and accelerating than replacing.

I'm optimistic about a future where it's fundamentally, I don't know how I feel about, the terminology human in the loop because it implies this like industrial kind of loop happening, but I'm excited about humans always sort of being the one steering this interaction or steering this relationship.

I think the AI is good for like executing on things but there always has to be a kind of like tastemaker. And I think taste is a word that's thrown out in creative context, but even in like business decision making, even things that look really dry, like should we prioritize shipping this product or we're shipping this other product? I think it requires someone to take ownership for that decision or want to take responsibility or take credit for that decision

AI can make the process of understanding the situations that you make that decision easier. It can help communicate that decision to the right people. It can help update people on that decision. But with these decisions and all these touch points, someone's actually exercising their human taste and their experience to set these things in motion.

Q: Are there use cases you've been most surprised about people using Notion AI for?

Linus Lee: There are lots of little surprising, unexpected things. Lots of people use notion, but there's a small, very enthusiastic fraction of people who use Notion AI to write fanfiction, which is made really easy because you can just be like, give me a romantic story that involves like Jungkook and Suga from BTS going to the zoo and it'll just write it and then you can like polish it. It's like that process is made 10x easier, right? So just things like that. I think more on the pragmatic side, I've been pleasantly surprised, and kind of uplifted, by how many people are using Notion AI to translate things. Obviously, there's existing tools like Google Translate but there's a couple interesting things about how people use Notion.

Translating in Notion AI is already in notion. And so, if you're working in an international team like, Notion is an international team, we have English speakers in US and Dublin. We have folks in Hyderabad, we have folks in Japan and Korea. And I got a chance to visit the Korea office early this year when I was visiting my family. And within the office walls, people often speak Korean and a lot of the docs are written in Korean and obviously everyone speaks English to a certain extent, but it's helpful to have the mechanical assistant. So if I'm translating this long document, Instead of painstakingly translating every sentence, they can give me a starting material and then I can go back and polish. And in aggregate data too, we see a lot of really enthusiastic users in those parts of the world that use Translate and rely on Translate. The other interesting thing about using translation with a language model as opposed to something more traditional like a NLP model catered to translating is you can ask for more nuances and tone to be conveyed in the text.

And so, I don't know if Jeffrey Litt is here, but he's a research friend and he and I were talking about he's a Japanese speaker. And so, he was trying to get some text translated to Japanese or back from Japanese. With these languages there's much more of a distinction between like registers of respect.

When you're speaking to older people or more in more professional context and oftentimes translation software doesn't get that quite perfect. And in those cases, you can ask Notion AI like, Hey, can you make this a little more formal? Can you make this more casual? Can you speak like XYZ in certain setting? You could even say, I'm trying to write an email to my boss and help translate it. And it'll capture more of that context than, a normal kind of rote software. And so, the translations is something I, being an immigrant and a Korean speaker, in addition to English, that's something I've been pleasantly surprised by.

Q: How is prompting affected for a multi-lingual product like Notion?

Ali Rohde: Yeah, that's something that's really interesting about Notion and Notion AI, which is that it needs to work in several different languages because Notion works in several different languages. So, I think you have English, Korean, and Japanese right now, and then German and French in beta. How does that affect how you think about the product?

Linus Lee: I was going to talk to you about my thoughts. So, one really interesting place that it comes up is in prompting because the developers in the room are mostly English speakers and GPT is particularly good at understanding English because of the training data composition.

And so, all of our prompts are written predominantly in English and it would suck to have to rewrite the prompts in every single other language that we support. I remember there was a week before, we launched, we had a general availability for Notion AI and Simon, our CTO, was literally full-time prompt engineering for an entire week. One of the things he was doing was writing a few shot examples to make sure that behavior, especially around language following, was really robust. We depend a lot on few shot prompts in addition to sort of the zero shot instructions and the prompts for language following.

Q: How does Notion AI work under the hood?

Linus Lee: Our basic prompt is summarize this document follow X, Y, Z kind of rules, and then we say, make sure that the reply is written in the language the document is written. We have few shot examples and they obviously vary in topic and material, like they should, but they also vary in language. So, we have, four or five examples in our summarization prompts that are like different documents in multiple different languages.

I think one is like a tour guide to Italy written in Italian or something. And another is like something Korean and examples of summaries. Sometimes we have foreign language speakers in the office and we grab them. We're like, does this sound like a rational translation? Because when I write them, I just Google translate and then summarize. And then it might not be right but that's generally the shape that it takes. And then we have stuff that we've built around evaluation and making sure that when we should make changes to prompts, those behaviors still remain in place. 

Q: What does the future of Notion AI look like?

Linus Lee: I mean, a big part of our efforts still in improving the way that people use current Notion AI. One of the things that I've been really excited about is the potential to do a lot more with translation as we just talked about. This is something I particularly got to feel when I went to Korea and see how people are using it there. 

Ali Rohde: Is there anything interesting and different about how they're using AI in Korea? 

Linus Lee: To be honest, I mostly stay with my family and then like worked at Notion. So things that I saw are pretty expected.

One is investing in our sort of portfolio of bets around AI and products around AI. There's a lot that we can do there. Another thing that I think is pretty interesting that we've, obviously thought a lot about also is the most enthusiastic Notion users have a lot of their personal knowledge inside their Notion workspaces. And you're obviously aware of what other companies and people and founders are doing the community, but being able to take more advantage of that data, all the knowledge that you have in Notion workspaces would be really exciting and really valuable. So that's something that we've thought a lot about.

Q: You were very skeptical of few shot prompting, what happened?

Linus Lee: Few shot. There's a lot of things that you can't communicate zero shot, or is it really difficult. So going back to the Japanese translation thing where you try to carry over, express like the specific register of speech that you want to communicate. When you're prompting in English and you're trying to communicate the specific register of Japanese that you want the model to reply, you're like using the wrong tool for the job. There's not as much vocabulary available to you. And I think that problem generalizes to other things too. If you want a model to summarize in a particular style that you like, often it's really hard to put into words exactly what style is. There are some interesting tricks around this.

Like you can try to get the model to write in the style of a specific person. And that's kind of like an interesting pointer to a specific spot in the latent space. In general, it's hard to fully specify the behavior that you want with just zero shot instructions. And few short examples I think are a great balance of speed and ease of iteration, and getting to a specific style or output format that you can't quite convey in just normal language. 

Ali Rohde: Why were you initially skeptical?

Linus Lee: This is probably a very long time ago. I think in the broader scheme of like, how do you customize? How do you get it to perform some particular tasks in a style that you want? There's a spectrum of things. Obviously, there's like zero shot instructions, few shot, there's like LoRA fine tuning. There's further steps of fine tuning.

Initially, I was more inclined to think that the best balance for building interesting LLM applications was probably few-shot or fine tuning. And I think that's still true if your task looks very different than what the model was trained on. So, if you're trying to teach the model to output some data in some very specific schema or you're trying to teach it a new programming language or something like that, then you still need training data and you can't prompt it.

And even few-shot prompts are probably not enough, but if you're trying to get it to do things like pull out action items from meeting notes, I think few-shot looks much more promising in comparison. 

Q: What feels off to you about chat?

Kanjun Qiu: So, you've written about how chat interfaces are not the end all be all of interface for AI, which I agree with but I'm curious, what feels off to you about chat?

Linus Lee: Yeah, I talk about this a lot. I think most people generally agree, I'm looking at the audience to get a vibe check. I think most people generally agree that like chat is probably not. 

There’s all these people like building the face of God and you're telling me we're going to interact with this being by texting it. It doesn't seem quite right but my stance on this has also kind of changed in the last few weeks. There's a lot of reasons. I think chat doesn't feel right. I think the most general one, and especially in specific use cases, for applications like photo editing, words are definitely not enough, right? Like you want to sketch, you want to take full advantage of that canvas. You want to like condition generations not just on what you want descriptions text wise but also draw X, Y, Z here and so on.

So for specific applications, I think it's pretty obvious why chat's not the answer. I think for general things, especially things that remain in the realm of text, it's a little more nuanced but there's a couple ways that I think chat could be better. One that I've thought a lot about is there's so much context that's around the conversation. Like I don't lock ourselves in a blank room with no walls and no context and ask him a prompt. We're usually meeting for a reason. I know what's happened in his life. He knows what's happening in my life. maybe we're in a meeting that has some purpose. He knows, he sees like what I'm wearing, he sees if I'm tired. In that context, the conversation happens or maybe we're like pointing to something. The people that are in the conversation take an advantage of the context to be interesting and useful.

In something like ChatGPT, you are locked in a blank room with an AI and you're telling the AI to do something. And so, there's a lot of context that's not being put to use. To make that more practical, I think, you know, software application, if you're writing an app, for example, there's still a lot of context. There's like edits that you've made in the past. There is information that's already on the page. There's even things like what edits did you make last? And where is your cursor? Do you have something selected? Maybe there's other collaborators on the page. If you're in a kind of richer workspace, like Notion, there is like other pages that the page links to. There are people that the page is shared with. So there's tons of contexts. And if the AI is able to better take advantage of the context, I think there's much less information you have to funnel into the text itself.

And I think that's one big challenge with text interface or chat interfaces is you have to funnel all of the context in like a single shot into that chat and that makes prompting difficult for professional prompt engineers, which definitely makes it intractable for most end users.

Q: What is the best approach for developing enterprise projects with security and privacy requirements?

Ali Rohde: It's an opportunity that Notion has that these enterprises have security privacy requirements. I'm sure many others that I'm not familiar with. How do you build in with that in mind? 

Linus Lee: It's a really interesting problem to work on. From the beginning when Ivan Simon working on it, we were very clear that our relationship with our customers was not going to change. We have data security and privacy practices in place, and we are still not training anyone's data. We're not using your data directly to improve any of our models.

Your data stays where it is. The relationship that you have with us and you're like custodians, every valuable data doesn't change. A lot of people at Notion also use Notion for their personal work or their personal life. I have all of my life inside my Notion as well. And it would really suck if like some of that data ended up in a model somewhere.

And so we've really focused on one: being transparent and committed. We are not training on your data, using it to do any of our engineering. Unless you expressly give us permissions to use that data and that I think works well to earn the trust and continue to keep the trust of our enterprise users.

That does raise interesting technical challenges though. A lot of machine learning is about using data to improve model performance and evaluate model performance. And so, in absence of cheap but morally questionable user data that you can't train on, you have to invest in other kinds of techniques. Like one thing that we've spent a lot of time building out internally is synthetic data. Generating very elaborate, fake Notion workspaces that we can use to evaluate our models and train our models and see how they work. There's literally a script somewhere you just run it and it generates a new workspace for a fake company that we make up and it generates HR guidelines for this company and like recent product releases.

But investing in those kinds of tools and then infrastructure for evaluating and testing prompts and models on those tools, I think, are things that we've invested in so that, we won't have to resort to these other kinds of things that we don't want to do. 

Q: How do you evaluate if a summary is good or if it did a good job in a certain situation?

Linus Lee: Yeah, this is I would say, at least on the data and model side, one of the toughest problems that we work on and it's definitely not solved for us or for the rest of the industry. There's a bunch of different levels where you can talk about whether something that a model is doing is good, and they sort of go up in complexity. So, at the base level there the question is the model doing something that's blatantly incorrect?

So, if you ask a model to summarize something in the right language, if it's using the wrong language, or if the summary is longer than the original text, or it's not on topic, or it's not using the right format, it's giving you a bulleted list instead of paragraphs and you ask for paragraphs. Those are pretty incorrect. And so, those I think you can pretty reliably catch with things that look like unit tests. I know in the community there's a lot of people working on sort of model graded evaluations where you generate some response from a model and then you ask like GPT-4 if it follows certain sets of rules.

And I think that's a good smoke test and we have things like that internally. Those can only catch kind of blatant correctness. As a total side note, some of the ways that popular language models are tuned, make them very chatty and verbose. A lot of times when you ask it to like summarize a blank document, it will write you a paragraph about why the documents is empty. And it's like, you can just say there's nothing there. So, checking for behavior like that, I think is like kind of unit test together. It's a pass or fail kind of situation.

And then slightly above that, is if you have other ways of measuring correctness for a particular task, you can measure those things more quantitatively. Maybe an example is you have a task that's pull out the action items. Maybe you can have a test data set of documents containing action items and you can check how many of them per example the model managed to extract. And that's a little more quantitative. It's a little more like you mentioned, the quality of work instead of the correctness. Ultimately the arbiter of correctness at the end is always having humans and there's still nothing that comes close to really putting things in front of real humans.

And so when we build things internally, Notion runs on Notion and we do a lot of dog fooding. When we have new features, we'll roll out internally to everybody and see how people feel about it, how people use it and catch a lot of bugs that way. And then for some features that seem a little more significant, we'll do kind of canary testing. Some portions of our global user base will get new versions or test versions of prompts. And then if they seem to work well, we'll roll it out to everybody else.

Q: What haven't we talked about that is exciting to you either within or outside of Notion?

Linus Lee: That's a really good question. There's a lot. I'll settle on three because rules of three. I'll try to be brief, but if anyone has questions or if you want to talk about it after, I'm happy to, the first is architectures beyond the transformer. Fundamentally, the reason that transformers have been good and we have this whole AI situation right now is because we have tons of tons of data we can shovel into these systems and then we have ways to massively parallel train them and none of that's really unique to the Transformer. The Transformer just makes it possible to run these things very parallel and there have been sort of incremental improvements to Transformer, but it's actually quite surprising how closely we adhere to the original Transformer architecture from six years ago when you look at the modern designs.

To me it's inconceivable that this is the last possible Language model architecture where they're ever going to make, right? It's only been six years. And so, there's other things like state, space models, that improve on the transformers in various ways. There are things that are simpler, I think with training algorithms too, like RL that are better pushes, or approaches that purport to be better. I'm interested in approaches that come after the transformer and what we'll learn by understanding how they work and why they work better.

The second is foundation models in science for scientific discovery. This is less my wheelhouse, but one of the nice side effects of all this language model boom is that now we have tons of infrastructure and compute tools and interest in running these very large models. Evaluating them, studying them, and understanding how they work.

And so, there are a lot of problems in like the biological sciences, for example, or social sciences, where it's fairly easy to collect like medium quality data but it's very difficult to understand all those phenomena. Protein structure prediction is the popular one but there's a lot of other ones in psychology and genetics where you can teach models to draw correlations and reasonable causal relationships between different data points, but it's very hard for humans to go in and come up with an explanation.

And so, building models for these use cases and taking advantage of the tooling and infrastructure that we've gotten to tackle those problems, I think is super interesting.

The third, and the one that I'm sort of like most enthusiastically playing with around on the side is model interpretability. There's a researcher named Neil Nanda, I like to read his stuff a lot and he has this way that he puts it;we have computer programs that know how to write a sonnet like write a Dr. Seuss poem and we don't know how they work. But it seems like a thing that we should try to understand how they work because it seems like it should not be possible but yet we have these machines that do these things.

A lot of the stuff that I did the last few months of my year off was somewhat related to this it was like trying to control how models model and generate text by interfering inside the model. By messing with model activations internally, by looking at latent spaces, embedding spaces and models.

I mean, what do these numbers actually mean? And then I think the one that's just interesting because like understanding stuff is cool. And I think it'd be really cool to look at an embedding and be like, okay, here's the number that corresponds to whether this embedding is about cars or not. I think that's just really cool.

And more practically, I think understanding how these things will work is going to help us a lot. Like, the difference between understanding how various parts of transformers work and don't, is the difference between do we have better performing models or not. So, I think more advances in interpretability is super interesting to follow.

Ali Rohde: Awesome. And I know you're also a very prolific writer, and you're thinking about these things and talking about them. So, we'll have a question and answer session, but if people want to learn more, where should they go? 

Linus Lee: I have a website, thesephist.com. It's hard to spell, but if you find me on Twitter or Google my name, it'll be there.

0 Comments
Thursday Nights in AI
Thursday Nights in AI Podcast
Fireside chats with leaders in AI, co-hosted by Outset Capital and Generally Intelligent