Recording: Demos and Q&A with Langchain's Harrison Chase, Chroma's Anton Troynikov, Unstructured.io's Brian Raymond

Thursday Nights in AI Podcast

0:00

-48:05

Recording: Demos and Q&A with Langchain's Harrison Chase, Chroma's Anton Troynikov, Unstructured.io's Brian Raymond

Talking all about open-source and its future with the founders of some of the most popular open-source frameworks for AI

Ali Rohde

Feb 29, 2024

Transcript

At our last Thursday Nights in AI, we interviewed and watched demos from three open-source leaders building in AI: Harrison Chase of LangChain, Brian S. Raymond of Unstructured.io, and Anton Troynikov of Chroma.

Watch the demos:

In our Q&A, topics covered include each company's north star metric, the importance of growth versus revenue when building a company around an open-source project, working weekends, the best advice they've ever gotten, the worst advice they've ever gotten, whether we need government regulation in AI, how similar founder life is to the life on Silicon Valley the TV show, the future of open-source, each founder's role models, and more.

Watch the full conversation here. You can also listen on Apple Podcasts and Spotify. RSVP to future Thursday Nights in AI events here.

Key Takeaways

On the attributes of great founders:

Brian: "Great founders don't just chase tech; they hunt for market gaps. It's a tightrope act between customer needs and real-world know-how. Leading a startup isn't just hard work; it's about sparking innovation in the face of competition."
Harrison: "Having a good sense of product, storytelling, and design is crucial. It's what I enjoy doing and essential for understanding and empathizing with users."
Anton: "Truly exceptional founders are driven by a unique vision that transcends ordinary ambitions. Their unwavering dedication to their vision sets them apart, fueled by a passion that surpasses conventional limits. It's this unparalleled commitment that propels them to greatness in a sea of mediocrity."

On value creation over revenue generation:

Brian: "We want to build something that would deliver success... the only moat that we're going to build is product-market fit."
Harrison: "We're really in it to provide value... to be able to generate some real revenue, you've got to provide something of real value to people."
Anton: "I think the question of Revenue...the real question here is about value creation...to validate whether the thing you're building has value or not."

On the nature of AI development:

Brian: "There's been a lot of experimentation...a lot of failure and a lot of learning...how we can get these things into production."
Harrison: "You need to think creatively... There's a lot of shortcomings of LLMs but they're also good so if you can communicate that well...then maybe it's acceptable."
‍Anton: "The responsibility...is to get as much stuff out of your way as developers...enabling experimentation is really the number one thing that we can do."‍

Transcript:

This transcript was edited for brevity.

Q: Who uses your product, and what are they using it for?

Harrison Chase: Developers, both software engineers and then data scientists and machine learning engineers, are using LangChain to help orchestrate a lot of their frameworks. So make it really easy to get started up and running, and then people are using LangSmith to help go from that kind of prototype to production. So I think it's very easy to get a Twitter demo that you can put out in five minutes, but trying to iterate on the prompts, iterate on kind of the orchestration logic, and then even just have any idea of what's going on when I ship something, under the hood. Everyone from startups to larger enterprises is using LangSmith for that particular thing.

Ali Rohde: Got it. And what are they building with it? What's the end product?

Harrison Chase: You know, mostly chatbots. I think there's a lot of chatbots. There's a wide variety of use cases. We're seeing a lot of conversational agents, I would say. And kind of simple ones that have one tool, like a search tool or a SQL tool. But it's a lot of conversational still in a chatbot setting hooked up to one or two. I think if you go beyond that, it's rare to see those.

Ali Rohde: Do you think other things will emerge? Do you think this time next year, it'll be far fewer chatbots?

Harrison Chase: I think probably the percentage of chatbots will go down. Yeah, but I think it's a good UX.

The models still aren't super reliable enough to see and let them run wild. So you can see what's going on, you can correct it, and you can give it feedback. I think it is interesting to think about what a different UI or UX would allow for those things.

I think something like a manager looking over his workers and he can kind of see what he's doing and review their work and maybe, oh, you messed up this step, go back here. And maybe that's less of like a chatbot setting. I don't think I've seen anything like that but I like that idea. But yeah, we'll still see a lot of chatbots.

Ali Rohde: Anton, you were smiling while he was answering. What do you think?

Anton Troynikov: Oh, I was just smiling at the part where the models are not running wild. I think a lot of people are glad of that.

Ali Rohde: Alright, Anton, tell us about you. Who's using Chroma?

Anton Troynikov: Yeah, look, I think at this point it's fairly clear, like I said earlier, retrieval is a fundamental part of how you build applications with AI, so kind of the most expansive answer is anybody building something as an application with a large language model or another AI in the loop that needs to work with the corpus of data or work with data that you have available.

To make that more concrete though, I mean, we've got Chroma being used in everything from one-day hackathon projects to industry commercial deployments where we don't have our open-source distributed data plane out yet, but people have hacked it together by spinning up Chroma in Docker containers and just running it to try to scale it that way.

One of the things that I'm most proud of is Chroma is being used to advance AI research itself. We've seen, I'm sure many people here are familiar with the Voyager paper out of NVIDIA research. Dr. Jim fans. That uses Chroma under the hood as the storage system and the retrieval system for the skills and situations that that agent learns. And we're hearing similar stories from other AI researchers, and we didn't coordinate with them at all. And I'm very proud to have built something that they chose for that project because it just works off the shelf.

Brian Raymond: I think broadly, we are Venn diagrams that are almost sitting right on top of each other in terms of user bases. We think about developers, most broadly, and they're using our APIs and our SDKs pretty heavily. Data scientists who are using the Python libraries, and increasingly data engineers as well.

In terms of the use cases, I think most folks are working with a RAG framework right now instead of populated knowledge graphs. On the use case side, we don't see as much of it, right? Because we're just on the data side. But think about it in four categories of ascending levels of difficulty. You have information retrieval and search, and then you have QA, maybe it's a second tier. Third tier might be summarization, and then fourth is more process automation.

And there's been a lot of experimentation over the last year around all four and a lot of failure and a lot of learning for all of us as an industry, on how we can get these things into production. And so I think increasingly we're seeing a lot of folks focus on the infrastructure layer. Get the compute layer, the data layer, the model serving layer. And then trying to prototype far more rapidly like with these folks sitting to the left and right of me are developing. So that's maybe the common theme, especially for the last quarter or two.

Q: What gave you the confidence to build a free open-source technology company at its core while ensuring it remains a for-profit venture?

Brian Raymond: Well, we watch Silicon Valley, and they said you want to stay pre-revenue, right? I'm just kidding. Look, I had product-market fit PTSD from the last company I was at, where we went from project to project. Historically, 80-95 percent of machine learning initiatives in organizations fail, right? And we wanted to build something that would deliver success. And I thought, in the first year, the only moat that we're going to build is product-market fit. We're not going to develop anything really secret sauce. We're going to build something that delights users and then points us in the right direction. So, the way that we've positioned what we're building on the open-source side and then what we're building on the commercial side is that we want it to be absolutely frictionless for a developer, a data scientist that's trying to get a chatbot going. To get clean data for the pilot or prototype that they're engaged with.

So, let's make a fabulous suite of tooling for them to accomplish that without needing to swipe a credit card or go into procurement. But then also on the commercial side, let's give them something that's enterprise-grade so that when they're ready to graduate into production, they have something that meets their needs. That's how we're kind of thinking about positioning on it. And then also the investments in both the commercial and the open-source side.

Q: How many of you three make revenue with your products, and with your companies?

Ali Rohde: That makes sense. I guess before we continue on this question, a good question would be, how many of you three make revenue with your products, with your companies?

Ali Rohde: Alright, Harrison? Brian?

Anton Troynikov: Hell no.

Ali Rohde: A "hell no" for Anton. Yes for Brian, and yes for Harrison.

Ali Rohde: How much of your focus is on revenue right now? Pause on Anton, because that's zero.

Anton Troynikov: I don't get to talk, I don't make any money. You see how it is, capitalism.

Ali Rohde: You guys have some of the most popular Open source projects in AI. And a lot of VCs believe that, go build something useful and people will pay. I'm curious, how are you thinking about the revenue side of the company and if you're getting any pressure from your VCs to think about that more?

Anton Troynikov: I'm taking notes.

Ali Rohde: Harrison?

Harrison Chase: We have zero pressure to think about it more. I think the reason that, and maybe answering a little bit of the previous question as well, LangChain started as a side project while I was at a previous job. It was open-source because I didn't think there'd be a company around it. And then the confidence to go into that and build a company was basically because the space is just moving super fast, and I think what became apparent is this is a really powerful technology, and there's a super big opportunity despite its super early on.

And so, right now we're focused very much on the value creation side of things. And part of that, we definitely want to test hypotheses around what is valuable to people and what they will kind of pay for, but it's more around just validating the value that it's creating rather than trying to capture as much as possible.

Ali Rohde: So the revenue that you're generating now, is that more of the hypothesis testing than actually focusing on that.

Harrison Chase: Yeah, I would say so.

Ali Rohde: Brian?

Brian Raymond: We had a board meeting last week, and we had a slide on this board meeting. We had a consensus here, which is we've driven adoption on the Unstructured side, and we've achieved some milestones that we're all kind of proud of as a team. But the path to revenue leads through adoption, RAG success, and revenue. And far too many are not being successful yet because we haven't created that value as an industry yet.

Because there are lots of cool Twitter demos going on, but we're hopefully at the inflection point where people can move into production. And so I think there's a recognition here. There is a recognition, at least from our investors and from our backers, that it's going to be a nonlinear path. And that this is a unique moment compared to like 10 years ago, 20 years ago, 30 years ago. You have Anthropic, you have OpenAI, and you have other LLM builders that are making huge capital investments. But the enabling technologies kind of came out of open-source around this, right?

We have this interesting relationship where we're all trying to figure out how all these pieces need to come together in a way that can deliver on the promise of generative AI for business value. And so, I think that's the task that we have here in 2024: how do we push that into production?

Anton Troynikov: Am I allowed to push back on you?

Ali Rohde: Please.

Anton Troynikov: I think the question of revenue, and I think the real question here is about value creation. And I agree with what Harrison said as well. The purpose of trying to get revenue from the market is to validate whether the thing you're building has value or not.

But to look at this as a revenue question, or how big is your company going to be, or you know, what's going to turn out to matter. My mental model for where AI is at is like the web in 1994. People have just figured out to put phone books online; that might be a good idea, right? And you can't talk about how big Amazon is going to be in 2020 based on the fact that people have put phone books online in 1994.

This is a period of experimentation and, you know, these guys spoke to it a minute ago, and I'd like to add to it that I think the responsibility of, in particular, these companies to the ecosystem is to get as much stuff out of your way as developers so you can actually do those experiments. I'm a lot less negative on those Twitter experiments than other people are because I'm old enough to remember when the Dancing Baby GIF was online.

Half of you have no idea what I'm talking about because I'm too old. But it was the first, it was essentially the first viral piece of media that went up on the web. And this thing was on late-night talk shows, sitcoms would make jokes about it. And it's a toy. But it changes how you think about what the medium can do.

It changes the web from being a place where you upload a phone book to a place where you can display dynamic media and content. And then you start thinking about interactive content. And in parallel, Peter Thiel and Elon Musk is thinking about payments for the internet and also libertarian paradises on floating islands.

All of these things, have to come together and so enabling experimentation is really the number one thing that we can do. Also, thinking as a business person with my hat on, if the opportunity, if the giant opportunities are in the future, we have to cause this entire ecosystem industry, to move forward enough so we can reach those giant opportunities in the future. Yeah, that's my response to the revenue question.

Ali Rohde: I appreciate that. I'll push back a little bit more to Harrison's point of revenue as a form of experimentation. What about that? Isn't revenue useful, at least there?

Anton Troynikov: Oh, absolutely. I agree with you. And, you know, Chroma is moving towards the first revenue as we launch our cloud product in the next couple of months. And certainly, we'll be charging for that. And it's a test. The aim of the company is not to ramp revenue. Chroma is in this interesting place as a business. On the one hand, we're dealing with a new technology and a new ecosystem which is getting figured out and built for the first time. And nobody really knows where the value is getting created. Although I think it's going to be in business process automation, which is possibly the single most boring opinion about AI you'll hear in San Francisco. So I think it's real, and we're dealing with this new technology. But on the other hand, Chroma is ultimately a data business, and data businesses have a great path to becoming large businesses.

So I'm much more interested in what do we need to execute in this ecosystem in the way that this has done to follow that path and build a generational company.

Q: What is your Northstar metric as a company?

Harrison Chase: Oh, that's a good question. I think the Northstar metric is probably based on the number of applications that we see out there, weighted by creativeness and inventiveness, times, are they actually being used in production and actually having value.

And I agree that experimentation is super important. And I also agree you want to hopefully see some places where it starts to catch on in people using. So, the number of applications that we're helping enable, whether they're using LangChain and not using LangSmith, whether they're using LangSmith and not using LangChain, with particular kind of bonus points if it's really creative and future-looking.

Brian Raymond: Yeah, ours are boring. It's files processed, a number of folks using us. And then also looking at the usage, kind of the contours of usage, so we can kind of back into who's actually moved something into production. We can kind of see, the success there. And there's a lot of smaller LLM-enabled startups that are having a ton of volume moving through us.

And we could see that they're commercializing, they're being successful, they're turning into larger businesses. That's that RAG success, or at least like Gen AI success that they're having. And so that's what we're looking at in the data, day in and day out.

Anton Troynikov: I mean, Chroma is more purely something that you install on your machine. We have telemetry, which is well-documented. We see what people are doing. I think, in purely empirical terms, it's seeing the number of people who spin up a Chroma instance, fill a collection, and then go on to query it and then continue querying, right? It's actually a fairly straight user acquisition and then retention set of metrics, but again, you know, we get most excited by the same sort of stuff that Harrison talks about: creativity and real production deployments where people are clearly getting value out of this. I'm actually also super bullish on these chat-with-your-documents applications. I think it's hugely useful. Yeah.

Q: Speaking of those builders you're enabling, with your unique perspective on the ecosystem, are there any successful technical approaches that you believe more people should take note of?

Harrison Chase: Can I add to that question, Brian? Because you said that RAG success was the blocker in there. People aren't getting successful, so do you have any hot takes about why people aren't getting successful?

Brian Raymond: We've, you've probably heard this before, Harrison, but we use this kind of made-up vignette of like Hank, who works in logistics at some forklift manufacturer or something like that, and Hank's been given a new chatbot, right? To be used to help accelerate his work. Right, running there's a logistics department, and this Hank, when he is given this, is he delighted by it or does he say, "This seems like a pile of shit," and it just pisses him off and says that, "Cool, you know, cool prototype, bro, but I'm not going to change my workflows for this." And so when it comes down to it, you're going to realize the business benefits when workflows start changing, right?

And that's what, in my mind, like when Hank, who couldn't care less about LangChain and Unstructured and Chroma, but just wants to do his job faster and better can actually realize those benefits, like that's RAG success in our mind. And so it's the non-technical user across the enterprise and lots of different use cases where you can start delivering that value. You can start harvesting all this human-generated data that an organization produces, for example, that the forklift manufacturer might produce.

Feed that back into all of these different workflows pretty elegantly, and then it has sticky adoption, right? And it's driving value.

Harrison Chase: But why isn't that happening?

Brian Raymond: The point that you made earlier, that it's too early on the models to set them loose. The reliability of the inferences is not there to do more ambitious things like we saw so much hype around agents in the first half of last year. And there are companies like Adept, and I'm not saying anything bad about Adept, but really tall ambitions, right?

And the models weren't sophisticated enough, they weren't sufficiently performant to support those ambitions and we're going to get there, right? But around the multistage agent-driven processes where you're going to start seeing that, I think the vision is there, but at least on the LLM side on the folks that are providing those huge capital investments, we're going to have to go a generation or two further ahead on the model side. And then also refine the architectures so that we can drive that reliability and that performance.

Harrison Chase: Yeah, I think that last point would maybe be the thing that I'm most excited about. I forget the original question, but the thing that seems interesting to look into.

There was a paper that came out, I think this morning from Codium that had some really good results on coding. Basically, their whole thing was the architecture that they used. They didn't have a new model, but they had a different way. They generated tests and then ran the code that they generated against those tests to see if it would pass. That had like a feedback loop. So, the architecture, the cognitive architecture wasn't just a language model calling a tool, calling a language model, calling a tool. They imparted their domain knowledge of how coding workflow, maybe even as humans would, how code would, I mean, there's test-driven development for a reason, so they kind of encoded that into this graphical representation of this flow, and it yielded pretty good results.

So, I think, even if these models do become better, I still think, for reliability in a lot of cases, you'll need to think more deeply about the flow that's going on. The second thing I would say is also just think creatively about different UXs to have.

There are a lot of shortcomings of LLMs, but they're also good. So, if you communicate that well to the user, then maybe it's acceptable. How do you communicate that well? I think when I talk to a bunch of people these days, a lot of it's around UX things that they're trying to experiment with.

Anton Troynikov: Look, something that we've been saying since February when we launched is that simple vector search will never be enough to make retrieval work robustly in the AI context. You can't just expect to embed a document or a document passage in a query and have your application work robustly out of the box.

Now, at the same time, Chroma's entire mission is that developers shouldn't think about the retriever at all. It shouldn't be complex for them. We're looking at the best approaches to solving a lot of these problems and then just shipping them directly into the product. This year, we're spinning up our applied research competency looking directly at applications of AI in retrieval.

What's interesting is information retrieval is by no means a new area of research or science, right? But the thing is, it's kind of been this dusty corner of natural language processing for two or three decades. The reason for that is mainly because the applications of information retrieval advances have been either the sort of companies that needed large-scale recommender systems, so you're looking at, you know, Amazon, Spotify, Pinterest, or web-scale search in Google.

And there's only so many of those that can exist because they're very much like aggregators. Now, every single AI application has an information retrieval component. So, not only is the problem surface area now much larger because these information retrieval systems are being deployed in many more domains, but the interest in the return of investment on working on these problems is much higher.

So, I expect, and we have brand-new tools, right? We have general-purpose natural language understanding systems, the same LLMs we're using to start doing some of this process automation. That's very exciting. That just blows the lid right off what's possible, and you know, we intend to contribute to that and figure out how to solve some of these problems beyond vector search.

Ali Rohde: Awesome. Thank you all for that. We'll switch into a rapid-fire session in the last 10 minutes, and then we'll go to the audience Q&A. All right, I'll just point out who will start.

Q: One AI company you're bullish on, present company excluded.

Brian Raymond: I love the work that Mistral's doing.

Ali Rohde: Mistral. I love it. Harrison? You're allowed to say OpenAI if you want, but...

Harrison Chase: I'll say FireworksAI. Open source model hosting, fine-tuning—creators of PyTorch.

Ali Rohde: Alright, Mistral, Fireworks, Anton?

Anton Troynikov: Since I can't cop out, I am going to name a specific division of one company. I think Google Brain Robotics does excellent work. Specifically, in the fact that they are probably in the lead in understanding how we can use large language models in a physical computing context, and they've put out just fantastic work over the last 12 months. I'd love to see what they're doing.

Q: Do you use Copilot regularly?

Anton Troynikov: Yes, all the time.

Brian Raymond: We're about 50/50 in the company.

Harrison Chase: Company, I think, same. I don't use it.

Ali Rohde: No? Why not?

Harrison Chase: I don't get to code much anymore.

Ali Rohde: Ah, okay. All right.

Harrison Chase: But I use ChatGPT a lot when I code. That's my main thing.

Q: How many hours of sleep do you get every night?

Harrison Chase: Seven to eight.

Brian Raymond: Yeah, probably about the same. It was less before. I have three small kids. That was the main driver, but about 7 to 8 these days. It's good. 7 to 8.

Anton Troynikov: Anywhere between 3 and 18 when I recently got COVID.

Q: What time did you leave the office last night?

Anton Troynikov: Oh, good question. I can check when the alarm turned on, but probably around 9 p.m.

Ali Rohde: 9 p.m? Brian?

Brian Raymond: 10 or 11.

Ali Rohde:10 or 11. Harrison?

Harrison Chase: I left around 6 p.m.

Ali Rohde: I love it.

Anton Troynikov: Doesn't mean any of us stopped working.

Ali Rohde: Yeah. Fair enough, fair enough, fair enough.

Anton Troynikov: The food is in my house. I have to go there.

Q: Do you work weekends?

Anton Troynikov: Pretty much every weekend, yeah.

Brian Raymond: Maybe four hours.

Harrison Chase: Yes.

Q: On a scale of one to ten, how similar is your experience of Silicon Valley life to the TV show "Silicon Valley"?

Anton Troynikov: It's a hundred, a hundred percent accurate.

Brian Raymond: Zero. I live out in the country outside of Sacramento.

Harrison Chase: Eight or nine.

Anton Troynikov: One of the best things about that show is watching it in mixed company. And when I say mixed company, I mean watch it with a person who's done a startup in Silicon Valley and San Francisco and someone who's never been in this whole ecosystem. You will laugh at different jokes. It's the most bizarre thing.

Harrison Chase: I watched it before coming here, rewatched it last year, and it hits differently.

Q: Favorite open-source model?

Brian Raymond: I'll just stick with Mistral.

Anton Troynikov: Yeah, I mean, look, probably the fine-tuning stuff getting done with the Retrievers and Mistral.

Harrison Chase: Probably Mistral.

Anton Troynikov: The Llama 3 is dropping soon, Zuck. It's going to be okay; I'm still your friend.

Q: What editor do you use?

Anton Troynikov: VS Code.

Ali Rohde: VS Code. Brian?

Brian Raymond: None.

Harrison Chase: PyCharm.

Q: In five years, will the gap between OpenAI and the next best model provider be bigger or smaller than it is today?

Anton Troynikov: Smaller.

Brian Raymond: Smaller.

Harrison Chase: Smaller.

Q: What would you be doing if you weren't building your company?

Harrison Chase: I'd be building an application company that uses long-term memory or something like that.

Ali Rohde: Okay, this is very specific.

Harrison Chase: Here's a free idea. The number two complaint about character AI, besides the not safe for work thing, is that it forgets stuff about you. And I think people want that. So, yeah, I think some sort of, I don't know, if it's a chatbot, I don't know if it's like an advanced journaling app or something like that. But, yeah, I'd want to build that.

Brian Raymond: I don't have a good answer for this. This is too hard. I've done so many different things in my career. I've worked at the CIA, done investment banking, and done this. I love this space right now. It is so much fun. I would be doing something else in this space, but something in the NLP world or multimodal models. I just wake up every day, and I'm like, it's so exciting. Like, this is such a cool time. So, something in this space.

Ali Rohde: Why is that? Why are you so excited right now?

Brian Raymond: It's moving so fast, and there's so much. It's not being driven by big companies. Big companies are enabling it, but it's being driven by thousands or tens of thousands of folks or hundreds of thousands globally. And it's so exciting to witness. I don't think we've ever seen anything quite like this, so we're going to look back in 20 years, and this is a really special time.

Ali Rohde: I love it. Anton?

Anton Troynikov: Look, the comedy meme-y answer is I'd probably go on a vision quest and figure out what to build. Realistically, though, I think the space is so accessible I would just be experimenting and trying things that I personally found entertaining. One of the things that here and there I think about and work on a little side project is how we can automate mathematical reasoning using language models?

While on my vision quest, I would pour time into that and do things like it. I think the best thing anyone could be doing in this ecosystem right now is just wild experimentation because you don't know where it will pay off. If people knew what to do and it would pay off, there would be no point, but it's not the case.

The fact that we don't know where all the great things will come from yet is a huge opportunity for everybody. And AI makes it so accessible. That analogy that I gave with the early web, back then, and you could just put up a website and then everyone in the world could see your website.

That's how accessible it was. And now you can get the most advanced AI model in the world over an API for a couple of bucks, maybe a week. That's huge. Just, I would just do stuff.

Q: What do you expect to see from OpenAI this year?

Brian Raymond: Better, faster, cheaper.

Anton Troynikov: Stable governance of the company.

Ali Rohde: Stable governance. One can hope. Harrison?

Harrison Chase: More agent-like things.

Q: On a scale of one to ten, how much does OpenAI affect your product roadmap?

Harrison Chase: Eight or nine.

Ali Rohde: Eight or nine. Brian?

Brian Raymond: Two or three.

Anton Troynikov: I think, yeah, anything that they're doing specifically doesn't impact us. But, it was great to see a developer day where they gave this talk, which they called, I think it was something like, "Improving LLM Performance," and 80 percent of that talk was like, things you need to do for your retrieval system to make it work better. And for us, that was enormously validating because of course, it's like, well, we've been saying this. So we stay the course, and it's more like, I don't know, maybe what Chroma is doing influences OpenAI.

Q: What's one of the best pieces of advice you've gotten, either in your founding journey or just your life?

Brian Raymond: My father-in-law gave this to me a few years back. He's like, "Bet on yourself and bet on yourself early." I had never done anything entrepreneurial before, and that was really formative for me, and kind of gave me the confidence to go and give this a shot.

Anton Troynikov: Yeah, one of our investors, Nat Friedman, said to me, "Before you do the radical thing, you've got to get permission from your users," which means, like, "Yeah, have these big ambitious goals for what the future could be like, but make sure that the people you're building for support you in doing that." And you've got to deliver something that they want now. This is why we spent 12 months calling Chroma a vector database because we're not. We're a retrieval company for AI. But to get it into people's hands, we have to call it a vector database.

Harrison Chase: I think my parents have always told me to, you know, do something that I enjoy doing, and I love what I'm doing right now.

Q: Worst piece of advice you’ve ever gotten?

Anton Troynikov: It's pointless to build this company because Pinecone and Weaviate already exist.

Brian Raymond: "Don't listen to others. Just build what you think you ought to build." I think that's terrible advice. Listen to others, test, retest, iterate.

Harrison Chase: I think some early conversations with VCs who were thinking about how to monetize, you know, it's really important.

Anton Troynikov: I think one of the worst pieces of advice to you came from me. Because remember we had that conversation. You were pitching me on LangChain. I was like, "Harrison, I don't understand what this is."

Harrison Chase: Yeah, yeah, yeah, it was at some happy hour thing. I had four ideas that I was thinking about at the time. I think the second one was LinkedIn. I think he walked away after that in disgust.

Anton Troynikov: So, no, it wasn't disgust; I got distracted, possibly. But yeah, that was a terrible piece of advice to give to Harrison.

Q: How often do you check your star count on GitHub?

Anton Troynikov: Star count, rarely. Python downloads, daily.

Brian Raymond: We were covering a lot until we realized we weren't going to get a lot of GitHub stars. So, not a lot. And so it's a lot of other metrics that are really important today.

Harrison Chase: Similar to Anton, I think Python downloads like every few days. GitHub stars, not really.

Q: Policymaking, regulation of AI, and of large language models. Yes or no?

Harrison Chase: No, but I'm not really smart enough to have an opinion on it.

Brian Raymond: No.

Anton Troynikov: I've never seen an industry try to regulate itself before it existed. So, I'm just deeply confused.

Q: In five years, will there be more or fewer software engineers than there are today?

Brian Raymond: I think it's going to be more. I think it's going to create a lot more surface area for value.

Anton Troynikov: I think more, and I think what a software engineer does will change.

Harrison Chase: I think more, but like, plus-plus on what a software engineer does will change.

Q: How many years until AGI?

Anton Troynikov: I think an article I wrote about this. I said 2034. I'm willing to stick to that deadline.

Brian Raymond: We're like 20 years off, 20 years. It's like if you knew the stupid stuff we’re dealing with on just the ingestion, pre-processing, it would just like it put such a lens on this AGI talk. It's ridiculous. So anyway, I'm looking at it from a very different perspective from other stuff.

Harrison Chase: 10 to 15.

Thank you for reading Thursday Nights in AI. This post is public so feel free to share it.

Thursday Nights in AI

Recording: Demos and Q&A with Langchain's Harrison Chase, Chroma's Anton Troynikov, Unstructured.io's Brian Raymond

Key Takeaways

On the attributes of great founders:

On value creation over revenue generation:

On the nature of AI development:

Transcript:

Q: Who uses your product, and what are they using it for?

Q: What gave you the confidence to build a free open-source technology company at its core while ensuring it remains a for-profit venture?

Q: How many of you three make revenue with your products, and with your companies?

Q: What is your Northstar metric as a company?

Q: Speaking of those builders you're enabling, with your unique perspective on the ecosystem, are there any successful technical approaches that you believe more people should take note of?

Q: One AI company you're bullish on, present company excluded.

Q: Do you use Copilot regularly?

Q: How many hours of sleep do you get every night?

Q: What time did you leave the office last night?

Q: Do you work weekends?

Q: On a scale of one to ten, how similar is your experience of Silicon Valley life to the TV show "Silicon Valley"?

Q: Favorite open-source model?

Q: What editor do you use?

Q: In five years, will the gap between OpenAI and the next best model provider be bigger or smaller than it is today?

Q: What would you be doing if you weren't building your company?

Q: What do you expect to see from OpenAI this year?

Q: On a scale of one to ten, how much does OpenAI affect your product roadmap?

Q: What's one of the best pieces of advice you've gotten, either in your founding journey or just your life?

Q: Worst piece of advice you’ve ever gotten?

Q: How often do you check your star count on GitHub?

Q: Policymaking, regulation of AI, and of large language models. Yes or no?

Q: In five years, will there be more or fewer software engineers than there are today?

Q: How many years until AGI?

Discussion about this episode