How Artificial Intelligence Is Sculpting the Future of the Startup Ecosystem [Podcast #104]

In this episode of the SOL Podcast, we are having Eric Landau who is the highly accomplished co-founder of Encord. With years of experience in the Tech industry, Eric has a proven track record of driving growth and innovation through his expertise in product development, marketing, and business strategy.

Tune in to the SOL Podcast to hear about Eric’s journey with Encord which is an active learning platform for computer vision which offers a range of services including software development, web design, mobile app development, and digital marketing with a team of experienced developers, designers, and digital marketers.

Ozan Dagdeviren

Hello and welcome to the Startups of London podcast. I’m your host, Ozan and the founder of Startups of London. Today I’m joined by Eric Landau, founder of Encore, and it’s a fascinating business. The description that I can see on the website is that it’s an active learning platform for computer vision, so it does all sorts of thing. AI assisted labelling, model training, and diagnostics. Find and fix data sets, errors and biases, all in one collaborative active learning platform to get to production. AI fasting. First of all, welcome to our chat, Eric.

Eric Landau

Yeah, thank you very much.

Yeah, it’s really nice to have you here. I know how strong of an experience that you have as well as a background in this field. But before we get onto that, let’s start off with your speculations and your feelings perhaps about the spotlight AI has been getting with all these. The discovery zones are generative ai, mid journey, ChatCPT, and everything else. It has become more of a household name, right? We were talking, it’s going to be a big thing. A big thing, and then we waited for 10 years and it never became that big thing, and all of a sudden, was it November last year? Or December, anytime became a big thing overnight, 10 years in the making, overnight, kind of like a situation I think there as well. How do you see all of this, do you think it’s positive? How do you think it’s going to change how people understand AI? Let’s start with that.

Yeah, no, I think it’s definitely positive, especially that there’s much more, public attention on it, and I think it’s just good for the general development of AI that a bunch of startups are now being built around things like the open ai, API and ChatGPT. A lot of people are thinking about it, but it is really much, it’s an overnight sensation, 10 years in the making, as you said. Where, from our perspective, it’s really been more about incremental progress and the main change in November of last year when Chachi GPT was released was just that, the way that it was packaged and presented. So, The technology hasn’t really changed very much. GPT tree, which is the main model that ChatGPT is based off of, has been around for almost three years, and capability has been, very similar to what you see with ChatGPT, except now it’s built in a way that the public can see the power of AI directly and they can interact with it directly and it’s much more accessible. And that perceptive change is the main one, which has allowed. Open AI too, I think ChatGPT is the fastest growing application in history. So, it really isn’t technological shift. The technology has been developing now for the last 10 years, but it’s more the way that it’s presented.

Yeah, I think that’s a good point. I remember when I first read about ChatGPT a few years back before last year, I think I first heard it like ChatGPT and it was a mention of Gepetto. Right? Have you heard of the GPT coming from that?

Yeah, I mean, there’s a long history of these large language models and the previous iterations of them, GPT two, as you said, and Bert before them. We’ve been in the space for a little bit of time, and I’m kind of watching the language part of AI develop and it has felt more incremental. But with seeing Chat GPT now, it’s pretty amazing to see like everyone now, clicking in and seeing the power of these models.

It’s crazy, and I think you’ve put it really well. It is more about the front end and the interaction interface because there were some people who really understood how it worked and they were saying, oh, AI is going to be huge. And people were looking at them like, I mean, Alexa doesn’t feel like that. Like it cannot get like a Spotify song wrong. So, how is this going to happen, right? When like, is the 2050 or something? But, yeah, as you said, with the front end and the interaction, the interface for, with the users that’s being established, I think. what happened was we as humans have this inherent tilt to think like in a way we give them human attributes, right? So, it becomes an emotional thing and we think, okay, this is just creepy. I think that kind of like a very emotional connection happened how people saw ai. and for the first time really got awakened in terms of the possibilities. Oh wow. This is going to be really, really disrupting, right?

Yeah, that’s right. It’s tuning a model into an application which is very resonant to human emotions and one of the reasons why we see less of AI in the regular world. because bar and the threshold for performance, for AI models, for most of the applications that it works on is very, very high. So, self-driving cars, they need to be right, not 99% of the time, but 99.999% of the time, right? Otherwise even 99%, right, will cause a lot of car accidents in the street. But chat, GPT or, or chatbot. It can afford to be right only 90% of the time, or 95% of the time. And it can be trained in a way that people can interact with it very fluidly. So those two, changes. The performance error threshold for the model and the ability to really interact directly with the public has been the main spark for the current growth and craze in and ai and more specifically generative.

How does thinking around computer vision, differ or compare to thinking around generative ai, or language-based models?

In a way, the fundamentals are the same, where the base of all these AI models is having the right data and the right training data. The main difference between these generative models or a lot of these foundational models is that they’re trained in what’s called an unsupervised way versus many predictive AI applications or ones that are in kind of traditional computer vision use cases. They’re trained with what’s called supervised. Both require huge amounts of data to train the models, but one requires having labelled training data, examples to learn from, and the other, you just give it the data by itself and it kind of learns from the inherent patterns of the data. So, supervised learning, you have to tell the machine what everything is. So, an example is you might have a video and you’re showing it cups and chairs and tables, and for each item you have to say in the video, hey, this is a chair, this is a table, this is a cup. So, you’re pointing directly to what you want the model to learn versus unsupervised learning. Give it a bunch of data and you say, here’s kind of the general structure and you need to learn from the structure of the data. And text lends itself very well to this because if you give it, Wikipedia for instance, a large language model, all, you don’t have to label anything. You just have to ask it to. Predict, given a sentence in a Wikipedia article, please predict the next sentence or the next word. And so that doesn’t require any additional human labelling, it’s just giving it the data with a task that allows it to learn from the data by itself.

That a very, very good point actually. Yeah. Because text itself is, is an abstraction. And according to some philosophers, it is the way that the human intelligence has actually developed right? Thinking through language is what in most ways defines most of our intellectual existence, abstract concepts, being able to compare them, et cetera. But with vision, there is another layer there. What we see is not always describable in words as humans as well, actually, right? There’s a, there’s an assumption layer that we make. We might look at an object and think that, okay, this is a circle, and then it’ll rotate and we will realize it’s a bottle. So, there is always that uncertainty that the human brains try to work through in daily life make the assumptions so I can understand it as a much layered and complicated, perhaps barrier to break. Is that off to Mark? Do you think there’s some truth in that?

No, I think there’s some truth in that. You could say that language is in a sense the natural operating system for the human mind because if you’re perceiving things visually. Even my example that I gave before where you have a table and a chair and a cup, we’re parametrizing the visual concepts with words. So, the word chair describes the thing that looks to us like a chair. So, the concepts are really encapsulated by the linguistic patterns that we have and that helps shape the way that we think about the world in a more intrinsic sense. Of course, language is not the bare bones of how we think about things. It, everything linguistic is pointing to some other higher-level concept. It’s just a very useful way from a human perspective to organize our principles in the way that we’re interpreting the world. So, the mental model that we have. Tends to really be underwritten by language itself. So, I think there is something there. It probably requires a little bit more thought of how it works, but it’s not something that I’ve really considered too deeply myself. I think it’s an interesting observation.

I think that’s just incredibly fascinating and this is the first time I’m thinking through this as well, but because in a natural language model, the abstraction is also the level we are operating. So, when we say chair, it’s chair essentially is a placeholder for whatever the person is going to understand, right? A chair might mean visually, it might mean a million different things. But we use chair as a placeholder. So that’s the level that we operate. So, then text completes, and it sounds right, it’s easy I think, for us to ascribe a certain intelligence to the engine, whereas in the visual realm, and perhaps because the visual cortex the back of the brain, the occipital lobe, if I’m not mistaken, it’s the first part of the brain. It’s the biggest part of the brain as well. We have a huge processor for understanding the nuance and the light and the direction and the speed of objects, and that’s basic their survival. So, it is much more layered. So, I think it is also easier to find faults in an AI system when it competes through visual recognition rather than the chat. And I’m guessing that’s kind of diving into the realm of Encore. And maybe with that segue, do tell us about the business and what the model is and what are the technical challenges you’re working on.

Sure. So, I think you did a good job introducing the company in your introduction. But not to repeat anything, we are what we call an active learning platform, and what that means is we work with a lot of companies and institutions and groups that are building computer vision models themselves. And we are the infrastructure layer for their training data. So, both in managing their training data, annotating their training data, evaluating with respect to their model and their model, perform. We kind of handle the base layer to make sure that all of the potential training data problems that they have are solved. And so that they can focus on really building strong models and then deploying them into production environments and making sure that they work in practice. So, what we offer is a software platform to help solve these various problems and work with a lot of sophisticated AI companies that have a huge corpus of training data and need these things to be resolved in efficient and scalable ways.

So, if I were to speculate Just from a market and market size perspective, in terms of the commercial potential for the business, please correct me when I’m wrong here. There is a certain interest that has been proven by the functionality of ai, especially around generative learning and also in computer vision companies want to use these. But they don’t necessarily have the infrastructure to be able to use that. They don’t know how to create and perhaps work through their data sets. That expertise is very expensive. So, you are a layer of service in a way that is enabling these companies to be able to use their own models. Is that the case?

Yeah, it’s an infrastructural software platform to solve the more granular trading data problems that these companies have.

Is there an example you could perhaps share with us so we can maybe understand it more?

Yes, I think that’s a great idea and an analogy is a lot of these companies. They train their models in cloud infrastructure, so in AWS or GCP. And they use the Kubernetes clusters and the GPUs offered by these larger cloud infrastructure purposes. The analogy is we do a similar thing, but on the training data layer and to give an example, back in the supervised learning context where you need to label a ton of data to feed it to the models to actually train the models properly. What a lot of companies were doing before, Encore and similar companies in our space was they had to build their own tool, their own internal tooling to handle how to annotate this data and how to do it in a way that was semi efficient and scalable. And it’s very hard for companies to build this internal tooling themselves because it’s quite expensive and requires a lot of maintenance and a lot of work. So, one of the main components that we offer is this toolkit to be able to annotate data efficiently. And do it in a way that can also use automated strategies to generate a bunch of labels quickly. So, if you have a car that’s driving on the street and you’re trying to train an autonomous self-driving system, you want to label all the pedestrians. And the way that this was being done before you would just get thousand people to go in and draw boxes over all, every pedestrian that they see in the video.

Or use recapture that was the purpose of it right?

I actually have an interesting story on that but so I can get to an

I’ll remind you in a second.

Yes, and there’re people were using these other kinds of strategies as well, but the main way was just shipping the data overseas getting it annotated by factories, filled with people, and then having it sent back and corrected and res specified, and then sent back overseas and et cetera, et cetera. And that process is just very slow and cumbersome. So, we offer this annotation toolkit and platform that allows you to do all the pedestrian labelling in a much more intelligent and scalable way. And one of the strategies that we use is this idea of micro models, these specialized models that are trained to do one specific thing very well. So, you have a very quick model. Focus on pedestrians in a video, and you use that to help label a bunch of pedestrians quickly. So that’s one of the toolkits that we have to, provide to these AI companies.

Okay. It makes more sense, more tangible now for at this in my head. With that in mind, I’d like to ask you before we are going to recapture thing. For most startups that I have, interviewed, as we’ve spoken over the years, there is usually a few points like the first big wins. These are sometimes some new big clients, sometimes investment, but let’s not focus on the investment side, but rather on the first big win for the business. What was that like? What is the story that you could tell us for?

Very first big win, I would say was getting into Y Combinator, which happened after first year. So, company started with my co-founder and. And we built the initial product, during the pandemic, essentially.

You were 2019 20 cohort.

Yes, 20, exactly. And so, we were doing it from our couches and we spent a lot of time, kind of developing it ourselves. And we’re thinking about, okay, we have this product now and we can think about raising money. We are based in London. And, my co-founder of the idea, oh, let’s apply Y Combinator. And I originally, I was sceptical because the acceptance rate of, companies to get into Y Combinator is very low. I think at the time under, under 2% or something like that. But we spent the time, wrote the application, then, we’re lucky to get an interview, did the interview, and then we’re very lucky to get in. And I think. Really changed the trajectory of the company. Because real looking back on it now, if we had gone out to try to raise money from where we were, we would’ve just failed. I think we would’ve failed. Failed pretty badly. So, it was a huge win.

It’s such a badge of approval. It’s amazing. It’s. Big, big, big legitimacy for the business and investors. They want to take risks, but it’s just like nobody got fired for buying I b m type of a tail, right? When, when you invest in a company that’s gone through Y Combinator, the person who’s making that decision will not get fired because they all say, hey, look like they’ve gone through Y Combinator, right? At least. and that really turns out to be incredibly important when people take risk. Yeah. Carry on please.

No, definitely. And it’s that, and so it’s the validation that you get and the signalling effects, but also the mentorship and the network and the advice and all of that was very positive for us, and I think put us in a very good foundational place to build the business from just a kind of a product, that we had into a proper business. So that was definitely a huge win and really, brought us, it was one of the pieces that brought us to where we’re today.

Amazing. In terms of the customers, the B2B sides, or, or some partnerships, big partnerships, anything that comes to mind?

Yeah, so the other, big win that we’ve had since is to get some of the. AI companies in the world to also work with us. So, it’s really a pleasure to see, some very sophisticated companies that are building, Production scale models that are working at, at a scale that is quite impressive, to see how they, they work and, see what problems that they’re having and allow us to build a product that really serves to them. Well. So, serves to, the most sophisticated AI players in the space because if you build to them, then, all the new companies that are coming up now, the ones that are just b are starting to build POC models, for instance, they will have a similar set of problems as what, more sophisticated players will have, right now. So, in a way, we’re allowed to see the future, by working with, the companies that have built their entire model infrastructure and gotten models work in the real world. And some of those include companies like Tractable and Iterative Health. These companies which we’re very impressed by, and we, we’re, we’re glad to work.

Amazing. I mean, this is such an exciting field and the real-life applications are what makes it exciting in the way that I see it. So, what jumps at you? What do you find most exciting in terms of the real-life applications of your potential customers or your current customers or potential use cases for what you’re working in?

Yes, so, one of the areas that we work in quite a lot is in healthcare and there are a lot of subspecialties and, sublocations within healthcare. But I think AI just applying within the healthcare field has the potential to really turn healthcare from being a defacto reactive system to being a proactive system. So, to focus more on preventative measures and finding things, early in a way that you can treat things, much faster than, if you wait a long time and just have to have to go at end of the end of the cycle rather than beginning of the cycle. So, a lot of these. Diagnostic use cases of finding cancer within, gastroenterology use cases or within radiology, detecting strokes. All of these things which we are working in, I think are very exciting and have the potential to really make a transformative effect within medicine.

Amazing. If I were to speculate on that a bit. So, there was this approach, I was having a conversation with a really smart person that I respect. He was running a marketing agency back then and we had a chat about how correlation is a lead indicator rather than causation. In some cases, especially in complex systems. You don’t always understand why something happens. But if you see correlation, okay that usually can save you a bit of a time. Correct me if I’m wrong here because you have more in-depth information on this than me. I’m assuming there are cases, brain imagery, for example, when AI works through, it catches something the human eye has missed. Yes. But then there are these other categories of cases where it’s assigns a probability of a risk of an illness in the absence of any clear identifiable causation based, at least visual signals. Is there any truth in.

Yes, that’s, it’s interesting you brought up this point because, this is one of the problems that late-stage companies have is how to decorrelate correlation and causation and how to make sure that when AI models are making decisions, that they’re making the right decisions and on the right factors, and have the right kind of explainable concepts that you would expect from a human domain expert point of view. So, one of tools that we have is in our active learning toolkit is a way of breaking down model performance, such that you can understand it better with respect to the data domain. And if you find that the model is doing these kinds of predictions, within a specific set of conditions that are happening with the data. Which should not actually be causally linked with the actual illness or inference result that the model is trying to do. Then our tool will help find that, in a way that is, a little bit difficult now for, companies to understand. So, this is one of the problems that you have once you have a model that’s actually working is. Trying to really understand why the model is doing what it’s doing, and to make sure that it’s making decisions based on the right factors. And that’s one of the things that we’ve seen from our, these sophisticated companies that we’re, we’re building towards as well.

This is what makes it the most exciting perhaps invention of our century. Even I think the model doesn’t know. The model of course doesn’t know. The model doesn’t conscious thing, right? Even know what is happening. A combination of data points, let’s say hundred thousands of them show a statistically significant change. And then that’s our conclusion kind of. So, it is actually an amazing way for us to discover the reality, the fabric of health. Also social, application areas are very exciting for me. My background is in social sciences, so I can’t wait for the day where we apply these models in a meaningful way and say, hey, you know what? When. As a society do these things, it actually decreases the level of crime or it increases the level of innovation. We don’t understand why that happens, but hey, change this and then, okay, this happens here, and then before we understand we can do some of things. That was kind of my point upon until this point in history, we needed to really understand the mechanism of how something. Before we were able to apply it, but now I think AI is training us as well in a way to think differently, right? So maybe we can apply some of things before we understand how they work. As long as there is statistically significant data there.

Yeah, I think that’s, a great point. And it’s, it’s very fascinating to see this evolution because there’s kind of two subpoints here. One is that the normal way that a computer programmer would come in and try to, fix or debug a system is to really decompose it within the code itself in terms of what in the system is doing the wrong thing, and then I can go in and fix particular thing. But these models are so complex and have so many layers that we can’t do that directly. So, we have to develop all these indirect methods of trying to correct and, improve the model. And in a way, it’s going to develop in a similar way to how we try to. Human behaviour, right? We can’t go in and affect neurons directly in people’s brains but we have things like therapy. So, you talk to a person, you try to understand why they’re doing what they’re doing, and, you have all these other interventions that in this indirect way are trying to move things in the right direction. So, we’re going to have like a kind of AI therapy, which is in, in parallel to how, we’re doing it with people now. And the second bit is understanding. The model is as good as the data that it’s being trained on. So, if you give the model a lot of data that’s wrong, then it will just come to the wrong conclusions. So instead of trying to understand the model directly or the AI application directly, put your attention on understanding the data that’s feeding it. So, in this case there’s also human analogy where if you give a student a textbook with a lot of factual errors and a lot of misspellings, that student will learn misinformation and they won’t learn how to spell. So, let’s focus on making sure that the textbooks are correct before feeding it to the students. And I think these two processes are going to evolve in with more sophistication as we get more of these models to work in production use.

Well said. Well said. ChatGPT stuff made me think, okay the humans will have a more creative role rather than a creative creator role in how, some, creative industries are heading. And I think that’s not so bad because we always talk about this tension. especially when you start a new craft, an art and any artisanal pursuit, there is a gap between what your hands and, and you are able to create the quality of your output versus your taste, your gusto, right? There’s this gap for a certain period of time for maybe 10 years. You really. Don’t like what you’re creating and you have to stay within that painful state. If you want to be a great creator, and once your dexterity, your hand skills, or, or like your paintings, painting skills or your pros matches your gusto, and then you become like a, a really good artist. Now, maybe we are cutting that time down quite significantly, and the people with, with that gusto, that taste, that curator mentality can come up with amazing. And just like that, it makes me think maybe what you. The data that we feed into these systems are so important. So maybe a similar approach would or could be useful there. What do you think a data set curator, or maybe there already is, right? How do you curate the right set of data? What metrics do you look at in complex systems? So going with the analogy of I want to train ai. On social policies, but what data do I put into that system to be able to get good conclusions? What do you think?

That that’s exactly what we’re doing? So those are ex exact problems that we are thinking about, a lot in our company, and solutions, we’re building as well. And I think this, as you say, this taste skill gap, which happens with anyone that’s trying to, be creative, will diminish and the human role won’t go away. It’ll just be pivoted to. We’re going to be the editors, the therapists, and the educators of ai, rather than the generators of the content ourselves. And part of that is really understanding, well, if you’re doing AI therapy, like how should you be doing it? If you’re curating data and the textbooks that these, these AI students are going to be learning from, how do you do that? What metrics are you looking at? What are the best practices to, to be making these decisions? And those are the problems that we are working on at the moment. And that’s what I’m quite excited about as well within our company.

Amazing Eric. Final question. What is, the big milestone, in front of the business and you as the founder? What’s the big goal for the coming year?

Really, it’s just continuing to grow. Keeping everything on track and on pace and the set of problems that you face when you’re a, two-person company is different. When you’re a 10-person company, which is different from when you’re a 40-person company.

How many people are you?

Around 40.

Okay. And by the way, are you hiring for any role? Sometimes we use this platform as like a way to make a quick announcement. If you’re hiring for anyone, do let the audience.

Yeah, I think the best way to look at all the roles we’re hiring for is going to our website. We have a careers page, and you can see all of the different roles. We’re always looking for strong people, strong full stack engineers, strong ML engineers. We’re hiring quite a bit in the go-to-market function. So, one of our goals is to really systematize our commercial function and to, get the engine. To be roaring outside of, just my co-founder and I being more involved in it. So having kind of growth engine is something which, every company and startup aspires to, and which we are, making very good progress on right now.

Well, Eric, what an insightful conversation this was. I feel like I’ve upgraded my understanding of AI, at least by a meaningful 50 percentages. So, thank you for taking the time, and joining in our chat. It was lovely to have this conversation with you.

Yes. And, thank you for, for having me as well. I had a lot of fun.

Leave your vote