EPISODES CONTACT

Making Email Wonderful, with Superhuman’s Lorilyn McCue

Ep 32

Sep 18, 2024 • 46 min
0:00
/
0:00
ABOUT THIS EPISODE
Lorilyn McCue, head of AI product at Superhuman, joins to share what she’s learned building LLM-powered products. She talks about how Superhuman executes on shipping delightful software, how to pursue both attention to detail and rapid iteration, optimizing your roadmap for learning, questions that are surprisingly easy (and hard) for LLMs, evaluation and testing of AI-powered features, and ruthlessly prioritizing the user.
TRANSCRIPT

Allen Pike: Welcome to It Shipped That Way, where we talk to product leaders about the lessons they’ve learned helping build great products and teams. I’m Allen Pike, and today’s interview is with Lorilyn McCue. Lorilyn heads up AI product work at Superhuman, works as a coach for tech executives and founders, piloted Apache helicopters, and previously worked in product and growth physicians at places like Impira, Google, and Slack. Welcome, Lorilyn.

Lorilyn McCue: Thank you so much for having me.

Allen Pike: I’m really excited to have the conversation, even just from that micro LinkedIn resume summary, there’s so many threads to pull on.

Lorilyn McCue: Yeah, I have a very atraditional background, for sure.

Allen Pike: Yeah. That tends to, in my experience, be correlated with more interesting conversation and episodes, so-

Lorilyn McCue: Yeah, good.

Allen Pike: … I’m excited by that. I am excited to talk about Superhuman. I’m a fan. I’ve been a user of it. For folks that aren’t familiar, if you haven’t used Superhuman, it’s the fastest email client. It makes email sane for me, and so it’s something that I’m excited to talk to someone who’s actually working on it and building this product. Before we get into the now, though, I wanted to give you some space and some chance and an opportunity to learn a bit about this fun path that you’ve taken here, the nontraditional, and then maybe slightly more traditional parts of it. How do you like to tell the Lorilyn story so far?

Lorilyn McCue: Yeah, so it started as most tech journeys started, which is I flew Apache helicopters in the Army.

Allen Pike: Yeah, like we all did.

Lorilyn McCue: As one does. Yeah, so I went to West Point for undergrad. I studied computer science, and then once I saw a helicopter, I just absolutely fell in love. Went to flight school and I flew Apache helicopters in Korea and in Iraq and had an awesome time.

Allen Pike: Nice.

Lorilyn McCue: I also did a little stint teaching back at West Point, and the other very traditional background for tech product managers, which I taught physical education at West Point.

Allen Pike: Oh yeah, so definitely I would have guessed that from like-

Lorilyn McCue: Totally.

Allen Pike: … a leader in technology taught at West Point. What would they have taught? I would think physical education probably.

Lorilyn McCue: Physical, obviously.

Allen Pike: Yeah.

Lorilyn McCue: Had a blast doing that, and then after my 10 years in the Army, I decided I should probably figure out what to do when I grow up, so I went to business school. I honestly had no idea what jobs were out there. I don’t think I even knew what consulting was when I started business school, so I spent a good two years trying to see what was there, and I kind of narrowed in on product management with kind of the leadership Army background and my computer science technical background and then the business school business stuff. It seemed like product management was kind of in the middle. I had a good friend, Jamal Eason, who was working at Google, and he urged me to apply for the Google product management internship position, so that’s what I did between my first and second year. Loved product management, had an absolute fantastic time there. Then, I decided to see if I could join Slack. It was a little bit smaller company. It was moving really fast. I had met Merci, a previous guest on your show, at a dinner and just really loved the company, and so I ended up joining Slack. That kind of like was my transition into tech, a very natural one.

Allen Pike: What was the stage that Slack was at when you joined?

Lorilyn McCue: It was like the curve was like it was already bending pretty highly. It wasn’t maybe a few hundred employees-

Allen Pike: Right.

Lorilyn McCue: … and it wasn’t like a sure bet, but it was really a hot startup. It was pretty exciting to get to work at Slack for sure.

Allen Pike: Yeah, and Merci told a little bit of the story of getting together because you worked with her in growth for a while there, as I understand, and like-

Lorilyn McCue: Yes.

Allen Pike: … figuring out at that time because that was some years ago now, like, “What does it mean to be in growth? What are the various tools and approaches and things?” Obviously, Slack was growing pretty well, but you don’t get to that level of growth without intentionally actually building for it.

Lorilyn McCue: Yeah, and then after Slack, I did two years on the growth team, and then I did two years on the platform team. Really got into the ability of a technology to scale when you can link it to other products, and really enjoyed that portion as well.

Allen Pike: Yeah, and then, as I understand it, after Slack, you ended up at an AI-centric company just before it was cool.

Lorilyn McCue: Yeah, yeah. Well, I thought it was cool. I mean-

Allen Pike: Yeah. I mean, AI has always been cool to us-

Lorilyn McCue: Right, so cool.

Allen Pike: … but the median person out in the world as not necessarily talking a lot about AI in 2021 the way that-

Lorilyn McCue: Totally.

Allen Pike: … they are today.

Lorilyn McCue: My mom had no idea at all what was happening. Yes.

Allen Pike: Yes.

Lorilyn McCue: That’s true. Yes, I worked at Impira and absolutely loved. It was a very small startup. It was a small team. We were fast. We had a blast. I learned a ton about AI, and that company was eventually acquired by Figma. I took a little break to get my coaching certification, so I had been using my professional development funds to do coaching classes. It’s a fun side project, and so I took a little bit of time to do that. That’s the coaching portion of my bio.

Allen Pike: Yeah.

Lorilyn McCue: Yeah, and then got linked up with Superhuman and they said, “AI, it’s Superhuman,” I said, “Well, that sounds really fun.”

Allen Pike: Yeah.

Lorilyn McCue: I was really excited about using all of this. I’d seen like the very complex side of AI. We were using AI to turn a PDF into a spreadsheet at Impira, which sounds easy, but it’s very complicated. I was really interested and like, “Okay, what if we took a little bit easier use case of just email? How could AI make email better?” That was really, really exciting to me, something so practical and useful. I saw how much technology could make people’s working lives better, and every part of it, just the way we communicate, can bring so much joy to our lives if we make it better. It was really exciting to have a chance to tackle it in this really productive way.

Allen Pike: Email is something that almost every person with any sort of office knowledge work job has to interact with, and stereotypically isn’t necessarily super keen at the… If you say, “What is the most productive, delightful minutes of your day?” Most people aren’t going to say, “Well, it’s going to be deleting sales emails I didn’t want,” or whatever it is. It’s probably, “Yeah, I’m finally done with my email,” and that’s been even prior to AI and all the ways that it can help with any tool that’s dealing with text and things like that. Superhuman has been a really great force for good in that. My emotional relationship with email has improved a lot-

Lorilyn McCue: Oh, I like that.

Allen Pike: … over the years, so I appreciate the work that Superhuman has done so far. I’d be curious to learn because the stereotypical B2B software product thinking is like, “Let’s check as many check boxes so we can get the biggest enterprise sales.” Then, I guess user experience matters and most B2B software companies will say that user experience matters, but there is not very much, almost no B2B software can I remember having used that the attention to user experience and speed and how do people feel when they’re using the product actually comes through when you use Superhuman. Every single thing has a keyboard shortcut and things animate within less then a 10th of a second, like milliseconds often. That’s something that most software that’s written for users doesn’t even come close to. I’d be curious, what have you seen and learned at Superhuman that do you think contributes to that? Obviously, it’s easy to say, “We care about that stuff,” but how is that actually flowed through into making sure it actually is that way?

Lorilyn McCue: Yeah, I think it definitely from the top, Rahul is incredibly detail-oriented as a CEO and founder. He will come and say like, “Hey, this is a pixel off.”

Allen Pike: Hmm. Hmm.

Lorilyn McCue: When your CEO comes and tells you something’s a pixel off, you start checking to see if things are on baseline, if they’re aligned, if they’re pixel perfect. That level of quality is deeply embedded into every part of the organization, and when it’s a small startup, everyone has to take ownership of that. Just a couple of days ago for a feature we’re about to ship, Rahul noticed like, “Hey, there’s a hanging word in this tool tip.” We’re like, “Oh-

Allen Pike: Right.

Lorilyn McCue: … “man.” That level of quality, when your CEO tells you about that, then you’re like, “Well, I better freaking notice that.” You know?

Allen Pike: It’s interesting you say that, and I am biased to want to hear that because I’m the sort of CEO that goes in and mentions things about tool tips and things, but there’s a long-standing, and I don’t want to get all into Paul Graham and Founder Mode necessarily, but there is a bit of a thread here where when you think about the standard advice about how do you build a product org. It tends to have things like, “Okay, we’ll have a team and then delegate to them and focused on the things that you can uniquely do.” It’s like, is Rahul uniquely capable of noticing hanging words and tool tips out of anyone? Is he literally the best person at noticing that? Probably not, but basically where the leadership’s attention goes indicates prioritization and sets that culture. I’m curious if you’ve observed anything about how building an org that is able to thrive in an environment where you have somebody who’s coming in and having opinions about pixels and stuff, but rather than necessarily that just like derailing everything. Is there any hiring or leadership or strategic mindsets or habits that sort of help align an org that jives with that rather than creates friction?

Lorilyn McCue: I don’t know if we’ve completely figured it out. I don’t know if anyone has figured it out. Please tell me all your ways if you have. I think Superhuman is in this really interesting state right now where we have a passionate, passionate vibes. I thought people loved Slack, people love Superhuman. Passionate user base of people that love the attention to detail, love the craftsmanship. That’s where we were coming from. At the same time, the environment is very competitive right now. People are doing really, really cool things with AI and email, so how do we move fast, stay relevant, providing the same, hopefully more value than the competitors, but also keep that same level of craft? Everything that you build, oh, well, a new model is released like, “Okay, let’s go back. Let’s change all the prompts. We got to update this up. We want to give everybody the best and greatest stuff.” I think there’s this really interesting moment of tension of what has made us successful and what’s going to make us successful going forward. What I think is important is to not lose that attention to detail, but still be able to move fast. One thing that I noticed that Rahul does pretty well is it’s like a spot check. He’ll be like, “Hey, this is off,” or, “This is off,” but he’ll often give us quite a bit of autonomy. There’s a quite a bit of autonomy for the product managers during the majority of the process. We’ll obviously have product reviews where we’ll look at the designs and look at the spec and talk through some of this stuff, but there’s key moments where he’ll come in and be like, “Hey, there’s this.” I think that those little spot checks help keep us moving fast, but also keep that same level of craft that has made Superhuman so successful.

Allen Pike: People refer to it as like the head chef testing the soup.

Lorilyn McCue: Yeah, yeah. Yeah, he’s a good tester.

Allen Pike: They don’t need to micromanage how the carrots were chopped, but does the soup still tastes good? It’s interesting you say it, and this is the kind of things come up in conversations with a few other product leaders I’ve talked to over the last like two to six months is this tension in between those of us who have gotten really good at delivering thoughtful detail-oriented experiences and making sure things are excellent. I worked at Apple back in the day. I have worked on lots of products that have this really focus on detail and polish. Squaring that and the tension in between that and the teams that are doing amazing things with AI and moving fast and breaking things and learning faster and trying to grapple a little bit with, what are the things that we have learned over the last 10 years about doing things really nice and polished that were a response to the environment that we’d been in a local maximum In terms of the technology we were working with was relatively well-understood? Cloud apps and browser apps are relatively well-understood, so a competitive edge would be to do that, but do it really, really, really well. Whereas, if we had tried to do that in like 1992 when computers were still just being figured out, then meanwhile they’re on four versions. We were making like a 286 thing and then 486 is out and our product sucks because we took too long polishing it. How much of it is that we maybe learned skills that slightly overoptimized for the underlying speed? How much of it is that we’ve learned durably useful skills and mindsets, we just need to, like you say, find new ways of applying those to this underlying fast-moving world and making sure that we’re polishing the things that matter and durable and not polishing the things that are completely outdated when GPT 4.5 comes out in like eight weeks or whatever?

Lorilyn McCue: The principle that I’ve been approaching our team’s roadmap with is optimize for learning, optimize our roadmap for learning. A good example of this is we recently launched our Ask AI feature, which is my absolutely favorite feature AI-wise inside of Superhuman. You can ask a question like, What is Allen’s phone number? Instead of having to find the email from you, it just is like there’s a phone number. Oh, here’s the source. You can click on it and just double-check the answer if you need to. Or you could say, “Hey,” I love this. Each of the product managers send like a little weekly update. “This is what’s going on.” You can say, “Summarize all the weekly updates.” There you go. You get your summary right there. It’s both like needle in a haystack answers and summarizing over a handful of emails, which is super exciting. When we were first talking about this, we were like, “We’re building an agent,” and so when you think about building an agent, you’re like, “It’s doing stuff. It’s going to create events. It’s going to draft emails. It’s going to automatically put things into splits. We just got really pumped about this awesome agent that we were going to start building.

Allen Pike: It’s going to do everything.

Lorilyn McCue: It’s going to do everything. At some point, we were like, “We want to ship this very fast,” and what we know we can do well is answer questions inside of your inbox. That’s what we can uniquely do well. We can find the answers inside of your inbox and give them to you. We have prototyped this. This works very well. Let’s start there, release this right away, and then let’s see. We’ll tell people what’s possible, but then let’s see. People are always going to push the boundaries. They’re going to be like, ‘Okay, well, cool. It answered what’s Allen’s phone number.’ Now, it will be like, ‘Now, send Allen an email doing this way.” Then, people would tell us like, “Hey. I really wanted this to be able to do this. I wanted it to be able to create an event. I wanted it to be able to find time that I was available and put that in the email.” What we were able to do is figure out, “Okay, this is what people are actually asking for. This is what they really, really want. This is the most number of requests. Great.” We handled the base case, which does provide a ton of value. Now, we know what to prioritize next. For example. What we found is that people wanted more sophisticated almost research-based questions. Look at all of the feedback from Ask AI. Group it into 10 different categories and give me an example quote from them. Now, that’s going to take longer. You’re not retrieving a couple handfuls of emails. You’re going to have to look at hundreds of emails to answer that question.

Allen Pike: You probably need to make an intermediate spreadsheet and then be able to like maybe sort it and stuff like that. You’re kind of a little analyst now.

Lorilyn McCue: Yeah, it is like a little analyst, and what’s cool about that is like, yeah, I probably saved you a couple of minutes by answering the phone number question or finding when your flight is or giving you your confirmation number, but I probably will save you half an hour to an hour of work by providing this deeper research mode. For us, we launched a feature, it was nowhere near where we wanted it to be, like nowhere near, but we did know that it provides some amount of value. After that launch, we were able to learn what users really wanted more or next, and then we could prioritize the follow-on actions. What’s interesting about that is by launching sooner, sometimes the technology completely changes, you know?

Allen Pike: Mm-hmm.

Lorilyn McCue: There was technology that was available for just a few months before our launch. We wouldn’t have been able to do it any earlier really. There’s a real benefit to doing what you can now, learning what people want, and then tackling that. Then, maybe technology is giving you a heads-up, like a head start by the time you get to it.

Allen Pike: Yeah, it’s interesting. How long ago did you launch that Ask AI feature?

Lorilyn McCue: We launched it, we announced it in May, and then had people in beta, like got people off of the beta June, July, and then we just GA’d in August.

Allen Pike: Right, so what have been some of the big surprises? I assume everybody I talked to launches a feature like this finds it has some surprises about either user behavior or whatever. You touched on one, which is that people asking for it to do very complicated analysis. Is that the biggest surprise? Or are there any other things that you were surprised by, delighted by, or horrified by the kinds of things that people ask it to do?

Lorilyn McCue: I was super delighted by… It’s helpful to say, “What’s Allen’s phone number?”, but honestly, I could just use regular search for that and eventually I’d find your email and go to whatever. There’s sometimes that you literally don’t remember the name of the person. You don’t remember what’s in the subject line, you don’t remember the date. People would say like, “Hey, I’d ask things like, ‘Who is that guy from Belgium who called me Juan or whatever?’”, and they would find it. To me, how would you possibly find that otherwise? The semantic possibilities were very, very exciting to us.

Allen Pike: Yes.

Lorilyn McCue: That was one thing we found through user research at the start is like, “Hey, sometimes I don’t even remember what I would search for. I don’t know what I should search for to find this.” Not only is it saving time, but sometimes it’s making something possible that wasn’t even possible beforehand. I guess we kind of knew it was coming, but I think I was tickled pink by how helpful it was.

Allen Pike: Yeah. On the flip of that, has there been any cases or questions that you kind of going into it thought like, “Oh, this would be a pretty straightforward thing,” and then it turns out to explode in complexity of trying to solve a certain case well?

Lorilyn McCue: One example that we definitely really wanted it to do well at because we all have had this problem is like, “Hey, when is my next flight?”

Allen Pike: Hmm. Hmm.

Lorilyn McCue: It does really, really well. When is my next flight to Austin? There’s so many emails that talk about flights.

Allen Pike: Right.

Lorilyn McCue: We almost were like, “Should we develop just a flight tool given its own prompt? Just like this person has said the word “flight” or “plane” or something. Now, let’s get better at it. It’s so funny. People will give us positive or negative feedback about this particular question. It’s like almost equal. People are like, “This is great. It found it immediately.” People are like, “It gave me a flight a month ago,” and I’m like, “Oh gosh, I’m sorry.” That one use case which we really wanted to nail has been particularly challenging. I’m sorry if you’ve experienced that. We are working on it, okay?

Allen Pike: Well, you hit on something that is a common thing that product teams struggle with with LLMs, and there’s a whole bunch of approaches and things, but it’s just like it doesn’t come for free out of the box. Maybe eventually you can imagine the LLMS themselves improving this, but a lot of it ends up being product teamwork is when you’re asking it about things that aren’t semantically in the content. You’re not asking like, “When is,” or like, “Show me the emails of the flights in July.” You’re asking it about something that’s relative and the text of the email doesn’t say, “Hey,” well, maybe every single email says, “Your next flight is coming tomorrow,” but only the most recent one that actually is your flight and not someone else’s flight is… The question has a bunch of implication about stuff that isn’t in the content. You end up need to prep metadata and surface that metadata to the LLM earlier and do a whole bunch of stuff behind the scenes to answer this thing that the human being thinks is totally instinctual because we bring a bunch of extra context in our brains when we go look at stuff.

Lorilyn McCue: This is like the bane of email AIPM existence, which is you look at your inbox and you inherently are like, “That’s junk. This is relevant. This is important. Look at this now.” People, obviously, want an AI prioritized inbox like, “I want you to tell me what’s important.”

Allen Pike: Yes.

Lorilyn McCue: I played with a lot of products that do this and it’s like, “You have 24 hours before the West Elm discount period ends. This is very urgent.” You’re like, “No, it’s not. It’s not urgent.” Like-

Allen Pike: I urgently have a billion dollars to send you from Nigeria. You must agree now otherwise you’re going to lose this opportunity.

Lorilyn McCue: “Wire this money immediately.” AI is just not good at that yet. It’s not good at saying like, “This is important, this is just not important yet.” We’ve been thinking a lot about this because it’s one of the next things that we’re thinking about is how to make your inbox… Should it just be a chronological list? Should it be something a little bit smarter than that? Ideally, the most important stuff is right there for you, but do we decide what’s important? Do we categorize your email and then let you decide like, “This is a category, oh, by the way, this is an important category?” It’s really hard to… AI, these LLMs are not as powerful as our brains at figuring out this kind of stuff and how to use this very powerful technology to give you what you want in an actually useful way is that’s really tough.

Allen Pike: Yeah. One of the things that LLMs don’t have it the ability to learn themselves. We have GPT 4.0 and it does what it does, and as product developers, we can add extra stuff. We can give it more context to learn on its behalf, but it’s not actually learning. Even if it’s like you could say, “Oh, well, here. Oh, it prioritized this,” and you sell it like, “Oh, that was a bad priority,” which I sometimes still have the instinct to do. I mostly learned out of it, but sometimes I want to say like, “Claude, come on. No, that was wrong.” It’s like-

Lorilyn McCue: No, be better.

Allen Pike: … I’m sorry, but it’s not going to next time. It’s like, “Now, maybe with GPT 4 or in ChatGPT, at least it’ll be like, “Oh, it can add to the memory,” but yeah, that’s definitely a huge challenge. Digging a little bit more in, and a lot of the folks in our audience are tactical people that are either founders or product leaders that are building this stuff. You’re in this world where you have these features and they’re getting huge value for a certain percentage of the people who use them, and then a certain percentage of the people are grumpy because it’s imperfect, which is the state of most AI-powered products. Then it’s just the percentages are different for each product and each feature. This really comes out of the fact that they’re non-deterministic functionalities that have subjective quality. It’s generally just like, “Okay, well, let’s write a test to make sure that it always does this if that, therefore, it’s always correct because everybody’s data is different and everybody’s idea of good is different.” My question is, what approaches have you taken as a product team? Obviously, this is partially an engineering thing, but from your product perspective, what have you been learning to do to support the process of getting those percentages up and getting visibility into like, “Are the changes we make or we’re making increasing that percentage of people that are happy? Are they having a positive impact and not maybe making some other thing worse by accident, the fact that it’s all kind of non-deterministic and subjective?”

Lorilyn McCue: Yeah, so I mean, there’s a couple levels to this. One level is I literally look at the positive-to-negative feedback ratio. I’m like, “The green line is going up, the red line is going down. That seems like a good sign.” That’s one. To get more exciting than that, when we internally will give feedback, we’ll add those examples to a data set. We’ll then use that new data set to be, “Okay, we have a new prompt now. Let’s see if the new prompt regresses on the past examples and let’s see if it better handles these new examples that we’re trying to improve on.” Right now, it’s more of a manual process, but we want to make that a little more automated, so automatically if I thumbs down an answer, I’m like, “This is wrong,” it automatically goes to a data set. Then, we are like, “Okay, let’s try a new prompt. Let’s run it. Let’s see. Okay, got it. I got the retrievals right this time. Fantastic.”

Allen Pike: Right, and so is that something where you’re having, yourself and other product leaders on the team, feed that tooling or those processes with some sort of meta evaluation of like… One of the questions that comes to my mind is like, of course, you have a thumbs down and you can say, “Okay, this was a bad generation,” and so you could automatically test that, “Oh, okay, you have a better or you have a different generation.” Is it better? Or is there some, and of course I’m just creating work for you, but is there some loop where you can say like, “Okay, Lorilyn didn’t like this answer. The new model has this answer. Does Lorilyn think it’s better?” Or is the data science team handles that? Or is it internal to engineering? Or, don’t know, I’m kind of poking for more.

Lorilyn McCue: Yeah, so what we do in the moment, I don’t expect this from the company at large, but from the AI team members, let’s say you asked a question like, “What’s Allen’s phone number?”, and I didn’t get the right email. Well, then find the email, copy the thread ID, and then hit the thumbs down button and then copying the thread ID, so we have literally the right answer and then type in the right answer like-

Allen Pike: Right. Yes.

Lorilyn McCue: … “His number is this.” We do that just to build the data set in the moment. We also have a very amazing QA engineer. Her name is Nico and she cleans this up so hard, so she’ll take a data set and then it’s perfect after she gets her hands on it.

Allen Pike: Nice.

Lorilyn McCue: A lot of times it’s Nico doing some work to help out An AI QA engineer is such an amazing role to me now. She’s helped out with fine-tuning, she’s helped out with creating data sets for fine-tuning. She’s automated ways to get examples into the data set. I mean, yeah, it’s a really interesting… What even is an AI QA engineer right now?

Allen Pike: I mean, I love that you bring that up because one of the things that’s coming up is talking to all these other teams that are building AI-powered products is that literally just been even to your point, the roles on the team change, right?

Lorilyn McCue: Yeah.

Allen Pike: Everyone is asking themselves, if they’re not asking themselves, they probably have their head in the sand. Everyone’s asking themselves like, “What is the right balance of like I used to think of these triads where it’s like, ‘OKay, there’s a designer, a developer, a product manager, whatever.’ Now, it’s like, ‘OKay, where’s the data scientist? Do we have someone with the title data scientist or whatever? Data engineer, QA AI engineer? Are the AI engineers and the product engineers two separate people? Or is everyone an AI engineer?” That some teams-

Lorilyn McCue: Yeah.

Allen Pike: … say that and they try it and it works. Other teams say like, “Three-quarters of our engineers are totally weirded out by nondeterministic systems and we have to kind of shield them from the chaos of AI. We have AI engineers and non-AI engineers, at least as of September 2024, and it continues to evolve.” I guess maybe that brings a question. Are there any other team competition things that you’ve learned and have been helpful in this effort to move quickly and ship stuff on this new platforms?

Lorilyn McCue: Yeah, I think one thing that my engineer manager, Sachin, has spent a lot of time doing is making it so that nontechnical people like myself can do the things that don’t require technical expertise like edit a prompt, edit a prompt and make it so that change goes into code. We are using mustache templates for our prompts now, so that they’re so easy even a PM can do it. That makes it so that I can iterate on this stuff fast. He is working on building tooling so that we can uplevel me and let the engineers do more engineering stuff.

Allen Pike: Yeah, that’s a super… I’m glad you brought that up because it hasn’t come up on the show yet I think, but that’s something I’ve heard from a lot of teams getting a lot of mileage about, and that’s one of the wonderful things about LLMs and prompts is that folks that really understand the product can more directly sometimes have positive impact on the product without having to like, “Okay, file a ticket.” Engineer will write some-

Lorilyn McCue: Oh my gosh.

Allen Pike: … you know-

Lorilyn McCue: Totally.

Allen Pike: … Java code or whatever.

Lorilyn McCue: Yeah. Before this, it would be like, “Lorilyn, I’m working late at night trying a bunch of different prompts. We used this tool called Brain Trust to do prompt iterations with a big data set.” I’d be like, “Okay, I got a good prompt now.” Then it would be like, “Okay, hey, here is the new prompt.” Then, our engineer’s like, “Okay, let me put this new prompt in.: Then, I’d have to wait for us to run the evals, and then like half a week later, finally we get to test the new prompt and staging. It’s like, “Oh my gosh, we’ve got it, this should be faster,” and so-

Allen Pike: Yes.

Lorilyn McCue: … getting faster at that so we can… We have to be able to iterate really fast. That’s been really helpful.

Allen Pike: Yeah, I imagine, or maybe if you don’t yet, probably is coming. Well, one thing a lot of teams end up building is a place where a product leader can edit a prompt and then turn it on in the product just for them so that they can play with it. I don’t know if you have that yet, but that’s one of the things people loved having.

Lorilyn McCue: I’m about to.

Allen Pike: Yeah, excellent.

Lorilyn McCue: Sachin’s going to be like, “Darn it, come on, Allen.”

Allen Pike: I’m creating work for your team.

Lorilyn McCue: Yeah.

Allen Pike: Cool. Well, that’s fine. I could spend all day asking about that, but not everyone in our team is in AI product engineering, so I’ll zoom out a little bit in terms of some of the stuff that you’ve been building. One of the questions I wanted to ask is we touched on it because we talked about summarization, which is one of the very commonest, soonest, people see LLMs are like, “Of course, oh, I want to be able to summarize this stuff.” One of the things we haven’t talked much about yet is composition, which is the ability to create, which are like those two often high-level first email AI features that people think of. Then, there’s this Twitter joke of like, ‘Okay, well, if everyone’s just summarizing long emails into shortened snippets and then typing shortened snippets to generate long emails, can we just send each other the short snippets?”

Lorilyn McCue: Yes.

Allen Pike: I was curious. I’m sure this has come across your desk.

Lorilyn McCue: For sure. Yeah, we laugh about this constantly. We laugh, cry about this all the time.

Allen Pike: Yeah, so are you at the point yet you have any kind of thoughts about ways that us as society or email users or product vendors that are creating stuff for email can sort of nudge us a bit more into the future where we are sending the short meaningful snippets and maybe not just a little bit away from the world where we have 5,000-word generated emails that we only read the summary of going in every direction?

Lorilyn McCue: One thing that we’re doing internally is we have like currently voice and tone profile for you. This is your voice and tone, but one thing that we’ve been thinking about quite a bit is my voice and tone to Rahul, my CEO, is quite different from my voice and tone to my engineer manager or my Mom or my friend. We think we can get a little bit more sophisticated about a you plus X person’s voice and tone profile, and ideally that means that like even though my tone is maybe professional and friendly and there’s all these other different factors that go into what my tone is, but that’s the general theme of it. Well, okay, I’m going to get long emails all the time when I use AI Compose, but if we can get a little bit more specific about how you particularly write, I think people will start being like, “Hey, okay, I can talk to this person using AI in the same ways that I’ve talked to them beforehand.” I don’t need to always write with this very professional tone. I still think long professional emails are going to have a role. I think that’s just the nature of professional business courtesy, but ideally, once we get through those initial stages and we have a relationship, we can say, “Okay, this is how I actually write to this person. Now, lets cut this a little bit shorter.” I think that we can hopefully embrace the AI. The AI can get better at saying like, “Okay, you calk to this person in a shorter manner,” and then maybe the long summary isn’t quite as necessary. I also personally always use Improve My Writing, which I do think tends to make it a little bit shorter and a little bit more concise. I wish there was like a Modify My Writing and cut it in half.

Allen Pike: Yeah.

Lorilyn McCue: We do have a shorten command, but if I could just combine those, modify and shorten, I think that would be great.

Allen Pike: It feels intuitive to me that over right now as an industry, we’re trying to use the current generation of LLMs to try and help people save time and do the things that they might already do and estimate and try to simulate something roughly what the person would do on their own. My guess is that over the next 12 months, we’ll start to get better as product teams at helping them not just do the thing that they would do already, which we’re still, to your point, there’s more work to do to get them there, but then to try to get them actually somewhere that we are learning as now experts. If you’re writing AI to compose emails, to do that really well, you have to become experts on what does it mean to write a great email, which two years ago, Superhuman didn’t really need to know a lot about that. You just facilitate whatever emails people are writing, but you can imagine in 12 months that your combination of the model’s getting better and your prompts actually help them. If shorter emails are better, which I personally feel like is the case, then the prompts can really just encourage in the direction of just writing shorter. Even if you tend to write long emails to Rahul, which I sort of doubt because most CEOs don’t tend to want long emails, but you tend to write long emails to someone, that it would kind of tend to write maybe two-thirds as long as that in order to sort of just nudge people in a way that we know is good.

Lorilyn McCue: Yeah, I think we’re seeing this a little bit with some of the email sales coaches where they have a perspective on, “Okay, you should write sentences no more than this many words. The grammar should be at this grade level. The whole email should be no more than this many words.” It actually will grade your outbound email on this rubric, and it’s really interesting to think about, “Okay, is there a world where we say like, ‘Hey, your busting this email. This is way too long. You could shorten this. You can get rid of this stuff. Would you like us to make those changes for you? Cool, we got it.’” I think that’s really a cool idea.

Allen Pike: Well, and that’s one of the things I’m really enjoying in the initial explorations of people applying LLMs to products where they’re doing something where it’s more of like a little angel on your shoulder that can help you rather than like, “Okay, it’s just going to do it all for you.” It’s like-

Lorilyn McCue: Yes, totally.

Allen Pike: … it’s superpowering you in this way where the little angel, because it’s been prompted well, model understands writing whatever, it’s encouraging you to learn to do the thing, to shorten it, to make it more succinct. To actually back up your argument, I saw a nice demo a few months ago now where it was like, “If you made an unsupported argument in your writing, it would be like little orange underline or purple underline or whatever. It’s just like, “You haven’t actually supported this.” This is meant for research, so it’s like if you’re writing a research argument, you should actually cite something that supports this claim, which I found joyful and a little bit of a level above other than being, “You have a typo.”

Lorilyn McCue: We’re absolutely going to turn on email grading for your emails. You’re just going to get B’s and C’s and D’s on your emails and be like, “You didn’t support that work on it.”

Allen Pike: Yeah, come on. Yeah, I personally would enjoy that. I imagine not all users want that on by default, but I would turn it on for sure. We’ve gone through some specific features. We’ve gone through some really tactical approaches. At the high level, any big lesson or thing that you’ve taken from over now being a year fully immersed in building this stuff where it would be useful to send to a year ago yourself or to other product leaders and folks that are now really digging into building their own LLM-powered functionality or AI product work in general that might be powerful for people?

Lorilyn McCue: Yeah, I would probably tell myself two things, which is the optimize for learning piece, but I would also embed AI into the product as much as possible. First of all, build it in a way that lets you figure out the weird LLM edge cases like, “Okay, got it. We understand this. We can play with it. It’s really lightweight.” Also, go through the rigorous design process. For example, we have the autosummarize feature, which puts the one-line summary at the top of your email. You don’t have to ask for that. It’s like there. The summary is there. If you want more details, you can click on it and get the polls, but it’s like this first AI feature in Superhuman that you didn’t have to remember to use-

Allen Pike: Right.

Lorilyn McCue: … which was super exciting. A lot of other products, you could get an email summary, but you’d have to click a button or write “summarize.” For us, it was incredibly important that this was embedded into the product experience. It was seamlessly interwoven to the point that you don’t have to even think about it. I think thinking about both of those things, like, “Okay, maybe make it possible.” Here’s a good example. We’re going to launch a feature where you can create an event with AI. Let’s say you send me an email saying, “Hey, let’s get coffee tomorrow at 2:00 PM.” I could say like, “Create event with AI from this email.” There’s an incredibly amount of cool experiences that we could do to embed this into the product. There could be a little calendar event already there. You just click, “Yes, yeah, I like it. I dig it. Do it.”

Allen Pike: Yeah, yeah.

Lorilyn McCue: That’s like where we want to go, but we’re probably going to start, first of all, having a Command-K command. Command-K, create event with AI, and then you can give it a try. We’ll probably release that to our beta users and let them play with it and tell us, “Hey, it really breaks in this case, and it really succeeds here because not every person at Superhuman uses email the way that our customers do, and so let’s figure out. Let’s get the learnings and then let’s deeply embed this into the product because it’s going to be really useful and we don’t want to have people have to do Command-K, create event with AI anytime we can tell it an event is probably being to be created.

Allen Pike: Yeah, and that’s like generalizing that, that’s I think a really good pattern that a lot of teams seem to be having success with, which is first to build the thing where the workflow that is really explicit-

Lorilyn McCue: Yeah.

Allen Pike: … and then people say they’ll do X with AI, which only a certain percentage of people will do, but when they do it, then there’s an opportunity to collect feedback and then iterate towards the future that we all know we want because there’s people, again, people on Twitter will be like, “I don’t want to do X with AI. I don’t want to do Y with AI. I just want the product to work.” It’s like, “That will be great in a year or two once we all figure out how to do that and the models get good enough and whatever. For now, if you just said, “Oh, I’m just going go insert your calendar event with AI without any opportunity for feedback, without any opt-in, then 20% of the time it’s wrong. Can you imagine?

Lorilyn McCue: Oh my gosh. People would hate us.

Allen Pike: Right, but there’s value in the feature, at least for some percentage of people who also… I personally find this stuff interesting, so I play with it, and so I’m willing to eye it with Superhuman. This week I was playing more with the AI features knowing that we were going to be talking about it stuff, and so I hit some edge case and so I’m filling out the little form you’re talking about being like, “Oh, this is what an example of good.

Lorilyn McCue: Thank you. That’s so nice of you.

Allen Pike: It’s part of the fun as a product person. I’m like, “Oh, what does this flow look like? I want to put positivity back into the world about… Benefits me if it gets better on this one case. Yeah, I love that, and I think that as an arc, kind of a flow of how we think about these things from opt-in to rating towards more and more, it just being a seamless default thing that doesn’t even need to be labeled. That’s the goal, I guess, we all want to get to. It doesn’t even need to say with AI, it just works.

Lorilyn McCue: I will say I think that is what I’m trying to do as an AI team. I don’t think this is the Superhuman way. The Superhuman way is absolutely make it embedded and beautiful, but at the first get-go, but I feel that the edge cases that we find after a big exciting launch that we’re like, “Oh man, maybe we would’ve designed this differently if we had known some of these edge cases or if we had understood it differently.” I’m really trying to create this kind of rapid iteration culture as much as possible, at least with the AI team.

Allen Pike: Yeah, my instinct is that that’s the right balance. You’re in a similar tension, which we were talking about at the beginning of the show where it’s like you have a culture and you have a user base that really cares about polish, but you also have this huge opportunity where moving fast and learning rapidly is kind of necessary. You can’t just sit and watch. I mean, you could, but it’s probably not a great idea for the business to sit and watch and like two or three years later, it’s just like, “Ta-da, we built it all,” right?

Lorilyn McCue: Yeah.

Allen Pike: You see this in the Arc Browser, which is another one of those teams where they care a lot about you as your experience in polish and stuff like that, but they also want to explore these opportunities and they do it in the exact same way you’re describing where it’s like, “Beta, opt in,” and then it’s like folks like me who I am like, “Oh, I’m curious.” I turn it on and I’m like, “Wow, this thing where it summarizes my tabs and chunks them into little sections and retitles them, I didn’t think it would be that useful, but it is super super useful. I’m like, “Oh, now I love it,” but if it had just started doing that, I’m sure some percentage of users would be like, “What is going on?”

Lorilyn McCue: What?

Allen Pike: Cool. One last question before we run out of time that I think is on a lot of folks’ minds who are building with this stuff. How are maybe you personally and/or Superhuman as an org or if there’s subtleties in between those two, how do you think about the cost of running all this stuff when we’re in a world where it’s like on one hand it’s all getting rapidly cheaper and GPT 4.0 costs like a quarter of what it did six months ago, but also it’s still not really expensive to run some of this stuff. Obviously, it’s still kind of custom to use case and your user base, but how does that factor in how you build this stuff right now?

Lorilyn McCue: Yeah. Our finance guy doesn’t love us. No, no, he’s great. I will say the principle that we use is always use the best model for the job. Again, we know it’s going to half in a few months or quarter. Yeah, we just ruthlessly prioritize the user. We also bring in a lot of ARR. Our new AI features are really exciting, and they bring in quite a few new users saying like, “This sounds interesting. I want to try this.” We are also making the business money, and I think, obviously, people would probably think Superhuman wasn’t worth it if we weren’t keeping up or hopefully exceeding the competitors on the AI features. I would say we don’t have a choice. We have to pay this amount of money to stay relevant and to provide value for our users. Fortunately, Superhuman is really well-funded, really well-run, and so I think we have the luxury of doing that, yeah, but it is a big cost.

Allen Pike: Yeah, I know. I think that having this SaaS business model, and like I’ve said, I’ve seen over the last few months that Superhuman even has now like a more pro plan, that you get additional AI features that are more expensive like this ASK AI thing that, obviously, can do if you’re asking it to create complicated spreadsheets in the background, and it actually does do that, that’s going to be quite expensive.

Lorilyn McCue: By the way. It doesn’t do that yet, so please don’t try that.

Allen Pike: I know, but that’s the kind of thing you could consider if you’re like, “Hey, if people are paying hundreds of dollars a year more for their,” I don’t know if that’s the actual right amount, but a substantial percentage more for the feature, then you could potentially, especially as the models keep getting cheaper, but just-

Lorilyn McCue: Yeah.

Allen Pike: … I don’t know if you want to tell your finance guy, even though I’m sure you probably know this. This is like the 80% of teams that I talk to are taking the same mindset where it’s like not everyone is as well-funded as Superhuman, but it’s that the cost of this stuff is going to keep going down as competitive risk to just being like, “Well, we’re going to use some tiny model that doesn’t really actually provide a very good user experience. The worse case scenario, I think that what a lot of teams adopt is, if it’s so expensive that we just can’t manage it, ten, we just turn it on for a certain cohort of users or beta users or opt-in or whatever. We still have to be experimenting with the cutting edge stuff and then it’s open AI and Anthropics’ job to keep making it cheaper, which so far, fingers crossed, they’ll keep it out, but they’ve been doing a pretty amazing job on that so far.

Lorilyn McCue: Good. I’ll tell my finance guy that. I’ll say, “Allen says it’s fine.”

Allen Pike: Yeah, just Allen Pike said so, so I’m sure he’ll be like, “Okay, it may be fine.”

Lorilyn McCue: Okay, cool. That’s good.

Allen Pike: Or he might just say like, “Is he one of our investors? If not, awesome.” Well, this has been a wonderful conversation. I could talk about it for a whole nother hour, but for today, that’s time for us, and thank you so much for making the time and sharing some of this stuff. It’s been-

Lorilyn McCue: Oh-

Allen Pike: … really cool.

Lorilyn McCue: … yeah. Thanks for having me on. I was very like, “Oh, same company as Merci and Johnny and everybody. Ooh, that’s exciting.” Yeah, what an honor.

Allen Pike: Yeah, I feel like you’ve held your own very well-

Lorilyn McCue: Oh, phew.

Allen Pike: … in the conversation to join our guest cohort. Where can people go to find more about you and the work that you do? I feel like you’ve held your own very well in the conversation to join our guest cohort. Where can people go to find more about you and the work that you do?

Lorilyn McCue: I am not really on social media, to be honest. I’m like a secret.

Allen Pike: I don’t blame you.

Lorilyn McCue: … stalker on social media. This is really boring, but probably the Superhuman AI Blog, or not the AI blog, but the Superhuman Blog, all the stuff about AI. That’s not really about me, but it’s about the work, which is more exciting than me.

Allen Pike: Hey, I mean, that’s what we’re talking about today, so that’s a great answer. I’ll link that up on the show notes. Thanks for being on the show. It Shipped That Way is brought to you by Steamclock Software. If you’re a growing business and your customers need a really nice mobile app, get in touch with Steamclock, and that’s it for today. You can give us feedback, follow us on social media, or rate the show by going to itshipped.fm/contact. Until next time, keep shipping.

Read More