Calendar apps are important for productiveness however it’s onerous to distinguish sufficient to have sustained development from simply the core utilization. Y Combinator-backed Superpowered, which is an AI-powered notetaker on your conferences that doesn’t contain recording bots, hit this roadblock and is now pivoting to turn into Vapi, an API supplier so anybody can simply create a natural-sounding voice-based AI-powered assistant.
Superpowered was based in 2020 by Jordan Dearsley and Nikhil Gupta. However after three years of engaged on it, Dearsley stated the workforce wished to work on the more difficult product. The corporate is just not shutting down the preliminary product because the startup stated that Superpowered is worthwhile — it’s within the technique of bringing somebody in to run it. Y Combinator stated in June that greater than 10,000 folks have been utilizing the product weekly, however the firm didn’t present any up to date numbers.
Up to now, Superpowered/Vapi has raised $2.1 in seed cash from buyers together with Kleiner Perkins and Summary Ventures.
Pivot to Vapi
The corporate provides Vapi as an API to let builders create a bot utilizing simply prompts — it then put it behind a cellphone quantity. Moreover, it provides an SDK integration so builders can embed the bot on web sites and cell apps.
Dearsley instructed TechCrunch over e-mail that the concept to construct Vapi stemmed from a private downside. He had moved to San Fransisco and began lacking his family and friends, who have been in a distinct time zone. He constructed an AI bot hooked up to a cellphone quantity on the opposite finish to speak to somebody so as to kind his ideas.
“I preferred it, however I used to be frequently annoyed with how unnatural it was. It wasn’t like speaking to an individual. The voice sounded off, there could be lengthy delays earlier than it responded, and it could interrupt me whereas I used to be talking.” he stated.
“So I stored engaged on it and going for my walks with it. Ultimately, we bought fascinated with this dialog downside. It’s actually onerous to make one thing really feel human. Voice assistants in the present day are clunky and turn-based, we need to construct one thing that feels human.”
Technically, Vapi is at present stringing a bunch of third-party APIs to construct a sturdy voice dialog platform. For example, it makes use of options from Twilio for telephony, Deepgram for transcription, Day by day for audio streaming, OpenAI for responses, and PlayHT for text-to-speech.
ScaleConvo, a startup within the YC winter batch for 2024, is already utilizing Vapi to launch conversational bots for gross sales groups and property administration firms. Nonetheless, Vapi didn’t disclose its different shoppers. The corporate is opening up its API with Vapi Telephone and Vapi Net merchandise in the present day.
Challenges for Vapi
One of many largest challenges the startup has is to cut back latency, in response to Magnus Revan, an ex-Gartner analyst and chief product officer at multimodal dialog startup Openstream.ai.
“OpenAI fashions want between 2-10 seconds to generate a solution – whereas on the cellphone the gold commonplace is to have 700ms between the person ending speaking after which the ‘bot’ beginning to discuss. And attending to sub 1-second latency with succesful fashions (excessive parameter rely open-source fashions like LLaMA2 70B) is actually onerous,” Revan stated.
At present, Vapi has a latency of 1.2-2 seconds relying on numerous components. Dearsley expects to convey down latency to beneath one second within the subsequent month because of Vapi’s personal work and OpenAI’s enhancements.
Mohamed Musbah, an angel investor in Vapi additionally stated that the startup’s answer will enhance with total advances in API.
“As OpenAI and others enhance their fashions, Vapi’s platform will turn into extra highly effective, geared up with higher information bases, code execution capabilities, and bigger context home windows. Vapi’s concentrate on fixing the best friction areas in voice communication will probably be its edge as person demand grows for voice assistants,” he stated.
Nonetheless, this places the onus on the development of different options somewhat than Vapi itself. Dearsley stated that reliance on different APIs reduces Vapi’s defensibility if large firms begin transferring into that space. Nonetheless, the workforce stated that it has an edge when it comes to having constructed infrastructure to deal with hundreds of calls concurrently. Dearsley emphasised that with Vapi’s net and cellphone API launch for the general public, the workforce will even look to construct its personal fashions for audio-to-audio options.