Series B Closing Note
This is a copy of the email that was sent to all the investors after the closing of Series B.
We have received the wires from everyone — thank you!!
I was looking at the closing notes from our Seed and Series A round [see below]. I was about to send the same note today, but then after spending some time thinking about it, I felt it makes sense to perhaps share a more detailed note.
As more people get involved and the stakes get higher — there would be more voices, and more opinions; there would perhaps also be some fear and uncertainties.
One way of starting this note could have been to assure you that we are on the path of greatness, and in front of us lies the opportunities that are limitless — giving you reasons to believe that we’d be successful. I think that would be a partial and misleading note. There are many challenges ahead and there are some real risks.
Before we dive into the challenges and risks, let’s take a quick tour of our history — just to understand how things came to be, our good times and bad times, and perhaps at the very core, who we are.
It was Aug 2015 — we really had no idea what we wanted to build. Pete and Jerry had invested $3.75M at that time on this vague system architecture called DataHub that argued for a Windows/Mac like operating system design which will provide an interface for end-user data applications, by abstracting all the infrastructure and other complexities underneath [http://istc-bigdata.org/
We didn’t really know what apps are useful and what would really work — we started with two random apps:  “Notebook”, for universities, and  “Refiner” — an app for data scientists to help structure text data coming from sensors and other machine generated sources. We also built a couple of visualization apps:  “Lens”, and  “Viz”, primarily eye candy for demos. That’s what the whole company was in the beginning.
The first 12-18 months [until the beginning of 2017] was primarily focused on building the foundations of the operating system — a general purpose abstraction for files [File Service], a general purpose abstraction for tabular data [Table Service], and a general purpose abstraction for processing tasks [Distributed Task Processing]. None of these had any real value or practical application from the usefulness perspective — these were the foundations being built for something unknown.
It was late 2016 [we were a 3-person company] — in a random demo, a data scientist from a HR & Benefits company asked us: “I want to do some analysis over health insurance plans — wouldn’t it be great if we could use Refiner over the insurance plan docs which we get as PDF from different insurance provider websites [Aetna, Anthem, etc.]”.
We didn’t have anything to handle PDF but we knew how to do extraction from text. We suggested an idea: maybe copy the PDF text and pass the text file to Instabse, and we will take care of things from there. Here is a demo video of that use case:
This video represents how raw and experimental we were — one had to open the PDF in Chrome manually, and copy the text and paste that text into Instabase for creating different programs, and then use notebook to stitch them all together for end to end processing.
It was scrappy but it still captured their imagination — they wanted to do a deal. We never closed a deal with them because of a large re-org in that company [the entire leadership team (including CEO) got replaced] but it was our first encounter with what would later become one of the key use cases. This was the genesis of what would later become the “Flow” app.
It was early 2017, David was on a trip to Lake Tahoe with his friends [and friends of friends] — he was doing a demo of Instabase and a person in that circle happened to be from a lending company. The person from the lending company curiously asked: “my company too has to deal with a lot of PDFs but those PDFs are scanned images and a lot of camera pictures, would it be possible to add an OCR app that can read all the text from the images and we can then pass that text to the Refiner app to extract some key fields (Net Pay, Gross Pay, etc.) from them.”
We didn’t have anything to read text from images (OCR) — but we decided to pursue it. We assembled some open-source components and cloud libraries to support reading text from images and then pass the text to Refiner to extract the structured field as output. It was scrappy but still much more accurate than anything else out there. The mortgage company decided to engage with us. In the process, we realized that they didn’t even know what kind of docs they got so they also needed a way to classify those images before any fields could be extracted from those images [For example: W2’s have different fields from Paystubs so being able to classify the document is an important step]. We added a new app called Classifier. In order to know what different kind of documents exist — they wanted a functionality to group similar documents together — this would later become the “Cluster” app. Eventually, we also ended up building our own “OCR” app.
It was March 2017 [we were a 4-person company] — we had no revenue. We had just started a POC with the lending company. Martin dropped by our office one day, saw the demo, and offered to lead our Series A. We had a small startup paying us $2k/month — we tried to increase the price to $50k/year and they decided not to continue with us, this happened during the Series A due diligence process and we were back to zero revenue. Frank Chen from a16z in the due diligence call made an important note: we had three customers before the Series A, and zero after the Series A.
We started another POC with Standard Chartered Bank around June 2017. They needed OCR, Classifier, Refiner, Cluster, and Flow but they also needed something totally different — a way to do Fraud search. This was new territory, but we had a very flexible platform — we added a new app called “NLP Search” [this app was built in 6 hours].
Here is a demo of the “NLP Search” app at that time:
Over the next 8-10 months, we had built a number of new apps [30+ apps] that served a number of use cases. We had slowly started recognizing the logical whitespaces we had no clue about in the beginning.
The reason for giving this historical perspective is to help understand who we are, and how things came to be. The core of Instabase isn’t about these apps and the use cases we serve. That’s not who we are. Beneath the surface, there is this architecture that represents us more than anything else — it represents the belief that we don’t really need to be good at predicting the future as long as we remain curious, fearlessly experimental, and have the ability to react faster than anybody else. It allows us to quickly recognize problems, push experimental ideas to our customers, test new ideas, and at times correct mistakes [completely replace an old app with a new app that takes an entirely different approach].
Today, when people see Instabase, it’s easy to take things for granted — these apps — OCR, Refiner, Flow, etc. — they see these as a result of how smart we are [I think that’s what made Sarah, Will, and Mike interested in us]. The story above tells you how things came to be, and presents the real picture: none of these were the ideas we came up with — we have always been uncertain, and clueless — today, we are as uncertain as when we started.
With this historical perspective, let’s put our focus towards the risks.
First, the technology risks: we should be aware that we don’t have all the answers. While it’s true that we have a very strong engineering team and a product that truly captures people’s imagination, the reality is that we still don’t have many answers.
Just take the area of document processing: there remains many open problems — for example: understanding long-tail documents with unknown structure, understanding natural language text in an unseen domain, reading text from poor quality images accurately, reading hand-written texts accurately, etc. We are far far away from being able to say that we can understand a document of any kind from any domain.
With the kind of engineering team we have, we might have temptations to believe that we can force any problem into tech submission — we should be very careful in making that assumption. Many problems we are going after are very hard and we won’t get things right from the start — perhaps some problems we might never be able to fully solve [unless we find a way to solve AI-Complete problems].
While technology is going to be an indispensable part of everything we do, we should be aware that it may not hold all the answers. We need to be equally creative in other areas such as our product strategy, market strategy, financial models, distribution models, and pricing.
This brings us to the second risk: market risks.
This risk is far greater than the first one — in fact many technology risks I highlighted earlier can be eliminated if we are very creative.
History teaches us — companies and products fade away because some newer technology/approach makes them irrelevant. What is unique about Instabase is its app-based approach towards problems — it gives us two-pronged advantages:  as long as we can build a new app which is the state of the art that can replace our old/legacy app, we should be fine, and  for certain areas, where we aren’t best qualified to build the state of the art apps — we must find ways to make those apps run exclusively on Instabase.
That’s why being creative in product strategy, market strategy, financial models, distribution models, and pricing is even more important than being creative in technology.
We haven’t built the core foundation for these areas yet, and there will be a sense of urgency to act fast to fill these gaps. As we begin to build this foundation, I worry about blind spots — the things that keep us from grasping the bigger picture. The scary part is: I don’t even have a clue where my blind spots begin and where they end.
I do believe that we have a huge opportunity ahead — if we are able to build the right foundation, we can shape how desktop computing applications are built, distributed, and consumed. If we get this right, we get to define the next few decades of computing!
What is necessary to bring every single organization (from large to small) onto Instabase’s platform? How do we enable developers and organizations across the world to create apps on Instabase? Do we focus on the platform aspect, or on apps? Do we focus on high value operational apps for large enterprises, or self-service productivity apps that can attract a large number of small and medium businesses? Do we optimize for distribution of the platform to a large number of accounts [regardless of their size], or do we optimize for key use cases in a certain industry?
At the very core, what is the foundation that enables us to instrument these?
We’re going to have our share of bad ideas — we should allow for the possibility that we’ll make mistakes along the way. The few stories I presented earlier about how things came to be at Instabase highlight this fact — we were never certain, we were never sure about the future, we never had a clue what would really work. But, here is an important thing to note: at the core, we were able to build something that allowed us to engage with the world — not as a high-minded voyager, but as a curious explorer, and adventurer, where we could take a crack at problems we recognized as worth pursuing.
I hope we are able to build something enduring, something we all can be proud of!