Transcribing patient notes is a very labor-intensive process. Hospital districts in Finland, where medtech firm Inscripta is based, hire hordes of professional transcriptionists to work manually. With advances in voice recognition, it’s easier and cheaper to transcribe medical records – passing savings on to hospitals and patients alike. Our partner Inscripta is doing just that. Espeo software helped make it a reality. Below you will find an interview with Simo Sorsakivi CEO & Founder | Inscripta conducted by Jacob Dunn.
I know that Inscripta is a transcription service using speech recognition. So what inspired you to get into this industry? Why did you decide to go into this field?
It’s a question that I often get asked. It’s not a very well-known business at all but people generally understand when I tell them that we have the solution to help medical professionals to draft their patient notes. They generally don’t realize how big a business is, because it’s something that actually involves every single medical professional at least in Western-style multidisciplinary healthcare systems where you have this obligation usually from the law to type in these patient notes when there’s any interaction with you or the patient.
In Finland, we have hospital districts that cover a wider range of healthcare for secondary health care or tertiary. Even specialized healthcare centers have their own transcription departments that employ hundreds of medical transcriptionists. I happened to be one of those almost 10 years ago.
So seeing the pain that everything is still done in a very old-fashioned way using these separate dictation devices that cost you more than a smartphone for some weird reason, even though they don’t do anything else than just record voice. Then you have the processing carried out entirely manually. So you have to employ hundreds of people to serve in each hospital district. Let’s say that the size of the Helsinki Hospital District the Capital Region Hospital District is 330 employees working day and night for doing just manual medical transcription.
I came from cognitive science, which is the study of intelligent artificial systems. In our case, we choose to apply artificial intelligence to do that transcription part. So that’s like combining my studies and my job on the side, which I did during my studies.
In a previous interview, you compared Inscripta to how a child might learn. So how does it look in practice?
We don’t teach the system any grammar or anything like that. We just feed it with data and it learns by itself. In that sense, it’s like a child. You just implement it into an environment where it gets exposed to this huge amount of data and from that data it can extrapolate its own understanding of the inner workings of that particular language in use. In that sense, it can adapt to any kind of language. For example, we are a Finnish startup, so it was natural to get our first customers from Finland, but the people working on the algorithm don’t speak Finnish. So that’s all validation that we’ve just adapted the solution for this particular language that we have over here which people might even consider a quite a difficult language.
Yes, and there are very few native speakers. How did you collect enough data for it to be such a self-sustaining machine learning system?
Actually, the data collection is part of the transcription process that we have. We are actually providing this transcription service that has a double function of providing us with the optimal data for this particular use case including that medical lingo. So we might enter a totally different language area where there wouldn’t be any kind of open-source or maybe even commercial data sets available at first. We could then just begin again having the medical professionals dictate the patient notes and having the local transcriptionists or other personnel for that user who we still use for annotating and correcting the possible errors that might exist in the automatic transcripts.
We would have that person to interrupt begin entirely manually, but incrementally already within a short time after we’ve gathered some tens of hours of data, the system starts to build up a primitive understanding of the local language. Then after a couple of hundred hours, it already starts to provide you with meaningful automatic transcripts that actually speed up your work and after about 1,000 hours, you are pretty much on par or it’s like human parity. If you were to pick a person who is not a professional transcriptionist and compare our solution to that person. So those are pretty much the steps. But the service production serves as a data collection utility in that sense as well.
Does Inscripta work in all languages?
Yes, we’re truly language agnostic. So whether it is in Polish or whichever language that our customers want us to provide services when we can actually do that. And the funny thing is, for example, that we just did a project for MIT in Arabic speech recognition, but also the University of Cairo. So it’s like in Arabic and Finnish and Polish and these kinds of languages.
I know that with medical records, privacy and security is a major concern. How do you protect the sensitive nature of medical records?
Absolutely. That’s already the key element in our business. We’re obliged to follow the same level of privacy as any private medical practitioner would. It’s built so that we don’t actually care about the contents of the data. We use it only for performing the task at hand, which is converting audio to text as accurately and as efficiently as possible. However, with the actual contents, we really don’t even want to know, for example, who’s the patient in question and we don’t collect that kind of information.
So by integrating our solution with the customer (EHR – electronic health record system – this is the system that houses that the personally identifiable information), we are just acting as the processor. Our business logic follows the processing efficiency right above providing the service. After we’ve delivered the transcript to the customer there is a 24-hour trigger after which we cannot even access it anymore.
You mentioned that the technology is continuing to learn and improve. Where do you see this in the next three to five years and what sort of applications do you foresee?
We often get asked if we provide an automatic front-end solution that would be the medical professional themselves communicating with the interface. It’s not the speech recognition part that’s difficult. It’s how you implement that speech-driven UI into the legacy systems. Let’s say you want to make a referral for some kind of imaging purposes or to another hospital or whatever and yet you might have multiple different kinds of legacy systems that you would have to operate within. Naively we sort of think that using voice would be the best way to use the interface, but that’s still like it’s 5 to 10 years down the road at least.
Take the human element that we currently provide as a hybrid model. People are correcting and annotating those transcripts. Removing human proofreaders is likely the way that this business is going to develop. Of course, there have been different undertakings like this already, but they have pretty much all failed or at least they haven’t provided any kind of cost efficiency to the actual workflows quite yet.
Some companies want to introduce a so-called Alexa for doctors but then we hit these privacy issues. So you have a machine that is continuously listening to very private conversations that you have at the doctor’s office. It’s something that I would be really cautious about especially within GDPR and also the mentality that we have overall related to data privacy here in Europe. Also, the fact that some companies are unrealistically trying to come up with these kinds of solutions that will automatically create a patient note from a discussion between the patient and the doctor. But that, of course, already would suggest that it’s much more than a speech recognition solution. It’s already a step towards a general AI if it were to actually understand the tone of the conversation and the hidden messages what the actual symptoms – for example in a psychiatric or psychological case would be about. I really don’t see that that’s actually something that will be happening at least within five to ten years. It’s impossible I think to predict but in the near future, it’s more realistic that we will be seeing more of these actually functional front-end interfaces.
Is there anything that you are sort of worried about? Is there anything that you think could be used for ill?
Sure. Many people are already aware of how those global conglomerates that offer speech recognition solutions i.e. how Google and Amazon access your data and how much they actually listen to you. People are generally surprised when they learn that their most private discussions might be exposed to other people.
Having your conversation recorded is something you should be aware of. Which is, of course not the case with us because Inscripta is not voice-activated or anything like that. Instead, you have to actually press a button and then start recording.
Why did you decide to work with an external IT partner – with Espeo?
The main reason was definitely a lack of local resources. It was just so much easier to get started. It was clearly defined what we needed to do. For example, we had this iOs solution mobile application and we just needed to put that into Android so have the Android version done. It’s easier to have an external partner do that and start getting on that immediately versus us trying to find ourselves a developer.
What are some of the key factors of successful collaboration with external providers?
I’d say that you need to know what you want. Projects need to be clearly defined and that’s something that I guess mentioned already, but if you don’t know what you actually want to do and what kind of functionalities you want to have in your application or how you would like to do this or that, then it’s going to fail. But when everyone involved in the project communicates on a daily basis it’s much easier to carry out a budget and clearly defined goals.
What advice would you give other tech entrepreneurs?
Get a shareholder’s agreement. In the beginning, it’s all smooth sailing but when you’re confident because it becomes a multi-million euro perhaps for some companies they become a billion dollar company. And if you have any of these shortcomings in terms of who gets rewarded with what then you’re going to run into problems, but of course we’re way before that. You’re pretty much like unfindable if you don’t do these kinds of things professionally.
CEO & Founder | Inscripta