From Memprotein Wiki
The skill of speech and also the art of transcription are blended to bring about a new state-of-the-art technology known as the automatic speech recognition software. The ASR or even the automatic speech recognition software programs are found to be the talk from the town. Speech recognition has been a dream for us in the good old days of the exorcist along with other science fiction movies and stories. Have our dreams come true?? Today, it has been partially fulfilled with the new arrivals in the markets. Each company has been into this competition of giving the very best speech recognition software around the world market. What has happened to the race among themselves? It jogs my memory of the hare and also the tortoise story. The slow and steady appears like it's won the race, and yet has miles to the touch the finish line. Discussing about what exactly may be the goal of the race?? Could it be either getting to the top or dealing with the people, is again millions of dollar question. With the revenues pooled set for speech recognition have learned to drain, there is a have to analyze the growth with time factor, that will show a flattened graph showing the stagnant nature of the software research and development.
Make a situation, where you have invested on speech recognition software for some thousand dollars monthly and find so that it is unworthy given that they type in your dictations wrongly, test is replaced and jumbled, and the context becomes different, what a chaos that will create. The frustration that's exhibited at those times is really unbearable. Flawless services or products are nowhere found since everything on earth includes unique benefits and drawbacks. This applies to the speech-to-text software as well. It's its very own flaws and demerits, which limits the usage of it inside the small community. The idea needs more attention and research to achieve or to compete with the languages that have been developed over countless years.
The ethnologue around the globe appears to be far too long and unending. The languages that people speak today would be the development of it over countless years together with all of the efforts of countless generations. All animals communicate with one another, but it is just the humans who have formulated the communication in predefined set of signals known as the language. The Cortical Speech Center is again an evolutionary feature that only the humans posses, which differentiates the human brain from the other animals within the animal kingdom. Hence, the speech recognition softwares which has a very recent history compared to the languages has to travel not millions but a minimum of many years to understand the least about the speech and languages spoken by different categories of people.
The drawbacks of the voice recognition or audio-to-text software are:
It cannot understand all the words after working hours together training the program. Time matters after all we've only 24 hours a day!!! All the punctuations such as coma, full stop, semicolon, hyphenation requires the speaker to dictate wherever he/she wants one. Understanding the context is another major drawback or demerit: Some words especially in English have many meanings and needs to be used within the correct context to acquire good results within the records. The program doesn't seem to understand the context in most of the places. Homophones are again a difficult task to deal with for that audio to text software: Different words with the same pronunciation but different meanings: For example elicit-illicit; desert-dessert; there-their; flour-flower; bowel-bowl; words with same pronunciation but different spelling and meaning, that are utilized in different context, confuse the software leading to bloopers and hilarious phrases and sentences. The other major black mark about the speech recognition is that it cannot understand the varied types ofaccent that is contained in a single language. Comprehending the words inside a neutral slang is difficult for the software then just how can it ever comprehend the different slangs or accents utilized by differing people around the world!!
In 1997, Bill Gates gave a open statement that "In this 10-year time frame, I believe that we'll not just be using the keyboard and also the mouse to interact, but in that time we will have perfected speech recognition and speech output good enough those will end up a typical part of the interface." Now, it's Three years past ten years and yet speech recognition is only in the primitive stage of usage and development.
Hence, to conclude transcription industry has a bigger give the audio-to-text software. Transcriptionists aren't obsolete. They've their own space and need in the field for their integrity, caliber, and experience in the industry.