Vom Thema belegte Seiten: [1 2 3 4 5 6] > | Is OpenAI’s Whisper better than Dragon? Initiator des Themas: Hans Lenting
| Hans Lenting Niederlande Mitglied (2006) Deutsch > Niederländisch | Dan Lucas Vereinigtes Königreich Local time: 10:10 Mitglied (2014) Japanisch > Englisch Interesting, but... | Mar 12, 2023 |
Hans Lenting wrote:
Hey Michael: did you read this/do you have plans for this Sunday?
I agree, let's get Mr Beijer to do the hard work.
But isn't this online only? To put it another way, isn't all content recorded...? I think I read something about privacy concerns.
Dan | | | Tom in London Vereinigtes Königreich Local time: 10:10 Mitglied (2008) Italienisch > Englisch
Hans Lenting wrote:
Who will be the first to tell us?
https://youtu.be/8SQV-B83tPU
Hey Michael: did you read this/do you have plans for this Sunday?
That guy in the video is I N T O L E R A B L E | | | Baran Keki Türkei Local time: 13:10 Mitglied Englisch > Türkisch Tolerable or not | Mar 12, 2023 |
Tom in London wrote:
That guy in the video is I N T O L E R A B L E
Apparently his 2 million plus subscribers find him tolerable enough... He must be making shit loads of money from Youtube than you'll ever make from translation by annoying people like you. | |
|
|
| Tom in London Vereinigtes Königreich Local time: 10:10 Mitglied (2008) Italienisch > Englisch
Baran Keki wrote:
Tom in London wrote:
That guy in the video is I N T O L E R A B L E
Apparently his 2 million plus subscribers find him tolerable enough... He must be making shit loads of money from Youtube than you'll ever make from translation by annoying people like you.
I'm doing fine thanks. | | | Michael Beijer Vereinigtes Königreich Local time: 10:10 Mitglied (2009) Niederländisch > Englisch + ...
Hans Lenting wrote:
Who will be the first to tell us?
https://youtu.be/8SQV-B83tPU
Hey Michael: did you read this/do you have plans for this Sunday?
I have indeed already been investigating how to use Whisper to dictate text in Windows. Haven't gotten anywhere yet, but I might ask over at the knowbrainer forums, where someone might know more.
See e.g.:
• https://www.knowbrainer.com/forums/forum/index.cfm
• https://www.knowbrainer.com/forums/forum/messageview.cfm?catid=4&threadid=36875&highlight_key=y&keyword1=whisper (‘Whisper vs Dragon’)
I am currently on the fence as to whether to upgrade to the new Dragon Professional v16, also because Voice Access in Win11 is just so good already. In my testing it seems to be pretty much as good as Dragon already at dictating flowing text.
I also recently discovered you can send Voice Access to sleep and wake it up with a keyboard shortcut (Alt+Shift+B, which I have changed to something easier with AutoHotkey). You can of course also do this by voice, by saying: ‘Voice access wake up’ & ‘Voice access sleep’.
Voice Access also works really well in memoQ. You can't select words, but you can dictate your target segment perfectly, and you can even do basic things like confirm segments (by saying ‘Press control enter’), go to the top of the file (by saying ‘Press control home’).
Another thing about Windows 11's Voice Access that's good is that it runs entirely in the cloud. Temporarily ignoring the privacy concerns that this might raise with some people, this means that dictating text will not require any CPU/GPU cycles from your computer, like with Dragon and Whisper.
[Edited at 2023-03-12 14:16 GMT] | | | Michael Beijer Vereinigtes Königreich Local time: 10:10 Mitglied (2009) Niederländisch > Englisch + ... interesting post @ knowbrainer.com/forums/… | Mar 12, 2023 |
very interesting post by a person called rjwilmsi @ https://www.knowbrainer.com/forums/forum/messageview.cfm?catid=4&threadid=36875&highlight_key=y&keyword1=whisper
01/30/2023 07:06
rjwilmsi
Power Member
Posts: 77
Joined: 08/24/2008
I've been playing with ... See more very interesting post by a person called rjwilmsi @ https://www.knowbrainer.com/forums/forum/messageview.cfm?catid=4&threadid=36875&highlight_key=y&keyword1=whisper
01/30/2023 07:06
rjwilmsi
Power Member
Posts: 77
Joined: 08/24/2008
I've been playing with whisper more or less since it was first released on github https://github.com/openai/whisper/. It's freely available on github and after installation and download of the models it runs entirely offline. The installation is easy if you're a Linux user used to fiddling with python/utilities from github, probably a bit challenging if you are new to that sort of set up. Though I haven't tested the Windows installation of whisper, maybe that is packaged up so easier. I use it in conjunction with a "whisper_mic" utility to get live dictation https://github.com/mallorbc/whisper_mic's
Compared to the last versions of DNS that I was using regularly (DNS 12 and DNS 13), for general speech and dictation the accuracy of whisper is much better. I don't know if DNS 15 is much more accurate than 12 or 13, if it's just incrementally better then I'd say whisper would be clearly better. Whisper has 5 different models trading time versus accuracy, on the fastest two models its error rate is much lower than I got with DNS (for live dictation of general speech in English). On the slower models (which are too slow on CPU for real time dictation) the accuracy on everything I've played with (youtube video audio etc.) has been so good that differences versus a transcript I'd do myself are nearly all differences about punctuation (how do you spit sentences etc. from a speaker speaking off the cuff).
Because whisper is done by machine learning on big datasets of various audio sources it supports multiple languages, accents etc. so there is no concept of training it for your voice or specifying your accent. It means that it is very good at all sorts of accents and you don't have to cultivate your own custom trained profile etc. Also there is no big focus on your mic quality like with DNS.
The downsides of whisper is that it's much more resource demanding than DNS 12 - on CPU need a modern 6+ core CPU and to use the tiny or base model (fastest two). Otherwise for the larger 3 models need a workstation CPU (16 threads etc.) or better a reasonable NVIDIA graphics card to run in CUDA mode (e.g. a GTX 1060 or better), and unless you have a high end GPU (RTX etc.) then transcription would still be worse than real time on those larger models. Also there is no "training" you can do or custom vocabulary, so if you have specific terms that it hasn't got in its model it may not work so well there - though there is a "prompt" option where you can pre-feed it words likely to be in the audio (so sort-of custom vocab option). And of course its core is just speech recognition so it doesn't of itself provide any computer automation / macro functionality.
So I think if you use DNS for general dictation / transcription and find DNS's accuracy lacking then whisper is very much worth looking at. If you use DNS tied into the environment of automation, macros and specific software integration then whisper doesn't cover that. ▲ Collapse | |
|
|
Dan Lucas Vereinigtes Königreich Local time: 10:10 Mitglied (2014) Japanisch > Englisch
Michael Beijer wrote:
very interesting post by a person called rjwilmsi
Thank you Michael, this is the thread I read. It's interesting and the technology certainly sounds promising. Hopefully the "prompt" function allows specialist terminology could be used.
Those comments have also persuaded me that my next PC should have a proper GPU. If the PC has Thunderbolt 4 ports one could use an eGPU, but that would not be a portable solution.
My current system is fairly puny and has only integrated graphics, which is why I haven't bothered trying Whisper.
Regards,
Dan | | | Gerard de Noord Frankreich Local time: 11:10 Mitglied (2003) Englisch > Niederländisch + ... Graphics processing unit | Mar 12, 2023 |
Dan Lucas wrote:
Those comments have also persuaded me that my next PC should have a proper GPU.
Dan
That's the first thing I thought too. I hadn't thought twice about GPU before. Geeks like Milan, Hans and Michael can really guide us to stay ahead of the wave.
Cheers,
Gerard | | | Samuel Murray Niederlande Local time: 11:10 Mitglied (2006) Englisch > Afrikaans + ...
I can't tell if its better or not, but I used the online version (i.e. via Google Colab) to transcribe some Zoom meetings. I hit my free limit after transcribing about 7 hours of audio. Then I bought 100 credits for $12. Having paid gives me access to "Premium" GPU, and I can confirm that it is a bit faster than the free tier. One annoying thing is that there is a usage timeout (which can occur even while a transcription process is running), at which point all data is lost, so it's best not to transcribe more than 3 hours of audio in a single go. It eats about 13 credits per hour.
When transcribing Zoom meetings and webinars, obviously its faster to upload only the audio, so to extract MP3 from MP4 I use Pazera's converter.
I'm not sure how much better Whisper is at transcribing such files than e.g. YouTube. It's faster (well, with Whisper the results are available within an hour or so, compared to 3-4 days with YouTube).
[Edited at 2023-03-13 16:21 GMT] | | | Try Buzz - Free Transcription App | Mar 13, 2023 |
Samuel Murray wrote:
Hans Lenting wrote:
Then I bought 100 credits for $12. Having paid gives me access to "Premium" GPU, and I can confirm that it is a bit faster than the free tier. One annoying thing is that there is a usage timeout (which can occur even while a transcription process is running), at which point all data is lost, so it's best not to transcribe more than 3 hours of audio in a single go.
It eats about 13 credits per hour.
I'm not sure how much better Whisper is at transcribing such files than e.g. YouTube. It's faster (well, with Whisper the results are available within an hour or so, compared to 3-4 days with YouTube).
Try the Buzz - it is free and works on your PC.
https://www.youtube.com/playlist?list=PLG8jlFKr-RtchdAg069DGCFTpVBae6-3R
Buzz - Free Transcription App Tutorials, 3:11 min
David Mbugua
6. 1. 2023
Batch Transcription Using Buzz - How to Automatically Transcribe Multiple Files for Free Using Buzz
--
I downloaded it, unzipped and run on my new PC. For example, I open ten MP4 files at once.
I select a language manually and I always use Large size of Whisper model. Buzz convert MP4 to MP3 automatically.
The results are TXT, SRT (or VTT) files.
Milan | |
|
|
Samuel Murray Niederlande Local time: 11:10 Mitglied (2006) Englisch > Afrikaans + ... | Thank for info | Mar 13, 2023 |
Samuel Murray wrote:
Malwarebytes is not happy with Buzz:
Samuel, thank you for the information. I'll inform the author if you haven't already.
"My" Buzz is not running on my main working PC. I have been using it for two months. Any system that creates new files tends to be suspicious.
https://github.com/chidiwilliams/buzz/releases/tag/v0.7.2
You tested source code?
--
I replaced version 0.7.0 with version 0.7.2, again "windows.tar.gz". After unzipping the file, Windows 11 reported "Unknown publisher" . I still use Buzz.
Milan
[Edited at 2023-03-13 21:49 GMT] | | | Samuel Murray Niederlande Local time: 11:10 Mitglied (2006) Englisch > Afrikaans + ...
Milan Condak wrote:
After unzipping the file, Windows 11 reported "Unknown publisher".
Yes, that happened to me too, but that by itself is not concerning. I have such faith in my anti-virus tools that I would install such a program anyway. But when I ran the program the first time, Malwarebytes complained and quarantined the program. Malwarebytes isn't always right, though -- I have a folder on my computer that is excluded from scanning from where I run programs that Malwarebytes doesn't like. The explanation given by Malwarebytes on their website about this particular threat is so generic that I'm tempted not to take it seriously. | | | Vom Thema belegte Seiten: [1 2 3 4 5 6] > | To report site rules violations or get help, contact a site moderator: You can also contact site staff by submitting a support request » Is OpenAI’s Whisper better than Dragon? TM-Town | Manage your TMs and Terms ... and boost your translation business
Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.
More info » |
| Trados Business Manager Lite | Create customer quotes and invoices from within Trados Studio
Trados Business Manager Lite helps to simplify and speed up some of the daily tasks, such as invoicing and reporting, associated with running your freelance translation business.
More info » |
|
| | | | X Sign in to your ProZ.com account... | | | | | |