Replies: 9 comments 16 replies
-
Updated to reflect changes in file structure, included parameter files for audio and transcriptions to allow for state resumption |
Beta Was this translation helpful? Give feedback.
-
Updated trascription/parameters.json -->> transcription/transcription.json |
Beta Was this translation helpful? Give feedback.
-
Maintaining an internal database of files introduces complexity and requires significant storage, since all the audio files are effectively copied from their original locations. So what are the advantages to this approach? Is the idea that users will be able to easily see and manage transcription entries in the app, and even potentially play back the audio or edit the transcription text inside the app? |
Beta Was this translation helpful? Give feedback.
-
This might deserve some user testing/research. This is something Tabula does (and I'm betting Tabula has a similar audience) but if I'm honest, I find it more annoying than useful. It's rare that I'm trying to extract data from the same file more than once, so really I just end up going through and clearing out this list from time to time. On the other hand, transcripts are something someone might be more likely to revisit than a CSV. In combination with other interface niceities that couldn't be reproduced in a text editor (say, being able to click on a sentence in the transcription to hear it in the audio file)—I could imagine users revisiting or spending significant time with transcripts in our interface. |
Beta Was this translation helpful? Give feedback.
-
The main reason for us to do this is to build stability into the app. We could instead just keep the path that the audio file is on, it (would be trivial to change this) but we would have to then build in resiliency against a user moving the file or editing it. My thought was that by keeping the audio file inside the application we can ensure that we have it and know where it is. I could add a setting "space saver" which changes the import behaviour. The functionality benefit would be playing the exact audio file back and showing transcriptions underneath, with the option to edit the transcriptions inside the app. I guess the question is, do we want a wrapper for Whisper or a full app that can manage transcriptions? (or one then the other?) |
Beta Was this translation helpful? Give feedback.
-
Many PKM systems involve the direct importing of files but I feel that what’s more important is not the files itself but rather the transcripts. We can soft link to the files on the hard drive I guess.
Space saver as a setting is a good compromise.
About the songs, I believe that a more useful use case would be interviews and stuff like that which can span from about 5 minutes to over an hour.
I spoke to my English teacher this morning and he mentioned the immense utility something like this could have for quickly reviewing past papers and assessments which are often at least 15 minutes.
Plus if the files are uncompressed(should we compress them first?), it can take even larger mounts of space (well over a gigabyte per hour. )
…On 5 Oct 2022, 8:55 PM +0400, Adam Newton-Blows ***@***.***>, wrote:
As a note to this I wasnt considering transcription audio to be that large, aren't songs are just a couple of megabytes each?
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
Another advantage to maintaining files in our directory is #25 (comment) being able to use our directory as an RSS output, meaning we could have jobs added automatically with the option to run them through transcription |
Beta Was this translation helpful? Give feedback.
-
Looking over this again I think we can actually work around it mean issue is the storage and I think that with the proper UI and as long as we plan for those interactions very well it's not that big of a deal we just need to give the user and ability to monitor the file size of the database, so to speak, and also to monitor which recordings are taking up the most space and stuff. I think that telegram is useful apps who's UI we can model towards. With a few affordances for things like the fact we should never automatically delete. I think a small banner, or something on the bottom left or something to show the uses of the excuse of predefined size limit is a good idea |
Beta Was this translation helpful? Give feedback.
-
@oenu Can you give us an update on how the filesystem currently works, what the advantages and disadvantages of the current approach are, and possible alternative approaches? |
Beta Was this translation helpful? Give feedback.
-
Im working on handling files from the script, I think it would be best if we used internal file paths for our data and then offered an export. This (hopefully) keeps the chances of a user going in and messing with loosely typed data (strings in vtt for example) low.
https://github.com/electron/electron/blob/main/docs/api/app.md#appgetpathname
There is a set of standard locations that can be used, im thinking we use the
appData/StageWhisper
directory that returns as%APPDATA%/StageWhisper
on Windows$XDG_CONFIG_HOME/StageWhisper
or~/.config/StageWhisper
on Linux~/Library/Application Support/StageWhisper
on macOSLet me know if you have any thoughts
It is my view that keeping files in webVTT is our best bet at offering stable audio-sync capabilities as allowing direct text editing would be a nightmare to copy across to a VTT file.
When a user wants their transcript they can export to an output directory. This also makes more sense for queuing multiple files as a user may direct output at an external/mounted drive, which given the long transcribe times, could be unplugged/unmounted before Whisper is done.
@Stage-Whisper/developers
Beta Was this translation helpful? Give feedback.
All reactions