This implementation is fine but the text to speech is meh, Google’s uses https://google-research.github.io/seanet/soundstorm/examples/ which is unfortunately not open source
https://mastodon.social/@alexcannan
also sometimes posts as lfcbot
This implementation is fine but the text to speech is meh, Google’s uses https://google-research.github.io/seanet/soundstorm/examples/ which is unfortunately not open source
This is hilarious. you should do standup.
Sorry for the ambiguity, I intend to use this for audio applications (specifically I want to shift 20 kHz to 100 kHz down to human hearing range, tunable by a pot.)
Given the price and low supply I think I’ll go ahead and try to wind my own transformers–thanks for the video, seems perfect!
For the majority of human history, we’ve eaten around wood (around a campfire, a hearth, etc), it makes sense it would become intertwined with our food palette
99 woodcutting!
In my experience, mypy + pydantic is a recipe for success, especially for large python projects
What are your thoughts on Javier Milei?
Cross modality is what is missing. We have models that can produce text, hear things, and see things really really well. Facilitating communication between these individual models, and probably creating some executive model to utilize them, is what’s missing. We’re probably still a few years from beginning to touch general intelligence
How can setup get any easier than apt install jellyfin
and then going into a web UI to add a few folders?
I’ve been using jellyfin everyday for a few months on my (very tiny) debian server and have never experienced a memory spike like that. Handles music, HD video, even network streams without a hitch
Why does anyone use Plex when Jellyfin exists?
Just like the trillions of parameters that make up machine learning models that can speak or create images
Amazing handheld attention sinks + low quality education during the pandemic seems like a deadly combo. I wonder if reading rates will bounce back at all given a few years.