Skip to main content

Search

Items tagged with: ML


Meta releases Spirit LM, a multimodal (speech text) model. #Multimodal #LLM #AI #ML ai.meta.com/blog/fair-news-seg…


F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching! The quality is pretty impressive for open source, and it even supports mps for Mac! I was able to get it going on my Mac with no problem. #TTS #ML #AI
github.com/SWivid/F5-TTS
@ZBennoui


😲 OMG! Mlx-whisper transcribed 12 minutes of speech under 18 seconds with excellent accuracy, using the new OpenAI model, whisper-large-v3-turbo, on my MacBook Pro with the M3 Max! ⚡ #OpenAI #ML #AI #Transcription huggingface.co/mlx-community/w…


Die @openhomefoundation freut sich übrigens über Sprachschnipsel eurer Stimme. Es soll nur "OK NABU" eingesprochen werden, also viel einfacher als etwa bei #commonvoice
Gerne so viele unterschiedliche Sprecher wie möglich, damit später dies wakeword zuverlässig erkannt wird :clippy:
ohf-voice.github.io/wake-word-…

#stt #stimme #voice #nabu #crowdsourcing #commons #nlp #ml #homeassistant #hass #iot #smarthome


I fed the full transcript of the 2024 presidential debate and asked NotebookLM to create an audio overview. Find out which political side the AI is on. ROFL #LLM #ML #AI #NotebookLM @vick21 abcnews.go.com/Politics/harris…


😲 OMG, Audio Overview feature on NotebookLM is wild! It basically creates a podcast with two AI generated voices based on the source documents you upload. Definitely try if you haven't yet. #LLM #ML #AI blog.google/technology/ai/note…
#AI #ML #llm


New Mistral 22B model Mistral-Small-Instruct-2409 #LLM #AI #ML huggingface.co/mistralai/Mistr…
#AI #ML #llm


😲 Kyle Kabasares, a Physics PhD graduate working at NASA's Ames Research Center, gave the methods section of his research paper to ChatGPT O1 Preview and asked it to generate the code based on the description. After just six prompts, it produced a working version of the code that took him a year to develop during his PhD. #ChatGPT #LLM #ML #AI youtube.com/watch?v=M9YOO7N5jF…


Does anyone have a recommendation for #LlamaCPP alternative to run recent vision language models on Apple Silicon? Llama.cpp doesn't support any of the recent #VLM such as Qwen2-VL, Phi-3.5-vision, Idefics3, InternVL2, Yi-VL, Chameleon, CogVLM2, GLM-4v, etc.
Minicpm-v 2.6 is the only recent model that was added. Maybe time to move on. :( #LLM #multimodal #AppleSilicon #MacOS #ML #AI


After a long period of inactivity for vision language models, llama.cpp merged the support for MiniCPM-V-2.5. Hopefully the support for 2.6 is also on the way soon. #LLM #Multimodal #AI #ML
huggingface.co/openbmb/MiniCPM…
huggingface.co/openbmb/MiniCPM…
github.com/ggerganov/llama.cpp…


Exciting news on open-source neural voices!
Our first experiment is complete with fantastic results! Check out the audio sample attached to this post.
For this month, @pneumasolutions provided GPU resources for training. I really appreciate their contribution.
This is just the beginning. To keep training going, I'm still accepting donations. Any amount helps.
I'm happy to receive your donations via PayPal:
paypal.me/geotts
Please mention mush42/tts in the notes.
#SpeechSynthesis #AI #ML


Interesting thought: If AI takes all our jobs and reduces overall purchasing power, who’s Going To Buy stuff? #AI #ML youtube.com/watch?v=MYB0SVTGRj…
#AI #ML


AlphaProof and AlphaGeometry from Google DeepMind tried the 2024 International Mathematical Olympiad and performed at the level of a silver medalist! #Math #LLM #ML #AI deepmind.google/discover/blog/…
#AI #math #ML #llm


According to the commit for download.sh on meta-llama Github repo, we're getting the updates: llama-3.1-405b, llama-3.1-70b, llama-3.1-8b. #LLM #ML #AI github.com/meta-llama/llama/co…
#AI #ML #llm


Llama3-405b base model is leaked on 4chan as Miqu-2. Miqu-1 was leaked Mistral 70b model which was confirmed by Mistral CEO. The download size is 764GB, and it was briefly on Huggingface but taken down. The torrent is still working apparently. #LLM #AI #ML reddit.com/r/LocalLLaMA/commen…
#AI #ML #llm


If the rumors are true, this week could be another exciting week for opensource LLMS! Meta may release Llama-3-405b on this Tuesday. Also there could be updates to 8b and 70b models distilled from 405B. Joe Spisak, a product director at Meta says they were initially going to call Llama 3 8b and 70b a prerelease or preview because these models didn't have all the things they planned to release.
Sources:
theinformation.com/briefings/m…
x.com/AlpinDale/status/1814814…
youtu.be/r3DC_gjFCSA?feature=s…
#LLM #AI #ML
#AI #ML #llm


Time to buy the Mac Studio with 192GB memory! lol Rumor: Meta plans to release the largest Llama 3 model with 405 billion parameters on July 23, according to a Meta employee. "it will be able to understand and generate images and text." #LLM #AI #ML theinformation.com/briefings/m…
#AI #ML #llm


ok ok ok, we covered #jupyter last week. Our #datascience adventure continues with checking into the #accessibility of #google CoLab. Is it better than jupyter? Worse? Are there tools to make it better? How's a #blind person even do #ML and datascience anyway? Come see in just over 2 hours at twitch.tv/ic_null and youtube.com/@blindlycoding #python #programming #coLab #selfPromo


Microsoft released Phi3 Small, Medium, and Vision! #LLM #AI #ML huggingface.co/collections/mic…
#AI #ML #llm


Editted to fix link. Please boost for reach if this kind of stuff interests you. Will post more on this later.

Once upon a time, there was a cool emulator frontend called Retroarch. This emulator wasn't accessible until I and a few other gamers went to them and asked about adding accessibility. An amazing person known as BarryR made it happen. Now, if you turn on accessibility mode in settings, or pass the "--accessibility" (or something like that) flag on the command line, you get spoken menus, including the emulator's pause menu, good for saving states and such. Then, using PIL and other image processing Python utilities, running a server and hooking into Retroarch, the script allowed players to move around the map, battle, talk to NPC's, ETC. The only problem was, no one wanted to test it. The blind gaming community pretty much spoke, saying that we want new games. We want cool new, easy accessibility. So that's what we have no, follow the beacon or get sighted help in the case of diablo and such. It's sad, but meh. It's what we wanted I guess. No Zelda for us. So, this is about as far as he got:

To expand on what devinprater was saying: I am working on an accessibility pack/service for Final Fantasy 1 for the NES (this was what was shown in the latest RetroArch update). The idea is similar to how Pokemon Crystal access works, but it's using the RetroArch AI Service interface to do so.
Right now, the FF1 access service is mostly done, but I need more testers to try it out and give me feedback on how it's working. Right now, you can get up to the point where you get the ship, but there's no code to deal with how the ship moves, so that still needs to be done. Likewise with the airship later on.
The service works the latest version of RetroArch, on linux and mac, but not windows. This is due to how nvda reads out the text and until the next major update to nvda (which will have a feature to fix this), it'll have to wait. If you have those, I (or maybe devinprater) can help you set it up on mac/linux to test out. The package itself is available at: (new link cause old one broke yesterday): dropbox.com/scl/fi/ggffl769fx6…
#accessibility #finalFantasy #RetroArch #blind #emulator #emulation #Python #ai #ML #MachineLearning


Start saving money for that M4 Ultra with 500GB! Maybe this could be the first open source that could surpass GPT-4! AIatMeta: "Llama 3 8B & 70B models are just the beginning of what we’re working to release for Llama 3. Our largest models currently in the works are 400B+ parameters and while they’re still in active development, we’re excited about how this work is trending." #LLM #AI #ML twitter.com/AIatMeta/status/17…
#AI #ML #llm


Apparently Meta is planning to release two small varients of Llama-3 next week "as a precursor to the launch of the biggest version of Llama 3, expected this summer." Command-r-plus, mixtral 8x22b, Google CodeGemma... All of sudden companies are releasing LLMS like crazy! Where's Apple? Maybe In WWDC 2024? lol #LLM #AI #ML theinformation.com/articles/me…
#AI #ML #llm


Following xAI Grok-1 314B, Databricks DBRX 132B, Cohere Command R+ 104B, another big model drop this time from Mistral! Mistral 8x22B! #LLM #AI #ML twitter.com/mistralai/status/1…
#AI #ML #llm


Claude 3 can summarize up to about 150,00 words, (a length similar to Harry Potter and the Deathly Hallows.) also It outperformed GPT-4 and Gemini Ultra on industry benchmark tests, such as undergraduate level knowledge, graduate level reasoning and basic mathematics. It allows users to upload images and documents for the first time. #LLm #AI #ML cnbc.com/2024/03/04/google-bac…
#AI #ML #llm


Funny Reddit thread about Tim Cook's comment: "the Mac is the best computer for AI." Apple fanboys defend it with unified ram, and Apple haters attack with Nvidia GPU speed. lol #ML #AI #Apple #Mac reddit.com/r/LocalLLaMA/commen…
#apple #AI #ML #mac


Zuckerberg says Meta is training #LLaMa 3 on 600,000 H100s! Wel, time to finetune and quantize everything again when it comes out. lol #ML #AI #LLM reddit.com/r/LocalLLaMA/commen…
#AI #ML #llm #llama


Interesting, Apple released ferret, an open source multimodal Model! It's based on LLaVA and Vicuna. #AI #LLM #ML github.com/apple/ml-ferret/
#AI #ML #llm


Apparently Arthur Mensch, CEO of #Mistral, declared on French national radio that mistral will release an open source model equivalent to #Gpt4 in 2024. I don't speak French, so can't verify, but it would be interesting along with Llama-3 and whatever OpenAI has planned for 2024. #AI #ML #LLM radiofrance.fr/franceinter/pod…


Hello Fediverse,

We are looking for Text-To-Speak (TTS) expertise to help or advise us on improving the default voice of the Linux desktop. :linux: 📣

Please reach out or boost :boost_love:

Thanks!

#Linux #tts #accessibility #a11y #GNOME #KDE #FreeSoftware #freedesktop #ml


New blog post: Enhancing Accessibility with AI and ML

We discuss how to specialize #AI for understanding UI for #a11y testing, the importance of algorithm and data integrity, and how #ML can be used to simplify #a11y testing. deque.com/blog/enhancing-acces…

#a11y #AI #ML


#AI #FutureOfWork #ML

Researchers discover a more flexible approach to machine learning.

quantamagazine.org/researchers…


"Especially in this moment in history, it is vital that we provide our students with the critical thinking skills that will allow them to recognise misleading claims made by tech companies and understand the limits and risks of hyped and harmful technology that is made mainstream at a dazzling speed and on a frightening scale."

Excellent call to action by @Iris

irisvanrooijcogsci.com/2023/01…

#AIhype #MathyMath #AI #ML #ChatGPT


Maybe this could help FOSS developers make something like this for Linux or Android. This is an article from Apple Machine Learning, on how they made mobile apps accessible with machine learning:

machinelearning.apple.com/rese…

#a11y #apple #ml