codevoid.de/1/hn/comments_48521236.gph

        _______               __                   _______
       |   |   |.---.-..----.|  |--..-----..----. |    |  |.-----..--.--.--..-----.
       |       ||  _  ||  __||    < |  -__||   _| |       ||  -__||  |  |  ||__ --|
       |___|___||___._||____||__|__||_____||__|   |__|____||_____||________||_____|
                                                             on Gopher (inofficial)
  HTML Visit Hacker News on the Web
       
       
       COMMENT PAGE FOR:
  HTML   Show HN: Trace â Offline Mac meeting transcripts you can flag mid-call
       
       
        lee_ars wrote 10 hours 0 min ago:
        It works well so far, but what's up with the weird non-standard menubar
        menu? It's very odd, and it doesn't respect system light/dark mode
        preferences.
       
        yilugurlu wrote 12 hours 46 min ago:
        Looks awesome, I just bought. I'd love to see Office/Teams Calendar
        integration.
        Good luck!
       
          AG342 wrote 12 hours 41 min ago:
          Thanks. Trace integrates with your Mac calendar, so if youâve
          signed into your accounts with those it will pick them up. No support
          currently for direct Teams/Exchange integration but Iâll add it to
          the list.
       
        shireboy wrote 13 hours 43 min ago:
        I tried this but it took over my mic and people couldnt hear me on
        teams call until I turned it off.  Nice idea but needs to share the mic
        w teams to be useful for me.  Not sure if itâs teams fault or trace
        fault but either wayâ¦
       
          AG342 wrote 12 hours 16 min ago:
          Looks like echo cancellation is to blame here. I'm working on a more
          permanent fix but for now try turning off "Cancel system audio from
          the microphone" to see if that helps.
       
          AG342 wrote 12 hours 47 min ago:
          Someone else has also reported this, which is not coincidentalâ¦
          Have you tried this with other applications too to see if itâs a
          teams-specific issue?
          
          Iâll take a look as my top priority.
       
        iorinu wrote 14 hours 10 min ago:
        I thought it was really interesting
       
        haaz wrote 14 hours 32 min ago:
        Really nice, I will buy now and test out. How did you make the video on
        your website? And are you based in UK?
       
          AG342 wrote 14 hours 4 min ago:
          Thanks for the support. The videos are Vue components with CSS
          animations. As mentioned in a previous reply Iâm happy to share the
          component if itâs of interest.
          
          Yep, based in Sheffield, UK.
       
        zmmmmm wrote 17 hours 22 min ago:
        I've tried so many of these and paid for a lot of them and I still
        can't find what I want. It sounds like this is closer than most:
        
        - record and separate two sides of the conversation
        
        - save meetings in a simple transcription format in a local folder
        
        - connect with my calendar (Outlook, Google Calendar) and name meeting
        transcripts accordingly
        
        - for recurring meetings, append rather than create a new transcript
        
        - let me label speaker voices and recognise those voices across
        different meetings
        
        A tool that did all this and then ALSO built a knowledge base to let me
        RAG query my meetings would be the holy grail for me.
       
        ectoloph wrote 17 hours 42 min ago:
        I've just bought it based on the description.
        
        Minor complaint is that it steals Cmd-Shift-P (Firefox Private Browsing
        shortcut) by default.
        
        Easy to change in the UI though, so no big deal.
       
        chid wrote 20 hours 15 min ago:
        Will this be available for iOS?
       
        scimonk wrote 20 hours 51 min ago:
        The App looks really interesting and Iâd love to try it out. How well
        does it work in other languages than English? For me, German would be
        important.
        
        Due to audio quality, transcription sometimes produces garbled output
        or understands something wrong. FluidVoice offers the option to use a
        LLM to âinterpretâ the text to rescue garbled audio through
        context. Do you also plan to support something like this? This would be
        a great feature!
       
          ahamez wrote 19 hours 31 min ago:
          From the website ( [1] ):
          
          > Which languages does Trace support?
          English only, for now. Both transcription models, Fast and Accurate,
          are built for English audio. A recording in another language will
          still produce a transcript, but it wonât be accurate: the model
          maps whatever it hears onto English words, so the result comes out
          garbled rather than failing outright.
          
          > If transcribing other languages matters to you, get in touch (see
          Contact below).
          
  HTML    [1]: https://traceapp.info/support#preferences
       
            scimonk wrote 18 hours 55 min ago:
            Thanks! Somehow I missed that part. Didn't look for language
            support in the "preferences" section I guess.
       
        littlecranky67 wrote 21 hours 53 min ago:
        Would love to use this app, I recently thought about coding something
        similar myself. I would need to only record my own voice due to privacy
        laws (here in Germany, you can record yourself without consent). With
        overthe-ear headset, the microphone only captures my voice. Would need
        to store the original audio plus the transcription. Ideally, you can
        configure it to start as soon as it detects a new window with a given
        title (i.e. Webex launches meetings in a new window named "Meeting
        ....").
       
        triyambakam wrote 22 hours 59 min ago:
        My stack has been QuickTime and Assembly AI lol
       
        fandorin wrote 1 day ago:
        looks great! and good that you decided on the one-time fee instead of a
        subscription. side question: what did you use to create these
        videos/gifs on the homepage that shows the app? They look really good!
       
          AG342 wrote 1 day ago:
          Appreciate it, thanks. Theyâre just Vue components with CSS
          animations. Drop me an email and Iâll send you the code.
          
          hello@traceapp.info
       
            fandorin wrote 9 hours 17 min ago:
            thanks! will do
       
        tillcarlos wrote 1 day ago:
        Had the same idea, but have to focus on my main business. This comes at
        the right time!
        
        I just purchased it. What's the best way to give you feedback? (Do you
        want any?)
        
        From the top of my head:
        - will the mic switch automatically when I am at my office? Or do I
        have to change settings every time? Maybe a preference of what's
        available + auto switch would be good.
        - I personally don't need the hot key. Menu bar icon would be fine.
        - Download the model is a long process. Put it into the installer, not
        into the bar on the bottom
        - Speaker correction would be amazing. If it could "Learn" the speakers
        based on voice.
        - Overall neat app. Good animations and UX
        
            **Speaker 1** [00:00] What if I fell to the floor?
            **Microphone** [00:02] Yes, this is Phil, I'm just speaking, this
        should be      my voice, and there's music in the
            **Speaker 1** [00:05] Couldn't tell this anymore
       
          AG342 wrote 1 day ago:
          Thanks for the feedback. Feel free to drop an email to
          hello@traceapp.info if you want a chat. Happy to hop on a call too if
          you'd prefer.
          
          For the switching, do you mean if you hot-swap during a call? The mic
          should auto-switch if you've got System default selected, but feel
          free to give it a go and report back. If it doesn't do what we expect
          I can absolutely take a look at changing the behaviour.
          
          Learning speakers is also on the to-do list.
          
          P.S. Great choice in test audio. What a banger.
       
        nightpool wrote 1 day ago:
        Is this legal to use in 2 party consent states? Might vary from state
        to state, which is probably why both zoom and Meet require users to
        click through consent screens when meetings are bring transcribed.
        Might be useful to have that on the FAQ page
       
          AG342 wrote 23 hours 58 min ago:
          Recording and consent rules vary quite a bit by country and region,
          so it's on whoever is doing the recording to get the consent they
          need and to follow the law where they are. Still, I like the idea of
          putting a note on the FAQ to make that clear.
       
          Gigachad wrote 1 day ago:
          It would be if you tell the person you are recording first.
       
        scosman wrote 1 day ago:
        Those transcription times are fast fast. What model/library do you use?
       
          AG342 wrote 1 day ago:
          Trace has two engines that you can choose from. The fast one uses
          NVIDIA's Parakeet-TDT 0.6b v3 model run through FluidAudio, which
          surprised me with how fast it was. There is also an accurate engine,
          which uses Whisper large-v3-turbo via WhisperKit, which is slower but
          holds up better on accents and jargon.
       
            scosman wrote 11 hours 37 min ago:
            ahh yes. I'm using Whisper v3 turbo via WhisperKit as well. Will
            play with parakeet
       
        mrkn1 wrote 1 day ago:
        The key moments feat is neat. Been working on a free opensource offline
        transcriber that runs fast on CPU and does diarization too
        
  HTML  [1]: https://github.com/kouhxp/yapsnap
       
        addozhang wrote 1 day ago:
        This is an excellent product and exactly what I've been looking for.
        But most of my meetings are done on my company Mac, and they definitely
        won't let me install this kind of software, even though I'd be willing
        to pay for it myself.
       
          geniium wrote 1 day ago:
          And if it runs on the browser without install it would not probably
          be able to record your other browser (or app) audio
       
            z3ugma wrote 15 hours 5 min ago:
            Maybe we ought to be making little hardware passthroughs that plug
            into the headphone/mic jack and control them with idk the Caps Lock
            signal from USB HID to start recording.
            
            It's very cyberpunk eventually...the human operator of the console
            needs to be able to see and hear the screen and sound, there will
            always be an interface that can be adapted to a machine, however
            low-fidelity
       
            sofixa wrote 18 hours 26 min ago:
            Why not? You can run Google Meet, Zoom, Teams in the browser,
            including screen sharing with audio sharing. The browser APIs are
            there (e.g. [1] ).
            
  HTML      [1]: https://developer.mozilla.org/en-US/docs/Web/API/Screen_Ca...
       
              dsl wrote 17 hours 27 min ago:
              A webpage cannot provide a system I/O device (camera, microphone,
              speaker, etc.). That requires a signed driver on MacOS.
       
        usernametaken29 wrote 1 day ago:
        I donât have this particular use case right now but if anything it
        feels like LLMs and their distilled on prem models are starting to kill
        SaaS simply because it becomes more and more tenable to build a
        âcomplete softwareâ in a short time frame. Thatâs freaking
        awesome. Good idea and love the return of the good old you buy, you own
        it mentality
       
        Myrmornis wrote 1 day ago:
        I will be happy to spend Â£10 on this. One feature question though --
        does it continue transcribing the meeting even if I've turned my volume
        down / muted it?
       
          AG342 wrote 1 day ago:
          It does indeed. Trace will record your system audio regardless of
          your speaker volume. You do have the option to mute your own mic
          temporarily though, via a button on the âpillâ or a global
          keyboard shortcut.
       
            Myrmornis wrote 10 hours 28 min ago:
            Thanks. I've bought it and started using it; it looks great. I was
            previously using Hyprnote which did work well, but yours appears to
            fit my "I just want markdown" case better and to generally be more
            polished.
            
            I'll be wanting to find a good workflow to get the markdown
            transcripts into a git repo with file names that define a suitable
            sort order and also indicate what the meeting was. So would welcome
            your suggestions there. Not blocked of course, yo umake it easy to
            copy from clipboard or from the disk location and rename, but might
            be nice to have more control about where and how the .md lands.
            
            I might email the support address on the off-chance that you're
            happy to have support/feature conversations like this. Thanks!
       
              AG342 wrote 10 hours 11 min ago:
              Please do drop me a line on hello@traceapp.info and let's chat!
       
        robertkarl wrote 1 day ago:
        This looks sick. I was going to download it but for $10 I am more
        willing to attempt asking Claude to implement something like it, than
        to purchase.
        
        I would be more willing to purchase if it was open source and I could
        build from source to try it first.
       
          PufPufPuf wrote 19 hours 50 min ago:
          AI is great at getting you 80% there. But you have to finish the
          remaining 80% yourself.
       
          addozhang wrote 1 day ago:
          I don't really recommend it. If the software is a one-time purchase,
          there's no need to rewrite it with an LLM. Rewriting the tokens could
          cost more than just $10.
       
            anonymouse008 wrote 1 day ago:
            * full price tokens, yes
            
            Not the subsidized subs
       
              plaguuuuuu wrote 1 day ago:
              I'd much rather spend $10 than have to sit at a prompt every day
              babysitting the thing, after working all day sitting at a prompt
              babysitting other things
       
          satvikpendem wrote 1 day ago:
          It's kinda funny how frontier LLMs change the game when it comes to
          software. If it becomes so good to make whatever little utility you
          want, why would I pay 10 dollars when an AI subscription is 20 bucks
          and I can build way more in a month for that $20? Especially since
          it's very likely people on show HN have simply used AI anyway, so why
          would I pay for your prompts?
       
        blopker wrote 1 day ago:
        Nice! I really like how many variations on this idea are coming out.
        MacWhisper used to be great, but is kinda of a buggy mess now.
        
        I'm making my own, for personal use. I did a survey of many and they
        all (that I could find) skip the fundamentals.
        
        The major issues that I've run into:
        
        - Crash recovery. Most of these apps are incredibly buggy and crash all
        the time, taking the recorded audio with them. Macwhisper is incredibly
        bad at this.
        
        - Disk space. Many of these apps save wav files to disk. After a few
        hours of meetings, you may end up with gigabytes eaten.
        
        - Microphone bleed. People don't always use headphones, the system mic
        will pick up the speaker sounds, causing duplicate (approximately)
        transcriptions.
        
        I've yet to find a solution that handles all these correctly, let alone
        having high quality transcriptions.
        
        Anyway, most of these apps are built around [1] , if anyone is curious.
        Their readme has a big list of similar apps as well.
        
  HTML  [1]: https://github.com/FluidInference/FluidAudio
       
          victorbjorklund wrote 12 hours 56 min ago:
          Handy works good with crash recovery (mostly from me turning off the
          computer mid-recording because I forgot about the recording)
       
          AG342 wrote 1 day ago:
          Crash recovery is definitely something that I want to spend a bit
          more time on. I'm not entirely sure how Trace handles crashing right
          in the middle of a recording, so I'm going to put a bit of time aside
          in the next few days to properly explore this and see if I can come
          up with an elegant solution to it.
          
          I think I've got the other two bits covered. I pushed an update
          yesterday that adds active echo cancellation so that audio playing
          through the speakers (or leaky headphones) won't get transcribed
          twice if it is picked up by the microphone. It can be disabled in
          preferences, but it's on by default.
          
          The disk space issue is one that I considered as well. By default,
          Trace deletes the actual audio recordings as soon as transcription is
          successfully completed, so the idea is you keep just the markdown
          transcript rather than the gigabytes of raw audio. If you want,
          there's a preference to disable the auto-deletion. There's a bit more
          on the support page here [1] (search for "Auto-deletion of audio").
          
          FluidAudio is a big part of this and is actually used in two places
          during a session. It runs the Parakeet EOU model for the instant
          recap (which isn't hugely accurate, but it's good enough for the job)
          and after the call it's also used to transcribe the recording,
          depending on which engine you've selected (Trace offers a fast and an
          accurate one). If the fast engine is selected, we use FluidAudio with
          the Parakeet-TDT 0.6b v3 model for transcription, which then goes
          through Pyannote and WeSpeaker for diarization. If the accurate
          engine is selected, we use WhisperKit with the Whisper large-v3-turbo
          model for transcription, and SpeakerKit for diarization.
          
  HTML    [1]: https://traceapp.info/support
       
            kstenerud wrote 23 hours 37 min ago:
            For crash resilient data, you have a few options:
            
            - Journaling file structures (telegraph what you're about to write,
            then write it, then signal completion)
            
            - memmap your important data structures to a file (they will be
            flushed to disk no matter how your app dies - short of a power
            loss)
            
            - post-crash dump (put last-minute writers in a crash handler to
            save it to disk)
            
            A journaling file structure is the most secure, because it's
            designed with the assumption that writing will eventually fail.
            memmapped structs are easy and cheap, and get you 99% of the way
            there (only power loss will lose your data). Crash-time writing is
            doable with a crash handler like KSCrash, but there are many ways
            an app can crash without triggering a crash handler (thermal kill,
            exceeding quota, memory jetsam, etc). You also need to write your
            data in a signal-safe manner.
       
          scosman wrote 1 day ago:
          I had the same experience so started building my own. All problems
          are solvable, just working on the polish.
          
          - crash recovery: part one is use ADTS aac (even if process crashes,
          audio is saved up until it does). Part two is isolating the
          transcription/summaries in separate XPC services.
          
          - disk space: AAC 64kbps mono soles it. Could use Opus for further
          reduction but both are small.
          
          - speaker bleed: macOS voice isolation processing solves this. Itâs
          a nightmare to get setup, but works great once done.
          
          - library: using argmax SDK - by a bunch of ex-Apple on device AI
          folks.
          
          It it wasnât for CoreAudio, Iâd say it was easy to make. Argmax,
          Whisper, and llama.cpp - wrapped in the right architecture, mostly
          just work.
          
          Iâm having fun nerding out on the details like custom vocabulary
          (get the names of the people in here meeting right), inferring
          speaker names from transcript, calendar integration, nice UI, etc.
       
          Folcon wrote 1 day ago:
          > I've yet to find a solution that handles all these correctly, let
          alone having high quality transcriptions.
          
          Wait really? I honestly would have thought this was a solved problem
          by now, especially high quality transcriptions bit, just out of
          curiosity, is the problem that the quality isn't high enough?
       
            sofixa wrote 18 hours 28 min ago:
            > Wait really? I honestly would have thought this was a solved
            problem by now, especially high quality transcriptions bit, just
            out of curiosity, is the problem that the quality isn't high
            enough?
            
            If I had to guess, all of those apps are probably vibecoded, hence
            the variable quality.
       
            blopker wrote 1 day ago:
            There are still a few unsolved problems that require tuning for
            specific applications. Applications that own the video call have a
            much easier time, they have access to each individual audio stream.
            Applications like this, however, have to deal with overlapping
            voices from a single stream. If it's trying to attribute each
            utterance to an individual, separating the voices is tough, or can
            lead to confusing transcripts. There are many little problems like
            this which make it a tough problem in real world usage. Domain
            specific terms, or proper nouns is another source of inaccuracy.
       
          highmastdon wrote 1 day ago:
          Iâm using MacParakeet these days. If your language is supported,
          definitely give it a try. Itâs much faster and lower footprint
       
          jv22222 wrote 1 day ago:
          Nice tip on FluidAudio that's the kind of thing I've been looking
          for. Thanks!
       
        satvikpendem wrote 1 day ago:
        I don't see how this is different to literally the dozens of other
        offline transcription apps, many open source even unlike this one.
       
          hmokiguess wrote 1 day ago:
          can you share them? I'm looking for a decent open source one
       
            nl wrote 1 day ago:
            Handy is the most common recommendation
            
  HTML      [1]: https://handy.computer/
       
              hmokiguess wrote 1 day ago:
              That doesn't seem to do transcription of meetings?
       
            satvikpendem wrote 1 day ago:
            Literally so many when searched: [1] Add "open source" if you wish
            as well.
            
  HTML      [1]: https://hn.algolia.com/?q=macOS+transcription
       
              hmokiguess wrote 1 day ago:
              Any that you have used and recommend comparable to the one from
              the post? Thank you!
       
            infl8ed wrote 1 day ago:
            I don't mind [1] however I do really want speaker recognition which
            it does have but I haven't been able to get it working.
            
  HTML      [1]: https://matthartman.github.io/ghost-pepper/
       
            jv22222 wrote 1 day ago:
            I'm seeing a lot right here:
            
  HTML      [1]: https://github.com/FluidInference/FluidAudio
       
              hmokiguess wrote 1 day ago:
              I went through the list but most feel subpar to me, and some
              aren't even open source (just claim they use FluidAudio I guess?)
       
              vermilingua wrote 1 day ago:
              I donât see any there that are as focused as this one, perhaps
              except Talat which is considerably more expensive.
       
                jv22222 wrote 1 day ago:
                Ah. My bad. I didn't review them I was just paying more
                attention to the op asking for a list of open source ones.
       
          jv22222 wrote 1 day ago:
          Classic HN. Thanks for keeping it real.
       
            satvikpendem wrote 1 day ago:
            There are so many I've seen on show HN, that's why.
            
  HTML      [1]: https://hn.algolia.com/?q=macOS+transcription
       
        nkmnz wrote 1 day ago:
        Which Speech-to-Text is used? Is it possible to configure it? This
        might be crucial for supporting languages other than English - the
        model that comes built-in with macOS fails completely for German.
       
        denbyc wrote 1 day ago:
        I'd love to have a purchase option not tied to the App Store if
        possible. I don't use an Apple account with my Mac, but I would love to
        try Trace.
       
          thenipper wrote 1 day ago:
          Also agreed, my work prohibits App Store apps so i have to skip
          things like this.
       
          AG342 wrote 1 day ago:
          This is definitely on the to-do list if thereâs enough demand for
          it. The payment/distribution/updates infra required is not
          insignificant, especially if nobody was that bothered, but by the
          sounds of it they are so Iâll bump this up the priority list.
       
          addozhang wrote 1 day ago:
          Agreed, no need to tie it into Apple either.
       
            tillcarlos wrote 16 hours 18 min ago:
            try amore.computer - that might do the trick?
       
        nazca wrote 1 day ago:
        I've been looking for this exact thing!
       
        frabia wrote 1 day ago:
        Super interesting! How accurate is the local model to transcribe audio
        compared to other cloud services? E.g. Google Meet, Otter, Granola,
        etc.
       
          watchlight wrote 1 day ago:
          A lot of the available models are Whisper or Faster-Whisper derived
          and shared across multiple apps. The tier names are often funny...
          "Tiny" "base" "small" "medium" "large" "large-v2" "large-v3"
          "large-v3-turbo" -en only variants, etc.
          
          In my experience, medium is often the sweet spot for English accuracy
          vs speed, especially if following-up with a post-processing pass. The
          large options are all fine, but can severely slow it down. There are
          some speed checks on my website if you're curious (link not posted
          because I don't want to hijack another post's app).
       
        overflowy wrote 1 day ago:
        Does it support multiple languages?
       
        mushufasa wrote 1 day ago:
        This looks like a good approach, though I would expect this to be a
        native macOs feature within 12 months -- this seems totally like it
        fits into their product roadmap.
       
        watchlight wrote 1 day ago:
        Agreed with JohnBiz, the moment flagging is interesting and unusual,
        and a nice contrast to passive transcription. I only recently learned
        about MacWhisper (I'm Windows primarily) and was floored to learn how
        expensive the Pro option is. Nowadays it's not so hard to have
        some-level of DIY transcription, so crazy that it's priced with a
        premium.
        
        What's your diarization pipeline? Pyannote?
        
        I'd taken a different approach that used a LLM clean-up pass to
        summarize and progressively compress the transcript for ultra-long
        content, but I like the idea of targeted "pay attention here" flags.
       
       
   DIR <- back to front page