codevoid.de/1/hn/comments_48506545.gph

        _______               __                   _______
       |   |   |.---.-..----.|  |--..-----..----. |    |  |.-----..--.--.--..-----.
       |       ||  _  ||  __||    < |  -__||   _| |       ||  -__||  |  |  ||__ --|
       |___|___||___._||____||__|__||_____||__|   |__|____||_____||________||_____|
                                                             on Gopher (inofficial)
  HTML Visit Hacker News on the Web
       
       
       COMMENT PAGE FOR:
  HTML   Launch HN: BitBoard (YC P25) â Analytics Workspace for Agents
       
       
        sails wrote 10 hours 39 min ago:
        > but customers kept pulling us toward their data analysis problems
        
        I hear this all the time, I still donât think itâs a good
        justification to build a BI tool, but I hope this time it is different.
        
        Product looks cool! Iâm hopeful that agents do actually unlock
        business analytics and we can move on from the BI concept
        
        Edit: a rough explanation of why you get pulled towards data problems
        is that they are intractable symptoms of upstream process issues.
        Customer sees a capable startup and co-opts them into trying to solve
        their tarpit problems. Happens all the time!
       
          arcb wrote 2 hours 11 min ago:
          We hear you on getting pulled into tarpit problems, and on the
          pattern you're describing leading to them. The core product
          motivation we're excited about is letting humans and their agents act
          on data together, but we do think that requires thoughtful tooling to
          exist before that becomes desirable (more to come here). Our newer
          customers tend to be a little more technology forward, which helps us
          focus on the product we're offering them rather than internal
          politics or process issues.
       
        BoorishBears wrote 17 hours 46 min ago:
        How are you connecting to various data sources?
       
          arcb wrote 17 hours 24 min ago:
          We're offering secure connections to sources like SQL DBs,
          warehouses, file stores, and MCP/API sources like PostHog or
          Salesforce. Customers can choose to set up credentials in our key
          store. We also support directly dropping data into BitBoard (where we
          sync it to object storage).
       
        rancar2 wrote 19 hours 34 min ago:
        I do exactly this (and more since my role is much broader and so is my
        approach) as a fractional head of product, data, and operations for
        multiple companies all in healthcare (fast growing self-funded to
        series D/IPOing soon). I saw your initial launch and felt validated by
        you all working on it, and now Iâm further validated by the pivot. I
        have more work than I can handle, so Iâm happy to share tips. You can
        find me via a bit of googling my HN handle or just adding a dot com to
        the end.
       
          infakelife wrote 19 hours 21 min ago:
          Appreciate the validation. Would be great to connect and exchange
          notes - will reach out directly.
       
        dennis16384 wrote 19 hours 36 min ago:
        Nice, I recently did a similar but much simpler thing and open-sourced
        it under MIT, maybe some bits and pieced will be useful [1] For
        example, MIT-licensed sqlite vector search extension.
        
        Overall, I have a orchestrator - sql coder - js coder - dashboards, all
        without backend, running locally in the browser. It's mostly tested on
        small analysis and question answering with Gemini Flash Lite, and the
        overall target was speed from question to answer, including data
        sharing and waiting.
        
  HTML  [1]: https://github.com/eatmydata-org/eatmydata
       
          arcb wrote 19 hours 30 min ago:
          There are a lot of cool and useful things in there. What are you most
          excited about?
       
            dennis16384 wrote 19 hours 28 min ago:
            Fast response. I can upload Excel/csv and iterate under 10 seconds
            from question to result. Doing same thing in Claude with 10x less
            data takes 5 minutes.
       
              arcb wrote 19 hours 22 min ago:
              I hear you on fast responses. One of the frustrations I've had
              using BI / data tools in the past was not being able to get local
              performance... which led to me exporting data to spreadsheets or
              local code. We're taking this to heart for BitBoard as well.
       
                dennis16384 wrote 18 hours 56 min ago:
                Totally. One thing that all major AI vendors are not doing
                currently is merging server AI with edge devices.
                
                For example, there is no way neither in Claude nor in ChatGPT
                to run your own WASM or JS or whatever AI produces directly in
                user's browser context as a tool/skill - there is no call site
                for that. The only option is remote server-side.
                
                My whole idea was that AI can perfectly write SQL and dashboard
                code knowing only the shape of your data and not it's contents.
                With direct upload to vendor now we're forced to share the
                contents.
       
                  arcb wrote 18 hours 42 min ago:
                  I suspect stronger edge performance will come as a
                  side-effect of local inference. Your point on edge tool calls
                  is interesting and I'll think about that. Features like
                  offline mode could be a great motivating reason. Re knowing
                  the shape vs not the internals - I'm mixed here. It feels
                  like there's always a sampling period where you have to look
                  at contents in order to understand what you want. But edge AI
                  (like antirez's work running DeepSeek on Mac) will let you
                  have both. I'm excited for that future!
       
                    dennis16384 wrote 12 hours 25 min ago:
                    Why would an LLM want to look into the contents, what for?
                    
                    We have low-cardinality data and yes this is safe to share
                    and required to build an actual query.
                    
                    Then we have high-cardinality and possibly PII - thereâs
                    absolutely no reason to share that data, thereâs nothing
                    for LLM to analyse there. Also semantic index (vector
                    search) will find relevant records much faster and more
                    accurately that any chain-of-thoughts just with an
                    LLM-authored search fn call.
                    
                    Further there are continuous numerical values and thereâs
                    not much LLM needs to see in there either. We can say, for
                    example, if you look at data distributions when building
                    your analysis, it can drive your analysis logic, but
                    another point of view here is taht it creates unnecessary
                    bias instead.
       
                      arcb wrote 2 hours 8 min ago:
                      On re-read I think I might have overreached in my reply.
                      I think having local LLMs being able run tool loops to
                      _transform_ data, rather than just summary or analysis,
                      will become 1/ great for non-technical users, 2/ fast.
       
        baetylus wrote 21 hours 17 min ago:
        First, I love this concept and I think your demo is great!
        Collaboration with existing harnesses makes a ton of sense. Just had a
        conversation with some folks in the non-tech world raving about using
        Claude.
        
        A few questions:
        
        - How do you think about competing with ChatGPT Canvas or Anthropic's
        artifacts, when these are shareable, native experiences in their
        products where users already work?
        
        - Is a "dashboard" limited to analytics or are you trying to expand it
        to include written reports?
        
        Since teams are connecting MCPs like Granola, Slack, I imagine BitBoard
        would facilitate sharing demos, PRDs/briefs, or customer reports. This
        seems like a natural expansion and trivial functionally, so I'm
        wondering if that's part of the sell now or something you're looking at
        expanding into as you grow.
       
          infakelife wrote 20 hours 36 min ago:
          Thanks! Non-OP BitBoard cofounder here. Would love to hear your
          thoughts when you get a chance to check it out.
          
          > How do you think about competing with ChatGPT Canvas or Anthropic's
          artifacts, when these are shareable, native experiences in their
          products where users already work?
          
          The flexibility is amazing for static content and playing around with
          visuals, the experience is just more like a whiteboard than a
          dashboard. It's hard to do both well in the same place. For reporting
          I want live connections, consistent logic, the ability to trace
          provenance, and a more opinionated starting point for the UI.
          
          We started with an extremely flexible surface but there are just a
          ton of things you don't want to leave up to the agent to implement
          and we gradually layered those in. It's no fun having to prompt the
          agent to expose a "view source" affordance, "run" button, or working
          data labels. But it's a lot of fun building whatever visualization
          you want and generating a dashboard without a billion clicks in some
          SQL-abstraction UI.
          
          > Is a "dashboard" limited to analytics or are you trying to expand
          it to include written reports?
          
          We weakly support written reports today (technically possible with
          markdown blocks in dashboards for commentary) but will do more to
          support them in the future for exactly the reasons you called out.
          
          We actually built a more notebook-like artifact for this but cut it
          to focus on dashboards since they seemed to be a bigger pain point
          for users. One-off reports can be hit or miss with a chat or coding
          agent today but static reporting is at least supported with some
          effort. Live reporting with connection infrastructure, provenance,
          etc. is much harder to pull together.
       
        mritchie712 wrote 21 hours 25 min ago:
        Looks cool! It's a lot of work to get a full data stack set up and
        people are losing interest in stitching the pieces (ETL, warehouse, BI)
        together.
        
        > Agents made bad inferences because they had no context on the
        business
        
        We've been working on this since before the chatgpt launch.
        
        We started with a semantic layer since there were already good open
        source options and LLMs at the time were good at writing the JSON
        (remember function calling?) to run a semantic query.
        
        But as LLMs have gotten smarter and people wanted to do more data work
        in agents, we found we needed something more flexible, so we built an
        "Ontology" that lets you store all the terms you use in your company
        and connect them to the data points (e.g. tables, columns, metrics)
        that matter.
        
  HTML  [1]: https://www.definite.app/blog/ontology-ai-analytics
       
        htrp wrote 21 hours 57 min ago:
        Is there a way to sign up without going through google oauth?
       
          arcb wrote 21 hours 41 min ago:
          Not at the moment but it's in the queue. If there's a sign up method
          that works better for you feel free to DM me.
       
        straydusk wrote 23 hours 11 min ago:
        Great concept. Had this idea myself recently.
       
          arcb wrote 23 hours 8 min ago:
          Thank you! If you try it out let us know how it goes!
       
        spmartin823 wrote 23 hours 49 min ago:
        Highly rec going after a specific vertical - healthcare might be the
        right spot given your experience. Why did you use DuckDB instead of
        CockroachDB/Snowflake?
       
          arcb wrote 23 hours 42 min ago:
          Our outreach is vertical-specific, and healthcare is indeed on the
          list! But what we learned working a vertical is that the primitives
          underneath (shared queries, permissions, caching, refresh semantics)
          repeat across industries.
          
          We use DuckDB internally because we like its ergonomics - it's
          flexible, runs well in memory, manages a lot of file structures under
          the hood, but we do work with Snowflake (and Databricks and other
          warehouses) as well.
       
       
   DIR <- back to front page