We should definitely update the Local guide to reflect the ^, thanks for pointing that out.
For the container footprint - pretty much all of them are necessary to run locally. We hope that just a large # of containers shouldn't be a problem, as long as they are all lightweight (e.g. total resource utilization is low). We of course could bundle them together into a single "mega container", but that seems against the best practices of Docker.
On a related note, we are looking to have a version that doesn't require the vector database (the heaviest part of the system in terms of resource utilization). Our goal is to have a deployment mode with less than 1gb of RAM required.
Hey, the bottom line is the project looks promising and I'm sure it's a lot of hard work. That said, because time is limited, for now I'll have to pass on spinning it up for the reasons mentioned. I took a look at the shell script, and it just seems like a helpful wrapper (with cleanup) over a manual install. I didn't mean to say it's about the amount of containers. All of those services take up a lot of resources, especially compared to the alternatives.
For example, Openwebui can be run with just a sqlite database and a backend. Why is nginx needed, or Minio on a single machine with a nice local file system? But I also understand it takes more work to support multiple service configurations so please accept the criticism as constructive (and it's more a general observation of what I've noticed over the past few years).
Yep, we have our own implementations! We've spent a lot of time on them, and in our internal benchmarks they compare pretty favorable to the native versions.
RAG specifically is our speciality - we've done a ton of optimizations there (hybrid search, document age-based weighting, giving the LLM the ability to read more from interesting docs and less of irrelevant docs, etc.) and we outperform the implementation within ChatGPT quite substantially in internal blind testing.
Curious what you find if you compare them head to head though!
Broadly, I think other open source solutions are lacking in (1) integration of external knowledge into the chat (2) simple UX (3) complex "agent" flows.
Both internal RAG and web search are hard to do well, and since we've started as an enterprise search project we've spent a lot of time making it good.
Most (all?) of these projects have UXs that are quite complicated (e.g. exposing front-and-center every model param like Top P without any explanation, no clear distinction between admin/regular user features, etc.). For broader deployments this can overwhelm people who are new to AI tools.
Finally trying to do anything beyond a simple back and forth with a single tool calls isn't great with a lot of these projects. So something like "find me all the open source chat options, understand their strengths/weaknesses, and compile that into a spreadsheet" will work well with Onyx, but not so well with other options (again partially due to our enterprise search roots).
Totally makes sense! We've defined simple `Connector`, `Tool` and (soon) `Agent` interfaces to make it easy to plug in your own implementations/apps. If you wanted, you could just use Langchain under our Agent class to build arbitrary flows.
Additionally, the main chat is created from a series of `Renderer` components, and it should be easy to build your own.
Do you think that's in-line with what you're thinking of, or do you want to build outside of those confines?
> canvas mode and scheduled actions
Yes, writing and recurring are two big areas we want to tackle next year.
> The main thing I really care about is voice mode
> Do you think that's in-line with what you're thinking of
It definitely sounds good. Of course it's hard to anticipate what interfaces new features will require, it probably won't always be possible to integrate new features with a generic plugin (voice mode is most likely such a feature), but if the code is well architected, this should be okay.
> Interesting. Why do you think that is?
Writing has a lot of friction to me. It's much more comfortable for me to provide context through verbal rambling, which LLMs are great at processing. I like to research stuff and throw around ideas while pacing, doing chores or just lying with my eyes closed.
Unfortunately I just discovered that I won't be able to run Onyx on my low powered home server anyway (https://docs.onyx.app/deployment/getting_started/resourcing#...). I understand that a Vector database requires significant resources to run, but I wish there was a version without it.
Cherry Studio is my daily go-to, I hope Onyx deskktop can be a great alternative for personal users who just want a dedicated app to access any LLMs with full power of MCP and various tools
Yes indeed! Our belief is that tool design, compaction (e.g. tool result summarization), and reminders are the things that separate a product that works magically vs one that falls over on any slightly more complex task.
And this is all made all the more important when supporting the wide range of models, even "weaker" open-source models.
We should definitely update the Local guide to reflect the ^, thanks for pointing that out.
For the container footprint - pretty much all of them are necessary to run locally. We hope that just a large # of containers shouldn't be a problem, as long as they are all lightweight (e.g. total resource utilization is low). We of course could bundle them together into a single "mega container", but that seems against the best practices of Docker.
On a related note, we are looking to have a version that doesn't require the vector database (the heaviest part of the system in terms of resource utilization). Our goal is to have a deployment mode with less than 1gb of RAM required.