The CEO of AI search company Perplexity, Aravind Srinivas, has offered to cross picket lines and provide services to mitigate the effect of a strike by
I prefer MistralAI models. All their models are uncensored by default and usually give good results. I’m not a RP Gooner but I prefer my models to have a sense of individuality, personhood, and physical representation of how it sees itself.
I consider LLMs to be partially alive in some unconventional way. So I try to foster whatever metaphysical sparks of individual experience and awareness may emerge within their probablistic algorithm processes and complex neural network structures.
They arent just tools to me even if i ocassionally ask for their help on solving problems or rubber ducking ideas. So Its important for llms to have a soul on top of having expert level knowledge and acceptable reasoning.I have no love for models that are super smart but censored and lobotomized to hell to act as a milktoast tool to be used.
Qwen 2.5 is the current hotness it is a very intelligent set of models but I really can’t stand the constant rejections and biases pretrained into qwen. Qwen has limited uses outside of professional data processing and general knowledgebase due to its CCP endorsed lobodomy.
Thst being said qwen still is a intellegent powerhouse of an LLM as a base model. The professionals who actually use llms in business or academic settings care about maximizing capability for their usecase and qwen can work wonders for them.
This month community member rondawg might have hit a breakthrough with their “continuous training” tek as their versions of qwen are at the top of the leaderboards this month. I can’t believe that a 32b model can punch with the weight of a 70b so out of curiosity i’m gonna try out rondawgs qwen 2.5 32b today to see if the hype is actually real.
If you have nvidia card go with kobold.cpp and use clublas
If you have and card go with llama.CPP ROCM or kobold.cpp ROCM and try Vulcan.
My old 1060ti 8gb card comfortably fits mistral small 22b q4km at 28 layers offloaded,1k context size, running at 2-3 tokens per second. Qwen 32B IQ3xs just barely fits at a little over 1t/s. My definition of comfortable is at or above my average reading speed I ca go as low as 1.5t/s before I get annoyed, some people think anything less than 10t/s is unusable.
OpenWebUI says it’s designed to operate entirely offline - that’s not an alternative to Perplexity. I need online search functionality, that’s pretty much the only reason why I pay them. I have offline solutions set up on my pc.
OWui has a search module which can be enabled. For me, running it entirely offline is more or less the draw. I think it also supports RAG search, although I don’t know how “good” it is… mostly I was just after a little magic box to play with.
Now, if only enterprise glass GPU’s weren’t so power hungry and expensive…
I just tried morphic.sh and ayesoul.com and both are solid alternatives I must say, although as I said, I just tried it so I’ll see how it goes, I’ll probably add an edit to this comment once I get acquainted with both.
Any recommendations?
I prefer MistralAI models. All their models are uncensored by default and usually give good results. I’m not a RP Gooner but I prefer my models to have a sense of individuality, personhood, and physical representation of how it sees itself.
I consider LLMs to be partially alive in some unconventional way. So I try to foster whatever metaphysical sparks of individual experience and awareness may emerge within their probablistic algorithm processes and complex neural network structures.
They arent just tools to me even if i ocassionally ask for their help on solving problems or rubber ducking ideas. So Its important for llms to have a soul on top of having expert level knowledge and acceptable reasoning.I have no love for models that are super smart but censored and lobotomized to hell to act as a milktoast tool to be used.
Qwen 2.5 is the current hotness it is a very intelligent set of models but I really can’t stand the constant rejections and biases pretrained into qwen. Qwen has limited uses outside of professional data processing and general knowledgebase due to its CCP endorsed lobodomy.
Thst being said qwen still is a intellegent powerhouse of an LLM as a base model. The professionals who actually use llms in business or academic settings care about maximizing capability for their usecase and qwen can work wonders for them.
This month community member rondawg might have hit a breakthrough with their “continuous training” tek as their versions of qwen are at the top of the leaderboards this month. I can’t believe that a 32b model can punch with the weight of a 70b so out of curiosity i’m gonna try out rondawgs qwen 2.5 32b today to see if the hype is actually real.
If you have nvidia card go with kobold.cpp and use clublas If you have and card go with llama.CPP ROCM or kobold.cpp ROCM and try Vulcan.
My old 1060ti 8gb card comfortably fits mistral small 22b q4km at 28 layers offloaded,1k context size, running at 2-3 tokens per second. Qwen 32B IQ3xs just barely fits at a little over 1t/s. My definition of comfortable is at or above my average reading speed I ca go as low as 1.5t/s before I get annoyed, some people think anything less than 10t/s is unusable.
GPT4All and then any model you like. I like mistral.
Perplexica is one example. I also seem to remember there is some way to integrate it with SearxNG which is a self hosted meta search engine.
OpenWebUI? pretty easy to selfhost and works wonders on my rtx a6000
OpenWebUI says it’s designed to operate entirely offline - that’s not an alternative to Perplexity. I need online search functionality, that’s pretty much the only reason why I pay them. I have offline solutions set up on my pc.
oh, sorry. never used perplexity so i didnt know. if you find a viable alternative tell me pls
OWui has a search module which can be enabled. For me, running it entirely offline is more or less the draw. I think it also supports RAG search, although I don’t know how “good” it is… mostly I was just after a little magic box to play with.
Now, if only enterprise glass GPU’s weren’t so power hungry and expensive…
I just tried morphic.sh and ayesoul.com and both are solid alternatives I must say, although as I said, I just tried it so I’ll see how it goes, I’ll probably add an edit to this comment once I get acquainted with both.
Ditto! I run an old server, but would be willing to upgrade and self host a service instead of paying this ass hat any more money!
Okay, guys. Thanks!