dtgreene: One issue is this:
* People aren't used to this new technology.
* The LLM's output looks, at least at first, like it's written by a human.
* Therefore, people expect the LLM to behave like a human.
* But the LLM does not actually behave like a human, particularly when it gives incorrect answers.
* So, the usual ways that people tell if someone isn't being entirely correct or truthful don't work with LLMs.
(Sorry for the long post ahead) Agreed.
I'd like to add to this topic that would be good if we separate things a bit.
Allow me to explain:
As it is with practically all niches of technologies, LLMs will also have its general public users who are convinced about all sorts of common behaviors about AI which may be or are indeed false and will use such AIs/LLMs as a human companion that have answers to all their questions (like some people use Google today, although we know LLMs are not like Search Engines).
But there are a niche of people whose not only develop to this technology but understands much more about it and push the technological advancements forward, be it ethically or not.
I think as it is with the FLOSS community, we have a good opportunity here.
Here's an example:
General public will of course buy their iPhones and Google Androids and use them without ever learning about data collection and that they can change privacy settings, probably they won't ever know that something like DivestOS or F-Droid exist.
But there are a niche of people, albeit small, of developers and common users that not only know about alternatives but also push the FLOSS agenda online on their companies, sites and personal friends and family.
This will absolutely happen with AIs, Machine Learning, LMMs, and so on.
A good example of this is this site:
https://opening-up-chatgpt.github.io/
Here's a TL;DR from the site:
Our paper makes the following contributions:
We review the risks of relying on proprietary software
We review best practices for open, transparent and accountable 'AI'
We find over 40 ChatGPT alternatives at varying degrees of openness, development and documentation
We argue that tech is never a fait accompli unless we make it so, and that openness enables critical computational literacy
We find the following recurrent patterns:
Many projects inherit data of dubious legality
Few projects share the all-important instruction-tuning
Preprints are rare, peer-reviewed papers even rarer
Synthetic instruction-tuning data is on the rise, with unknown consequences that are in need of research
ChatGPT is not the only viable LLM that exists. Although, because of such absurd marketing, it is possibly the most known one.
We are possibly about to see a rise of "Free and Open Source Large Language Models", FLOSSLLMs, if you will. And I do think that a reality where you can install a locally offline functional AI, feed with ethically selected Data Sets of public knowledge in your PC is not distant.
(I don't know any personally, but they might exist already - I'm still researching and studying about it.)
I personally think that we, as a DRM-Free community, therefore, a niche one, could learn from our experiences on gaming community and also apply such ideas to this area of technological advancements in our personal opinions about it.
Well.. at least lets hope we will be able to use "DRMFree-FLOSSLLMs" one day with personalized privacy specially.