helvede.net is one of the many independent Mastodon servers you can use to participate in the fediverse.
Velkommen til Helvede, fediversets hotteste instance! Vi er en queerfeministisk server, der shitposter i den 9. cirkel. Welcome to Hell, We’re a DK-based queerfeminist server. Read our server rules!

Server stats:

159
active users

#largelanguagemodels

0 posts0 participants0 posts today

”Scaling has run out…; models still don’t reason reliably…; the financial bubble may be bursting; there still ain’t no GPT-5; Sam Altman can’t be trusted; an overreliance on unreliable LLMs…has indeed gotten the world into deep doodoo. … LLMs are not the way. We definitely need something better.”
—Gary Marcus
garymarcus.substack.com/p/scal
#llm #llms #largelanguagemodels

Marcus on AI · Scaling is over, the bubble may be deflating, LLMs still can’t reason, and you can’t trust SamBy Gary Marcus

🔴 💻 **Are chatbots reliable text annotators? Sometimes**

“_Given the unreliable performance of ChatGPT and the significant challenges it poses to Open Science, we advise caution when using ChatGPT for substantive text annotation tasks._”

Ross Deans Kristensen-McLachlan, Miceal Canavan, Marton Kárdos, Mia Jacobsen, Lene Aarøe, Are chatbots reliable text annotators? Sometimes, PNAS Nexus, Volume 4, Issue 4, April 2025, pgaf069, doi.org/10.1093/pnasnexus/pgaf.

#OpenAccess #OA #Article #AI #ArtificialIntelligence #LargeLanguageModels #LLMS #Chatbots #Technology #Tech #Data #Annotation #Academia #Academics @ai

A) Accuracy (i.e. percentage of correct predictions) and B) F1 scores by model type, prompt type, and zero or few shot.
OUP AcademicAre chatbots reliable text annotators? SometimesAbstract. Recent research highlights the significant potential of ChatGPT for text annotation in social science research. However, ChatGPT is a closed-sour

The thing to keep in mind about Large Language Models (LLMs, what people refer to as AI, currently) is even though human knowledge in the form of language is fed into them for their training, they are only storing statistical models of language, not the actual human knowledge. Their responses are constructed from statistical analysis of context of prior language used.

Any appearance of knowledge is pure coincidence. Even on the most “advanced” models.

Language is how we convey knowledge, not the knowledge itself. This is why a language model can never actually know anything.

And this is why they’re so easy to manipulate into conveying objectively false information, in some cases, maliciously so. ChatGPT and all the other big vendors do manipulate their models, and yes, in part, with malice.

Much has been written about the #hype surrounding #LargeLanguageModels (so-called "AI") and their effect on the IT industry - both on its products and on its consumers.

But this post by @calpaterson explores a related aspect that I haven't seen discussed so far: will *making* LLMs be a profitable business in the long run?

"A lot of people think that (LLMs) are going to be The Future. Maybe they are — but that doesn't mean that building them is going to be a profitable business."

calpaterson.com/porter.html

Cal elegantly explains how the structure of the industry you're in (suppliers, buyers, competitors etc) influences your chances of success. Then, he goes on to analyze the situation for #OpenAI and others.

Definitely recommended reading! No MBA required.

calpaterson.comBuilding LLMs is probably not going be a brilliant businessBy Cal Paterson
Replied in thread

🧵 …as mentioned above, I am critical of A.I. and I am not alone, its experts are too:

«Large language models not fit for real-world use, scientists warn — even slight changes cause their world models to collapse.
Large language model AIs might seem smart on a surface level but they struggle to actually understand the real world and model it accurately, a new study finds.»

👉 livescience.com/technology/art

Live Science · Large language models not fit for real-world use, scientists warn — even slight changes cause their world models to collapseBy Roland Moore-Colyer

Wie gut rechnen #LargeLanguageModels?

Wer zwischendurch die Textaufgabe füttert mit irrelevanten Zwischensätzen, bringt das Modell total aus dem Tritt: "Oliver pflückt am Freitag 44 Kiwis. Am Samstag pflückt er dann 58 Kiwis. Am Sonntag pflückt er doppelt so viele Kiwis wie am Freitag, aber fünf davon waren etwas kleiner als der Durchschnitt. Wie viele Kiwis hat Oliver?"

Das LLM ist geneigt dazu, die 5 kleineren Kiwis wegzusubtrahieren.

dnip.ch/2024/10/29/wie-gut-ver

Das Netz ist politisch · Wie gut verstehen LLMs die Welt? - Das Netz ist politischIrgendwann in der Primarschule, typischerweise ab der vierten Klasse, kommt man zum ersten Mal mit Sätzchen-Rechnungen in Kontakt. Für die einen ist das ein

How you know the wave is breaking:

Tonight on mainstream commercial TV, exactly the same TV ads for non-tech home products (heating, etc) have chirpy-comfort voiceovers saying ‘Now powered by AI!’ that 18 months ago were saying, ‘Now powered by Blockchain!’

(UK TV, Channel 4/ITV, mass audience, drama/documentary)

Continued thread

2/ Der Sammelband enthält auch den Aufsatz von Steven T. #Piantadosi: „Modern language models refute #Chomsky’s approach to language“.

zenodo.org/records/12665933/fi

Dieser Aufsatz war schon eine Weile auf #lingbuzz veröffentlicht. Er führt dort die Liste der Downloads im letzten halben Jahr an. Fast 30.000 Downloads. Jetzt ist er offiziell erschienen. Bei uns! Bei @langscipress! Schade, dass die Downloads nicht bei uns einzählen, aber das ist der Sinn der Übung: Maximale Verbreitung des Wissens, nicht maximaler Profit für einen Verlag.

Am Tag der Veröffentlichung hat der Aufsatz schon 140 Zitationen:

scholar.google.de/citations?vi

So muss das!

#ChatGPT #LargeLanguageModels #LLMs

#SprachmodelleLLMs

🆕 Recommendations on the Use of Al in Scholarly Communication by our Peer Review Committee

Responsible/transparent use of #LargeLanguageModels/ #GenerativeAl with links to policies & practices for:
:orcid: Authorship
:doi: Citations & lit review
📊 Data collection, cleaning, analysis & interpretation
🧑‍💻 Data/code generation
🖼️ Tables, images & videos
📝 Language & style editing
⚒️ Editorial work
🎓 #PeerReview creation & editing
📄 Paper creation, editing & revision
ease.org.uk/communities/peer-r
#AItools