Dronningen af Helvede @queen

0 posts0 participants0 posts today

**Leshem Choshen** @LChoshen@sigmoid.social · Mar 21

Leshem Choshen @LChoshen@sigmoid.social

"Multilingual" LLMs are really just clusters of monolingual ones!
They might know a Brazilian brewery—but only in Portuguese
With ECLEKTIC, you can now test this. The challenge? Making them truly multilingual. alphaxiv.org/pdf/2502.21228

#LLM #ai #genAI #nlp

**Daniel Buschek** @DBuschek@hci.social · Mar 14

Mar 14

Daniel Buschek @DBuschek@hci.social

We’ve redesigned mobile email replies - with & without AI. Tap sentences while reading to enter local responses (or get suggestions). Then connect them on a usual draft screen (or let AI do just that). Result: Flexible workflows with varying speed and control. #CHI2025 preprint in comments.

#HCI #AI #NLP

**Cedric** @cedric@social.circl.lu · Feb 26 *

Feb 26 *

Cedric @cedric@social.circl.lu

New blog post on the Vulnerability-Lookup blog:

LLMs + Vulnerability-Lookup: What We’re Testing and Where We’re Headed

https://www.vulnerability-lookup.org/2025/02/26/exploring-llm-in-vulnerability-lookup/

www.vulnerability-lookup.org · Feb 26LLMs + Vulnerability-Lookup: What We’re Testing and Where We’re HeadedEveryone’s talking about AI, NLP and LLMs these days, and, to be honest, so are we! Recently, we’ve been exploring how LLMs can help us make sense of the massive amount of vulnerability data we collect and improve vulnerability management—while always remembering that AI is just a tool, not a solution on its own! The picture below gives a glimpse of what we’ve tested so far.

#NLP #LLM #Vulnerability

**Leanpub** @leanpub@mastodon.social · Feb 7

Feb 7

Leanpub @leanpub@mastodon.social

Transformers for Natural Language Processing and Computer Vision - Third Edition: Explore Generative AI and Large Language Models with Hugging Face, ChatGPT, GPT-4V, and DALL-E 3 https://leanpub.com/transformersfornaturallanguageprocessingandcomputervision-thirdedition by Packt Publishing Ltd is the featured book on the Leanpub homepage! https://leanpub.com #Databases #NLP #ai #LLM #huggingface #DALLE #books #ebooks

**Joche Ojeda** @joche_ojeda@mastodon.social · Feb 7

Feb 7

Joche Ojeda @joche_ojeda@mastodon.social

Bridging Traditional Development using XAF and AI: Training Sessions in Cairo

https://www.jocheojeda.com/2025/02/07/bridging-traditional-development-using-xaf-and-ai-training-sessions-in-cairo/

#SoftwareDevelopment #AI #AITraining

**Irene Elmerot** @elmerot@polyglot.city · Jan 29

Jan 29

Irene Elmerot @elmerot@polyglot.city

Slavic corpora? Yes! Also other interesting, methodological or empirical, studies welcome to this excellently run NLP and CL conference. Deadline Friday 31 January:
https://korpus.sk/en/about-us/conferences/slovko-2025-en/
#NLP
#CorpusLinguistics
@linguistics
@corpuslinguistics

Logotype of the Slovak National Corpus,www.korpus.sk

**Harald Sack** @lysander07@sigmoid.social · Jan 29 *

Jan 29 *

Harald Sack @lysander07@sigmoid.social

ReadMe2KG: Github ReadMe to Knowledge Graph #Challenge has been published as part of the Natural Scientific Language Processing and Research Knowledge Graphs #NSLP2025 workshop co-located with #eswc2025. This #NER task aims to complement the NDFI4DataScience KG via information extraction from GitHub README files.

task description: https://nfdi4ds.github.io/nslp2025/docs/readme2kg_shared_task.html
website: https://www.codabench.org/competitions/5396/

@eswc_conf @GenAsefa @shufan @NFDI4DS #NFDIrocks #knowledgegraphs #semanticweb #nlp #informationextraction

Readme2KG Challenge website screen shot:
The vision of NFDI4DataScience (NFDI4DS) is to support all steps of the complex and interdisciplinary research data lifecycle, including collecting/creating, processing, analyzing, publishing, archiving, and reusing resources in Data Science and Artificial Intelligence. GitHub is a popular platform for hosting and collaborating on software projects. In the context of research, authors can use GitHub repositories to share the datasets, models, and source code of experiments in the paper. These repositories can provide implementation details and facilitate the exploration and reproduction of research results. Each GitHub repository typically includes a README.md file, which serves as an introductory document for the project. READMEs are usually written in Markdown format and provide key information such as the project’s purpose, setup instructions, usage examples, and often links to the original research paper. Aiming to enhance the NDFI4DS-KG[1] with information from GitHub README files, a fine-grained Named Entity Recognition task is proposed.

**AIagent.at AI, GenAI, AGI** @ai@defcon.social · Jan 28

Jan 28

AIagent.at AI, GenAI, AGI @ai@defcon.social

»Former Intel CEO Pat Gelsinger is already using DeepSeek instead of OpenAI at his startup, Gloo« https://techcrunch.com/2025/01/27/former-intel-ceo-pat-gelsinger-is-already-using-deepseek-instead-of-openai-at-his-startup-gloo/ #AIagent #AI #ML #NLP #LLM #GenAI

TechCrunch · Jan 28Former Intel CEO Pat Gelsinger is already using DeepSeek instead of OpenAI at his startup, Gloo | TechCrunchThe tech industry's reaction to AI model DeepSeek R1 has been wild. Pat Gelsinger, for instance, is elated and thinks it will make AI better for everyone.

**Roman Klinger** @romanklinger.de@bsky.brid.gy · Jan 27

Jan 27

Roman Klinger @romanklinger.de@bsky.brid.gy

I'll give a talk at the @fau.de in Erlangen this Thursday (2025–01–30, 16:15h; www.linguistik.phil.fau.de/2025/01/22/v...). I'll talk about studies in which we collected data for building #NLP #NLPproc models via psychology(-inspired) data acquisition methods (for emotions, coping, deception).

Vortrag: Roman Klinger (30.1.2...

Bluesky SocialFAU Erlangen-Nbg (@fau.de)Official account of the Friedrich-Alexander-University Erlangen-Nürnberg (FAU). We are Moving Knowledge. fau.de Impressum: https://www.fau.de/impressum/ Datenschutz: https://www.fau.de/datenschutz/

**Javier Fernández Cruz** @javier@lingo.lol · Jan 20

Jan 20

Javier Fernández Cruz @javier@lingo.lol

#Introduction I work in #ComputationalLinguistics #NLP #SentimentAnalysis #Linguistics #CorpusLinguistics at the University of #Málaga (.es) @ the Tecnolengua lab.

My PhD thesis dealt with #SA and #CADS the language of the #GreatRecession in the press and changes in #semantics #discourseanalysis

I'm into #technopolitics, #Autonomy, #PunkRock, #photography, #History, and the #RightToTheCity

Here's my Scholar profile: https://scholar.google.com/citations?user=EhEnu3wAAAAJ&hl=es

scholar.google.comJavier Fernandez-CruzInvestigador Margarita Salas, Universidad de Málaga - Citado por 113 - sentiment analysis - corpus linguistics - computational linguistics - corpus assisted discourse studies

Continued thread

**Benjamin Han** @BenjaminHan@sigmoid.social · Jan 16

Jan 16

Benjamin Han @BenjaminHan@sigmoid.social

8/n

[2] Jinhyuk Lee, Anthony Chen, Zhuyun Dai, Dheeru Dua, Devendra Singh Sachan, Michael Boratko, Yi Luan, Sébastien M. R. Arnold, Vincent Perot, Siddharth Dalmia, Hexiang Hu, Xudong Lin, Panupong Pasupat, Aida Amini, Jeremy R. Cole, Sebastian Riedel, Iftekhar Naim, Ming-Wei Chang, and Kelvin Guu. 2024. Can Long-Context Language Models Subsume Retrieval, RAG, SQL, and More? https://arxiv.org/abs/2406.13121

arXiv.orgCan Long-Context Language Models Subsume Retrieval, RAG, SQL, and More?Long-context language models (LCLMs) have the potential to revolutionize our approach to tasks traditionally reliant on external tools like retrieval systems or databases. Leveraging LCLMs' ability to natively ingest and process entire corpora of information offers numerous advantages. It enhances user-friendliness by eliminating the need for specialized knowledge of tools, provides robust end-to-end modeling that minimizes cascading errors in complex pipelines, and allows for the application of sophisticated prompting techniques across the entire system. To assess this paradigm shift, we introduce LOFT, a benchmark of real-world tasks requiring context up to millions of tokens designed to evaluate LCLMs' performance on in-context retrieval and reasoning. Our findings reveal LCLMs' surprising ability to rival state-of-the-art retrieval and RAG systems, despite never having been explicitly trained for these tasks. However, LCLMs still face challenges in areas like compositional reasoning that are required in SQL-like tasks. Notably, prompting strategies significantly influence performance, emphasizing the need for continued research as context lengths grow. Overall, LOFT provides a rigorous testing ground for LCLMs, showcasing their potential to supplant existing paradigms and tackle novel tasks as model capabilities scale.

#NLP #NLProc #RAG

Continued thread

**Benjamin Han** @BenjaminHan@sigmoid.social · Jan 16

Jan 16

Benjamin Han @BenjaminHan@sigmoid.social

REFERENCES

[1] Yifu Qiu, Varun Embar, Yizhe Zhang, Navdeep Jaitly, Shay B Cohen, and Benjamin Han. 2025. Eliciting in-context Retrieval and reasoning for long-context large language models. https://arxiv.org/abs/2501.08248

arXiv.orgEliciting In-context Retrieval and Reasoning for Long-context Large Language ModelsRecent advancements in long-context language models (LCLMs) promise to transform Retrieval-Augmented Generation (RAG) by simplifying pipelines. With their expanded context windows, LCLMs can process entire knowledge bases and perform retrieval and reasoning directly -- a capability we define as In-Context Retrieval and Reasoning (ICR^2). However, existing benchmarks like LOFT often overestimate LCLM performance by providing overly simplified contexts. To address this, we introduce ICR^2, a benchmark that evaluates LCLMs in more realistic scenarios by including confounding passages retrieved with strong retrievers. We then propose three methods to enhance LCLM performance: (1) retrieve-then-generate fine-tuning, (2) retrieval-attention-probing, which uses attention heads to filter and de-noise long contexts during decoding, and (3) joint retrieval head training alongside the generation head. Our evaluation of five well-known LCLMs on LOFT and ICR^2 demonstrates significant gains with our best approach applied to Mistral-7B: +17 and +15 points by Exact Match on LOFT, and +13 and +2 points on ICR^2, compared to vanilla RAG and supervised fine-tuning, respectively. It even outperforms GPT-4-Turbo on most tasks despite being a much smaller model.

#NLP #NLProc #RAG

Continued thread

**Benjamin Han** @BenjaminHan@sigmoid.social · Jan 16

Jan 16

Benjamin Han @BenjaminHan@sigmoid.social

Through extensive experiments on five LCLMs using both the LOFT and ICR² benchmarks, our best approach on Mistral-7B with a 32K token limit outperformed Vanilla RAG and SFT baselines by an average of +17 and +15 points (Exact Match) on LOFT, and by +13 and +2 points on ICR², respectively (picture). It even achieved performance comparable to the state-of-the-art GPT-4, despite having only 7B parameters.

#NLP #NLProc #RAG

Continued thread

**Benjamin Han** @BenjaminHan@sigmoid.social · Jan 16

Jan 16

Benjamin Han @BenjaminHan@sigmoid.social

3. Joint retrieval head training alongside the generation head (picture): We equip LCLMs with a dedicated retrieval head and optimize both the retrieval and generation heads jointly during training.

#NLP #NLProc #RAG

Continued thread

**Benjamin Han** @BenjaminHan@sigmoid.social · Jan 16

Jan 16

Benjamin Han @BenjaminHan@sigmoid.social

With a more realistic benchmark in hand, we systematically explored three approaches to enhance model performance:

1. Retrieve-then-generate supervised fine-tuning (picture): we train LCLMs to first retrieve relevant information from the context and then generate the final responses.

2. Retrieval-attention-probing: During inference, we probe attention heads activated for in-context retrieval, and use their top predictions to filter out confounders.

#NLP #NLProc #RAG

Continued thread

**Benjamin Han** @BenjaminHan@sigmoid.social · Jan 16

Jan 16

Benjamin Han @BenjaminHan@sigmoid.social

This limitation often leads to inflated results. To address this, we created a more realistic dataset ICR². It uses five retrievers to generate challenging negative documents (picture 1). Our results show significant performance drop with standard RAG setups. For example, with GPT-4-Turbo, accuracy on NQ dropped from 0.85 to 0.67, and on HPQA, it fell from 0.78 to 0.64 (picture 2).

#NLP #NLProc #RAG

Continued thread

**Benjamin Han** @BenjaminHan@sigmoid.social · Jan 16

Jan 16

Benjamin Han @BenjaminHan@sigmoid.social

But are current LCLMs up to the task? If not, how can we improve their performance?

In our preprint [1], we evaluated five popular LCLMs using the LOFT benchmark [2], which involves answering questions paired with documents. However, LOFT relies on random sampling to create irrelevant (negative) documents for each query, failing to include confounding documents — those that are relevant but misleading — which are common in real-world scenarios.

#NLP #NLProc #RAG

**Benjamin Han** @BenjaminHan@sigmoid.social · Jan 16

Jan 16

Benjamin Han @BenjaminHan@sigmoid.social

What if #LLMs had context windows so large that an entire knowledge base could fit into a single prompt? This would revolutionize Retrieval-Augmented Generation (RAG) applications by enabling retrieval, re-ranking, reasoning, and generation all in one step. With a Long-Context Language Model (LCLM), we could simplify RAG architecture by leveraging the model’s capability for In-Context Retrieval and Reasoning (ICR²).

#NLP #NLProc #RAG

**Cheng Soon Ong** @cheng@masto.ai · Jan 15

Jan 15

Cheng Soon Ong @cheng@masto.ai

If you care about theoretical computer science, you should watch this lovely talk by Jon Kleinberg on language generation being easier than language identification. He mentions an interesting tension between hallucination and mode collapse.

https://arxiv.org/abs/2404.06757
https://www.youtube.com/live/BrdaVSZBuyU?feature=shared

arXiv.orgLanguage Generation in the LimitAlthough current large language models are complex, the most basic specifications of the underlying language generation problem itself are simple to state: given a finite set of training samples from an unknown language, produce valid new strings from the language that don't already appear in the training data. Here we ask what we can conclude about language generation using only this specification, without further assumptions. In particular, suppose that an adversary enumerates the strings of an unknown target language L that is known only to come from one of a possibly infinite list of candidates. A computational agent is trying to learn to generate from this language; we say that the agent generates from L in the limit if after some finite point in the enumeration of L, the agent is able to produce new elements that come exclusively from L and that have not yet been presented by the adversary. Our main result is that there is an agent that is able to generate in the limit for every countable list of candidate languages. This contrasts dramatically with negative results due to Gold and Angluin in a well-studied model of language learning where the goal is to identify an unknown language from samples; the difference between these results suggests that identifying a language is a fundamentally different problem than generating from it.

#MachineLearning #LLM #nlp

**h-pom** @hpom@vivaldi.net · Jan 14

Jan 14

h-pom @hpom@vivaldi.net

#film #filmcamera #filmphotography

Recent searches

Search options

Administered by:

Server stats:

#nlp