We’ve redesigned mobile email replies - with & without AI. Tap sentences while reading to enter local responses (or get suggestions). Then connect them on a usual draft screen (or let AI do just that). Result: Flexible workflows with varying speed and control. #CHI2025 preprint in comments.
New blog post on the Vulnerability-Lookup blog:
LLMs + Vulnerability-Lookup: What We’re Testing and Where We’re Headed
https://www.vulnerability-lookup.org/2025/02/26/exploring-llm-in-vulnerability-lookup/
Transformers for Natural Language Processing and Computer Vision - Third Edition: Explore Generative AI and Large Language Models with Hugging Face, ChatGPT, GPT-4V, and DALL-E 3 https://leanpub.com/transformersfornaturallanguageprocessingandcomputervision-thirdedition by Packt Publishing Ltd is the featured book on the Leanpub homepage! https://leanpub.com #Databases #NLP #ai #LLM #huggingface #DALLE #books #ebooks
Bridging Traditional Development using XAF and AI: Training Sessions in Cairo
Slavic corpora? Yes! Also other interesting, methodological or empirical, studies welcome to this excellently run NLP and CL conference. Deadline Friday 31 January:
https://korpus.sk/en/about-us/conferences/slovko-2025-en/
#NLP
#CorpusLinguistics
@linguistics
@corpuslinguistics
ReadMe2KG: Github ReadMe to Knowledge Graph #Challenge has been published as part of the Natural Scientific Language Processing and Research Knowledge Graphs #NSLP2025 workshop co-located with #eswc2025. This #NER task aims to complement the NDFI4DataScience KG via information extraction from GitHub README files.
task description: https://nfdi4ds.github.io/nslp2025/docs/readme2kg_shared_task.html
website: https://www.codabench.org/competitions/5396/
@eswc_conf @GenAsefa @shufan @NFDI4DS #NFDIrocks #knowledgegraphs #semanticweb #nlp #informationextraction
I'll give a talk at the @fau.de in Erlangen this Thursday (2025–01–30, 16:15h; www.linguistik.phil.fau.de/2025/01/22/v...). I'll talk about studies in which we collected data for building #NLP #NLPproc models via psychology(-inspired) data acquisition methods (for emotions, coping, deception).
Vortrag: Roman Klinger (30.1.2...
#Introduction I work in #ComputationalLinguistics #NLP #SentimentAnalysis #Linguistics #CorpusLinguistics at the University of #Málaga (.es) @ the Tecnolengua lab.
My PhD thesis dealt with #SA and #CADS the language of the #GreatRecession in the press and changes in #semantics #discourseanalysis
I'm into #technopolitics, #Autonomy, #PunkRock, #photography, #History, and the #RightToTheCity
Here's my Scholar profile: https://scholar.google.com/citations?user=EhEnu3wAAAAJ&hl=es
8/n
[2] Jinhyuk Lee, Anthony Chen, Zhuyun Dai, Dheeru Dua, Devendra Singh Sachan, Michael Boratko, Yi Luan, Sébastien M. R. Arnold, Vincent Perot, Siddharth Dalmia, Hexiang Hu, Xudong Lin, Panupong Pasupat, Aida Amini, Jeremy R. Cole, Sebastian Riedel, Iftekhar Naim, Ming-Wei Chang, and Kelvin Guu. 2024. Can Long-Context Language Models Subsume Retrieval, RAG, SQL, and More? https://arxiv.org/abs/2406.13121
7/
REFERENCES
[1] Yifu Qiu, Varun Embar, Yizhe Zhang, Navdeep Jaitly, Shay B Cohen, and Benjamin Han. 2025. Eliciting in-context Retrieval and reasoning for long-context large language models. https://arxiv.org/abs/2501.08248
6/
Through extensive experiments on five LCLMs using both the LOFT and ICR² benchmarks, our best approach on Mistral-7B with a 32K token limit outperformed Vanilla RAG and SFT baselines by an average of +17 and +15 points (Exact Match) on LOFT, and by +13 and +2 points on ICR², respectively (picture). It even achieved performance comparable to the state-of-the-art GPT-4, despite having only 7B parameters.
5/
3. Joint retrieval head training alongside the generation head (picture): We equip LCLMs with a dedicated retrieval head and optimize both the retrieval and generation heads jointly during training.
4/
With a more realistic benchmark in hand, we systematically explored three approaches to enhance model performance:
1. Retrieve-then-generate supervised fine-tuning (picture): we train LCLMs to first retrieve relevant information from the context and then generate the final responses.
2. Retrieval-attention-probing: During inference, we probe attention heads activated for in-context retrieval, and use their top predictions to filter out confounders.
3/
This limitation often leads to inflated results. To address this, we created a more realistic dataset ICR². It uses five retrievers to generate challenging negative documents (picture 1). Our results show significant performance drop with standard RAG setups. For example, with GPT-4-Turbo, accuracy on NQ dropped from 0.85 to 0.67, and on HPQA, it fell from 0.78 to 0.64 (picture 2).
2/
But are current LCLMs up to the task? If not, how can we improve their performance?
In our preprint [1], we evaluated five popular LCLMs using the LOFT benchmark [2], which involves answering questions paired with documents. However, LOFT relies on random sampling to create irrelevant (negative) documents for each query, failing to include confounding documents — those that are relevant but misleading — which are common in real-world scenarios.
1/
What if #LLMs had context windows so large that an entire knowledge base could fit into a single prompt? This would revolutionize Retrieval-Augmented Generation (RAG) applications by enabling retrieval, re-ranking, reasoning, and generation all in one step. With a Long-Context Language Model (LCLM), we could simplify RAG architecture by leveraging the model’s capability for In-Context Retrieval and Reasoning (ICR²).
If you care about theoretical computer science, you should watch this lovely talk by Jon Kleinberg on language generation being easier than language identification. He mentions an interesting tension between hallucination and mode collapse.
https://arxiv.org/abs/2404.06757
https://www.youtube.com/live/BrdaVSZBuyU?feature=shared