본문 바로가기

NLP

(111)
[논문이해] Diffuser: Efficient Transformers with Multi-hop Attention Diffusion for Long Sequences 논문명: Diffuser: Efficient Transformers with Multi-hop Attention Diffusion for Long Sequences 논문링크: https://arxiv.org/abs/2210.11794 Diffuser: Efficient Transformers with Multi-hop Attention Diffusion for Long Sequences Efficient Transformers have been developed for long sequence modeling, due to their subquadratic memory and time complexity. Sparse Transformer is a popular approach to improving t..
[논문이해] Unmasked Teacher: Towards Training-Efficient Video Foundation Models 논문명: Unmasked Teacher: Towards Training-Efficient Video Foundation Models 논문링크: https://arxiv.org/abs/2303.16058v1 Unmasked Teacher: Towards Training-Efficient Video Foundation Models Video Foundation Models (VFMs) have received limited exploration due to high computational costs and data scarcity. Previous VFMs rely on Image Foundation Models (IFMs), which face challenges in transferring to the..
[논문이해] GPT-4 Is Too Smart To Be Safe: Stealthy Chat with LLMs via Cipher 논문명: GPT-4 Is Too Smart To Be Safe: Stealthy Chat with LLMs via Cipher 논문링크: https://arxiv.org/abs/2308.06463 GPT-4 Is Too Smart To Be Safe: Stealthy Chat with LLMs via Cipher Safety lies at the core of the development of Large Language Models (LLMs). There is ample work on aligning LLMs with human ethics and preferences, including data filtering in pretraining, supervised fine-tuning, reinforce..
[논문이해] Dataset Distillation with Attention Labels for Fine-tuning BERT 논문명: Dataset Distillation with Attention Labels for Fine-tuning BERT 논문링크: https://aclanthology.org/2023.acl-short.12/ Dataset Distillation with Attention Labels for Fine-tuning BERT Aru Maekawa, Naoki Kobayashi, Kotaro Funakoshi, Manabu Okumura. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). 2023. aclanthology.org 아이디어만 정리합니다 Da..
[논문이해] Query2doc: Query Expansion with Large Language Models 논문명: Query2doc: Query Expansion with Large Language Models 논문링크: https://arxiv.org/abs/2303.07678 Query2doc: Query Expansion with Large Language Models This paper introduces a simple yet effective query expansion approach, denoted as query2doc, to improve both sparse and dense retrieval systems. The proposed method first generates pseudo-documents by few-shot prompting large language models (LLM..
[huggingface🤗] Prompting? PEFT? 총정리 계기 NLP 분야를 공부하다 보면, Prompting, prompt tuning, soft prompt, p-tuning, prefix-tuning, In-Context Learning 등 다양한 용어들 때문에 헷갈린다. 하지만 새로운 분야와 방법이 등장하며, 점차 자리를 잡아가는 과정이기 때문에 이런 혼란스러움은 필연적이다. 그래도 한번 정리할 필요가 있겠다 싶어서 나름 정리 해봤다. 아무리 검색해봐도 제대로 정리한 이미지는 찾지 못했다. 기준 NLP 분야에서 어느 정도 권위가 있는 huggingface 문서를 따랐다. 개인적인 의견이 담긴 블로그나 논문보다는 그나마 가장 객관적이라 판단하였다. https://huggingface.co/docs/peft/conceptual_guides/prompting..
[논문이해] Block-Skim: Efficient Question Answering for Transformer 논문명: Block-Skim: Efficient Question Answering for Transformer 논문링크: https://arxiv.org/abs/2112.08560 Block-Skim: Efficient Question Answering for Transformer Transformer models have achieved promising results on natural language processing (NLP) tasks including extractive question answering (QA). Common Transformer encoders used in NLP tasks process the hidden states of all input tokens in the c..
[논문이해] What learning algorithm is in-context learning? Investigations with linear models 논문명: What learning algorithm is in-context learning? Investigations with linear models 논문링크: https://arxiv.org/abs/2211.15661 What learning algorithm is in-context learning? Investigations with linear models Neural sequence models, especially transformers, exhibit a remarkable capacity for in-context learning. They can construct new predictors from sequences of labeled examples $(x, f(x))$ prese..