본문 바로가기

NLP/논문이해

(65)

[논문이해] Dynamic Prompt Learning via Policy Gradient for Semi-structured Mathematical Reasoning 논문명: Dynamic Prompt Learning via Policy Gradient for Semi-structured Mathematical Reasoning (ICLR 2023) 논문링크: https://arxiv.org/abs/2209.14610 Dynamic Prompt Learning via Policy Gradient for Semi-structured Mathematical Reasoning Mathematical reasoning, a core ability of human intelligence, presents unique challenges for machines in abstract thinking and logical reasoning. Recent large pre-train..

[논문이해] BLEURT: Learning Robust Metrics for Text Generation 논문명: BLEURT: Learning Robust Metrics for Text Generation 논문 링크: https://arxiv.org/abs/2004.04696 BLEURT: Learning Robust Metrics for Text Generation Text generation has made significant advances in the last few years. Yet, evaluation metrics have lagged behind, as the most popular choices (e.g., BLEU and ROUGE) may correlate poorly with human judgments. We propose BLEURT, a learned evaluation me..

[논문이해] Active Retrieval Augmented Generation 논문명: Active Retrieval Augmented Generation 논문링크: https://arxiv.org/abs/2305.06983 Active Retrieval Augmented Generation Despite the remarkable ability of large language models (LMs) to comprehend and generate language, they have a tendency to hallucinate and create factually inaccurate output. Augmenting LMs by retrieving information from external knowledge resources is one arxiv.org 아이디어만 정리합니다..

[논문이해] Should You Mask 15% in Masked Language Modeling? 논문명: Should You Mask 15% in Masked Language Modeling? 논문링크: https://arxiv.org/abs/2202.08005 Should You Mask 15% in Masked Language Modeling? Masked language models (MLMs) conventionally mask 15% of tokens due to the belief that more masking would leave insufficient context to learn good representations; this masking rate has been widely used, regardless of model sizes or masking strategies. In ..

[논문이해] SODA: Million-scale Dialogue Distillation with Social Commonsense Contextualization 논문명: SODA: Million-scale Dialogue Distillation with Social Commonsense Contextualization 논문링크: https://arxiv.org/abs/2212.10465 SODA: Million-scale Dialogue Distillation with Social Commonsense Contextualization We present SODA: the first publicly available, million-scale high-quality social dialogue dataset. In contrast to most existing crowdsourced, small-scale dialogue corpora, we distill 1.5..

[논문이해] Pre-Training to Learn in Context 논문명: Pre-Training to Learn in Context 논문링크: https://arxiv.org/abs/2305.09137 Pre-Training to Learn in Context In-context learning, where pre-trained language models learn to perform tasks from task examples and instructions in their contexts, has attracted much attention in the NLP community. However, the ability of in-context learning is not fully exploited becau arxiv.org 아이디어만 정리합니다. 아이디어 기존 ..

[논문이해] Diffuser: Efficient Transformers with Multi-hop Attention Diffusion for Long Sequences 논문명: Diffuser: Efficient Transformers with Multi-hop Attention Diffusion for Long Sequences 논문링크: https://arxiv.org/abs/2210.11794 Diffuser: Efficient Transformers with Multi-hop Attention Diffusion for Long Sequences Efficient Transformers have been developed for long sequence modeling, due to their subquadratic memory and time complexity. Sparse Transformer is a popular approach to improving t..

[논문이해] Unmasked Teacher: Towards Training-Efficient Video Foundation Models 논문명: Unmasked Teacher: Towards Training-Efficient Video Foundation Models 논문링크: https://arxiv.org/abs/2303.16058v1 Unmasked Teacher: Towards Training-Efficient Video Foundation Models Video Foundation Models (VFMs) have received limited exploration due to high computational costs and data scarcity. Previous VFMs rely on Image Foundation Models (IFMs), which face challenges in transferring to the..

이전 1 2 3 4 5 6 7 8 9 다음

티스토리툴바