본문 바로가기

NLP

(111)
[논문 이해] SPLADE v2: Sparse Lexical and Expansion Model for Information Retrieval 논문명: SPLADE v2: Sparse Lexical and Expansion Model for Information Retrieval 논문 링크: https://arxiv.org/abs/2109.10086 SPLADE v2: Sparse Lexical and Expansion Model for Information Retrieval In neural Information Retrieval (IR), ongoing research is directed towards improving the first retriever in ranking pipelines. Learning dense embeddings to conduct retrieval using efficient approximate nearest..
[논문이해] Cap4Video: What Can Auxiliary Captions Do for Text-Video Retrieval? 논문명: Cap4Video: What Can Auxiliary Captions Do for Text-Video Retrieval?논문 링크: https://arxiv.org/abs/2301.00184 Cap4Video: What Can Auxiliary Captions Do for Text-Video Retrieval?Most existing text-video retrieval methods focus on cross-modal matching between the visual content of videos and textual query sentences. However, in real-world scenarios, online videos are often accompanied by relevan..
[error] GPU가 충분한데 CUDA OUT OF MEMORY가 발생합니다 tensorflow 만 쓰면 문제가 없는데, tensflow 와 pytorch 를 혼합해서 쓰는 경우 발생한다고 한다. tensorflow 는 GPU를 미리 다 할당받은 다음, 사용하는 구조라서 그걸 코드로 방지해야 한다. import tensorflow as tf gpus = tf.config.experimental.list_physical_devices('GPU') if gpus: try: # Currently, memory growth needs to be the same across GPUs for gpu in gpus: tf.config.experimental.set_memory_growth(gpu, True) logical_gpus = tf.config.experimental.list_logi..
[논문이해] Hybrid Contrastive Quantization for Efficient Cross-View Video Retrieval 논문명: Hybrid Contrastive Quantization for Efficient Cross-View Video Retrieval 논문링크: https://arxiv.org/abs/2202.03384 Hybrid Contrastive Quantization for Efficient Cross-View Video Retrieval With the recent boom of video-based social platforms (e.g., YouTube and TikTok), video retrieval using sentence queries has become an important demand and attracts increasing research attention. Despite the d..
[논문이해] LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models 논문명: LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models 논문링크: https://arxiv.org/abs/2309.12307 LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models We present LongLoRA, an efficient fine-tuning approach that extends the context sizes of pre-trained large language models (LLMs), with limited computation cost. Typically, training LLMs with long context sizes is ..
[논문이해] CODEFUSION: A Pre-trained Diffusion Model for Code Generation ChatGPT의 파라미터 개수 공개 논란으로 뜨거운 논문인데, 논문 내용 자체도 좋은 것 같아서 가져와봅니다. 논문명: CODEFUSION: A Pre-trained Diffusion Model for Code Generation 논문링크: https://arxiv.org/abs/2310.17680v1 CodeFusion: A Pre-trained Diffusion Model for Code Generation Imagine a developer who can only change their last line of code, how often would they have to start writing a function from scratch before it is correct? Auto-regress..
[논문이해] Neural Text Generation with Unlikelihood Training 논문명: Neural Text Generation with Unlikelihood Training 논문 링크: https://openreview.net/forum?id=SJeYe0NtvH Neural Text Generation With Unlikelihood Training Neural text generation is a key tool in natural language applications, but it is well known there are major problems at its core. In particular, standard likelihood training and decoding leads to... openreview.net 아이디어만 정리합니다. 그동안 generation d..
[개발하며 깨닫는 것] 석사식 구글링 컴퓨터 학과하면 구글링 실력이 중요하다. 내가 느끼기엔 다음과 같은 단계가 있다. 키워드 변경 능력 긴 에러를 잘 요약해서 물어보는가? 조금씩 변형을 줘서 원하는 검색어로 바꿀 수 있는가? 검색 결과를 토대로, 연달아 구글링하는가? 언어 능력 영어 자료만 나와도 읽는가? 중국어 자료만 나와도 읽는가? 최근에 하나 더 늘었는데, 바로 github issue 찾기다. 도저히 검색해도 안 나오면, pytorch community 나 github issue 도 다 읽어보자. 구글링으로 나오는 경우도 있는데, 내 경험 5번 중 4번은 직접 찾아내야 했다. 이젠 중국어도 영어도 두렵지 않다. 검색할 수 있는 기회에 감사하다. 아무것도 나오지 않을 때야말로 절망스럽다...