본문 바로가기

분류 전체보기

(112)
[error] PeftModelForCausalLM.generate() takes 1 positional argument and 2 were given 해결책 beam_output = model.generate( input_ids=input_ids, # 정확히 input_ids 인자라고 알려줘야 한다 do_sample=True, top_k=10, max_new_tokens=32, min_new_tokens=16, output_scores=True, num_beams=BEAM_SIZE, repetition_penalty=10.0, return_dict_in_generate=True, num_return_sequences=BEAM_SIZE, ) input_ids=input_ids 라고 정확히 이야기해야 한다 문제점 beam_output = model.generate( input_ids, # 기존 모델은 이렇게 명시하지 않아도 암묵적으로 수행했다 do_sam..
[error] 파이썬 엑셀 인코딩 문제 해결 해결 df.to_csv(FILE_NAME, encoding='utf-8-sig') encoding 을 'utf-8-sig' 를 사용할 것 문제 python 에서 csv 파일을 읽고 쓸 때는 'utf-8' 인코딩 방식을 사용함. 그 파일을 excel 로 열면, 인코딩을 ANSI 등 으로 변경해야 함. 그런데 그 과정이 귀찮고, 양쪽에서 다 열렸으면 좋겠음. 검색하면 죄다, 메모장으로 열어서 인코딩 양식을 변경하라는 귀찮은 방법을 알려줘서 남겨놓고자 한다.
[논문이해] Active Retrieval Augmented Generation 논문명: Active Retrieval Augmented Generation 논문링크: https://arxiv.org/abs/2305.06983 Active Retrieval Augmented Generation Despite the remarkable ability of large language models (LMs) to comprehend and generate language, they have a tendency to hallucinate and create factually inaccurate output. Augmenting LMs by retrieving information from external knowledge resources is one arxiv.org 아이디어만 정리합니다..
[논문이해] Should You Mask 15% in Masked Language Modeling? 논문명: Should You Mask 15% in Masked Language Modeling? 논문링크: https://arxiv.org/abs/2202.08005 Should You Mask 15% in Masked Language Modeling? Masked language models (MLMs) conventionally mask 15% of tokens due to the belief that more masking would leave insufficient context to learn good representations; this masking rate has been widely used, regardless of model sizes or masking strategies. In ..
[논문이해] SODA: Million-scale Dialogue Distillation with Social Commonsense Contextualization 논문명: SODA: Million-scale Dialogue Distillation with Social Commonsense Contextualization 논문링크: https://arxiv.org/abs/2212.10465 SODA: Million-scale Dialogue Distillation with Social Commonsense Contextualization We present SODA: the first publicly available, million-scale high-quality social dialogue dataset. In contrast to most existing crowdsourced, small-scale dialogue corpora, we distill 1.5..
[논문이해] Pre-Training to Learn in Context 논문명: Pre-Training to Learn in Context 논문링크: https://arxiv.org/abs/2305.09137 Pre-Training to Learn in Context In-context learning, where pre-trained language models learn to perform tasks from task examples and instructions in their contexts, has attracted much attention in the NLP community. However, the ability of in-context learning is not fully exploited becau arxiv.org 아이디어만 정리합니다. 아이디어 기존 ..
[논문이해] Diffuser: Efficient Transformers with Multi-hop Attention Diffusion for Long Sequences 논문명: Diffuser: Efficient Transformers with Multi-hop Attention Diffusion for Long Sequences 논문링크: https://arxiv.org/abs/2210.11794 Diffuser: Efficient Transformers with Multi-hop Attention Diffusion for Long Sequences Efficient Transformers have been developed for long sequence modeling, due to their subquadratic memory and time complexity. Sparse Transformer is a popular approach to improving t..
[논문이해] Unmasked Teacher: Towards Training-Efficient Video Foundation Models 논문명: Unmasked Teacher: Towards Training-Efficient Video Foundation Models 논문링크: https://arxiv.org/abs/2303.16058v1 Unmasked Teacher: Towards Training-Efficient Video Foundation Models Video Foundation Models (VFMs) have received limited exploration due to high computational costs and data scarcity. Previous VFMs rely on Image Foundation Models (IFMs), which face challenges in transferring to the..