본문 바로가기

분류 전체보기

(112)
[논문이해] MobileCLIP: Fast Image-Text Models through Multi-Modal Reinforced Training 논문 설명만 보고 싶으신 분들은 글을 조금 내려주세요. 실험직전까지만 해석하니, 참고바랍니다. 오타 교정 : 비록 본 논문은 Apple 에서 작성하였으나, 오타가 여러 군데 보여서 부득이하게 교정하고 시작하겠다. 물론 내 식견이 모자랄 확률이 가장 클테니 이상이 있다면 알려주시면 감사드리겠습니다. 5페이지 일부 [수정 전] In particular, we create DataCompDR-1B and DataCompDR-12M by reinforcing DataComp-1B and DataCompDR-12M. [수정 후] In particular, we create DataCompDR-1B and DataCompDR-12M by reinforcing DataComp-1B and DataComp-12M. 사..
[논문이해] The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits 논문명: The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits 논문 링크: https://arxiv.org/abs/2402.17764 The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits Recent research, such as BitNet, is paving the way for a new era of 1-bit Large Language Models (LLMs). In this work, we introduce a 1-bit LLM variant, namely BitNet b1.58, in which every single parameter (or weight) of t..
[Insight] Some intuitions about large language models 블로그명: Some intuitions about large language models 블로그 링크: https://www.jasonwei.net/blog/some-intuitions-about-large-language-models Some intuitions about large language models — Jason Wei An open question these days is why large language models work so well. In this blog post I will discuss six basic intuitions about large language models. Many of them are inspired by manually examining data, wh..
[error] conda 에서 local 에 설치된 라이브러리에 접근해요 해결책 임시 방편: PYTHONNOUSERSITE=1 을 추가해서 실행해라 # BEFORE python main.py # AFTER PYTHONNOUSERSITE=1 python main.py 이 글에 단기적, 장기적, 실험적 해결방법이 상세히 정리되어 있으니 참고할 것 내가 처했던 상황 1) Deepspeed 환경을 위해 CONDA 환경을 새로 생성했다 2) 다음과 같은 에러가 발생했다 NVIDIA GeForce RTX 3090 with CUDA capability sm_86 is not compatible with the current PyTorch installation. The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_6..
[논문이해] SELF-DISCOVER: Large Language Models Self-Compose Reasoning Structures 논문명: SELF-DISCOVER: Large Language Models Self-Compose Reasoning Structures 논문 링크: https://arxiv.org/abs/2402.03620 Self-Discover: Large Language Models Self-Compose Reasoning Structures We introduce SELF-DISCOVER, a general framework for LLMs to self-discover the task-intrinsic reasoning structures to tackle complex reasoning problems that are challenging for typical prompting methods. Core to ..
[논문이해] unlearn what you want to forget efficient unlearning for llms 논문명: unlearn what you want to forget efficient unlearning for llms 논문 링크: https://arxiv.org/abs/2310.20150 Unlearn What You Want to Forget: Efficient Unlearning for LLMs Large language models (LLMs) have achieved significant progress from pre-training on and memorizing a wide range of textual data, however, this process might suffer from privacy issues and violations of data protection regulatio..
[논문이해] Sinkhorn Transformations for Single-Query Postprocessing in Text-Video Retrieval 논문명: Sinkhorn Transformations for Single-Query Postprocessing in Text-Video Retrieval 논문 링크: https://dl.acm.org/doi/10.1145/3539618.3592064 Sinkhorn Transformations for Single-Query Postprocessing in Text-Video Retrieval | Proceedings of the 46th International ACM SIG SIGIR '23: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval July 2..
[용어정리] LogSumExp https://gregorygundersen.com/blog/2020/02/09/log-sum-exp/ The Log-Sum-Exp Trick In statistical modeling and machine learning, we often work in a logarithmic scale. There are many good reasons for this. For example, when xxx and yyy are both small numbers, multiplying xxx times yyy may underflow. However, we can work in a logarithmic s gregorygundersen.com 한글 자료 중에서 LogSumExp 의 정의와 사용 이유를 와닿게 작성한..