본문 바로가기

딥러닝

(3)

[huggingface🤗] Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA 이 글은 huggingface blog 의 'Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA' 이라는 글을 의역한 것입니다. https://huggingface.co/blog/4bit-transformers-bitsandbytes Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA LLMs are known to be large, and running or training t..

[error] pretraind language model 이 같은 값만 뱉는 경우 이런 사람들에게 bert model 이 똑같은 인코딩 값만 뱉어요. loss 는 줄어드는데, 정확도는 늘지 않아요. 3줄 요약 BERT 와 같은 pretrained language model 이 똑같은 값만 출력하는 경우 learning rate 가 높아서 그렇다. lr 을 낮추자. 관련 글 모음 https://stackoverflow.com/questions/61855486/bert-encoding-layer-produces-same-output-for-all-inputs-during-evaluation-pytor BERT encoding layer produces same output for all inputs during evaluation (PyTorch) I don't understand why..

[용어 정리] Ablation Study 3줄 요약: what is ablation study? 특정 부분을 제거해서 달라지는 게 무엇인지를 파악하기 위해 설계 및 진행한 실험 의미 ablation: 절제. 여기서 '절제'는 의학용어에 가깝다. 특정 장기, 조직 등 생명체의 일부를 제거하는 과정을 의미한다. study: '연구'를 의미한다. 기원 즉, ablation study 는 원래 생물학에서 쓰이는 용어였다. 쉽게 생각해봐도, 과거 수많은 동물 실험들이 특정 부위가 어떤 기능을 하는지 알기 위해 그걸 제거한 동물과 제거하지 않은 동물을 비교한 사실을 어렵지 않게 떠올릴 수 있다. 처음으로 동물을 연구한다면, 살아있는 생명체를 해부해야 하니 그 속을 열었을 것이다. 그런데 무언가 수축과 팽창을 하는 덩어리가 있다. 우리는 그게 심장인 걸 알..

이전 1 다음

티스토리툴바