본문 바로가기

NLP

(111)

[error] pretraind language model 이 같은 값만 뱉는 경우 이런 사람들에게 bert model 이 똑같은 인코딩 값만 뱉어요. loss 는 줄어드는데, 정확도는 늘지 않아요. 3줄 요약 BERT 와 같은 pretrained language model 이 똑같은 값만 출력하는 경우 learning rate 가 높아서 그렇다. lr 을 낮추자. 관련 글 모음 https://stackoverflow.com/questions/61855486/bert-encoding-layer-produces-same-output-for-all-inputs-during-evaluation-pytor BERT encoding layer produces same output for all inputs during evaluation (PyTorch) I don't understand why..

[논문 이해] Math Word Problem Dataset Math Word Problem 분야 관련 논문들을 읽다보면, 지겹게 나오는 dataset 들이 있다. 이 dataset 을 간과하기엔, 파고드는 논문들이 많아서 정리하고자 한다. Math Word Problem 자연어 처리 분야(Natural Language Processing) 분야 task 중 하나 모델에게 문장형 수학 문제를 풀도록 하는 것 문제 해석: Dan 은 2개의 펜을, Jessica 는 4개의 펜을 가지고 있다. 총 몇개의 펜을 갖고 있는가? 정답: 6 방정식: x = 4 + 2 보통, 정답만 맞추는 것보다 해당 문제 풀이를 의미하는 방정식을 생성(generation) 하도록 한다. 오늘은 대표적인 dataset 4개를 간략히 소개하고자 한다. Math23k MathQA MAWPS SVA..

[PyTorch] Auto Mixed Precision 3줄 요약 NVIDIA 와 Baidu 에서 부동소수점의 이점을 이용하여 딥러닝 연산량을 줄여서 속도를 높이는 기법 (참고로 여기서 'precision' 은 평가 기준 '정확도'를 의미하는 단어가 아닙니다. 저는 처음에 착각해서 혹시나 저처럼 착각하시는 분들이 계실까봐 작성합니다.) 사용법 pytorch docs: https://pytorch.org/docs/stable/amp.html# pytorch 에선 AMP: Automatic Mixed Precision 으로 불리고 있다. Automatic Mixed Precision package - torch.amp — PyTorch 1.12 documentation The following lists describe the behavior of eligibl..

[논문 이해] Generating Equation by Utilizing Operators : GEO Model 논문명: Generating Equation by Utilizing Operators : GEO Model 한글 논문 링크: https://s-space.snu.ac.kr/handle/10371/175890#export_btn 영어 논문 링크: https://aclanthology.org/2020.coling-main.38.pdf SNU Open Repository and Archive: 템플릿 기반의 방법을 이용한 문장형 수학 문제 풀이 템플릿 기반의 방법을 이용한 문장형 수학 문제 풀이 Automatically solving math word problem using template-based methods Issue Date 2021-02 Publisher 서울대학교 대학원 Keywords 자연어 ..

[용어 정리] Ablation Study 3줄 요약: what is ablation study? 특정 부분을 제거해서 달라지는 게 무엇인지를 파악하기 위해 설계 및 진행한 실험 의미 ablation: 절제. 여기서 '절제'는 의학용어에 가깝다. 특정 장기, 조직 등 생명체의 일부를 제거하는 과정을 의미한다. study: '연구'를 의미한다. 기원 즉, ablation study 는 원래 생물학에서 쓰이는 용어였다. 쉽게 생각해봐도, 과거 수많은 동물 실험들이 특정 부위가 어떤 기능을 하는지 알기 위해 그걸 제거한 동물과 제거하지 않은 동물을 비교한 사실을 어렵지 않게 떠올릴 수 있다. 처음으로 동물을 연구한다면, 살아있는 생명체를 해부해야 하니 그 속을 열었을 것이다. 그런데 무언가 수축과 팽창을 하는 덩어리가 있다. 우리는 그게 심장인 걸 알..

[논문 이해] Learning to Reason Deductively: Math Word Problem Solving as Complex Relation Extraction 논문명: Learning to Reason Deductively: Math Word Problem Solving as Complex Relation Extraction 논문링크: https://aclanthology.org/2022.acl-long.410.pdf 용어 정리 Math Word Problem (이하 'MWP') 수학 문제가 문장화된 형태. 예를 들자면, '철수는 사탕 11개, 영희는 사탕 23개가 있다. 철수가 영희에게 사탕을 모두 양보했다면, 영희는 사탕은 총 몇 개일까?' 와 같은 문제를 의미한다. quantity 숫자. 위 사탕 예시에선, '11', '23' 이 quantity 라고 볼 수 있다. 요약 3줄 요약 기존 seq2seq 이나 seq2tree 는 성능은 좋아도 명시적으로 문제..

[논문 이해] TM-generation model: a template-based method for automatically solving mathematical word problems 논문명: TM-generation model: a template-based method for automatically solving mathematical word problems 논문 링크: https://link.springer.com/article/10.1007/s11227-021-03855-9 요약 - Math Word Problem(이하 MWP) 풀이 태스크의 정확도를 향상시키는 모델 'TM-generation' 제시 - 이를 위해 2개의 challenges 를 정의하고, 각각을 해결하고자 함. 1. filling in missing world knowledge required to solve the given MWP 필요성: MWP 를 풀기 위한 상식(world knowledge)을 채울 ..

티스토리툴바