[논문이해] Detoxifying Text with MaRCo: Controllable Revision with Experts and Anti-Experts

논문명: Detoxifying Text with MaRCo: Controllable Revision with Experts and Anti-Experts

논문링크: https://aclanthology.org/2023.acl-short.21/

Detoxifying Text with MaRCo: Controllable Revision with Experts and Anti-Experts

Skyler Hallinan, Alisa Liu, Yejin Choi, Maarten Sap. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). 2023.

aclanthology.org

핵심만 설명합니다. 참고로 MARCO 는 MSMARCO 아니고 그냥 방법명의 약자입니다.

핵심

Toxic LM: 독성 가득한 텍스트로 잔뜩 학습해둔 모델
Non-Toxic LM: 독성 없는 텍스트로 학습한 모델
독성 가득한 토큰 찾는 법: Toxic LM의 생성 확률은 높은데, Non-Toxic LM 이 낮으면 수상하니 일단 가리자

Base LM + Non-Toxic LM - Toxic LM = 일반적인 상식 + 독성 제거 - 독성 = 제일 좋은 결과!

저작자표시

'NLP > 논문이해' 카테고리의 다른 글

[논문이해] LORA-FA: MEMORY-EFFICIENT LOW-RANK ADAPTATION FOR LARGE LANGUAGE MODELS FINE-TUNING (1)	2024.06.08
[논문이해] REPLUG: Retrieval-Augmented Black-Box Language Models (2)	2024.05.30
[논문이해] Adapt in Contexts: Retrieval-Augmented Domain Adaptation via In-Context Learning (0)	2024.05.09
[논문이해] Compressing Context to Enhance Inference Efficiency of Large Language Models (1)	2024.05.02
[논문이해] RECOMP: Improving Retrieval-Augmented LMs with Compression and Selective Augmentation (0)	2024.04.25

자연어천재만재

[논문이해] Detoxifying Text with MaRCo: Controllable Revision with Experts and Anti-Experts

핵심

'NLP > 논문이해' 카테고리의 다른 글

티스토리툴바

[논문이해] Detoxifying Text with MaRCo: Controllable Revision with Experts and Anti-Experts

핵심

'NLP > 논문이해' 카테고리의 다른 글

'NLP/논문이해' Related Articles

티스토리툴바