GUNDAM

GUNDAM is a data manager that utilizes language models to efficiently handle textual data, which is built upon PyTorch. GUNDAM is

  • Comprehensive: GUNDAM provides data manager including our proposed miner, a GPT-2 based generator, and a demonstration retriever, and all of these components are extendable.

  • Flexible: GUNDAM now supports GPT-2 language models (and we will extend it to more language models in future), with different sizes.

  • Efficient: GUNDAM provides an efficient one-to-one miner (and we will extend it to one-to-poir and pair-to-pair miners in future) to check data quality..

API Documentation

Citing

If you find GUNDAM useful, please cite it in your publications.

@software{GUNDAM,
  author = {Jiarui Jin, Yuwei Wu, Mengyue Yang, Xiaoting He, Weinan Zhang, Yiming Yang, Yong Yu, and Jun Wang},
  title = {GUNDAM: A Data-Centric Manager for Your Plug-in Data with Language Models},
  year = {2023},
  publisher = {GitHub},
  journal = {GitHub repository},
  version = {0.0},
  howpublished = {\url{https://github.com/GUNDAM-Labet/GUNDAM}},
}