Check the new instruction-tuning resources:
InstructHumanEval: a variant of HumanEval benchamrk adapted for instruction-tuned models InstructHumanEval
Full Curated CoNaLa: we used UL2 to rewritte more than 590k uncurated intents in CoNaLa dataset conala-mined-curated
Self-Instruct with StarCoder: we release a selft-instruct dataset generated with StarCoder, as weel as the code we used to build it self-instruct-starcoder
Models trained on CoNaLa and self-instruct StarCoder: we release a the models we trained on the previous two datasets.
This organization is dedicated to language models for code generation. In particular CodeParrot is a GPT-2 model trained to generate Python code. For advanced Code Language Models and pre-training datasets we recommend checking our work in the BigCode organization. Here you can find:
Interactive blog: where we compare different code models and explain how they are trained and evaluated Code generation with 🤗
Spaces: