StarCoder – A State-of-the-Art LLM for Code – Free alternative to GitHub Copilot

StarCoder is a new large language model (LLM) for code. In this article we’ll discuss StarCoder in detail and how we can use it with VS Code.

BigCode is an open-source collaboration(Hugging Face and ServiceNow) working for responsible large language models for code. BigCode’s mission is to create code LLMs to complete/generate code and work across a wide range of domains, tasks, and programming languages.

StarCoder

StarCoder and Starbase are Code large language models :

  • trained on permissively licensed data from GitHub
  • 15B parameter model for 1 trillion tokens
  • 80+ programming languages
  • Git commits, GitHub issues, and Jupyter notebooks
  • context length of over 8,000 tokens

StarCoder is fine-tuned version StarCoderBase model with 35B Python tokens.

As per StarCoder documentation, StarCode outperforms the closed source Code LLM code-cushman-001 by OpenAI (used in the early stages of Github Copilot ).

One key feature, StarCode supports 8000 tokens. It can process larger input than any other free open-source code model.

Use cases

The following are some use cases of StarCoder :

  • giving a series of dialogues we can make them act as a technical assistant.
  • autocomplete code
  • make modifications to code via instructions
  • explain a code snippet in natural language

StarCoder is publicly available under an improved version of the OpenRAIL license.

StarCoder acting as assistant

To use StarCoder as a technical assistant, we can use Tech Assistant Prompt

Quick Start

We can use Starcoder playground to test the StarCoder code generation capabilities.

VS Code extension

We can use StarCode with VS Code by using the extension HF Code Autocomplete simply follow the steps described here.

Conclusion

StarCoder seems to be a promising code generation/completion large language model. Here are some useful links

You can read about How To Use Amazon CodeWhisperer with VS Code- Free alternative to GitHub Copilot.