adesso Blog

Methodology

29. November 2023 By Marc Mezger

Coding assistants: GitHub Copilot, Amazon CodeWhisperer or open source?

In today’s era of digital software development, artificial intelligence (AI) has become an indispensable tool that helps developers work faster and more efficiently. AI-supported coding assistants are now so advanced that they can not only perform simple coding tasks, but can also help solve complex programming problems. Three prime examples include GitHub Copilot, Amazon CodeWhisperer and various open-source alternatives. In this blog post, we will take a closer look at these three options and compare their relevant strengths and weaknesses.

GitHub Copilot

GitHub Copilot is an AI-supported tool developed by GitHub and OpenAI that assists developers in code creation. It is a cloud-based service that is implemented in integrated development environments (IDEs) such as Visual Studio Code, Visual Studio, Neovim and JetBrains. The tool features an autocomplete function for code and is currently available as a subscription service for both individuals and companies.

GitHub Copilot was first announced on 29 June 2021. A technology preview was launched in the Visual Studio code development environment and later made available as a plug-in on the JetBrains marketplace and as a public repository for the Neovim plug-in. As an upgrade on the Bing Code Search plug-in for Visual Studio 2013, GitHub Copilot provides a range of functions. This includes functions used to generate solution code from natural language problem descriptions, translate code between different programming languages and convert code comments into executable code. It also offers autocomplete functions for code segments and full methods or functionalities.

GitHub Copilot is powered by OpenAI Codex, a customised, ready-to-use version of GPT-3, a language model based on deep learning that can generate human-like text. The Codex model is also trained on several terabytes of source code written in a variety of programming languages. Its successor, GitHub Copilot X, is powered by GPT-4, the most recent and best model from OpenAI. Despite the obvious strengths of GitHub Copilot, there are also security concerns and controversies surrounding the licensing of the code it generates. These relate primarily to legal challenges and privacy concerns, seeing as the service is cloud-based and requires constant communication with the GitHub Copilot servers.

Example of how Copilot is used in Visual Studio Code

GitHub Copilot Chat

Amazon CodeWhisperer

CodeWhisperer, a product from tech giant Amazon, is an AI-supported programming tool that can generate both single-line and full code recommendations in real time within its integrated development environment (IDE). Its aim is to significantly increase the efficiency and speed of the software development process. Like Copilot, CodeWhisperer permits the use of natural language comments that articulate specific tasks in English. Based on this information, CodeWhisperer then generates suitable code fragment recommendations right in the IDE. As you can well imagine, it specialises in AWS libraries and components.

Amazon Code Whisperer, Source: AWS

Open-source coding assistants

TabbyML is an excellent example of an open-source programming assistant. This tool is highly professional and even better than Amazon CodeWhisperer in terms of integration. One great feature is support for Neovim and Vim, two text-based editors that are held in high regard in elite development circles.

TabbyML is extremely easy to integrate. All you need to do is install the accompanying plug-in in Visual Studio Code, in a JetBrains IDE or in Neovim. Once this has been done, the model is either launched in a Docker container or installed right on the PC. Please note that TabbyML should be installed on a PC equipped with a GPU or an M chip from Apple or on a dedicated GPU server/computer equipped with a GPU to achieve the best results. Although it takes roughly five minutes to set up, which is slightly longer than with Copilot and CodeWhisperer, TabbyML is not that difficult to configure. One of the main advantages of TabbyML is that your data and code never leave your device and are therefore not transferred to servers operated by US-based companies. This is particularly relevant in relation to the US Cloud Act. This law allows American authorities to access data stored in the cloud, even if it is stored outside the US, which can raise a number of issues, especially if the data involved is confidential or sensitive. TabbyML, however, allows you to retain control over your data and code at all times. This is a key advantage and more than makes up for the few extra minutes you will need to set it up. Another great feature is the ability to train the model underlying TabbyML (StarCoder or CodeLlama, for example) yourself, or to integrate it when a new version is released.

Models currently available for TabbyML

Comparison of the copilot providers

In the table below, I compare four popular AI-supported copilots: GitHub Copilot, Amazon CodeWhisperer, TabbyML and Codeium. I assess them according to several criteria, including control over your own data, supported programming languages, licence type, price and integration options.

Feature	GitHub Copilot	Amazon Codewhisperer	TabbyML	Codeium
Full control over personal data	No	No	Yes	No
Supported programming languages	According to Github, all languages in theory	Python, Java, JavaScript, TypeScript, C#, Go, Rust, PHP, Ruby, Kotlin, C, C++, Shell-Scripting, SQL und Scala	Rust, Python, JavaScript, Typescript, Goland, Ruby, Java	70+
Licence	Proprietary	Proprietary	MIT	Proprietary
Price/month (private use)	10 USD	Free of charge	Free of charge; fees for hardware	Free of charge
Price/month (commercial use)	19 USD	19 USD	Free of charge; fees for hardware	15 USD
Integrations	Azure, Jetbrains IDEs, Vim/Neovim. Visual Studio, Visual Studio Code	Visual Studio (Code), Jetbrains IDEs, AWS, JupyterLab, Sagemaker	Visual Studio (Code), Jetbrains IDEs, Neovim	Visual Studio (Code), Jetbrains IDEs, Eclipse, Google Colab, Databricks, Emacs, Vim/Neovim, Xcode
Self-hosting	No	No	Yes	Yes
Fine-tuning	Planned	Planned	Yes	Yes

Control over your own data

One of the key criteria for many users is the ability to have control over their own data. This means that you know exactly where your data is at all times and only you have access it. Of the copilots in this comparison, only TabbyML allows users to control their own data. GitHub Copilot, Amazon CodeWhisperer and Codeium do not offer this option.

Number of supported programming languages

In terms of the number of supported programming languages, GitHub Copilot in theory supports all of them. Amazon CodeWhisperer supports a variety of languages (Python, Java, JavaScript, TypeScript, C#, Go, Rust, PHP, Ruby, Kotlin, C, C , shell scripting, SQL, Scala), and is also optimised for the use of AWS components and libraries. TabbyML supports Rust, Python, JavaScript, Typescript, Goland, Ruby and Java, while Codeium offers support for over 70 languages.

Licence

The licence for GitHub Copilot, Amazon CodeWhisperer and Codeium is proprietary, while TabbyML uses an MIT licence. In addition, you can integrate most any model into TabbyML since it is based on GGUF-compatible models.

Integrations

All four copilots offer the option of integration into a variety of development environments. Each of them supports Visual Studio (Code) and JetBrains IDEs. TabbyML and Codeium also support self-hosting, while GitHub Copilot and Amazon CodeWhisperer do not.

Fine-tuning

Fine-tuning refers to the process of adapting an existing language model to specific data sets. This process includes additional training of the model on company-specific data to teach it the unique characteristics and requirements of each context. The main advantage of this approach is that the model is better able to understand and interpret the company’s specific data, ideas and codes. This could include, for example, the ability to understand the company’s in-house libraries, code conventions or business terminology. By training the model on this data, it can acquire special knowledge that enables it to process specific internal company information and structures more effectively. As a result, this fine-tuned model is able to increase performance and accuracy in tasks carried out based on the company’s specific data and requirements.

Higher efficiency and productivity

It is important to know how an assistant can be deployed to increase efficiency and productivity. GitHub has conducted experiments in-house, with the primary goal of promoting their product. Nevertheless, the results are very interesting.

Source: https://github.blog/2022-09-07-research-quantifying-github-copilots-impact-on-developer-productivity-and-happiness/

Here is an excerpt from a survey conducted by GitHub on the benefits of using Copilot for developers. It is interesting to note that the main benefit from the perspective of developers is much the same as it is for healthcare professionals in relation to digitalisation and the streamlining of reporting and billing processes. In other words, it does away with monotonous, repetitive tasks, which presumably also require the user to search for the same information time and again. This would appear to increase the speed of work and allow users to focus fully on a highly efficient workflow. This makes it possible to devote their full attention to solving interesting and important problems.

Source: https://github.blog/2022-09-07-research-quantifying-github-copilots-impact-on-developer-productivity-and-happiness/

GitHub ran an interesting experiment with two groups of developers involving one group that used GitHub Copilot and a control group that did not. While the success rate for developers who used Copilot was only slightly higher (78 per cent) than for those who did not (70 per cent), it turned out that Copilot users were able to solve their tasks 55 per cent faster. This a quite stark contrast that highlights the potential importance and added value that assistants like this can offer in software development.

Outlook

The future of software development looks very promising thanks to the continuous improvement of and upgrades to AI-supported coding assistants. These tools will undoubtedly continue to provide even more advanced and more useful features that help developers work more efficiently and write better code. It is important to pay close attention to issues of privacy and data protection. As cloud-based services gain popularity, having control over your own data is becoming more and more important. Tools such as TabbyML that support self-hosting could become more popular in this domain. You can also expect more open-source alternatives to emerge in the future that offer greater control and flexibility. This could lead to healthy competition that promotes innovation and reduces the cost for users. At the end of the day, what coding assistant is the best one for an individual developer depends on their specific needs and requirements. It will be exciting to see how this technology evolves and what influence it will have on software development going forward.

Would you like to learn more about exciting topics from the world of adesso? Then check out our latest blog posts.

Author Marc Mezger

Marc Fabian Mezger is an AI Specialist Consultant specializing in Medical Deep Learning, Computer Vision and Drift. In his current role at the AI & Data Science Competence Center, he is responsible for advising customers on AI solutions and their implementation. He has extensive knowledge in Machine and Deep Learning.

Category:	Methodology
Tags:	Programming Software Development