The following page may contain information related to upcoming products, features and functionality. It is important to note that the information presented is for informational purposes only, so please do not rely on the information for purchasing or planning purposes. Just like with all projects, the items mentioned on the page are subject to change or delay, and the development, release, and timing of any products, features or functionality remain at the sole discretion of GitLab Inc.
Stage | Create |
Group | Code Creation |
Maturity | viable |
Content Last Reviewed | 2024-10-15 |
Thanks for visiting this category direction page on Code suggestions in GitLab. This page belongs to the Code Creation group of the Create stage, and is maintained by Jordan Janes (@jordanjanes).
The Code Creation team focuses on helping developers create & maintain code more efficiently. We strive to help GitLab customers deliver value to their customers more quickly through an accelerated software development lifecycle.
We help developers understand, write, test, fix, refactor, and document code:
We believe there’s an opportunity to help developers with both narrowly scoped tasks - such as generating unit tests - and with broadly scoped tasks - such as evaluating and updating all tests across a codebase.
GitLab Duo Code Suggestions helps teams accelerate code creation throughout their software development lifecycle, without sacrificing security, privacy, and enterprise control.
We plan to improve the quality and latency of code suggestions, expand the breadth of customer use cases we support, and ensure our customers have sufficient administrative controls. To make progress towards our vision, our investments are organized into these primary themes:
Code suggestions, and specifically inline code completions, have to keep up with the pace of the user. Delays in presenting useful suggestions often result in the user manually continuing their workflow, and this impairs our opportunity to help our customers accelerate their development.
Code generation is often less latency sensitive, though we still strive to quickly deliver generated code to the user. We will often stream the responses so the developer can start to assess the results, and this also improves the latency when generating large blocks of code.
GitLab has customers across the globe, and we’re committed to optimizing latency for all customers. We’ve invested in globally distributed infrastructure, and we prioritize model providers who can mitigate network latency with globally deployed models.
We consider fast latency a tablestakes expectation from our customers, and we carefully manage latency tradeoffs when considering larger and more capable models.
Improving the quality of code suggestions is a primary focus. We think of quality as providing useful suggestions, which our users accept to accelerate their workflow. For code completion, this could be 1 line of code that perfectly matches what the user wanted. For code generation, this could be dozens of lines of code, and the user may sometimes edit a few specifics before moving forward. Over time, we want to provide exactly what the user needs.
We plan to make progress on this goal by leveraging context throughout the customer’s codebase, continuing to invest in our internal evaluation suite, and continually assessing new AI models.
Context
Managing and providing relevant context - from relevant dependencies, files, and throughout a codebase - is the main tactic to improve code suggestion quality. We can greatly improve the quality of code suggestions by ensuring responses are aware of key dependencies, libraries, and systems. We plan to find the right portions of content, provide that as context to our AI models, and generate a better response. We’ll use a combination of implicit context sources - which require no action from the user - and user-selected context sources.
We’ve made initial progress in this space and plan to broaden the aperture of local context, then iterate towards remote context sources. As we broaden the sources, we’ll improve our logic to rank the most relevant portions of content to be used as context.
Ensuring we provide quality responses is ultimately a tablestakes expectation from our customers. We will also need to ensure quality improvements don’t cause latency penalties as we broaden context.
Internal evaluations
We must be confident that we’re improving quality as we broaden Code Suggestions to include more context and support more customer use cases. We use a broad evaluation dataset to internally quantify quality before rolling out changes to our customers. We’ll continue to extend our evaluation dataset by curating and creating test scenarios that will help us confidently assess quality.
Models
We continuously monitor and evaluate the latest AI models for opportunities to improve quality and latency. We provide full transparency to our customers on the models used within GitLab Duo, and we gladly take responsibility for keeping up to date on all of the latest breakthroughs in AI models and technology. With Duo Code Suggestions, our customers can focus on accelerating their software development lifecycle and not spend time reviewing the latest AI models.
With the breadth of the GitLab DevSecOps platform, expanding to support more customer goals is a long term opportunity for differentiation.
Broadly scoped tasks
Today, most of our user interactions center around narrowly scoped tasks. As an example, a user can select portions of code and use Duo Code Suggestions to create tests, or document the code. We want to support more broadly scoped tasks from our users to further accelerate their workflow. This might include searching an entire codebase to find code that needs improvement, or continuously scanning for bugs and surfacing those to our users. Another example might be helping a developer add a field to an existing API, then updating all queries to match the new schema. These broadly scoped tasks require more context and reasoning, and the ability to make edits among multiple files and locations within the files.
We'll also iterate towards automating more end to end code creation workflows. This will allow users to describe their goal, review a GitLab-generated implementation plan, then guide the execution and implementation.
Expanding beyond IDE
Our focus has been helping developers create & maintain code more efficiently within the IDE, and we’ll continue this investment. Broadly, we want to consider opportunities within and outside of the IDE to help customers further accelerate their code creation workflows and overall development lifecycle. This could include code review workflows within the web UI, where we can help a code reviewer understand committed code changes, and suggest further changes or improvements.
Enterprise customers often have more needs for admin controls and auditing. We want to ensure Duo Code Suggestions meets our customers’ goals for compliance and administration. This section summarizes a few areas we’ve heard from customers, and is not fully comprehensive.
GitLab Duo has a strict data privacy and data retention policy to ensure customers can be confident in our data protection agreements. Customers don’t need to manage administrator controls to ensure there’s data privacy.
Context and indexing controls
As we broaden context sources, customers may want to exclude specific context sources. This will ensure Duo Code Suggestions doesn't use these context sources when generating a response. We'll need to extend these controls alongside the context sources.
Public code attribution & licensing risks
Code suggestion models are trained on large volumes of public code, and may provide responses that exactly match public code. There can be legal risks when using public code from a source that isn’t permissively licensed. We want to ensure we help customers manage these risks by identifying and surfacing when code suggestions match sources that aren’t permissively licensed.
Audit logs
This is an area where we’re gathering more customer input. We want to provide detailed visibility into how and where Duo Code Suggestions are used, while balancing data privacy.
Our main focus will be improving suggestions quality through expanded context. We'll work to expand both system-managed AI context and user-defined AI context:
We plan to make progress by using local context - including specific code chunks within local files, entire files, and entire local repositories. We'll then extend to remote context sources, including entire remote repositories.
What we recently completed
What we are currently working on
What is next for us
What is not planned right now
We're not currently focused on supporting context from remote repositories. Our current focus is expanding context from local content, then extending to remote content.
People who code:
Success metrics & signals are organized around 2 broad goals:
Increasing developer productivity
Signal: reduced time from task start to task finish
Signal: more coding tasks are automated or accelerated
Improve developer satisfaction
Signal: Developers consistently use code suggestions
Signal: Developers report higher satisfaction and less frustration
Please see the content in our internal handbook.