Get AI code generation tools to create correct Elixir code, or else

josevalim · February 23, 2023, 9:30am

I believe this was written in a confusing way. The following chart from the article is a bit clearer in that those numbers mean suggestions acceptance rate. So 46% of the suggestions were accepted but this is a quantitative measure and it makes sense they are higher for Java as it has more boilerplate. But here is what I would love to know:

If 54% of the suggestions are rejected, does it mean I need to parse a suggestion and then discard it? Which would mean that most of the time suggestions could be slowing me down?
What is the time taken to accept or reject a suggestion?
Does Copilot tracks what happens with a suggestion? Maybe it is accepted and then it is immediately changed or removed because it was wrong?

In any case, I believe there are two separate discussions here, and they are getting mixed.

Are the AI tools in a state where we can consider them trustworthy or generally acceptable? To me the answer is no. Besides a huge potential copyright issue on tools like Copilot, which has made some organizations ban certain AI tools altogether, there is still a lot to improve. For example, researchers have found that code generated by OpenAI’s Codex contained security vulnerabilities 40% of the time. However, the tools will improve as there is a large amount of techniques and ideas that still have to explored and potentially adopted (such as reinforcement learning with static and security analysis!).
That brings us to the second discussion: should we do what is necessary to get Elixir working with more AI tools? To me, the answer is a 100% yes, because it will only get better and it is not only about code completion.