Abstract: Large Language Models (LLMs) are increasingly used to automate software development, yet most prior evaluations focus on functional correctness or high-level languages such as Python. As one ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results