Data Collection Research Software Engineer – GitHub
Software DeveloperBookmark Details
GitHub (501+ Employees, 90% 2 Yr Employee Growth Rate)
29% 1-Year Employee Growth Rate | 90% 2-Year Employee Growth Rate | LinkedIn | $350M Venture Funding
What Is Employee Growth Rate & Why Is It Important?
Location: Remote – Global
Data Collection Research/Software Engineer
GitHub is seeking a research/software engineer with programming languages and/or software engineering expertise to join the Copilot team as part of GitHub Next. GitHub Next aims to be a meeting place within GitHub for experimentation with new ideas, and for setting the agenda for GitHub’s product several years in advance. Next has a small number of permanent research staff, and this position is part of the core Next team.
This engineer will collaborate with OpenAI and Microsoft Research to collect and process large-scale data sets to improve the OpenAI Codex model that powers Copilot. The ideal candidate will have experience with mining software repositories, large-scale program analysis, and/or creating benchmark sets for machine learning for code tasks.
Responsibilities:
- Collaborate with OpenAI to improve the quality of the Codex code synthesis model.
- Undertake short- and medium-term research projects in the area of code synthesis, and ship improvements to the production model.
- Participate in all activities of GitHub Next: organizing webinar series, evaluating project proposals, and disseminating research results.
Minimum Qualifications:
- Ability to do innovative research on one of the following topics: mining software repositories, program analysis (static or dynamic), program synthesis, machine learning for code.
- 3+ years experience building developer tools in production
- Inclination to prototype quickly and make fast decisions on experiment failure.
- A creative mindset and good practical skills are more important than formal experience.
Preferred Qualifications:
- PhD in computer science or related field, or other evidence of the ability to do independent research.
- Knowledge of Python or JavaScript and its ecosystem, or the ability to acquire such knowledge quickly.
- Experience analyzing and/or mining large software repositories.
- Ability to communicate complex ideas clearly, both in spoken and written form, for expert as well as novice audiences.
- Interest in modern AI technologies and program synthesis in particular.