Our Core Values:
• Users come first
• Build a better product, not just different
• Do less but get more done
• Always be learning
High quality of data preparation brings enormous value to improve the quality of machine learning-based products. Improving existing data quality is usually more effective than tweaking the machine learning models. In this position, you will contribute to a variety of initiatives and technologies related to data preparation management. You will also interact with our data annotators and vendors.
What you will achieve in this role
- Own the data labeling, data annotation, and crowdsourcing pipelines
- Scope and define strategies for robust data collection and annotation projects at scale
- Train teams from vendors or external platforms like Amazon Mechanical Turk, Appen, Scale AI, etc.
- Implement and execute quality assurance processes
- Ensure the annotators meet their obligations under the service level agreement
- Track and report progress on the data collection and annotation projects on a recurring basis
- Continuously improve tooling and implement scalable solutions
- Share your knowledge and experience with the rest of the team
What you need to be successful
- Excellent written and verbal communication skills, organizational skills, and highly developed interpersonal skills, with fluency in English and Mandarin
- Capable of leveraging technology to improve and scale daily workflow
- Strong passion for data quality and operation efficiency
- Strong computer skills: Linux command basics, you should be someone your relatives and friends would go to when they have problems with computers (e.g. Android, iPad, and Mac).
- Experience with managing vendors (e.g. negotiation with the vendors, preparation of contracts)
What else would help you, but not required
- Know at least one of the data analysis languages: SQL, Python, Java, or R
- Perform quality check on TB-scale data sets (e.g. using cloud services and Spark)
- Manage data sets in a systematic manner (e.g. data versioning)
- Store the data sets effectively for retrieval (e.g. in a columnar store or files)
This will be a 6-month contract with a possibility to convert into a permanent role.
• You’ll receive a competitive compensation and meaningful equity along with a chance to make significant contribution to a product people already love.
• Most of our positions are eligible for remote work, provided you have at least 3 hours of overlap with the team in the office every weekday between 10 AM and 6 PM. Please indicate your preference in your application form.
• You’re also welcome to join us in our Hong Kong or London office, we sponsor visas and relocations.
• We take care of you and your loved ones with medical insurance and flexible working hours including two optional work-from-home days!
• Join our best company tradition, the annual off-site. Check out our pictures from team outings and more on our Instagram.
Before you apply, please check if any restrictions apply in terms of time zone or country.
This job has a geo-restriction in place: European/ Asian Timezones.