Job Title: Linguist III (Indonesian)
Location: Remote
Duration: 7 Months
Job Description:
- 0-3 years of experience.
- Must have a graduate degree in Linguistics
- Must be native speaker of a non-English language (preferably Indonesian) with a high level of proficiency in another Austronesian language, plus broad knowledge of other languages in the same family.
Top 3 must-have HARD skills:
- Must be native speaker of a non-English language (preferably Indonesian) with a high level of proficiency in another Austronesian language, plus broad knowledge of other languages in the same family.
- Perform linguistic error analysis of machine translations and identifying the most frequent and severe error categories
- Strong skills in pattern recognition, cross-functional communication, and multitasking
- Experience with Python
Good to have skills:
- Experience with creating and/or maintaining specialized lexical resources (e.g., profanity dictionaries) a plus
Soft Skills:
- Ability to independently work through ambiguous requests, based on priorities established by CWAM, and perform under pressure. Able to work cross functionally.
Main Duties:
- Perform linguistic analyses on large datasets.
- Perform linguistic error analysis of AI model outputs, determining what the most frequent and severe error categories are.
- Write and revise guidelines for human annotation and other AI projects, including but not limited to translation tasks.
- Conduct typological and sociolinguistic research on a large number of languages, highlighting their similarities and differences.
- Perform linguistic analyses for Responsible AI (toxic language, hate speech, gender bias and other cultural biases) in massively multilingual settings.
- Conduct linguistic literature reviews on various NLP-adjacent topics, and summarize findings.
- Compare the quality of deliveries between vendors, identify error patterns, and provide actionable feedback.
- Provide information or guidance relative to any aspect of linguistic knowledge (typology, morpho-syntax, sociolinguistics, classification, phonetics/phonology, pragmatics, etc.).
- Reach out to and collaborate with native speakers in various languages.
- Communicate results of linguistic analyses to engineers and research scientists.
Skills:
- Must have strong written and spoken communication skills, especially business and research communication.
- Must be native speaker of a non-English language (preferably Indonesian) with a high level of proficiency in another Austronesian language, plus broad knowledge of other languages in the same family.
- Working knowledge in other languages is a plus. Proficiency in a low-resource language is valued.
- Must be able to code in Python (must) and query databases using SQL, other coding languages used for data analysis are a plus.
- Must be able to independently work through complex requests and perform under pressure.
- Strong ability to work independently, prioritize, plan, and track work, as well as report progress
- education or training in the basics of project management is a plus
- self-motivation is a must
- Working knowledge of international language-classification standards is valued.
Education:
- Graduate degree in Linguistics or related field is a must; PhD is a plus
- A background or specialization in corpus linguistics is a plus
- experience with field work is a plus
- A graduate degree in Literature or English is not an appropriate substitution
- degree in Computer Science with a specialization in NLP is not an appropriate substitution
- Must have a very firm grasp of the following linguistic fields: language typology, syntax, morphology, sociolinguistics (especially dialectology and discourse analysis), corpus linguistics, writing systems, pragmatics, phonology.
- Must have some experience with applying basic Natural Language Processing techniques.
Experience:
- Years of experience: 0-3
- Experience working cross-functionally
- Experience collaborating with machine learning, NLP, or software engineers, or data scientists
- Experience contributing to research papers
- Important: Preferably no known conflicts of interest in the fields of machine translation, ASR, TTS, or LLM research (as FAIR Linguists need to be contributing to research papers)
Story Behind the Need - Business Group & Key Projects:
- The FAIR C&L Linguistics team provides linguistic expertise related to catastrophic translations, error analysis, and knowledge base.
Compelling Story & Candidate Value Proposition:
- FAIR's mission could be summarized for the candidates as:
- Research whatever the "next big AI thing" is
- Try to open source as much of it as possible
Performance Measurement:
- We have rough KPIs for recurring tasks (such as standard guideline writing or spot-checking of translation deliveries) but a lot of what we do in research is bespoke, so this is harder to tell with certainty.
|