Data Science and Generative AI Specialist

Ford Motor

Chennai

Not disclosed

Work from Office

Full Time

Min. 2 years

Job Details

Job Description

Data Science/Gen AI Specialist

You'll be working alongside leading technical experts from all around the world, on a variety of products involving Sequence/token classification, QA/chatbots, translation, semantic/search and summarization, among others.

  • Design NLP/LLM/GenAI applications/products by following robust coding practices, 
  • Explore SoTA models/techniques so that they can be applied for automotive industry usecases
  • Conduct ML experiments to train/infer models; if need be, build models that abide by memory & latency restrictions, 
  • Deploy REST APIs or a minimalistic UI for NLP applications using Docker and Kubernetes tools.
  • Showcase NLP/LLM/GenAI applications in the best way possible to users through web frameworks (Dash, Plotly, Streamlit, etc.,)
  • Converge multibots into super apps using LLMs with multimodalities.
  • Develop agentic workflow using Autogen, Agentbuilder, langgraph
  • Build modular AI/ML products that could be consumed at scale.

Education: Bachelor’s or master’s degree in computer science, Engineering, Maths or Science 

Performed any modern NLP/LLM courses/open competitions is also welcomed. 

Technical Requirements:

Soft Skills

  • Strong communication skills and do excellent teamwork through Git/slack/email/call with multiple team members across geographies.

GenAI Skills

  • Experience in LLM models like PaLM, GPT4, Mistral (open-source models), 
  • Work through the complete lifecycle of Gen AI model development, from training and testing to deployment and performance monitoring.
  • Developing and maintaining AI pipelines with multimodalities like text, image, audio etc. 
  • Have implemented in real-world Chat bots or conversational agents at scale handling different data sources.  
  • Experience in developing Image generation/translation tools using any of the latent diffusion models like stable diffusion, Instruct pix2pix. 
  • Expertise in handling large scale structured and unstructured data.
  • Efficiently handled large-scale generative AI datasets and outputs.

ML/DL Skills:

  • High familiarity in the use of DL theory/practices in NLP applications
  • Comfort level to code in Huggingface, LangChain, Chainlit, Tensorflow and/or Pytorch, Scikit-learn, Numpy and Pandas
  • Comfort level to use two/more of open source NLP modules like SpaCy, TorchText, fastai.text, farm-haystack, and others

NLP Skills

  • Knowledge in fundamental text data processing (like use of regex, token/word analysis, spelling correction/noise reduction in text, segmenting noisy unfamiliar sentences/phrases at right places, deriving insights from clustering, etc.,) 
  • Have implemented in real-world BERT/or other transformer fine-tuned models (Seq classification, NER or QA) from data preparation, model creation and inference till deployment. 

Python Project Management Skills

  • Familiarity in the use of Docker tools, pipenv/conda/poetry env
  • Comfort level in following Python project management best practices (use of setup.py, logging, pytests, relative module imports,sphinx docs,etc.,)
  • Familiarity in use of Github (clone, fetch, pull/push,raising issues and PR, etc.,)

Cloud Skills and Computing:

  • Use of GCP services like BigQuery, Cloud function, Cloud run, Cloud Build, VertexAI, 
  • Good working knowledge on other open-source packages to benchmark and derive summary.
  • Experience in using GPU/CPU of cloud and on-prem infrastructures.
  • Skillset to leverage cloud platform for Data Engineering, Big Data and ML needs.

Deployment Skills:

  • Use of Dockers (experience in experimental docker features, docker-compose, etc.,)
  • Familiarity with orchestration tools such as airflow, Kubeflow
  • Experience in CI/CD, infrastructure as code tools like terraform etc. 
  • Kubernetes or any other containerization tool with experience in Helm, Argoworkflow, etc.,
  • Ability to develop APIs with compliance, ethical, secure and safe AI tools. 

UI

  • Good UI skills to visualize and build better applications using Gradio, Dash, Streamlit, React, Django, etc.,
  • Deeper understanding of javascript, css, angular, html, etc., is a plus. 

Miscellaneous Skills:

Data Engineering: 

  • Skillsets to perform distributed computing (specifically parallelism and scalability in Data Processing, Modeling and Inferencing through Spark, Dask, RapidsAI or RapidscuDF)
  • Ability to build python-based APIs (e.g.: use of FastAPIs/ Flask/ Django for APIs)
  • Experience in Elastic Search and Apache Solr is a plus, vector databases. 

Experience Level

Mid Level

Job role

Work location

Chennai, Tamil Nadu, India

Department

Data Science & Analytics

Role / Category

Data Science & Machine Learning

Employment type

Full Time

Shift

Day Shift

Job requirements

Experience

Min. 2 years

About company

Name

Ford Motor

Job posted by Ford Motor

Apply on company website