Data Scientist
Kotak Mahindra Bank LimitedJob Description
Data Science I-HO & SUPPORT-CVM CoE - Corporate Centre of Excellence
Job Description – Data Scientist I
Role Overview
We are seeking a highly motivated Data Scientist I with strong foundational knowledge in machine learning, modern AI techniques, and emerging Large Language Model (LLM) capabilities. The role requires hands‑on experience with model development, fine‑tuning, evaluation, and adherence to Responsible AI and regulatory guidelines (RBI/MeitY). You will collaborate with cross‑functional teams to build scalable, secure, and explainable AI systems that drive business value.
Key Responsibilities
1. Machine Learning & Statistical Modeling
- Develop and maintain ML models including propensity models, classification, regression, and clustering.
- Perform data cleaning, feature engineering, and exploratory data analysis.
- Build models using Python, SQL, and leading ML frameworks (TensorFlow, PyTorch, Scikit‑learn).
2. Generative AI & LLMs
- Work with Large Language Models (LLMs) and Small Language Models (SLMs) for enterprise use cases.
- Apply fine‑tuning, distillation, and model optimization techniques to adapt models to business needs.
- Create and manage synthetic data pipelines for training and evaluation.
3. AI Agents & Workflows
- Assist in designing AI agents and agentic workflows to automate decision-making processes.
- Contribute to building AI-driven orchestration systems across business workflows.
4. Model Evaluation & Guardrails
- Implement LLM-as-a-Judge, evaluation frameworks, prompt tests, and model benchmarking.
- Apply model risk assessment and mitigation strategies as per enterprise AI governance.
- Implement security guardrails, including DLP controls and content safety filters.
5. Responsible AI & Compliance
- Ensure all models comply with:
- RBI – Financial Regulation for Emerging Entities (FREE) guidelines
- MeitY AI & Data Governance Guidelines
- Integrate Privacy Preservation, Explainable AI (XAI), and Responsible AI techniques into model workflows.
6. Engineering & MLOps
- Participate in AIOps/MLOps processes: model deployment, monitoring, versioning, CI/CD.
- Document experiments, track model performance, and support reproducible ML pipelines.
7. Data Engineering & Domain Collaboration
- Work with structured, unstructured, and geospatial datasets (a plus).
- Collaborate closely with product, engineering, analytics, and compliance teams to translate business problems into ML solutions.
Required Skills
- Strong proficiency in Python, ML libraries (scikit‑learn, pandas, NumPy), and deep learning frameworks.
- Knowledge of LLMs, SLMs, prompt engineering, and RAG concepts.
- Familiarity with fine-tuning, quantization, pruning, and distillation methods.
- Understanding of model risks, adversarial ML, and mitigation strategies.
- Experience with AI/ML security, guardrails, and DLP principles.
- Understanding of XAI tools (SHAP, LIME, Integrated Gradients).
- Sound knowledge of Responsible AI, privacy techniques (DP, k-anonymity).
- Basic familiarity with AIOps/MLOps, Docker, Git, MLflow, Airflow (preferred).
- Exposure to geospatial analytics (nice to have).
Educational BackgroundJob Description – Data Scientist I
Role Overview
We are seeking a highly motivated Data Scientist I with strong foundational knowledge in machine learning, modern AI techniques, and emerging Large Language Model (LLM) capabilities. The role requires hands‑on experience with model development, fine‑tuning, evaluation, and adherence to Responsible AI and regulatory guidelines (RBI/MeitY). You will collaborate with cross‑functional teams to build scalable, secure, and explainable AI systems that drive business value.
Key Responsibilities
1. Machine Learning & Statistical Modeling
- Develop and maintain ML models including propensity models, classification, regression, and clustering.
- Perform data cleaning, feature engineering, and exploratory data analysis.
- Build models using Python, SQL, and leading ML frameworks (TensorFlow, PyTorch, Scikit‑learn).
2. Generative AI & LLMs
- Work with Large Language Models (LLMs) and Small Language Models (SLMs) for enterprise use cases.
- Apply fine‑tuning, distillation, and model optimization techniques to adapt models to business needs.
- Create and manage synthetic data pipelines for training and evaluation.
3. AI Agents & Workflows
- Assist in designing AI agents and agentic workflows to automate decision-making processes.
- Contribute to building AI-driven orchestration systems across business workflows.
4. Model Evaluation & Guardrails
- Implement LLM-as-a-Judge, evaluation frameworks, prompt tests, and model benchmarking.
- Apply model risk assessment and mitigation strategies as per enterprise AI governance.
- Implement security guardrails, including DLP controls and content safety filters.
5. Responsible AI & Compliance
- Ensure all models comply with:
- RBI – Financial Regulation for Emerging Entities (FREE) guidelines
- MeitY AI & Data Governance Guidelines
- Integrate Privacy Preservation, Explainable AI (XAI), and Responsible AI techniques into model workflows.
6. Engineering & MLOps
- Participate in AIOps/MLOps processes: model deployment, monitoring, versioning, CI/CD.
- Document experiments, track model performance, and support reproducible ML pipelines.
7. Data Engineering & Domain Collaboration
- Work with structured, unstructured, and geospatial datasets (a plus).
- Collaborate closely with product, engineering, analytics, and compliance teams to translate business problems into ML solutions.
Required Skills
- Strong proficiency in Python, ML libraries (scikit‑learn, pandas, NumPy), and deep learning frameworks.
- Knowledge of LLMs, SLMs, prompt engineering, and RAG concepts.
- Familiarity with fine-tuning, quantization, pruning, and distillation methods.
- Understanding of model risks, adversarial ML, and mitigation strategies.
- Experience with AI/ML security, guardrails, and DLP principles.
- Understanding of XAI tools (SHAP, LIME, Integrated Gradients).
- Sound knowledge of Responsible AI, privacy techniques (DP, k-anonymity).
- Basic familiarity with AIOps/MLOps, Docker, Git, MLflow, Airflow (preferred).
- Exposure to geospatial analytics (nice to have).
Educational Background
- Bachelor’s/Master’s in Computer Science, Data Science, Mathematics, Statistics, or related fields.
Experience Required
- 1–2 years of hands-on experience in ML/AI projects, internships, research, or capstone projects.
Nice-to-Have
- Experience with LangChain, LlamaIndex, or other agent frameworks.
- Participation in AI/ML competitions (Kaggle, Hackathons).
- Knowledge of BFSI domain analytics (advantage but not mandatory).
- Bachelor’s/Master’s in Computer Science, Data Science, Mathematics, Statistics, or related fields.
Experience Required
- 1–2 years of hands-on experience in ML/AI projects, internships, research, or capstone projects.
Nice-to-Have
- Experience with LangChain, LlamaIndex, or other agent frameworks.
- Participation in AI/ML competitions (Kaggle, Hackathons).
- Knowledge of BFSI domain analytics (advantage but not mandatory).
Experience Level
Senior LevelJob role
Job requirements
About company
Similar jobs you can apply for
Logistics/ Warehouse operations