Mario Jiménez Gutiérrez

Senior Data Scientist & AI & Machine Learning Engineer

I am a Senior Data Scientist & AI & Machine Learning Engineer with a strong academic background and hands-on experience delivering high-impact AI solutions in industry. I graduated from the elite dual degree in Mathematics and Computer Science at Complutense University of Madrid, and later completed a Master's in Artificial Intelligence Research at UNED.

Throughout my career, I have worked at leading organizations such as Boston Consulting Group and TomTom, designing and deploying end-to-end AI systems, from data engineering and experimentation to model development, evaluation, and production. My work spans deep learning, generative and agentic AI, optimization, forecasting, and intelligent data systems, always with a clear focus on creating measurable business impact.

I enjoy tackling complex problems and translating real-world challenges into data-driven solutions. I am known for my analytical rigor, technical depth, and adaptability, as well as my ability to collaborate effectively in diverse teams and communicate technical concepts clearly to non-technical stakeholders.

I am passionate about applied research and cutting-edge AI technologies, and I am eager to keep contributing to areas such as GenAI, agentic AI, MLOps, deep learning, and cloud-based AI systems.

Experience

Senior Data Scientist & AI & Machine Learning Engineer

TomTom

Nov 2024 — Present

Architected benchmarking frameworks using Kedro, Spark, and Azure to assess internal and competitor map quality across POIs, addresses, and ADAS data, supporting strategic client initiatives with multi-million-dollar impact.
Built a confidence scoring model (PySpark, XGBoost) that reduced POI superfluousness metrics by 90%.
Fine-tuned PyTorch LLMs for POI categorization using Hugging Face and MLflow on Databricks.
Improved entity matching F1-score by +20% through graph-based clustering and advanced feature engineering.
Developed multimodal agentic AI systems for ground-truth data generation and validation.
Contributed to team learning initiatives and to the team’s software development guidelines.

Data Scientist & Machine Learning Engineer

Boston Consulting Group (BCG)

Jul 2022 — Nov 2024

Developed an optimization model for a petrochemical company in the Middle East, optimizing the end-to-end value chain and performing what-if analyses that generated an annual economic impact of +$10M.
Implemented demand forecasting models with XGBoost across multiple clients, improving existing model performance by 20%.
Built pricing forecasting models using Regression and Transformer architectures, improving prediction accuracy by +12%, and applied Reinforcement Learning to optimize pricing strategy with +8% margin uplift.
Optimized shipping schedules using evolutionary algorithms, reducing delivery delays by 30% and improving fleet utilization by 22%.
Developed a RAG-based industrial chatbot that helps internal users answer their queries by retrieving relevant company documentation, reducing manual query handling by 50%.
Delivered data-driven insights to senior client stakeholders, informing strategic decisions across pricing, operations, and customer management.

Education

Master's in Artificial Intelligence Research

UNED

2023 — 2025

GPA: 91.2% Thesis: 90% 66 ECTS

Completed 66 ECTS (vs. 60 standard), covering advanced topics in Deep Learning, NLP, Computer Vision, Reinforcement Learning, Generative Models, Metaheuristics, and Graph/Probabilistic Models.

Thesis: RL for hyperparameter control in population-based metaheuristics (SIMDA Research Group).

Elite Dual Degree Program — 360 ECTS (vs. 240 standard)

National admission rank: Top 2 in Spain (GPA 13.66/14)

Bachelor of Mathematics

Complutense University of Madrid

Sept 2017 — Jun 2022

GPA: 83.1% Thesis: 98%

Relevant coursework: Problem Solving, AI, Advanced Analysis & Algebra & Geometry & Statistics, Optimization

Bachelor of Computer Science

Complutense University of Madrid

Sept 2017 — Jun 2022

GPA: 83.1% Thesis: 90%

Relevant coursework: Object-Oriented Programming, AI, Database Management, Concurrency, Operating Systems

Articles

Feb 2026

Agentic RAG and GraphRAG (2026)

Advanced retrieval-augmented generation techniques focused on agentic retrieval and graph-based approaches to improve reasoning and structured knowledge use in agents.

Feb 2026

Retrieval-Augmented Generation (RAG) (2026)

How RAG combines external information retrieval with generative models to ground outputs in real data for better accuracy and relevance.

Jan 2026

AI Agent Observability

Techniques for observing AI agents in production, including logging, metrics, traces, and debugging frameworks.

Jan 2026

Building a Production MCP Server: Real-World Lessons from Exposing Spanish Government Data

Real-world insights and challenges from building a production MCP server to expose government data reliably.

Jan 2026

How to Evaluate AI Agents: End-to-End Metrics, Tool Correctness, and Failure Modes

A comprehensive guide to evaluating AI agents, including full-pipeline performance metrics and detection of common failure modes.

Jan 2026

Feedback Loops for AI Agents

How to design feedback mechanisms so AI agents improve over time with automated and human feedback signals.

Jan 2026

Context Engineering for AI Agents (2026)

How to structure and optimize context for agent reasoning to improve quality and cost control.

Jan 2026

Memory Engineering for AI Agents: How to Build Real Long-Term Memory

A systems-oriented exploration of real long-term memory design for agents that goes beyond raw context windows.

Jan 2026

AI Agent Tools (2026)

Overview of tools and frameworks used for building and orchestrating AI agents in 2026.

Jan 2026

Claude Code Tool Best Practices: CLI, Slash Commands, and MCP

A comparison of the different ways to interact with Claude Code tools and when to choose each interface.

Jan 2026

Claude Code Best Practices

General best practices for working with Claude Code for reliability and efficiency.

Jan 2026

Multi-Agent System Patterns: Designing Agentic Architectures

An architectural guide to multi-agent systems, covering coordination, execution, interaction, and deeper system design.

Jan 2026

Prompt Engineering Basics (2026): A Practical Guide

Practical techniques for writing reliable prompts, treating prompts as clear specifications with structured constraints.

Jan 2026

Advanced Prompt Engineering (2026)

Advanced prompt engineering techniques beyond the basics, covering complex patterns for reliable and high-quality LLM outputs.

Jan 2026

Control Loops for Agentic AI: HITL & AITL Design Patterns

Patterns for control loops such as Human-in-the-Loop (HITL) and AI-in-the-Loop (AITL) in production-grade agent systems.

Dec 2025

Hyperparameter Optimization with Optuna

Technical guide on using Optuna for hyperparameter tuning, including samplers and key design trade-offs.

Dec 2025

Building a Practical Framework for Supervised Tabular ML

A practical end-to-end framework for supervised machine learning on tabular data, covering preprocessing, model selection, and evaluation.

Certifications

53 professional certifications from world-leading institutions

Stanford University

Machine Learning Specialization

Comprehensive specialization covering supervised learning, unsupervised learning, recommender systems, and reinforcement learning fundamentals.

Verify

Stanford University

Supervised Machine Learning: Regression and Classification

Core foundations of supervised learning including linear regression, logistic regression, and gradient descent optimization.

Verify

Stanford University

Advanced Learning Algorithms

Neural networks, decision trees, ensemble methods, and best practices for training and evaluating ML models.

Verify

Stanford University

Unsupervised Learning, Recommenders, Reinforcement Learning

Clustering, anomaly detection, collaborative filtering, and introduction to reinforcement learning concepts.

Verify

Stanford University

Probabilistic Graphical Models Specialization

Complete specialization in PGMs covering representation, inference, and learning of Bayesian and Markov networks.

Verify

Stanford University

Probabilistic Graphical Models 1: Representation

Bayesian networks, Markov random fields, and template models for representing complex probability distributions.

Verify

Stanford University

Probabilistic Graphical Models 2: Inference

Exact and approximate inference algorithms for probabilistic graphical models, including message passing and sampling.

Verify

Stanford University

Probabilistic Graphical Models 3: Learning

Parameter and structure learning in probabilistic graphical models from observed and partially observed data.

Verify

DeepLearning.AI

Deep Learning Specialization

Comprehensive deep learning specialization covering neural networks, optimization, CNNs, sequence models, and structuring ML projects.

Verify

DeepLearning.AI

Neural Networks and Deep Learning

Foundations of neural networks, forward/backward propagation, and building deep neural network architectures.

Verify

DeepLearning.AI

Improving Deep Neural Networks: Hyperparameter Tuning, Regularization and Optimization

Techniques for improving neural network performance including batch normalization, dropout, and advanced optimizers.

Verify

DeepLearning.AI

Structuring Machine Learning Projects

Best practices for structuring ML projects, error analysis, and strategies for handling mismatched data distributions.

Verify

DeepLearning.AI

Convolutional Neural Networks

CNN architectures, object detection, face recognition, and neural style transfer using convolutional neural networks.

Verify

DeepLearning.AI

Sequence Models

RNNs, LSTMs, GRUs, attention mechanisms, and Transformer architectures for sequence-to-sequence modeling.

Verify

DeepLearning.AI

TensorFlow: Advanced Techniques Specialization

Advanced TensorFlow specialization covering custom models, distributed training, computer vision, and generative deep learning.

Verify

DeepLearning.AI

Custom Models, Layers, and Loss Functions with TensorFlow

Building custom layers, loss functions, and model architectures using TensorFlow's flexible APIs.

Verify

DeepLearning.AI

Custom and Distributed Training with TensorFlow

Custom training loops, gradient tape, and distributed training strategies for large-scale deep learning.

Verify

DeepLearning.AI

Advanced Computer Vision with TensorFlow

Object detection, image segmentation, and model interpretability using advanced TensorFlow computer vision techniques.

Verify

DeepLearning.AI

Generative Deep Learning with TensorFlow

Variational autoencoders, GANs, and neural style transfer for generating new content with deep learning.

Verify

DeepLearning.AI

Introduction to Data Engineering

Foundations of data engineering including data lifecycle, architecture patterns, and modern data stack components.

Verify

DeepLearning.AI

Source Systems, Data Ingestion, and Pipelines

Designing data ingestion pipelines, working with source systems, and building reliable data workflows.

Verify

DeepLearning.AI

Data Storage and Queries

Data storage solutions, query optimization, and database technologies for modern data engineering.

Verify

Duke University

MLOps | Machine Learning Operations Specialization

End-to-end MLOps specialization covering DevOps, DataOps, MLOps tools, and cloud ML platforms for production ML systems.

Verify

Duke University

DevOps, DataOps, MLOps

Principles and practices of DevOps, DataOps, and MLOps for building robust and automated data and ML pipelines.

Verify

Duke University

MLOps Tools: MLflow and Hugging Face

Hands-on MLOps with MLflow for experiment tracking and model registry, and Hugging Face for model deployment.

Verify

Duke University

MLOps Platforms: Amazon SageMaker and Azure ML

Deploying and managing ML models on AWS SageMaker and Azure ML cloud platforms for production use.

Verify

Duke University

Virtualization, Docker, and Kubernetes for Data Engineering

Container orchestration with Docker and Kubernetes for scalable data engineering infrastructure.

Verify

Duke University

Advanced Data Engineering

Advanced techniques for building scalable, reliable, and efficient data engineering systems.

Verify

AWS

AWS Fundamentals Specialization

Core AWS services and cloud architecture fundamentals including compute, storage, networking, and security.

Verify

AWS

AWS Cloud Technical Essentials

Foundational AWS cloud services, infrastructure, and best practices for cloud-based application deployment.

Verify

AWS

Migrating to the AWS Cloud

Cloud migration strategies and best practices for moving workloads to AWS infrastructure.

Verify

AWS

Architecting Solutions on AWS

Designing scalable, resilient, and cost-efficient solutions using AWS services and architectural patterns.

Verify

AWS

AWS Cloud Solutions Architect

Advanced cloud architecture design principles for highly available and fault-tolerant systems on AWS.

Verify

AWS

Introduction to Designing Data Lakes on AWS

Data lake design patterns using AWS services for scalable and cost-effective data storage and analytics.

Verify

AWS

Generative AI with Large Language Models

LLM lifecycle from pre-training through fine-tuning and deployment, including RLHF and prompt engineering.

Verify

IBM

Generative AI for Software Developers Specialization

Specialization on applying generative AI in software development, from prompt engineering to building AI-powered applications.

Verify

IBM

Generative AI: Introduction and Applications

Introduction to generative AI concepts, models, and real-world applications across industries.

Verify

IBM

Generative AI: Prompt Engineering Basics

Fundamentals of prompt engineering for effective interaction with large language models.

Verify

IBM

Generative AI: Elevate your Software Development Career

Leveraging generative AI tools and techniques to enhance software development productivity and code quality.

Verify

IBM

Introduction to Big Data with Spark and Hadoop

Big data fundamentals using Apache Spark and Hadoop for distributed data processing and analytics.

Verify

IBM

Machine Learning with Apache Spark

Building scalable machine learning pipelines using Apache Spark MLlib for classification, regression, and clustering.

Verify

IBM

Introduction to NoSQL Databases

NoSQL database concepts, types (document, key-value, graph, column), and use cases for modern applications.

Verify

Amii / University of Alberta

Reinforcement Learning Specialization

Complete RL specialization from fundamentals through function approximation to building a complete RL system.

Verify

University of Alberta

Fundamentals of Reinforcement Learning

Core RL concepts including MDPs, dynamic programming, and the exploration-exploitation trade-off.

Verify

University of Alberta

Sample-based Learning Methods

Monte Carlo methods, temporal difference learning, and planning with sample-based approaches in RL.

Verify

Amii

Prediction and Control with Function Approximation

Function approximation methods in RL including linear and neural network-based value function approximation.

Verify

Amii

A Complete Reinforcement Learning System (Capstone)

Capstone project building a complete RL system integrating all concepts from the specialization.

Verify

Vanderbilt University

AI Agents and Agentic AI with Python & Generative AI

Building AI agents with Python, covering agentic AI patterns, tool use, and generative AI integration.

Verify

Google

The Path to Insights: Data Models and Pipelines

Data modeling and pipeline design for transforming raw data into actionable business insights.

Verify

Coursera

Selecting the Right LLM with Hugging Face

Evaluating and selecting the right large language model for specific use cases using Hugging Face tools.

Verify

Coursera

Advanced Relational Database and SQL

Advanced SQL queries, database optimization, and relational database design for data engineering.

Verify

Coursera

Spark, Hadoop, and Snowflake for Data Engineering

Distributed data processing and modern data warehousing with Spark, Hadoop, and Snowflake.

Verify

Coursera

Databricks to Local LLMs

Deploying and running LLMs from Databricks to local environments for inference and fine-tuning.

Verify

Skills

AI & Machine Learning

Deep Learning Reinforcement Learning Generative AI (LLMs, RAGs, DSPy) Agentic AI Optimization & Metaheuristics NLP Computer Vision Forecasting / Time Series Clustering & Unsupervised Learning Active Learning Hyperparameter Optimization (Optuna) Model Explainability Graph RAG / Knowledge Graphs Multi-Agent Orchestration Tool Calling & Persistent Memory LLM Evaluation & Observability Prompt & Reasoning Optimization

Data Engineering & Frameworks

Kedro Spark / PySpark Databricks Airflow Azure Data Factory Databricks Asset Bundles Snowflake Hadoop ETL Workflows PySpark MLlib

Cloud & MLOps

AWS (SageMaker, ECR, Data Lakes) Azure MLflow Docker Kubernetes CI/CD GitHub Actions MLOps Pipelines

Programming & Libraries

Python Pandas SciPy Scikit-learn PyTorch TensorFlow / Keras XGBoost Hugging Face LangChain / LangGraph DSPy Google OR-Tools Gurobi Java C++ SQL PySpark

Tools & Platforms

Power BI Databricks Jupyter Notebooks Git APIs & Databases Integration

Soft Skills & Languages

Strategic Thinking Hypothesis-Based Problem Solving First-Principles Reasoning Value Identification & Quantification Technical Communication Mentoring & Mentorship Best Practices Development International Team Collaboration Attention to Detail

Languages

Spanish (Native) English (C1) French (A2)