This word came to my mind after I had numerous workloads running in production and I couldn't guarantee the SLA of my machine learning workloads anymore. I felt I eventually ran into a pipeline jungle problem.

What is a pipeline jungle?

Machine learning workloads are powered by multiple real time or batch ingestion jobs, feature engineering jobs and most likely all of which are developed and owned by various departments and multiple engineering teams. The data in all the jobs are generally sourced from variety of data sources — data lakes, data warehouses and events. The responsibility and ownership of the correctness & completeness of…


There are not many articles online which help in the exam preparation for CCDAK exam. I have churned out a quick article on my journey with the preparation and various materials I studied to clear the exam.

Exam Preparation Guide

1. Video tutorials by Stephane Maarek

There are a total of 6 video tutorials, the link is below. The basic concepts are very well explained with examples.

2. CCDAK sample exams by Stephane Maarek

There are a total of 3 exams of 50 questions each. The sample exams are very useful and the real exam has many concepts from these exams. Please study…

Suteja Kanuri

I am all about data.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store