Mastering Data in Commodities: In Conversation with Petar Todorov

November 20, 2024
|
By Petar Todorov and Eva Clarke
Image

Today’s commodities markets are more unpredictable than ever. Geopolitical tensions, rising inflation, and the energy transition are all contributing to the volatility. To manage risks and find new opportunities in this competitive environment, industry players are turning to advanced data and analytics to provide clarity in uncertain times. In this series, we’re talking to leaders in the commodities sector to get their take on how to excel in this new data-driven landscape.

We recently had the pleasure of speaking with Petar Todorov, Director of Data Science at Kpler, a global trade intelligence platform. After completing a PhD in physics and transitioning into data science, Petar entered the maritime and commodities space. Initially hired by Kpler to lead their remote sensing team due to his geospatial expertise, his role soon expanded to overseeing the entire data science department. 

Petar’s team works on advanced techniques such as computer vision on satellite and drone imagery, time-series modelling, geospatial predictive analytics on ship movements, and anomaly detection. Reflecting on what drives his passion for commodities data, Petar said, “When I look at Kpler’s data – the movement of all ships and commodities on water, which account for around 80% of global trade – I really feel like I’m in the engine room of the world economy.”

As data becomes more democratised in the commodities industry, what should be considered from a data science perspective?


The democratisation of data makes it challenging to consume and understand data at scale. Data quality is a big issue, and this is where data science algorithms kick in – particularly anomaly detection. But you can’t naively run anomaly detection; you need to truly understand the data. When you spot an outlier, the key is determining whether it’s a true outlier, indicative of a meaningful event, or just an error in the pipeline.

At Kpler, we use different algorithms to address various data quality issues. For instance, when it comes to ship position data, we need to determine if an anomaly is due to an error in transmission, an issue on the ship’s end, or a deliberate manipulation of the automatic identification system (AIS) position. One of my teams focuses specifically on risk and compliance, detecting spoofing through pattern recognition of ship movements. In contrast, when we look at trades, say, commodity import data for a given country, we apply time series modelling to detect deviations from the norm.

So, as you can see, even within anomaly detection, we use entirely different techniques depending on the use case. Our main objective at Kpler is to always provide the highest quality data by leveraging the full range of data science tools.

It’s thought data engineering can take up 90% of a data scientist’s time. Is this something you’ve also witnessed?


While the exact proportion may vary between organisations, I do agree with this statement, and it holds true at Kpler. This is especially the case when you’re a data provider, focused on delivering high-quality data. The data we provide isn’t delivered based on ad hoc requests; it’s delivered in real-time through a SaaS platform. Because of this, there isn’t much room for data scientists who don’t have at least a basic mastery of the data engineering stack. When I say ‘basic,’ I mean a level of proficiency where they can operate autonomously.

In previous roles, I’ve worked in environments where ad hoc analysis was the norm. In those situations, it was fine to run a model in a notebook, evaluate the forecast, and move on. But when you’re delivering data at scale – tracking tens of thousands of ships, with around 25,000 vessels in the commodity space, including tankers and bulkers – you need a different approach. We have to build models, understand the data, and deploy them in a cloud environment, all while ensuring data scientists comply with engineering best practices.

Aside from data engineering, are there any other skills or qualities that you prioritise when hiring for your teams?


When I hire, beyond the technical stack, a solid understanding of statistics is crucial and it’s something every trained data scientist should have. After confirming they have consistent statistical knowledge (and can handle basic concepts like ensuring probabilities don’t exceed 100% – you’d be surprised how often that happens), I focus on whether they truly care about their models and what they produce.

I want to know if candidates monitor their models in production, devise thorough tests to evaluate performance, and continuously iterate based on those results. How is the model performing on day zero? On day 100? How do they adjust and improve it over time? It’s also crucial that a data scientist cares about how the end user consumes the data, as this helps them devise relevant monitoring metrics. In interviews, I specifically ask candidates how they handled their models hitting production for the first time. What happened? How did they manage that situation? For me, deployment isn’t the end of the process – it’s just the beginning of the development cycle. I’m looking for candidates who truly understand this.

Is previous experience in commodities necessary for data scientists in your team?


I’m open to hiring people from diverse backgrounds because it brings fresh ideas and new perspectives (although, I may be biased since I didn’t have prior commodities experience before joining Kpler.)

That said, knowledge of the commodities industry is beneficial. These individuals will have worked closely with physical and financial traders, and so understand their use cases and can communicate this to the rest of the team. The value I place on this knowledge varies depending on the current composition of the team. If many team members lack experience in the commodity space, I would prioritise hiring someone with previous commodities knowledge.

There are also different subfields within data science that are more or less transferable to the commodities space. For example, professionals with previous experience in geospatial data or mobility data can easily adapt to Kpler’s use cases, as can those who have worked in anomaly or fraud-detection when addressing data quality issues. So, while prior experience in commodities is nice to have, I also see value in different perspectives.

What advice would you give to someone leading or aspiring to lead a data science team, based on your experiences?


One of the first lessons for anyone entering the data science field is the importance of critical thinking. This includes admitting when there are negative results or when an experiment has failed and acting accordingly. It’s essential to maintain this mindset, even at the head of department level. 

On a practical note, I believe every leader should define a clear mission statement for their department. It’s important to clarify the purpose of the data science function within the organisation. Is it to facilitate innovative developments, or could engineers also handle data science tasks? Should data scientists and engineers operate separately, or can they share reporting lines? Every head of data science should reflect on these questions. For instance, at Kpler, I have a clear answer regarding our mission, but I encourage all leaders at my level to find their own.

If you had to bet on one significant development for the future of data in the commodities industry, what would it be?


At Kpler we’re becoming increasingly AI-centric, and this will differentiate us from our competitors. Developing and implementing machine learning models requires a deep understanding of our clients’ needs. Therefore, the future for companies like Kpler involves a greater emphasis on understanding client use cases while our clients themselves need to become more technologically adept. This means moving away from relying solely on user interfaces and instead integrating solutions through APIs. This shift is already happening, as we see our clients hiring more technical personnel to incorporate these solutions into their tech stacks.

What advice would you give leaders to secure internal buy-in for their data science teams?


This is an excellent question. To ensure data is recognised as an asset, we have to circle back to what I said about the importance of data quality and proactively detecting issues. Building confidence in the data is essential, along with employing data science forecasting and solid market analysis to make it more consumable. Standardisation, data quality, and a clear understanding of the data are key elements in this process.

What attracts talent to Kpler?


What attracts data scientists to Kpler is that we combine commodity and ship tracking expertise, making us a truly unique place to work. Occasionally, some team members develop a stronger interest in pure commodity trading and transition to our clients. That said, many who leave Kpler for trading desks or large companies in the commodity space eventually return, and they liken returning to Kpler from a physical trading company to switching from a Nokia 3310 to a modern smartphone.

So where we really do well is attracting talent who are interested in using cutting-edge technologies, which makes Kpler an attractive option for those eager to stay at the forefront of the industry.

Looking for more insights?


Get exclusive insights from industry leaders, stay up-to-date with the latest news, and explore the cutting-edge tech shaping the sector by subscribing to our newsletter, Commodities Tech Insider. Interested in being featured in a future spotlight? Reach out to [email protected].

About the authors

Avatar photo
Petar Todorov

Petar Todorov is the Director of Science at Kpler, a global trade intelligence platform. After completing a PhD in physics and transitioning to data science, Petar entered the maritime and commodities space. He now oversees the entire data science department at Kpler.

Avatar photo
Eva Clarke

I'm the Marketing Manager at Cititec Talent, where I get to combine my love for commodities and fintech with my passion for storytelling. I’m all about creating meaningful brand stories that connect with people, whether it’s through internal comms or reaching out to our broader audience.