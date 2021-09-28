At REA Group, we've had a lot of great success in the data science space in recent years. And excitingly, we've only just scratched the surface in terms of the value data science can bring to the business! This blog talks to the four key principles that have seen us be successful to date, and, I have no doubt, will see us be successful for many years ahead.

I have the privilege of leading a fantastic, high-performing data science team at REA Group. Our purpose is to generate a very rich understanding of our users, our customers, and the property market, through data and data science, so that we can provide relevant and highly personalised experiences to our users. We also use this data to deliver unique and powerful products and insights to our customers, including real estate agents, developers, and banks.

Our purpose is clear, and importantly, it strongly aligns with REA Group's strategy. We don't do data science because we have lots of data so we think we should, we do it because it helps us achieve our business objectives.

My favourite saying is 'garbage in means garbage out'. Data science is meaningless without quality data. This is our team's #1 principle; focus on data over tech and algorithms. Time and again we have reaped value by investing time in improving data quality, training data, richer features, and new data sources, than in trying different algorithms or different tech.

For example, in training our latest deep learning models on images, we invested much of our time ensuring our labelled training data was of high quality before looking at different models. And when developing new audience classification models, we focus much more time on creating rich features than we do trying different classification techniques.

You need to put all the right ingredients into a cake before you worry about icing it!

Luckily at REA, thanks to our wonderful data engineers, we have all the data we need right at our fingertips, as well as a magnificent data platform to transform it into the many rich features we need. This includes the huge amount of clickstream events on our sites and apps from our millions of users every day, to all the property and listings data (including images and text), product data, customer data, and market data.

The key to high-performing teams? Hire great people, then empower them to do their thing, and be accountable for it. Provide direction, encouragement, guidance, and support. Help them develop and learn - data science is a fast-changing space after all. Pretty simple right.

But it is more than that. It is not only important to hire great people, but to hire different people. Diversity is crucial! And we're not just talking demographics. Different skills, backgrounds, strengths, experiences, and ways of thinking are all vital. This way 1+1+1=5 rather than 1+1+1=2; the difference is huge! You don't need one person who can do it all, you need a team of people who can. Build a team of diverse thinkers, and setup an environment where they can work collaboratively, so you get the benefit of that diversity.

At REA we kick-off projects as a team, do lots of pairing when it makes sense, QA each other work, and document with others in mind.

Your data science function (or functions) can be centralised or distributed. Provided those distributed teams are large enough (in my experience at least 6 people), either structure will work. More important is that those teams are diverse, have easy access to all the data they need, and are embedded where they can best impact the business.

Finally, always build with the future in mind. It's amazing just how much leverage you can get from a data science product beyond what it was built for. For example, a model we developed years ago solely to help generate property suggestions is now used to power more than 10 different products across the company, and that list is growing. Pre-existing features and feature sets that we have developed over time are consistently used in new models we build.

It is also important to fail fast. Data science products often take time to deliver, so the last thing anyone wants is to get to the end of that time before realising you haven't hit the mark. Iterate, test, and learn, and check-in with stakeholders regularly.

So, in summary, ensure your purpose is very clear; focus on the data over the science; build a diverse team and promote collaboration; and build with scale in mind, create building blocks, fail fast and learn quickly. Follow these principles and you're well on the path to success.