Challenge
|
Machine Learning solution
|
Editorial solution
|
When we collaborate
|
How do we ensure curation is a good experience for s?
|
We consider many different measures of success: accuracy, diversity, recency, impartiality, editorial priority.
|
Traditionally on an editorial team, a journalist would research a story, discuss how it might be covered and compose the story itself to make it compelling.
|
The data scientists get a rich understanding from editorial of the different trade-offs between these measures of success. Deep domain knowledge.
|
How does recency impact curation of content?
|
We include publication date as a feature in our models. We sometimes try and optimise for recency, showing people more current content in some situations.
|
One of the challenges is that once that work is done it is fairly hard to bring the editorial creation back to life, especially for evergreen content. This is one of many examples that ML recommendations could help with, by surfacing this content in the most relevant time according to the ’s experience or history.
|
By working together we’re able to identify how to make decisions about which pieces of content are evergreen and suitable for recommendation, and which pieces have a limited shelf-life and shouldn’t be presented to s beyond a certain point.
|
How does the BBC ensure impartiality?
|
We use measures of statistical fairness to understand if our model is giving unbiased results.
Good practice in machine learning make sure that we’re using unbiased training data.
|
Editors, journalists and content creators make a concerted effort to ensure that a range of views and perspectives are shown within a piece of content or across several pieces of content(within a series for example)
|
We combine our good practices with domain knowledge from editorial. We use techniques like human-in-the-loop machine learning, or semi-supervised learning to make editorial’s lives easier, and apply their knowledge at massive scale.
ML helps editorial identifying those pieces of content that show a breadth of views.
|
How we ensure variety within content serving?
|
We construct mathematical measures for novelty and diversity. We include these in our machine learning optimisations.
|
Editorial staff responsible for curation ensure a breadth and depth of content on indexes, within collections etc
|
We learn about the differences between our different pieces of content. Working together we’re able to determine if our recommendations offer an interesting, relevant, and useful journey for the .
The BBC’s audio networks feature different output and tone of voice. ie. Radio 4 has a very different ‘flavour’ to 6Music. Consequently network can be used to ensure variety in results.
|
How do we avoid legal issues?
|
We are given a checklist, and we check the items off. We get told that there are things “we can’t do for opaque legal reasons” but never really understand why, and limit the functionality of our solution.
|
Editors, journalists and content creators have to attend a mandatory course relating to media law, so that they have full knowledge about issues such as contempt of court, defamation and privacy. An editor will sign off content to ensure that content is compliant with legal requirements.
|
By talking to legal advisers we can build business rules to minimise the risk of legal infractions.
Close collaboration with editorial means we gain a deep understanding of the potential problems ahead at an early stage. We build with awareness of these concerns, and with that awareness build a solution that is high quality from both a technical and editorial point of view.
|
How we handle editorial quality?
|
We build and refine a model using data science good practices, and then turn it over to our editorial colleagues. They then decide if the results are good or not.
|
When editors curate they can choose content that is relevant, interesting and of good quality.
Recommendations present a specific editorial challenge, in that recommenders can surface content that is not the best of our output.
|
In BBC+ we prioritised content that we knew would suit the environment in which it appeared: standalone, short-form videos, appearing in a feed, from digital first areas such as Radio 1, The Social, BBC Ideas etc
Including editorial throughout the process means that they teach us about what is important in the results, so that data science understand the real problems that we’re trying to solve.
We fail quickly, and learn quickly, getting to a better quality result.
|
How we learn from our audiences? Accuracy/ generated content?
|
Measure activity with the products, and construct measurements of engagement.
Building implicit and explicit loops. An explicit loop is having a“like” button, an implicit loop is determining a way to measure when something has gone wrong, like bounce rate or churn.
|
We monitor and analyse stats to build a picture about how our audiences engage with our content.
|
We work with editorial to understand the insights we get from data. They help rationalise the behaviours that we see in the data. They also teach us things that we should look for in the data.
|
How we test recommendations
|
A mixture of offline evaluation metrics (e.g.testing against a known test set of data), and online evaluation metrics (e.g.A/B testing)
|
Traditionally: We monitor and analyse stats to build a picture about how our audiences engage with our content.
|
The editorial lead works with data scientists on the composition of the recommender. The results are then reviewed by the editorial lead and to obtain a variety of opinions the results are reviewed by more editorial colleagues.
More on quantitative testing here .
The rich editorial lets us understand where our model could be better and make improvements.
|