16 Oct Artificial Intelligence, Robot Journalism and NFTs
There are years in which the media industry is abuzz with talk of new tech and new solutions, and others in which older mediums have suddenly been imbued with new possibilities.
Artificial Intelligence: How is it being used in newsrooms today and what for? Can AI predict user and subscriber behaviours and help automate mundane tasks so that journalists have more time to generate quality content?
Robot Journalism: An offset of AI that is both cheered and feared in newsrooms. Can creating some types of articles and content be done by
machines? And what does that mean for newsroom workflows?
Non-Fungible Tokens (NFTs): We’ll be honest in saying we have no idea how long this is going to last. But while the bandwagon is still
gaining steam it’s worth looking at how publishers can capitalise on the demand for NFTs to generate some revenue.
Let’s get started:
ARTIFICIAL INTELLIGENCE FOR THE SMART NEWSROOM The data we have right now from multiple surveys is that publishers across the world are betting big on AI. There’s an interesting dynamic here – a recent McKinsey report found that across industries 50% of companies use AI in one way or another. However, compared to other digital media outlets like video and audio platforms, news publishers are still at a very early stage of AI adoption.
Why is this the case? Anecdotally, we’ve found that despite there being many different use cases for AI in newsroom operation and other aspects of media management, there is still some level of stigma or fear… about the big elephant in the room proposition — can AI get so good that it will make journalists redundant? Can we get to a stage where machines will churn out news articles? These propositions are
still very far-fetched, though there have been significant advances in ‘robot journalism’ which we’ll cover later. For now, let’s pan out and look at the bigger picture.
The Journalism, media and technology trends and predictions 2022, a report written by Nic Newman for the Reuters Institute for the study of journalism surveyed 246 news leaders in 52 countries to explore the latest developments in the field and the priorities for the year ahead. Newman writes that AI technologies are fast being seen as core to publishers’ businesses.
“Artificial intelligence technologies such as Machine Learning (ML), Deep Learning (DL), Natural Language Processing (NLP), and Natural
Language Generation (NLG) have become more embedded in every aspect of publishers’ businesses over the last few years. Indeed, these can no longer be regarded as ‘next generation’ technologies but are fast becoming a core part of a modern news operation at every level – from newsgathering and production right through to distribution,” Newman writes.
A quick breakdown of priorities:
– More than eight in ten (85%) say that AI will be very or somewhat important this year in delivering better personalisation and content recommendations for consumers.
-A similar proportion (81%) see AI as important for automating and speeding up newsroom workflows, such as the tagging of content, assisted subbing, and interview transcription.
– About 70% see AI as playing a key part in helping find or investigate stories using data.
-60% see it as being helpful in identifying and targeting prospective customers most likely to pay for a subscription.
-Lastly, using AI to automatically write stories – robot-journalism – is less of a priority at this stage (40%) but is where many of the most future-focused publishers are spending their time.
Newman also lists out some excellent examples from 2021 on how AI has had an impact on both story production and distribution.
1 In June 2021, The Boston Globe won an investigative journalism Pulitzer Prize for Blind Spot, a story about preventable road accidents in the US. Reporters used Pinpoint, an AI tool developed by Google, to support investigative journalists to identify patterns in their data.
Despite nearly 50 years of warnings by federal safety officials, the United States has no effective national system to keep tabs on drivers who commit serious offenses in another state, Brendan McCarthy Deputy Projects¡ Editor at The Globe writes in a Google blog.
“When we launched the investigation, we hadn’t fully gotten acquainted with Pinpoint, a new Google tool where you can upload documents to easily search for names, places and more for patterns. But midway through our reporting process, we were dumping troves of files — court documents, photos, handwritten files, spreadsheets and more
— into the tool, he writes.
“A couple of helpful aspects of Pinpoint are its ability to recognize text in images and organizational capabilities, like the opportunity to quickly see, and search documents for, the most mentioned names or places and connections between people. So often in journalism — especially when you are dealing with mass troves of data — you are looking for outliers. Pinpoint let us figure out what was NOT there as much as what was there.”
2 Sky News used AI to extract and clean public health data from pdfs and other previously inaccessible formats, which they then used to constantly update web pages and TV graphics across its output.
3 The Washington Post has extended its synthetic voice audio versions across all of its output, using a software called Amazon Polly, following a successful trial period within its apps.
The Post was previously using standard textto- voice conversion iOS and Android technology, but. Polly “is a better voice – it’s a smoother, more human-sounding voice,” Kat Downs Mulder, the publication’s managing editor for digital said in an interview with Press Gazette. “It still sounds like a mechanised voice, an automated voice. But I think it’s a bit smoother and more natural.” In the 2021 edition of the Journalism, media and technology trends and predictions report,
Newman refers to the Journalism AI project from the POLIS think tank at the London School of Economics which has been documenting best practice case studies in this field
since the beginning of 2020. These include:
l The Peruvian news outlet Ojo Público has created a tool to spot potential patterns of corruption in government procurement contracts.
l The BBC has been testing an AI powered chatbot tool to answer questions about coronavirus using its own trusted reporting and information summarised from official sources.
l The South China Morning Post is using AI to identify look-alike audiences to help it better target new subscribers.
l The Reuters news agency used speech-totext technology to add time-coded transcripts to its entire archive of historic videos dating back to 1896 – making key moments easier to find in 11 different languages.
l The Globe and Mail in Canada has delegated many of the editorial choices on its homepage and other landing pages to an AI-based tool called Sophi.
l A number of publications are using AI tools to monitor issues of gender and racial bias in output and flag results to editors. New ideas have been surfaced by AIJO, the AI in Journalism project, which is a collaboration across eight publishers and has proposed ways to understand and mitigate newsroom biases.
THE CURRENT STATE OF AI ADOPTION Narrowing in further from Newman’s examples, many of which one could see as more advanced uses of AI, a survey published in February 2022 by the World Association of News Publishers (WAN-IFRA) in collaboration with the Germany- based consulting firm Schickler, decided to look at the current state of AI adoption in newsrooms by narrowing use cases to reader revenue and editing.
In general, the survey, conducted across 2021, found that more than 75 percent of publishers say AI will play a crucial role in the success of their business within the next three years. A more striking insight was that the use of AI was not just restricted to larger newsrooms with deeper pockets, or the oft-cited publisher examples in Europe or the U.S. who pop up in discussion about innovation.
“In fact, as we saw in this year’s World Press Trends survey the gulf in the importance (and planned investment) attributed to Automation, AI and Machine Learning is more striking, and perhaps, surprising, i.e. developing countries place a huge importance compared to developed countries,” the report states.
This is a trend confirmed by other research as well. In July 2021, a report jointly prepared by International Media Support, The Fix and El Clip, aimed to specifically address this gap in our understanding of AI application in newsrooms around the world. It focuses on media houses using varying degrees of AI, ML and Data Processing as part of their core business operations in 20 countries in Latin America and Central and Eastern Europe. The data was collected via deep dive case studies of 44 media outlets and over 33 hours of interviews with experts.
The report management of paywalls and subscriptions are the most widely used AI applications in both regions but while the use of AI/ML is not limited to large, corporate media, budgets and market realities can make stark differences.
The report finds that attracting specialist talent and skills is a barrier to further adoption and growth and more collaborative approaches between media, research institutions, and third-party solutions need to be encouraged. Still, as the WAN-IFRA report finds, the exciting news is that the near taboo that was once associated with AI as it pertains to journalism is clearly fading.
“It sounds like a cliche, but it is true: the advantage of automating mundane processes allows journalists to focus more on their core principles of creating quality content. But increasingly as the surveys showed, the intelligence side of AI is also winning over editors and journalists as it pertains to reader revenue strategies,” the report states.
“A lot of analytics tools inform the newsroom about what stories are trending and when, but many of the AI tools already on the market and in practice today help to predict more accurately which stories will be read, and more importantly, which stories will convert and retain based on historical data around subscriptions and engagement. WAN-IFRA also highlights some recent case studies, giving more insight into the specific use cases for AI that publishers are.
AN ALGORITHM TO DEFINE READER ENGAGEMENT The Norwegian publisher, Amedia, the country’s largest publisher of local media titles which reaches 2.4 million readers daily across all platforms, wanted to understand the kind of user behaviour that characterises a happy and satisfied customer. Getting an answer to this question is easier said than done, of course, but it can be key to designing new products and act as a guide to what kinds of content to focus on.
Amedia handed over this decision to a machine learning algorithm, letting it select the best combination of metrics. The algorithm takes up to 70 reader behaviour statistics as an input and distills them into a single number that best predicts how likely a reader is to stay loyal to the product. At Amedia they call this number the “Engagement Index.”
A particular benefit of the Engagement Index, according to the WAN-IFRA report, is that it separates the product’s and content’s influence on reader loyalty from other external factors such as personal preferences or subscriber tenure. In other words, it captures those factors that are currently affecting reader loyalty — and that are under the publisher’s control.
A PAYWALL THAT PREDICTSUSER BEHAVIOUR We’ve made mention of this innovation before from Canada’s The Globe and Mail. In recent years the publisher has committed itself to AI, developing a range of tools and products under the name Sophi. Perhaps the most cutting edge is a fully dynamic, personalised real time system that decides when, or even if, to show a paywall. The unique thing about this paywall is that it knows when to give up rather than alienate visitors. Here’s how Sonali Verma, senior project manager at The Globe and Mail, described it at the Online News Association conference last year: A reader who reads mostly general news and recipes might be less likely to subscribe than one reading a lot of business-related content. Still, Sophi might present this general news reader with a paywall.
If they don’t reach for their wallet, the model won’t hit them with the same message again. Instead, Sophi might pivot, and try asking the reader to register with an email instead.
Sophi uses analytics to make decisions that balance the potential for ad revenue against the potential for subscriber revenue. Some readers might never encounter a paywall (Verma mentioned a hypothetical visitor who primarily reads car reviews — a strong source of ad revenue) while others might see one every time they visit the site.
The Globe and Mail has credited Sophi with helping it achieve a 51% increase in subscriptions as against its old paywall which was a hybrid approach. In April 2021, the company reported that it had 170,000 digital-only subscribers
The paper also uses AI in many other ways that indirectly drive subscriptions and registrations. For instance, natural language processing is used to select stories to put outside the paywall. In fact, a Sophi tool decides where to place 99% of all digital content across The Globe and Mail’s properties, Gordon Edall, Vice President, Sophi, told WAN-IFRA. “Our work on Sophi Site Automation… is probably where we have had the biggest impact, as we have been able to let the newsroom continue to do its work but use the machine to more effectively drive performance of that content without putting the brand at risk.
Sophi is also helping publishers around the world who can purchase the AI tools from The Globe and Mail. The South African news website News24, which launched a freemium paywall in August of 2020, decided to use the tech to get directly involved in ranking news stories, prioritising times to publish, and conversion predictions. With its help, News24 reached 45,000 subscriptions in just over one year’s time, Adrian Basson, Editor-in-Chief of News 24 told WAN-IFRA.
A PREDICTIVE SCORE TO HELP RESURFACE AND REPUBLISH ARCHIVE CONTENT A collaborative research project between Ouest-France, the country’s largest newspaper, and the companies Twipe and Syllabs resulted in the creation of an internal search engine based on a predictive score that helps the newspaper surface, republish, and monetise archival content.
The collaboration resulted in the creation of a “content monetisation predictive score,” assigned to each of the more than 30 million articles in the Ouest-France content archive and designed to assess an article’s potential to generate page views.
“Initially, we had three goals in mind,” Jean-Pierre Besnard, Project and Incubation Manager at Ouest-France, told WAN-IFRA. “The first was to consider it from a page views angle, the next was engagement, and the last was subscriptions. We could imagine three algorithms on three of these points of view, but we only had time for the first one.”
The partners also built an internal search engine tool based on the predictive score, which surfaces articles from the archive, allowing journalists to select and re-publish them with just a few clicks.
A DATA-DRIVEN PLAN TO PREDICT CHURN Mediahuis, headquartered in Belgium, set itself the goal of implementing a fully automated data-driven customer journey with the aim of increasing revenue and reducing churn. The company wanted to target its potential and existing customers with the best registration offer, the best sales offer, and the best retention journey in an automated, efficient, and personalised manner.
In order to achieve this, they adopted a data-driven approach, using a churn propensity prediction model developed in-house. Incorporating this machine learning, Mediahuis is
targeting a selection of existing customers who seem likely to churn but worthwhile to keep, and who could potentially be persuaded to stay on with the right offers.
In addition to its churn prediction model, the WAN-IFRA report notes that Mediahuis started working on a propensity to buy model a few months ago to be able to predict whether someone will buy a subscription and what type of subscription they will buy in the next month.
In terms of initial results, Jessica Bulthé, Data Science Business Partner at Mediahuis, told WAN-IFRA they looked promising for both logged-in and anonymous users, with the model correctly predicting in eight out of 10 cases whether somebody will buy a subscription in the next month.
“That’s a good first step towards adding propensity to buy and the next best action modeling for propensity to buy,” Bulthé said.
DO ROBOTS DREAM OF DIGITAL TEXT? We’ll cover this last case study form WAN-IFRA before moving on to a larger discussion about robot journalism or automated content generation.
The case study itself is of the Dutch regional media group NDC, which publishes three daily titles and more than 40 weeklies, ramping up coverage plans for the first post-COVID football season with a reader promise to cover every single local match. This extended sports coverage is part of the publisher’s strategy to build out the value of its local journalism thereby driving reader revenues. It’s all about providing coverage that’s not available anywhere else, to drive engagement in local communities.
How many matches does that amount to? 60,000! That is far beyond the capacity of the newsroom. But NDC’s solution is to have robots write the match reports, while photos and comments from coaches are collected through a crowd- sourcing platform. For a regional publisher like us, being able to cover all matches of all divisions is engagement gold.” Ard Boer, Sports Project Manager at NDC told What’s New in Publishing.
We mentioned earlier in this chapter that publishers are interested in automated content generation, but it is not a priority area for most right now. This is largely due to the fact that at present AI can only produce texts for certain niche topics where structured data is available.
Sports is one, what could the others be? Schibsted’s Norwegian regional site Bergens Tidende is another publisher using automated content to drive revenue, writing stories about real estate. Local stories about the housing market are relatable for most readers but churning them out is incredibly time-consuming, involving mapping neighbourhoods, properties and sales, Bergens Tidende experimented in 2019 with the ‘Boligrobot’, a piece of tech created by a company called United Robots that finds information about the real estate, like current prices, addresses, areas, changes in prices over time, the price of a square meter, and so on. This data comes from agencies and is gathered by the Norwegian authorities. Then, it finds great aerial and panorama photos from a supplier, supplemented by images from Google Street view.
One of the main reasons to work with a robot was to engage more readers and sell more subscriptions. “Since launching in mid-July 2020, we have sold 500 subscriptions from nearly 6,000 automated articles. We anticipate roughly 1,000 new subscriptions per year. That’s five percent of all the article conversions in BT, overnight establishing the robot as the most popular service to our readers,” the publication reported in 2021. WNIP reported that these automated articles generate about 3,000–4,000 pageviews a day.
Other publishers are catching on to this idea. Swedish national tabloid site Nyheter24 publishes automated real estate top lists and celebrity real estate articles. The articles
generate tens of thousands of page views per week. The publisher is looking at growing ad revenues with cost-effective robot-generated pageviews.
In March this year, What’s New in Publishing reported that US local media group McClatchy has signed an agreement with United Robots for a pilot partnership to deliver home sales news in Sacramento County, CA, and nine other locations.
“The agreement includes automatically generated stories about individual sales as well as aggregated articles on aspects such as median or average price, and top lists of most expensive homes sold during given periods for given areas or neighborhoods,” WNIP reported.
The articles are based on real estate data from industry data providers, and also include images of properties as well as satellite maps, through a global agreement United Robots holds with Google.
Swedish publisher Gota Media is another that publishes automated stories about sports, real estate, company registrations, and traffic in order to provide regular updates in all its local communities. Its automated real estate content has a conversion rate of 2% which is the best across the group’s ten news sites.
“For a small newsroom, automation is necessary,” Helena Tell, Editor-in-Chief, Bärgslagsbladet (Bonnier News) in Sweden, told What’s New In Publishing. “We know where
to deploy our resources in order to make our readers happy. And if we can use technology and automation to perform tasks as well as we reporters would, there’s no doubt that’s what we should do.”
Another example is Crosstown, a non-profit community project that uses machines to cover hyperlocal news in neighbourhoods of Los Angeles. The title is published by Gabriel Kahn Annenberg School for Journalism.
Speaking at Journalism.co.uk’s Newsrewired conference in 2021, Kahn said he uses AI-powered tools to scrape public datasets and store content in the cloud. Humans can then turn that data into narratives that address people’s concerns around topics as varied as crime, traffic, air pollution or coronavirus.
The idea is that by having a location tag on each piece of data, every story can be turned into neighbourhood news. For example, Kahn said a large database on crime in LA can be dissected into stories about crime on each street which gives residents information that is relevant to their lives.
By outsourcing the most labour-intensive tasks to the machines, Crosstown is trying to address one of the biggest pain points of local news: sustainability. It is simply not possible to have a reporter in every neighbourhood to monitor everything from traffic to pollution to public spending, then produce personalised news relevant to the residents. By using machines to source and analyse data, Kahn said the editorial team can extend its reach and serve LA inhabitants on a hyperlocal level.
AI IS GROWING INCREASINGLY FLUENT What are the other types of stories for which AI can be applied, and can machines do it alone? Take the example of the Wall Street Journal, which has begun to generate AI narratives on the biggest markets in the U.S. and Europe. For this they partner with a company called Narrativa which has an AI system called Gabriele.
“We couldn’t be more satisfied with the result; news articles generated by Narrativa are featured on the front page of the American website,” a post from the company said.
Narrativa generates two types of news articles for The Wall Street Journal:
1 Information on the state of financial markets in the United States, Europe and Asia.
2 Consumer price index and producer price index (CPI and PPI).
“Our artificial intelligence system serves up the information quickly and then WSJ professionals then analyze it in terms of the impacts that a particular fall or rise in the stock market will have for the financial world.”
“But this isn’t a tool to replace editors,” the post adds. On the contrary, artificial intelligence is the perfect ally to support their work, using a combination of technology and research.”
More and more companies, Narrativa says, are waking up to the potential artificial intelligence and natural language generation (NLG) as superior support tools that improve results.
The company claims that the number of readers of news generated by NLG has risen from zero to 7% in the last two or three years. “With this in mind, it is more than likely that we have read an article generated by artificial intelligence at some point without even realizing it.”
There are more examples of AI growing increasingly fluent and proficient at generating text. In the Journalism, media and technology trends and predictions report
for 2022, Newman writes that the BBC is planning to extend its 2019 experiment with election results, which allows hundreds of constituency pages to be automatically written
and rewritten by computer as the numbers change – all in a BBC style. “Local elections in May 2022 will provide the next test of what will become a permanent system that could be adapted to work with many other types of publicly available data from health to sports and business,” he notes.
SHOULD WE BE WORRIED ABOUT ROBOTS STEALING JOURNALISTS’ JOBS? “I think saying that robots steal journalists’ jobs is the wrong way to look at it. Because it’s not
about stealing jobs,” Cecilia Campbell, Chief Marketing Officer of United Robots tells The Media Voices Podcast. “It’s actually about news publishers improving their offer, and actually securing the business. If you can offer better journalism, obviously you have a better chance at a future.”
“I’m not going to say that there has never been a single freelancer who hasn’t lost a gig because of a robot… But for the majority of publishers that we work with, the robots are part of the strategy to grow the business and improve the business,” she adds.
We’ll leave it to you to ascertain if that sounds reassuring or not but there’s no doubting that robot journalism is a space that we will be continually watching, and as technology and possibilities improve it will be fascinating to see how far the possibilities extend.
Speaking about what the newsroom workflow is like when robot journalism is incorporated, NDC’s Arb Boer, who joined Campbell on the podcast, said the most energy involved is in just checking the messages and the articles, and seeing if they need improvement. “Because it’s used with AI technology, if there’s an error, if there’s a sentence not written really well or you want some word not used or used differently, you can put it in a feedback loop. And then the robot knows, ‘Oh, I need to do this differently the next time.’”
“That will never happen with a human – they will probably do it 10 more times! But actually, training the robots to write the articles the way you really want, that’s where the effort goes in.”
“You have to make sure that the connections are still working, but that’s a relatively small amount of time and effort, and is more of an effort from the digital department than it is from the newsroom,” Boer says.
WHAT’S NEXT FOR AI The next frontier, Newman writes, is using AI for images and video. In particular there is much excitement about DALL-E, a new AI model from Open AI, revealed in January 2021, that automates original image creation from instructions you provide in text.
“This could open up a range of new possibilities, from simple story illustration to entirely new forms of semi-automated visual journalism,” Newman writes.
“The big challenge for many large media companies is serving audiences with very different needs using a monolithic website or app.
AI offers the possibility of personalising the experience without diluting the integrity of the newsroom agenda by offering different versions of a story – long articles, short articles, summaries, image or video-led treatments – with much
greater efficiency.”
Summarisation and smart brevity are trends that Newman expects to see more of in 2022, with news organisations experimenting with content formats for under-served audiences.
Research shows that these audiences prefer:
l Increased use of bullet-points in news articles,
l Visual stories over text,
l Mixed media story formats popularised by social media.
As an example already in operation, Newman writes of the BBC’s latest Modus prototype which uses two different NLP approaches to generate bullet point-led stories and automated captions for images in picture galleries.
“Enabling this,” he adds, “will be a new generation of modular content management systems, such as Arc from the Washington Post and Optimo from the BBC that do not base
authoring around a ‘story’ but instead around ‘nested blocks’ that allow better connections across stories, making it easier to reassemble content in potentially limitless ways.”
Up until now the best models for Natural Language Processing and Generation have been focused on English. While this has been a challenge for languages like Arabic and Spanish where extra training is often needed to get the required quality, Newman expects to see faster progress over the year with publications like La Nación in Argentina and Inkyfada in Tunisia refining their own models in collaboration with academics. Programmes to share best practice such as the Journalism AI collaboration programme from LSE’s Polis and INMA’s AI webinars and showcases are also helping spread knowledge.
The Innovation in News Media World Report is published every year by INNOVATION Media Consulting, in association with FIPP. The report is co-edited by INNOVATION Presidente, Juan Señor, and senior consultants Jayant Sriram and Inês Bravo