You are here
Big data applied to books. The Catalan startup Tekstum analyzes thousands of readers' tweets and reviews using an algorithm. As explained by one of its founders, Marc Santandreu, the goal is to analyze everything that is said about a book: "We carry out a scientific analysis of comments on social media, blogs, literary platforms... Our algorithm is based on big data and artificial intelligence in their facet of natural language processing. The algorithm determines the feeling, the emotion conveyed by the book to its readers."
What do readers think of the book The Children Act by Ian McEwan? If we take a look at the startup's website (with 3,000 users), we can see how readers raved about the latest book by the British author: "good, original, perfect."
La guerra civil contada a los jóvenes (The civil War Told to Youngsters) by Arturo Pérez Reverte is slightly less successful: even though there are more positive than negative comments, some readers describe it as "shallow" or "weak." Bestsellers such as The Girl on the Train by Paula Hawkins are punished on social media and book reviews. Excellent sales figures are not the same as satisfied readers.
According to Santandreu, the sentiment analysis tool establishes "the polarity of opinions about a book –whether they are positive, negative or neutral– and identifies key words that readers use to define their reading experience; the key word that defines the book's emotional impact." This is not an easy thing to do because "demanding read," for example, can have a negative or positive meaning depending on the reader. In turn, "addictive" is positive. The cofounder of Tekstum explains that the tool identifies more than 10,000 terms associated with the world of books.
Language and sentiments are one of the main focuses of the analysis but the tool also measures the impact of a specific author: for example, it looks into the reactions to a trilogy and identifies the book with the most success among readers.
Several publishing houses are already working with the Tekstum API "to follow up on their books and the books of competing publishers and target new releases; and to add to online sales platforms," explained Santandreu.
Tekstum's team includes Lauren Romeo (PhD in Computer Linguistics and Science Director) and Juanjo Fernández (Head of Development), and it is also aimed at bookshops.
Santandreu mentioned the physical Amazon bookshops in Chicago and Seattle as an example of how data is important: "To offer a different experience to their customers, these bookshops use all of the information contained in thousands of reviews. When you go into the store, in addition to the books, you see a quote of the most useful review as well as Amazon's rating. All of this information is used to help readers choose a book. We use this raw material but we go one step further since we analyze this information," he explained.
Are they hoping to become Netflix? "I hope we get there," said the CEO. He praised the online series and movie platform for its huge categorization: "It contains more than 70,000 micro-segments while there are only 3,000 categorizations for books."
Santandreu believes that his company's tool makes it possible to handle the large literary production volume that is common today. "Faced with so many books, readers need tools to help them discover their next choices as well as recommendation tools that are different from the existing solutions which are based on user searches or previous purchases," he said. But he stresses that big data cannot know everything. "Who would have thought that Leicester would win the UK's Premier League? Big data helps us a lot with forecasts but sometimes you need to know how to interpret things." And he ended by saying that "there are always people behind algorithms."
More information: Ebook– Big Data