Sunday, 25th August 2019
Logo

Democratising science through Big Data

A decade ago, biomedical scientist Tim El-Sheikh found himself spending countless hours searching through research papers and scientific journals for sources and citations...

The constant hunt was so time consuming that he began to wonder why there wasn’t a more efficient way to get the information he needed.

“I wanted to know how scientists could gain access to the most critical and up-to-date information without spending large amounts of time finding it,” El-Sheikh said.

He eventually decided to create a solution to the problem himself which launched in 2009 called Scicasts — a single, searchable resource of the massive information stream the scientific community needs access to. For a sector like medical science which has been cautious of using the internet as an information sharing tool and relied on traditional ways of working, this was a disruptive approach.

El-Sheikh started out with a simple blog, where he began indexing a network of research papers, journals and scientific news. By applying cognitive computing, he began saving users hours of research, while offering them a platform to share and network with peers and decision makers.

“My goal is to democratise science using the power of the cloud. Never before has this type of information been available so conveniently, anywhere across the globe, for anyone who needs it” El-Sheikh said. In a few short years, Scicasts is realising this vision, adding more than 300,000 users across 135 countries.

Scicasts processes a large percentage of the 1.5 million research papers published every year. This presents a challenge of how to keep up with the sheer amount of data the site needs to compile for its users. El-Sheikh, founder and CEO, realised he needed to tap into big data analytics if he was going to take the company to the next level.

Instead of trying to manage this time intensive task internally with the risk of shifting focus away from continuing to grow the firm, Scicasts has approached its cloud vendor Rackspace to tap into its expertise in harnessing the power of Apache Hadoop in Q1 2016.This will allow the company to make the most of its data by indexing it more efficiently — making information even easier for users to navigate and consume. The majority of the company’s revenue comes through its marketing operations, whether that be website ads or market research produced by the industry data collected by Scicasts. Implementing Hadoop will enhance both of these offerings.

When it comes to the market research, Hadoop offers Scicasts the ability to identify emerging scientific trends across the globe and areas where additional research is needed — something El-Sheikh says would have never been possible on the same level before.

Another benefit that will come from the successful management of Scicasts’ data will be additional revenue generated by being able to offer marketers more tailored ads on the website. For instance, if a user is looking at a research paper about a particular topic, a related product will be advertised.

To provide the necessary data mining man power to make this all work, El-Sheikh has recruited several mathematics experts from the University of Leicester, a leading public research university in the UK, to ensure the Hadoop solution runs smoothly and the potential benefits are reached.

Much of Scicasts’ success is owed to its reliable web presence. As it was getting started, El-Sheikh was feeling the pressure of having to keep up with the site’s rapid increase in web traffic and user growth.

“When someone visits Scicasts for the first time, I want that experience to be flawless. To achieve this we needed IT infrastructure behind the site that would be resilient whilst offering scalability as demand grew within a short space of time,” El-Sheikh said.

A problem was that the firm didn’t have a great deal of in-house IT expertise to put this solution in place and maintain it. El-Sheikh and his colleagues are scientists and that’s where they wanted their focus to remain, rather than on managing servers. That’s when Scicasts turned to Rackspace, where a dedicated team, which El-Sheikh has dubbed his “virtual CTO,” migrated the site to a public cloud powered by OpenStack, the open source platform Rackspace founded with NASA in 2010.

Darren Norfolk, MD of Rackspace UK said “Despite having a relatively lean in-house team, Scicasts has lofty ambitions which can only be achieved with an effective IT infrastructure in place.”

“Our managed cloud has served the company well because it can rely on us to maintain the IT infrastructure so it can retain a focus on growing the business and launching its exciting new big data capabilities. The Hadoop implementation will take Scicasts’ offering to a new level by giving users an easy way to find the scientific information they need while opening up new revenue streams for the company.”

After migrating to the Rackspace solution, the difference was immediately apparent.

“In terms of support, Rackspace has been a one stop shop for us,” El-Sheikh said. “I can call them for answers to all of my technical questions and they’ve always provided the answers I need. The technology itself has been a crucial part of our development too, as it’s given us access to crazy compute power and scalability.”

Scicasts’s rapid growth will continue in 2016 when it sets up a physical presence in Asia, engaging with the local scientific community there and tapping into new bodies of research which can be made available for the first time in some regions.

And although it was not his original aim, El-Sheikh also foresees the Scicasts model he created being used in other industries.

“I’ve already received enquiries from people asking if a similar product could be created for sectors like economics and HR. These are opportunities for expansion which I’m keen to explore, safe in the knowledge that our IT infrastructure won’t fall over. There is a vast amount of knowledge and information out there, beyond scientific research, which needs to be democratised so I’m optimistic about the future.”

Shoppers still don’t trust brands to protect their private details.
This collaboration between Microsoft and Informatica provides customers an accelerated path for thei...
MapR technology provides innovative file system for unified analytics from edge to cloud.
HVR’s real-time data replication software and WhereScape’s automation software combine best-in-class...
Geoscientists have launched an open data source from a world-first network of observatories investig...
With the increased usage of data & analytics (D&A) across the enterprise, the chief data officer’s (...
Launched in December 2018, EVOLVE is a €14 million Innovation Action comprising of 19 organisations...
Less than one third (31%) of data specialists, including data analysts, data scientists and data qua...