MapR Technologies to educate and demonstrate innovation at Hadoop Summit 2014 in Amsterdam

Ted Dunning and Michael Hausenblas to present data science educational sessions on commercially important algorithms and predictive analytics with Hadoop.

MapR Technologies, Inc. will be presenting at the upcoming Hadoop Summit, the leading conference for the Apache Hadoop community in Amsterdam this April.
MapR Technologies is a platinum sponsor of the two-day event which runs from 2nd to the 3rd of April and features many of the Apache Hadoop thought leaders showcasing successful Hadoop use cases, sharing development and administration tips and tricks, and educating organisations about how best to use Apache Hadoop as a key component within their enterprise data architecture.


On Wednesday 2nd, Ted Dunning, Chief Application Architect at MapR will run a session on “How to Tell Which Algorithms Really Matter,” as part of the data science educational track. Ted contributes to several Apache open source projects including Hadoop, ZooKeeper, Mahout, Drill and Storm and is a committer and PMC member for Mahout, Drill and ZooKeeper.


Ted’s informative session looks at why the set of algorithms that matter theoretically is different from the ones that matter commercially. Commercial importance often hinges on ease of deployment, robustness against perverse data and conceptual simplicity. Often, even accuracy can be sacrificed against these other goals. Commercial systems also often live in a highly interacting environment so off-line evaluations may have only limited applicability. In the session, Ted describes several commercially important algorithms such as Thompson sampling (aka Bayesian Bandits), result dithering, on-line clustering and distribution sketches and explains what makes these algorithms important in industrial settings.


Also on Wednesday, as part of the data science track, Michael Hausenblas, chief data engineer within EMEA for MapR will present two sessions. The first, “Applying the Lambda Architecture,” provides an informative presentation that aims to answers questions including: What Apache Hadoop eco-system components are useful for what layer in the Lambda Architecture? What is the impact on human fault tolerance? Are there good practices available for using certain Apache Hadoop ecosystem components in the three-layered Lambda Architecture?


To round off the Wednesday data science track, Michael will also present “Predictive analytics with Hadoop”. As a contributor to Apache Drill, a distributed system for interactive, ad-hoc analysis and query of large-scale datasets, Michael will provide real-world use cases from several different industries, including financial services, online advertising and retail. The session will also discuss the open source technologies and best practices for implementing predictive analytics with Hadoop, ranging from data preparation and feature engineering to learning and making real-time predictions. Some of the topics and technologies that will be covered include a deep dive on recommender systems that show cases the use of Mahout and Solr to deliver both batch and real-time solutions.


In early February, MapR announced the availability of the MapR Sandbox for Hadoop, a virtualized environment containing the MapR leading distribution for Apache Hadoop that enables users to begin exploring and experimenting with Hadoop in less than five minutes. The MapR Sandbox provides a complete and fully-configured virtual machine installation of the MapR Distribution along with several point-and-click tutorials for developers, analysts, and administrators.


MapR also announced its support for Hadoop 2.0 with YARN which delivers next-generation resource management by combining flexible resource management with the reliability and real-time capability of the MapR next-generation data platform. The upgrade uniquely enables organizations to run the Hadoop MapReduce 1.x and YARN schedulers on the same nodes in the cluster simultaneously, providing an easy and risk-free path for MapReduce 1.x users to upgrade to the new Hadoop scheduler.
 

First of its kind research, in partnership with Canalys, offers deep insights into some of the...
According to a recently published report from Dell’Oro Group, worldwide data center capex is...
Managed service providers (MSPs) are increasing their spending by as much as 70% to meet growing...
Coromatic, part of the E.ON group and the leading provider of robust critical infrastructure...
Datto’s Global State of the MSP: Trends and Forecasts for 2024 underscores the importance of...
Park Place Technologies has appointed Ian Anderson as Senior Director, Channel Sales, EMEA.
Node4 has passed the ISO 27017 and ISO 27018 audits, reinforcing its dedication to data security,...
Park Place Technologies has acquired Xuper Limited, an IT solutions provider based in Derby, UK.