hadoop big data ovh

Deploy your Hadoop big data cluster in just a few clicks

Deploying a big data cluster is usually a long, restrictive process. OVHcloud Big Data Cluster is designed to simplify this process for you. In under an hour, we can deliver a preconfigured, ready-to-use Apache Hadoop stack.

Based on a standard open-source Hadoop distribution, we preconfigure all the services you need to process data and secure the flow of data traffic.

Deploy the OVHcloud Big Data Cluster solution for a range of uses, including market analysis, business intelligence, IoT and even preventative maintenance. The power is yours.

Services available

Once you have deployed your cluster, you will have full access to all of the services listed below.

We base our solutions on the open-source Apache Hadoop operating system, along with an additional security and management layer that includes:

  • A network pathway and a gateway host, to secure your traffic with the public network.
  • An identity management service (Kerberos and LDAP), WebSSH and Apache Ambari, to secure your operations.
  • Ambari software, to simplify management via a web interface.
Data flow Sqoop, Flume
Security Ranger, Kerberos, Know, FreelPAidM
Storage HDFS, HBase
Monitoring Ambari, Logs Search, Ambari Infra
Scheduling Oozie
Messaging Kafka
Processing YARN, Map Reduce 2, Tea, Pig, Slider, Hive, Spark 2, Presto

 

Architecture

OVHcloud Data Analytics Platform

Usage

1

Deploy a Public Cloud project

Your cluster is based on a flexible, high-performance Public Cloud infrastructure, which is available across multiple regions. Your service is billed to match your usage, at the listed rates for the instances you use.

2

Deploy a cluster

In just a few clicks, you can deploy a full big data cluster. OVHcloud configures it all for you.

3

Log in to the interface

In less than an hour, your cluster is ready to use. You’re now ready to log in to the graphical interface, and harness the power of big data.

Ready to get started?

Create an account and launch your services in minutes

Pricing Public Cloud

Big Data Cluster billing

The Hadoop cluster is delivered, already configured, in about 1 hour. Billing is based on the instances and volumes used. In addition to the instances, there is an extra charge to cover the operation of the cluster.

Need a fully-managed big data solution?

Cloudera

Are you looking for comprehensive support, and a fully-managed solution, based on a Dedicated Cloud infrastructure? Discover our Managed Cloudera solution, which is adapted to suit even the most critical uses.

 

What SLA does OVHcloud offer for accessing the Big Data Cluster service?

Although we maintain a high quality of service, this solution is not managed, and OVHcloud does not guarantee availability. For further information, please refer to the Terms & conditions.

What guarantees are offered for the resources (compute, storage or other) used by the Big Data Cluster service?

The Big Data Cluster service is built on other cloud resources that have their own SLAs, and can be accessed from their respective pages.

What version of the software is deployed?

The Big Data Cluster service deploys the Hortonworks Data Platform software suite in version 2.6.2.

What locations are available for the Big Data Cluster service?

The Big Data Cluster service is available in the following locations: France (Gravelines, Strasbourg), Germany (Frankfurt), United Kingdom (London), Poland (Warsaw), and Canada (Beauharnois).

What is the minimum cluster size?

A cluster consists of at least 12 servers, distributed as follows: 4 worker nodes, 1 edge node, 3 master nodes, 3 utility nodes, 1 bastion node. The smallest instances are B2-60 for worker nodes, B2-15 for edge nodes, and B2-30 for master nodes.

What is the maximum cluster size?

The cluster size can be up to 107 servers, including 50 worker nodes, 50 edge nodes, 3 master nodes, 3 utility nodes and 1 bastion node. The largest instances are R2-240 for edge nodes and worker nodes, and B2-120 for master nodes.

What is big data?

Big data is more a concept than a technology. This involves collecting a large amount of data from multiple sources, in order to analyse it.

What is Hadoop?

Apache Hadoop is a collection of software utilities that enables users to analyse large volumes of data. This solution is designed for large-scale deployment, and ensures high availability for data and services.

What is the Hortonworks Data Platform?

Hortonworks Data Platform is an integration of Apache Hadoop and other components, made by Cloudera. It provides an enhanced user experience.

Do I have access to the nodes after deployment?

Yes, your SSH key is added to the servers that are deployed. You can log in to each of the nodes if required.