Technical blog

Codex Datalake Engine:

on the path of Cloudera CDP Private Cloud certification

With years of collaboration Cloudera and Atos built a strong partnership that first led to Atos Codex Datalake Engine a fully integrated & certified Cloudera PVC Base (mostly known as Cloudera Distributed Hadoop or CDH) on Atos BullSequana S servers, baremetal or virtualized.

Since many years, Atos and Cloudera are working together to achieve a solid and trustworthy data architecture and deliver the most complete, secure, industrial, and qualified Datalake solution on the market.

This has enabled the two companies, following joint studies and pre-sales, to carry out successful projects.

Cloudera is still providing cutting edge technologies with its new products. Private Cloud introduced Experiences Compute Service called ECS, it brings and bridges Hadoop ecosystem to Kubernetes world, in addition to the option to use OpenShift, which requires that you deploy and manage the Kubernetes infrastructure.

Cloudera ECS offers those functions:

  • Resource management / scheduling
  • Platform monitoring – utilization, health
  • Fault-tolerance – both infra & processes
  • Compute-side caching and storage
  • Quota Management (coming soon)
  • Log collection

Atos Datalake teams would never let pass this under the hood and we re-engineered our Atos Datalake Labs to address this new challenge (and more).

The new extended lab must fulfill the following rules:

  • Easy integration of new servers and/or new networking hardware (start with all range of Atos BullSequana SA servers),
  • Allow multiple isolated environments with different software versions
  • Automated hardware provisioning
  • Every new integration should be easily automated
  • Can provide dedicated or shared AD/Kerberos or LDAP/Kerberos per environment

After the setup of the rack in the lab, we integrated all nodes and networks. For the software & logical prerequisites, we have reused existing valid automation scripts that we modernized to integrate CDP PVC ECS. Each new Cloudera product integrations (including new releases) leads to a new or updated environment and test plan with non-regression reports.

With this architecture, Atos can certify the entire Atos Codex Datalake Engine stack from hardware to software. In terms of maintenance & support the Atos labs enable fast issue reproduction & handling without further impacts on production clusters.

Network topologies

  • Dedicated OOB network
  • Multiple VLANs for environment isolation and to simulate access or customer networks

Hardware

  • Atos servers
    • BullSequana SA10
    • BullSequana SA10EL
    • BullSequana SA20
    • BullSequana SA20G

Networking

  • Mellanox 10/25/40/50/100Gbps switches
  • Dell 10/25Gbps switches
  • Juniper 10Gbps switches

Automation stack

  • Provisioning servers (PXE)
  • Ansible + inhouse development

Security

  • disks HDFS TDE encryption,
  • authentication with external AD/LDAP/Kerberos
  • full stream encryption with SSL/TLS
  • Data services/environments isolation with containers

BullSequana SA server range
Specification and usage

Cloudera ECS comes with new paradigms brought by Kubernetes and Atos had to define a new reference architecture for Codex Datalake Engine based on BullSequana SA.

Control plane

  • 1 BullSequanaSA10 for POC / Dev
  • 3 BullSequanaSA10 for production with high availability

Characteristics

  • 1 x AMD CPU 16 cores
  • 128 GB of memory
  • 10 SSD disks
  • 2 x 25 Gb network link

Worker

  • BullSequanaSA20 with SSD
  • BullSequanaSA20G for GPU ready worker
  • Number depends on the number of data services

Characteristics

  • 2 x AMD CPU up to 64 cores
  • Up to 2TB of memory
  • Up to 24 SSD disks
  • 2 x 25 Gb network link
  • Up to 3 GPU (ex: A100 80GB)

This new Atos Datalake Labs was used to validate early, first and latest release of Cloudera PVC ECS. We early decided to fully share the environment with Cloudera as we could work as a single team.

All Cloudera PVC ECS Data services have been tested with various and security specific setups (hardware or software), with the support of Cloudera teams who were eager to know more about issues/difficulties and to help us.

As a result, Cloudera got early insights for their new product on simulated customer environment on our labs, this includes:

  • Regular feedbacks from Atos Datalake teams about user & admin experience with Data Services leaders/experts
  • Direct access to labs during validation/certification phases
  • Delivering testing use cases for Data Services validation
  • Automation examples provided

Here is a subset of the test plan used during the CDP PVC ECS validation.

Overall tests and technical product feedback

Installation-wise everything went fine except one bug discovered and related to hardware with more than 32 physical core that was acknowledged and taken in account by Cloudera engineering teams.

Regarding the security setup, we noticed that it required some manual configurations mainly due to Kubernetes security architecture (security setup will be improved in future releases).

Another topic was about the containers log where some are not exposed to the user or administrator, so during investigations it can be difficult to search for information. This has been communicated to Cloudera and logs exposure will be adjusted.

On the other side, Data services usage was very intuitive especially for the topics below:

  • CML: projects creation, users management, and collaboration between users.
  • CDE: create and schedule workload jobs, managing Python environment.
  • DW: managing data warehouses.
  • Monitoring: Global overview of the K8s cluster

While there is always room to improve a product, Cloudera PVC ECS is for first release very promising.

Conclusion

To conclude, this certification has been a joint effort with Cloudera engineering teams.

The global process took some time to achieve and led to a stronger collaboration with multiple engineering teams between Atos and Cloudera (a lot of very different and pleasant people 😊).

As before, we built together a very good synergy that helped us to get early overview and learned quickly new Cloudera PVC ECS concepts. That was a full in-depth expertise onboarding with bidirectional communication on the new Cloudera components with Kubernetes usage as we provided almost daily feedbacks.

This means that we are now able to quickly certify any new versions of CDP Private Cloud (Base & ECS) in a matter of days, and provides direct support to customers.

Our strategic partnership with Cloudera was key to accomplish Cloudera CDP Private Cloud certification in less than one month. It allows Atos to deploy Cloudera CDP Private Cloud in a few days in customer environments without worrying about integration & testing.