With the vision to transform its overall business and the data practice, Majid Al Futtaim (MAF) in collaboration with Publicis Sapient sought to envisage and build a core platform driving Real-time Intelligence across all its business ecosystems by rapid Integration of data. This involved collecting data from various sources, manage and expose it to various units of the business in order to leverage the data in terms of hyper-personalization or customer centric promotions.
To achieve this transformation, Centralized Data Lake platform was built on AWS as a solution that is future ready, scalable, decoupled, modular and agile with the objective to drive great consumer experiences.
The solution leveraged AWS cloud services and DevOps automation tools to create new infrastructure for test purposes and tear it down on demand. EKS cluster is used to host the microservices built to cater to this platform, which can scale on demand. Elastic search and Graph DB is used to store the meta information related to tables, schemas, and APIs etc, which can be configured through the Meta Manager to discover, orchestrate how data is ingested, stored and consumed. Kafka is used to stream the incoming data into AWS S3 and EMR with hundreds of topics and around 60 TB of data. Once consumed it is stored in RDS or Vertica data stores via APIs, OLTPs or other mechanisms for consumption by the data analysts or data scientists for data analysis, building dashboards for reporting, predicting or forecasting based on machine learning.
DevOps practice and methodologies were used to automate the infrastructure provisioning, microservices build and deployment and overall security, logging, and monitoring of the product. DevOps played an important role in the overall solution and helped attain the outcomes through these automations.
GIT is the source of truth for all Kubernetes deployments. The core idea of GitOps is having a Git repository that always contains declarative descriptions of the infrastructure currently desired in the production environment and an automated process to make the production environment match the described state in the repository. If you want to deploy a new application or update an existing one, you only need to update the repository - the automated process handles everything else.
The stack used for GitOps is as below:
To learn more about our work with Majid Al Futtaim, please visit How Majid Al Futtaim Transformed Their Data Practice into an Engine for Business Growth | Publicis Sapient