
Saying the GA of Cloudera DataFlow for the Public Cloud on Microsoft Azure
After the launch of Cloudera DataFlow for the Public Cloud (CDF-PC) on AWS just a few months in the past, we’re thrilled to announce that CDF-PC is now usually accessible on Microsoft Azure, permitting NiFi customers on Azure to run their knowledge flows in a cloud-native runtime.
With CDF-PC, NiFi customers can import their present knowledge flows right into a central catalog from the place they are often deployed to a Kubernetes primarily based runtime by way of a easy circulate deployment wizard or with a single CLI command. CDF-PC offers a central monitoring dashboard for circulate deployments and affords customized KPI monitoring and alerting permitting clients to remain on prime of what issues to them.

Determine 1: CDF-PC permits organizations to deploy their NiFi knowledge flows to a cloud-native run time whereas offering central monitoring and cataloging capabilities
The necessity for a cloud-native Apache NiFi service on Microsoft Azure
And not using a cloud-native service to run NiFi flows on Microsoft Azure, organizations resorted to constructing and working NiFi clusters on both digital machines or their very own container primarily based infrastructure. Whereas Azure providers like Digital Machines, Managed Disks, Digital Networks and Azure Kubernetes Providers (AKS) make infrastructure provisioning and administration simpler, organizations had been nonetheless answerable for configuring, securing and working NiFi. This finally pressured NiFi groups to spend so much of time on managing the cluster infrastructure, stopping them from constructing new knowledge flows and onboarding new use circumstances.
As we noticed a rising variety of organizations eager to run NiFi knowledge flows on Azure however battling the operational challenges, it grew to become clear that there was a necessity for a cloud service that takes care of infrastructure administration and NiFi configuration to permit NiFi customers to deal with what issues most to them: Constructing new knowledge flows and guaranteeing that these knowledge flows meet the enterprise SLAs.
Fixing Frequent Information Integration Use Instances with CDF-PC on Azure
CDF-PC helps Azure clients implement key knowledge integration use circumstances that require knowledge motion, filtering and transformation at scale. Apache NiFi’s wealthy processor library offers Azure targeted processors like ADLS Gen2, Occasion Hub, Blob Storage or Cosmos DB out of the field. Further Azure providers might be simply built-in by way of their APIs utilizing customizable NiFi processors like InvokeHTTP.
SIEM Optimization

Determine 2: Shifting utility log knowledge from Azure Occasion Hub to ADLS Gen2 and SIEM methods
A standard use case on Azure is SIEM Optimization (SIEM=safety data and occasion administration) for analyzing utility log knowledge. Cloud purposes might be configured to ship their logs to a central Azure Occasion Hub from the place CDF-PC circulate deployments choose up the log information to curate the occasions for the SIEM system. On the similar time the occasions might be saved in ADLS Gen2 storage for customized evaluation exterior of the SIEM utility. Utilizing NiFi for this use case helps cut back the prices of the SIEM system and establishes an ordinary device which might take any utility log information and put together them for the SIEM system.
Processing Streaming Information

Determine 3: Shifting knowledge from Azure Occasion Hub to ADLS Gen2
Fashionable purposes typically present streaming interfaces to ship transaction knowledge in real-time to exterior methods for evaluation. Apache Kafka deployments are generally used to buffer these messages for downstream consumption. Clients can use Streams Messaging clusters in CDP Public Cloud to create enterprise grade Kafka deployments on Microsoft Azure. Since not each downstream utility is ready to immediately learn from Kafka matters, CDF-PC circulate deployments are sometimes used to learn and curate the occasions for evaluation by downstream methods. A standard integration level for Azure providers is ADLS Gen2 for which NiFi offers out of the field connectivity choices. On this use case NiFi deployments on CDF-PC are the bridge between streaming knowledge and providers counting on knowledge being accessible in ADLS Gen2.
Information Ingest for Microsoft Sentinel

Determine 4: Shifting knowledge from community infrastructure units to Microsoft Sentinel
Microsoft Sentinel is an Azure native SIEM resolution that organizations use for assault detection, menace visibility, proactive looking, and menace response. Whereas Microsoft Sentinel offers level integration for a lot of supply methods, not each vendor or product is supported and might be immediately related. CDF-PC circulate deployments might help bridge the hole between unsupported units and purposes by turning the uncooked machine log information right into a format that Microsoft Sentinel understands and ingesting it by way of its HTTP API.
Getting a head begin with ReadyFlows
To assist organizations who aren’t as skilled with NiFi, CDF-PC comes with an built-in ReadyFlow Gallery which makes circulate deployments for common use circumstances straightforward. As soon as they’ve recognized their ReadyFlow of selection, all they should do is begin the Deployment Wizard to supply connection parameters for supply and vacation spot methods and the primary circulate deployment will likely be up and operating inside minutes. Right now, CDF-PC helps Azure optimized ReadyFlows to maneuver knowledge from Kafka to ADLS and between two totally different ADLS places. Sooner or later we’ll present extra Azure optimized ReadyFlows to cowl the use circumstances talked about above.

Determine 5: Uncover Azure targeted knowledge flows within the built-in ReadyFlow Gallery
Leveraging key Microsoft Azure applied sciences to supply elastic, auto-scaling knowledge flows
CDF-PC is powered by Microsoft Azure providers to supply a scalable infrastructure for NiFi knowledge flows. CDF-PC manages the lifecycle of those infrastructure providers, liberating up NiFi directors from infrastructure upkeep duties equivalent to performing upgrades or making use of hotfixes for safety points.

Determine 6: CDF-PC excessive stage structure on Microsoft Azure
As Determine 6 exhibits, CDF-PC creates and manages an AKS cluster in a digital community that consists of two node swimming pools – one for operating Cloudera infrastructure providers and one for operating CDF-PC and the precise NiFi circulate deployments. Every NiFi circulate deployment is created in its personal Kubernetes namespace for useful resource isolation functions. The NiFi circulate deployments can scale up and down primarily based on CPU utilization whereas AKS auto-scales the node swimming pools primarily based on useful resource utilization of scheduled pods throughout the cluster. CDF-PC additionally depends on ADLS Gen2 for storing utility and circulate deployment log information and an Azure Postgres database to retailer utility knowledge.
When CDF-PC is first enabled, customers can configure the minimal and most numbers of Nodes within the CDF-PC Node Pool which is able to scale up and down inside the boundaries as required.
CDF-PC helps totally different networking setups and permits customers to configure which of the accessible subnets in a digital community ought to be used for the AKS cluster, whether or not customers ought to have the ability to entry CDF-PC by way of a public endpoint, in addition to limiting entry to CDF-PC to a listing of CIDR ranges.

Determine 8. Networking settings when enabling CDF-PC for an Azure atmosphere
CDF-PC’s structure and configurable choices throughout service enablement make it versatile to work in any Azure setup whereas abstracting the complexity of the underlying infrastructure by way of easy wizards.
Abstract & Getting Began
With the Basic Availability of Cloudera DataFlow for the Public Cloud on Azure, we’re coming into a brand new period of operating Apache NiFi knowledge flows in multi-cloud environments. For the primary time ever, Apache NiFi customers can handle and monitor knowledge flows operating on Microsoft Azure or AWS from a single administration console. CDF-PC takes care of infrastructure administration, abstracts the variations between cloud suppliers and permits NiFi customers to really deal with growing and operating their knowledge flows.
Take our interactive product tour to get an impression of CDF-PC in motion or join a free trial.
Hyperlinks: