Azure Event Hubs – Gaining the Azure Data Engineer Associate Certification

Azure Event Hubs is an endpoint for ingesting data at high‐scale frequencies. There are numerous ways to move data into a datastore, such as by creating a connection to a database via code running on a website or within a client application. The problem with that approach is that databases typically have a limit on the number of concurrent connections that can be opened. When that limit is hit, clients that need a connection must wait, and the wait period has a timeout value. If the timeout is breached, then the connection is broken and the data possibly lost. The scenario in which a client connects directly to a database to store data is called a coupled or real‐time solution. The alternative to a coupled solution is a decoupled solution.

Placing a service—for example, Azure Event Hubs—between a client and the database decouples the transaction involving those two entities. Think of Azure Event Hubs as a messaging queue, where the information that the client would normally send directly to a database instead places it into a queue. Once the message is in the queue, the client no longer has a connection and can continue with other work. The insertion into the queue (aka a trigger) notifies another resource, such as an Azure Function, to pull that data from the queue and perform the processing and insertion into the database. This decoupling reduces the chance of losing data and lowers the chance of experiencing an overload on the datastore side. Azure Event Hub is designed primarily for Big Data streaming scenarios that load billions of requests—for example, logging every stock trade happening globally per day. Event Hubs are also commonly used as a decoupler for Azure Databricks, Stream Analytics, Azure Data Lake Storage, or HDInsight products that are used to process the data to gather intelligence.

IoT Hub

IoT Hub is like Event Hub in that it is used to ingest large amounts of data with high reliability and low latency but the target data producers are Internet of Things (IoT) devices, such as weather readers, automobile trackers, brain–computer interfaces (BCIs), or streetlamps, to name a few. One option that is available via IoT Hub is the ability to send notifications to the IoT device in addition to receiving data from it. Consider the streetlamp that needs to be turned on at a certain time of day. IoT Hub can be configured to send that signal to the lamp at a certain time per day. The technology to achieve cloud‐to‐device messaging is called WebSockets. The primary difference between Event Hub and IoT Hub has to do with the number of messages received within a given time frame and the kind of message producer sending the message.

Azure Event Hubs is an endpoint for ingesting data at high‐scale frequencies. There are numerous ways to move data into a datastore, such as by creating a connection to a database via code running on a website or within a client application. The problem with that approach is that databases typically have a limit on…

Leave a Reply

Your email address will not be published. Required fields are marked *