Azure Stream Analytics – Gaining the Azure Data Engineer Associate Certification

Azure Stream Analytics can be used to analyze data flowing in from an IoT device or from any other numerous kinds of message producers. Event Hubs and IoT Hubs are what you use to ingest those messages at scale. Those messages are placed into a queue and then processed by a consumer that has subscribed to the hub. In other words, a binding exists between the queue where the messages arrive and the consumer that will process that message. When the message arrives at the queue, a notification that includes metadata is sent to the consumer. Then the consumer takes an action on that message. As you can see in Figure 1.22, a binding can exist between either an Event Hub or an IoT Hub and Azure Stream Analytics.

FIGURE 1.22 Azure Stream Analytics data flow

Azure Stream Analytics uses SQL query syntax to perform analysis of the streaming data, something like the following:

SELECT *

INTO Output

FROM Input

WHERE AF3Alpha BETWEEN 11 AND 15


Once the data is transformed, it can be streamed in real time to a Power BI dashboard or to a SQL Server database for offline or near‐real‐time reporting. Remember that ASA has the word “stream” in its name. If your solution requires a product to perform analytics on a stream of data in real time, this is the product you would choose. Alternatively, if your solution is event‐driven or timer‐based, then consider choosing Azure Functions or Logic Apps instead.

Other Products

There are other products related to data analytics that are available on the Azure platform. It is possible that the exam will ask a question about these products, but don’t expect many. Regardless, if your desire is to be the best Azure data engineer possible, then the more you know, the better.

Anomaly Detector

Anomaly Detector is an application programming interface (API) used to find irregularities in data. Consider that a normal reading of an alpha brain wave is typically between 5.311 and 12.541 Hz. That range is determined by running data analytics on very large sets of recordings. If there is a reading outside of that range, it can be highlighted as an anomaly. Highlighting it ensures that the person who monitors such activities will be notified and take appropriate actions. Anomaly Detector can be used in real time, offline, or near real time.

Data Science Virtual Machines

A Data Science Virtual Machine (DSVM) is an Azure VM that comes preconfigured with lots of standard data sciences–related products. Instead of provisioning a default Azure VM, which would have nothing but the operating system on it, you would find the following, and more, installed by default:

  • CUDA, Horovod, PyTorch, TensorFlow
  • Python, R, C#, Node, Java
  • Visual Studio, RStudio, PyCharm
  • H2O, LightGBM, Rattle
  • Apache Spark, SQL Server
  • Azure CLI, AzCopy, Azure Storage Explorer

Azure Stream Analytics can be used to analyze data flowing in from an IoT device or from any other numerous kinds of message producers. Event Hubs and IoT Hubs are what you use to ingest those messages at scale. Those messages are placed into a queue and then processed by a consumer that has subscribed…

Leave a Reply

Your email address will not be published. Required fields are marked *