ACL and POSIX Security Model – Gaining the Azure Data Engineer Associate Certification

When you’re securing files and folders, the access permission level (aka Access ACL) is critical. The access permission level has to do with what actions a person or entity can perform on those files or folders. Table 1.5 describes those permissions.

TABLE 1.5 File/folder access permission levels

PermissionFile permissionDirectory permission
Execute (X)No meaning in ADLS Gen2Required to navigate child items in directory
Read (R)Read contents of a fileR and X required to list directory contents
Write (W)Write or append to a fileW and X required to create directory items

There is another kind of access control that pertains to ACLs and ADLS Gen2 called Default ACLs. An important distinction between Access ACLs and Default ACLs is that Default ACLs are associated with directories, not files. When you set an ACL on a directory, new files created within it take on the permission granted to the directory. A very important point to note is that if you change the Default ACL, it does not change the permissions on the files within it. The new ACL is only applied to files and directories created after the change. An additional step is required to apply those changes recursively.

Some tools you can use to set ACL permissions on files and directories hosted in ADLS Gen2 are

  • Azure Storage Explorer
  • Azure Portal
  • Azure CLI, PowerShell
  • Python, Java, .NET SDKs

Most distributions of Linux support portable operating system interface (POSIX) ACLs, which is a means for setting permissions on files and directories on that OS type. That activity is typically performed through a command window; the following code snippet shows a command and the result:

$ mkdir brainjammer
$ ls -la brainjammer
$ drwxr-xr-x 3 csharpguitar root 60 2022-05-02 16:19 ..
$ drwxr-xr-x 2 csharpguitar user 80 2022-05-02 16:52 brainjammer
$ -rw-r–r– 1 csharpguitar user 131 2022-05-02 18:07 playingguitar.json

A directory named brainjammer is created using the mkdir command. The ls command lists the attributes of the directory and any other object within it. Notice that user csharpguitar has full access to the directory brainjammer and the groups root and user have read, execute, and read, respectively. Access can be further clarified by reading the contents in this list. We will break down the fourth line of the snippet drwxr‐xr‐x into its parts:

  • d stands for directory.
  • rwx means the user and group have read (r), write (w), and execute (x) permissions on that directory.
  • r‐x means the user and group have read (r) and execute (x) permissions.
  • r‐‐ means “other” entities that are neither that user nor a user in that group.

Hierarchical Namespace

Hierarchical namespaces organize directories and blob files the same way the filesystem does on your personal computer. There exists a nested hierarchy of directories, each of which can include a file of some type. Hierarchical namespace keeps your data logically structured and by doing so renders better storage and retrieval performance. The following is an example of a hierarchical namespace on a filesystem:

brainjammer\
brainjammer\sessionjson\
brainjammer\sessionjson\metalmusic\
brainjammer\sessionjson\metalmusic\pow\
brainjammer\sessionjson\metalmusic\pow\brainjammer-pow-0904.json
brainjammer\sessionjson\worknoemail\pow\brainjammer-pow-1855.json

Object stores are typically flat, which results in less efficient behavior when moving, renaming, or deleting directories. Stating that a flat file structure is less efficient is understating a bit; it is actually quite dramatic. When it comes to performing data analytics, it is common for part of the analysis process to copy, move, create, and delete files. In many cases, the time required to achieve such actions using flat structures can take longer than the analysis process itself. Running on Azure, you are charged by the amount of time you consume the product, so if this time can be reduced because your files are stored more optimally, then you save money. Using the hierarchical namespaces provided via ADLS Gen2 reduces the latency and therefore reduces costs.

When you’re securing files and folders, the access permission level (aka Access ACL) is critical. The access permission level has to do with what actions a person or entity can perform on those files or folders. Table 1.5 describes those permissions. TABLE 1.5 File/folder access permission levels Permission File permission Directory permission Execute (X) No…

Leave a Reply

Your email address will not be published. Required fields are marked *