Before I get into the topic at hand, I’d like to bring you up to date on some Microsoft news you may not have heard. For years now, Amazon Web Services (AWS) have been the gold standard for the Internet as a Service (IaaS) industry. In fact, most experts have declared them the dominant winner that nobody will catch any time soon.
On July 20th, Gartner released their report on cloud IaaS services offered by AWS, Azure, and Google. In the Overall Required Category, AWS scored 92%, Azure scored 88%, and Google Cloud scored 70%. Microsoft surprised more than a few experts as the come-from-behind underdog. With this new reality in the IaaS industry, there is now talk of Microsoft becoming No. 1 very soon.
This is a testament to Microsoft’s commitment to the Azure cloud.
Now, off to the lake!
The Azure Data Lake Store (DLS) is an Apache Hadoop file system that allows you to store your data in the cloud. (I’ll bet you were expecting a different meaning for the word “store”.) The DLS is where you put your lake.
The Azure DLS will ingest virtually any size of data stream that is pointed at it. This becomes the repository for your big data. Just like Amazon S3, you can have virtually unlimited storage space. Unlike S3, Azure DLS allows you to optimize that data for use with integrated analytics.
The DLS includes all the enterprise tools you will need to manage your lake: Security, tools to manage the data, as well as Azure reliability, and access.
Any type of data streamed into the DLS can be stored in its native format. If you require specific types of data to be stored in a single file, that can be handled as well. Azure DLS does not have limits on file sizes.
Of course you are not just storing data for fun. You will want to make some use out of it. That is why Azure designed its DLS to optimize your ability to analyze enormous amounts of data. In addition to being integrated with a variety of Azure services, you are also able to use open source Hadoop tools.
To get started using the Azure DLS, you can choose to start reading here: https://azure.microsoft.com/en-us/documentation/articles/data-lake-store-get-started-portal.
Or you can start watching here: https://mix.office.com/watch/1k1cycy4l4gen.