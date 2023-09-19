Microsoft\u2019s AI research team accidentally exposed 38 terabytes of private data through a Shared Access Signature (SAS) link it published on a GitHub repository, according to a report by Wiz research that highlighted how CISOs can minimize the chances of this happening to them in the future.\n\nDubbed \u201crobust-models-transfer,\u201d the repository was meant to provide open-source code and AI models for image recognition, and the readers of the repository were provided a link to download the models from an Azure storage URL.\n\nThis URL allowed access to more than just open-source models, according to a Wiz blog post. It was configured to grant permissions to \u00a0the entire storage account, exposing additional private data by mistake.\n\n\u201cThe Azure storage account contained 38TB of additional data \u2014 including Microsoft employees\u2019 personal computer backups,\u201d Wiz said. \u201cThe backups contained sensitive personal data including passwords to Microsoft\u2019s services, secret keys, and over 30,000 internal Microsoft Teams messages from 359 Microsoft employees.\u201d\n\nThe slipup \u2014 a misconfigured SAS link that allowed access to sensitive information \u2014 could be easily avoided if one understood what exactly went wrong.\n\nMisconfigured SAS tokens created risks\n\nThe Microsoft repository meant for providing AI models for use in training code instructed users to download a model data file through a SAS link and feed it into their scripts, Wiz noted. To do this, Microsoft developers used an Azure mechanism called \u201cSAS tokens,\u201d which allow you to create a shareable link to grant access to data in an Azure Storage account that, upon inspection, would still seem completely private.\n\nThe token used by Microsoft not only allowed access to additional storage accidentally through wide access scope, but it also carried misconfigurations that allowed \u201cfull control\u201d permissions instead of read-only, enabling a possible attacker to not just view the private files but to delete or overwrite existing files as well.\n\nIn Azure, a SAS token is a signed URL granting customizable access to Azure Storage data, with permissions ranging from read-only to full control. It can cover a single file, container, or entire storage account, and the user can set an optional expiration time, even setting it to never expire.\n\nThe full-access configuration \u201cis particularly interesting considering the repository\u2019s original purpose: providing AI models for use in training code,\u201d Wiz said. The format of the model data file intended for downloading is ckpt, a format produced by the TensorFlow library. \u201cIt\u2019s formatted using Python\u2019s Pickle formatter, which is prone to arbitrary code execution by design. Meaning, an attacker could have (also) injected malicious code into all the AI models in this storage account,\u201d Wiz added.\n\nSAS tokens are difficult to manage\n\nThe granularity of SAS tokens opens up risks of granting too much access. In the Microsoft GitHub case, the token allowed full control of permissions, on the entire account, forever.\n\nMicrosoft\u2019s repository used an Account SAS token \u2014 one of three types of SAS tokens that also include Service SAS, and User Delegation SAS \u2014 to allow service (application) and user access, respectively.\n\nAccount SAS tokens are extremely risky as they are vulnerable in terms of permissions, hygiene, management, and monitoring, Wiz noted. Permissions on SAS tokens can grant high level access to storage accounts either through excessive permissions, or through wide access scopes.\n\nHygiene issues involve tokens having an expiry problem, where organizations use tokens with a very long (sometimes lifetime) expiry at default.\n\nOtherwise, account SAS token are extremely hard to manage and revoke. \u201cSAS tokens are created on the client side; therefore, it is not an Azure tracked activity, and the generated token is not an Azure object,\u201d Wiz said. \u201cThere isn't any official way to keep track of these tokens within Azure, nor to monitor their issuance, which makes it difficult to know how many tokens have been issued and are in active use.\u201d\n\nRecommendations include configuration hacks and monitoring\n\nWiz recommends avoiding external sharing of Account SAS, given the issues involving lack of security and governance. If external sharing can\u2019t be helped, Service SAS must instead be selected with a stored access policy to allow for the management of policies and revocation in a centralized manner.\n\nFor sharing content in a time-limited manner, expiry for user-delegation SAS should be capped at seven days. Creating dedicated storage accounts can be a good practice too, in cases where external sharing is inevitable.\n\nWiz also recommended tracking active SAS token usage by \u201cenabling storage analytics logs\u201d on storage accounts. \n\n\u201cThe resulting logs will contain details of SAS token access, including the signing key and the permissions assigned,\u201d Wiz said. \u201cHowever, it should be noted that only actively used tokens will appear in the logs, and that enabling logging comes with extra charges \u2014 which might be costly for accounts with extensive activity.\u201d\n\nAzure Metrics can also be used to monitor SAS token usage in storage accounts for events up to 93 days. Additionally, secrets-scanning tools can also come in handy to detect leaked or over-privileged SAS tokens in artifacts and publicly exposed assets, according to Wiz.\u00a0