Microsoft Accidentally Leaks 38TB of Confidential Data on GitHub
2 min read
Microsoft researchers inadvertently exposed 38TB of confidential information on the company’s GitHub page, potentially making it accessible to anyone. The data leak included a backup of two former employees’ workstations, which contained sensitive information such as keys, passwords, and over 30,000 private Teams messages. The leak was reported cloud security firm Wiz, who discovered that the data was mistakenly published on Microsoft’s AI GitHub repository as part of open-source training data.
This embarrassing incident highlights the fact that data breaches can arise from various sources, including internal ones. The leak occurred because Microsoft uploaded the data using Shared Access Signature (SAS) tokens, which enable users to share data through Azure Storage accounts. Visitors to the repository were instructed to download the training data from a provided URL. However, this URL granted access to other files and folders that were not meant to be publicly accessible.
What made matters worse was that the access token associated with the leak was misconfigured to provide full control permissions instead of read-only permissions. This meant that anyone who visited the URL could delete and overwrite the files they found, potentially injecting malicious code into the AI models stored in the repository.
The report from Wiz also highlighted the lack of visibility around the creation and circulation of SAS tokens, as they do not leave any trace. This absence of a paper trail makes it difficult for administrators to track and monitor token usage. Fortunately, Wiz reported the issue to Microsoft in June, and the leaky SAS token was replaced in July. Microsoft conducted an internal investigation and has now made the incident public to allow for a complete fix.
This incident serves as a reminder that even seemingly innocent actions can lead to data breaches. While the vulnerability has been patched, it is uncertain if any hackers managed to access sensitive user data before it was removed.
Sources:
– Cloud security firm Wiz
– Microsoft