-- revision 1, 20210904
This is a sample of the blob accesses in Microsoft's Azure Functions, collected between November 23rd and December 6th 2020. This dataset is the data described and analyzed in the SoCC 2021 paper 'Faa$T: A Transparent Auto-Scaling Cache for Serverless Applications'.
Functions in Azure Functions are grouped into Applications. Included here is only data pertaining to a random sample of Azure Functions applications. The sampling is done per application, so that if there is data about an application in the trace, then all of its functions are included. The sampling rate is unspecified for confidentiality reasons.
The dataset comprises this description and a Jupyter Notebook with the plots in the SoCC paper.
The data is made available and licensed under a CC-BY Attribution License. By downloading it or using them, you agree to the terms of this license.
If you use this data for a publication or project, please cite the accompanying paper:
Francisco Romero, Gohar Irfan Chaudhry, Íñigo Goiri, Pragna Gopa, Paul Batum, Neeraja J. Yadwadkar, Rodrigo Fonseca, Christos Kozyrakis, Ricardo Bianchini. "Faa$T: A Transparent Auto-Scaling Cache for Serverless Applications", in Proceedings of the ACM Symposium on Cloud Computing 2021 (SoCC 21). ACM, Seattle, WA, 2021.
Lastly, if you have any questions, comments, or concerns, or if you would like to share tools for working with the traces, please contact us at [email protected]
You can download the dataset here: https://azurepublicdatasettraces.blob.core.windows.net/azurepublicdatasetv2/azurefunctions_dataset2020/azurefunctions-accesses-2020.csv.bz2
Field | Description |
---|---|
Timestamp | Access time in milliseconds since 1970 |
AnonRegion | Unique id for the region1 |
AnonUserId | Unique id for the user1 |
AnonAppName | Unique id for the application1 |
AnonFunctionInvocationId | Unique id for the invocation1 |
AnonBlobName | Unique id for the blob accessed1 |
BlobType | Type of the blob accessed |
AnonBlobETag | Version of the blob accessed1 |
BlobBytes | Number of bytes of the blob |
Read | If the access is a read |
Write | If the access is a write |
- Ids are hashed using HMAC-SHA512 with secret salts and cropped.
Timestamp | AnonRegion | AnonUserId | AnonAppName | AnonFunctionInvocationId | AnonBlobName | BlobType | AnonBlobETag | BlobBytes | Read | Write |
---|---|---|---|---|---|---|---|---|---|---|
1606092900138 | 6ex | 775920313 | 9gti3olh | 1565080819 | jfvf7k9kwiiq7gdx | BlockBlob/application/octet-stream | kq2su6bhi0 | 30.0 | True | False |
1606928903185 | 6ex | 1252244298 | 7c51my6n | 1191849141 | 1fjxqoqi2nc5njpg | BlockBlob/application/zip | ibd6a5v5pv | 1938488.0 | True | False |
1606355700058 | iic | 1495523193 | uf2u84b0 | 1302383289 | tp783etybrgxap8x | BlockBlob/ | 6mreka6qhr | 36.0 | False | True |
1606924856178 | iic | 705112778 | 1jgfqbn6 | 1869133266 | 80lssrlkciitddx9 | BlockBlob/ | if8foq3a81 | 2204780.0 | False | True |
1606658957997 | 6ex | 1252244298 | 15dp5na6 | 1468781831 | juijw2ldiogyem3c | BlockBlob/application/zip | 414fgngli4 | 359512.0 | True | False |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
1607270691764 | ayi | 1003538042 | 766ofcie | 1080821259 | sfocyrxcksjgri5t | BlockBlob/application/json | tanw2860j5 | 164.0 | True | False |
1607270691884 | ayi | 1003538042 | 766ofcie | 1530317863 | aat6cv8j2cofwj1a | BlockBlob/application/json | gf05emgb6t | 164.0 | True | False |
1607270692007 | ayi | 1003538042 | 766ofcie | 358892311 | u7p02pymm07pa7bg | BlockBlob/application/json | kl2uv31e7y | 164.0 | True | False |
1607270692134 | ayi | 1003538042 | 766ofcie | 1978924507 | 9qeai70lggcku3c5 | BlockBlob/application/json | 3xa1dkrq7m | 164.0 | True | False |
1607270692284 | ayi | 1003538042 | 766ofcie | 1142206120 | t8e88ksd6fiy2dx0 | BlockBlob/application/json | bp4ynk65sl | 164.0 | True | False |
This data is the sample data used in the SoCC paper mentioned above. To verify the data, we reproduce the characterization graphs in the paper using the released trace in this Jupyter Notebook.