Revealing the Inner Structure of AWS Session Tokens

Tal Be'ery

17 min read2 days ago

TL;DR: A world first reverse engineering analysis of AWS Session Tokens. Prior to our research these tokens were a complete black box. Today, we are making it more of glass box, by sharing code and tools to programmatically analyze and modify AWS Session Tokens. Using this code we were able to take a seminal deep look into the contents of AWS Session Tokens, expose unknown facts about AWS cryptography and authentication protocols and finally test their resilience against forging attacks (Spoiler alert: surprisingly good!)

Amazon AWS is the most popular cloud computing service in the world. However, little is publicly known about its internal Authentication (AuthN) and Authorization (AuthZ) protocols.

Attackers’ main Modus Operandi (MO) for moving from their initially infected machine to their target (“Lateral movement”) is by abusing the environment’s Authentication and Authorization protocols via valid credentials. These credentials are obtained by attackers via guessing, theft or forging. While the technical details depend on the environments nature being either on-premises or cloud, attackers’ MO remains unchanged, as attackers’ motivations and goals remains the same.

However, while in the on-premises environment the relevant authentication and authorization protocols are open standards (e.g. the Kerberos protocol) in the AWS environment these details are mostly undisclosed by the vendor and remain unknown.

Since we believe that it is utterly important for defenders and builders to understand their environment’s Authentication and Authorization protocols and the resulting credentials structure, we need to reverse engineer what the missing parts from AWS official documentation.

Previously, we explored the inner structure of the AWS KEY IDs credentials

Today we take on the much richer Session Token!

AWS Session Tokens intro

Originally, users would use their assigned long-term credentials, consisting of an ID and a Secret, to sign each of their requests to AWS services.

Signing an AWS request with the user’s SecretAccessKey (source: aws.amazon.com)

To validate the request, AWS would retrieve the user’s Secret from their DB (via IAM / ARS services), sign the request with it and verify that the signatures matches.

Excellent (and rare) AWS talk on their internal authN + authZ solutions

However, this method was not enough to support all kinds of advanced scenarios:

  • Advanced authentication schemes: such as federated login (e.g. Okta) and Multi-Factor-Authentication (e.g. Yubikey)
  • Assuming a role: sometimes users want to connect with different set of privileges to perform a specific task for a limited time, e.g. to sign in as admin

To address these issues AWS introduced the Security Token Service (STS). When users authenticate to the STS service, they get a short term ID and Secret as they did before with their long term credentials, but with an additional Session Token.

As in before, users would need to use their assigned ID and Secret to sign each of their requests to AWS services, but now they must include the session token too.

The session token includes the encrypted user’s ID and Secret, so that AWS can decrypt the Session Token and extract them. Once the Secret is extracted, the AWS service can use it to validate the request as in before.

A decrypted Session Token, includes the user’s ID and Secret (source: youtube)

Additionally, the session token contains a validity period that can range between 15 minutes and 36 hours to limit the exposure of a stolen credentials.

As a result, AWS STS Session Tokens play a vital role in AWS’s security model by enabling the use of temporary, limited-privilege credentials for accessing AWS resources, thereby enhancing AWS’ overall security and access control.

Research motivation

Why should we care about the inner structure of the opaque Session Token:

  • Privacy: The Session Token seems to contain some interesting and sensitive data. Can we safely share an expired token or must we redact it first? and if so, how it should be redacted to make sure it no longer contains such data? Is it safe to save tokens in logs?
  • Integrity: As described above, attackers’ holy grail is the golden credentials attack in which they forge a valid token that contains the attackers’ chosen role (e.g. admin) or edit the expiry time to create an everlasting token. What prevents attackers from creating a Golden Token?
Golden tickets are attackers’ holy grail
  • Philosophy: A deeper understanding of the Session Token structure will probably lead us to a better understanding of the entire AWS auth system
  • Curiosity: Because it is there!

Research

For readability, the research process is described below as a linear process of multiple phases. However, in reality the process was more of a cyclic nature, as when new insights and hypotheses are gathered, they lead to changes in tooling and samples gathering.

Phase 1: From raw token to parsed fields

Enough meta, let’s get to business!

This is what a typical raw Session Token looks like: a long string consisting of printable characters

"IQoJb3JpZ2luX2VjEDoaCXVzLWVhc3QtMiJIMEYCIQDQh4gelDqno96q39RwiPT5x7K7SyVOSmeDpUMd9SthWAIhAP5tT81Cb+Rb2zN85delmYB4KECmW1uL7Tr36C/M2GaJKr0DCKP//////////wEQARoMNjY2MzU5NzY0NTI4Igyu9F2yAqZN3dG0q9YqkQMVrg/4mCJjDxg0QmplU581Z2P8LGhGfr9vgei6SaONhhfks5Kt9Ikbh61G9UiQ3SXgPLbHjOfTUueaIIcBz1Y3LcW+WajtfsGfB8CqT76lkJLtkvl+1KjSCVn6k+/K/iWgr3Zc1Ej+qT2djTH4x1OWFNS6i6iCtlUy/Z6i3P2fziHGsEmafkH3ict+07dFb3DA2aRnUhnaCHfQDNd/5ub70oILwB4UgtgGNkbM9SE/NxKgPZY9qIktYifqcgfDyYMYHlvY9XEc0UT2jfaQKDYVgMCdsdsW5mkoBYzLRisQhKxjfwaBpkRtdW8dEHFAG04eV4JSAbOSat3bgUwahATGizOdsMz/qhnS9qzShQGgSR6OU6pDDUtuHCGh0sgwrjsZ+bGDfzkw5Sy3JhjQpozfinCsAmDZ1t3nX6llw9OR9B2mdDHCeccsWGwjIvmprs21FtgjDuKGzaAET6HgQAR+pkFUgxBWVmZArtck1ziG21FEN8pFR75rOgxSkQ3yEZeDZkIIZ/aJnABGvbC3Fbq9ATD6ycuKBjqlAaGPeFKzdCR1dBh4sHQVHejXNegWWZV72n4MLyZx2FE9wLUfPGXXW+pYZg4SySvN0Z4OnGoYdlO/pjKvdRa507mSD8N8EhkwgpJMatFobJb0hsz7GY5flutVSkDfBDYkU91vpl7YCJ5rlvuR0I6iWe+K7smYj5hzm16YokWsRQ4EeWHo0peEJuqTZrZt/U4gHVsFpG44V8Yb6iRdZL78E+5xcgjeFw=="

Obviously, it’s base64 encoded, decoding it yields the following binary buffer (hex encoded)

21 0a 09 6f 72 69 67 69 6e 5f 65 63 10 3a 1a 09 75 73 2d 65 61 73 74 2d 32 22 48 30 46 02 21 00 d0 87 88 1e 94 3a a7 a3 de aa df d4 70 88 f4 f9 c7 b2 bb 4b 25 4e 4a 67 83 a5 43 1d f5 2b 61 58 02 21 00 fe 6d 4f cd 42 6f e4 5b db 33 7c e5 d7 a5 99 80 78 28 40 a6 5b 5b 8b ed 3a f7 e8 2f cc d8 66 89 2a bd 03 08 a3 ff ff ff ff ff ff ff ff 01 10 01 1a 0c 36 36 36 33 35 39 37 36 34 35 32 38 22 0c ae f4 5d b2 02 a6 4d dd d1 b4 ab d6 2a 91 03 15 ae 0f f8 98 22 63 0f 18 34 42 6a 65 53 9f 35 67 63 fc 2c 68 46 7e bf 6f 81 e8 ba 49 a3 8d 86 17 e4 b3 92 ad f4 89 1b 87 ad 46 f5 48 90 dd 25 e0 3c b6 c7 8c e7 d3 52 e7 9a 20 87 01 cf 56 37 2d c5 be 59 a8 ed 7e c1 9f 07 c0 aa 4f be a5 90 92 ed 92 f9 7e d4 a8 d2 09 59 fa 93 ef ca fe 25 a0 af 76 5c d4 48 fe a9 3d 9d 8d 31 f8 c7 53 96 14 d4 ba 8b a8 82 b6 55 32 fd 9e a2 dc fd 9f ce 21 c6 b0 49 9a 7e 41 f7 89 cb 7e d3 b7 45 6f 70 c0 d9 a4 67 52 19 da 08 77 d0 0c d7 7f e6 e6 fb d2 82 0b c0 1e 14 82 d8 06 36 46 cc f5 21 3f 37 12 a0 3d 96 3d a8 89 2d 62 27 ea 72 07 c3 c9 83 18 1e 5b d8 f5 71 1c d1 44 f6 8d f6 90 28 36 15 80 c0 9d b1 db 16 e6 69 28 05 8c cb 46 2b 10 84 ac 63 7f 06 81 a6 44 6d 75 6f 1d 10 71 40 1b 4e 1e 57 82 52 01 b3 92 6a dd db 81 4c 1a 84 04 c6 8b 33 9d b0 cc ff aa 19 d2 f6 ac d2 85 01 a0 49 1e 8e 53 aa 43 0d 4b 6e 1c 21 a1 d2 c8 30 ae 3b 19 f9 b1 83 7f 39 30 e5 2c b7 26 18 d0 a6 8c df 8a 70 ac 02 60 d9 d6 dd e7 5f a9 65 c3 d3 91 f4 1d a6 74 31 c2 79 c7 2c 58 6c 23 22 f9 a9 ae cd b5 16 d8 23 0e e2 86 cd a0 04 4f a1 e0 40 04 7e a6 41 54 83 10 56 56 66 40 ae d7 24 d7 38 86 db 51 44 37 ca 45 47 be 6b 3a 0c 52 91 0d f2 11 97 83 66 42 08 67 f6 89 9c 00 46 bd b0 b7 15 ba bd 01 30 fa c9 cb 8a 06 3a a5 01 a1 8f 78 52 b3 74 24 75 74 18 78 b0 74 15 1d e8 d7 35 e8 16 59 95 7b da 7e 0c 2f 26 71 d8 51 3d c0 b5 1f 3c 65 d7 5b ea 58 66 0e 12 c9 2b cd d1 9e 0e 9c 6a 18 76 53 bf a6 32 af 75 16 b9 d3 b9 92 0f c3 7c 12 19 30 82 92 4c 6a d1 68 6c 96 f4 86 cc fb 19 8e 5f 96 eb 55 4a 40 df 04 36 24 53 dd 6f a6 5e d8 08 9e 6b 96 fb 91 d0 8e a2 59 ef 8a ee c9 98 8f 98 73 9b 5e 98 a2 45 ac 45 0e 04 79 61 e8 d2 97 84 26 ea 93 66 b6 6d fd 4e 20 1d 5b 05 a4 6e 38 57 c6 1b ea 24 5d 64 be fc 13 ee 71 72 08 de 17

looking into the binary reveals a structure that looks like Type Length Value (TLV) tuples, but with some non trivial encoding.

After many attempts, came the revelation: What if the first byte is marking the type of the message and not encoded in the same manner of the rest of the message?

Once we had removed the first byte (which represents the message version, see below), the buffer decodes well as a protobuf structure!

By applying a protobuf decoder, we can see for the first time a structured view of AWS Session Token parsed into fields.

Protobuf decoding of the token using the protobuf-decoder

Summing up this phase’s findings:

  • The message first byte represents its version
  • The rest of the message is protobuf encoded

Now that the Token is parsed into fields we can “divide and conquer”: instead of inspecting the buffer as whole, we can dive into each field individually and analyze it independently.

Phase 2: building the research tools

To help us to identify, analyze and manipulate each of the tokens field in an efficient manner, we created two new open sourced tools:

  • The AWS Token Decoder web app: Based on the protobuf-decoder, we created a new specialization of it for AWS tokens, that strips the first byte then tries to parse the buffer as protobuf.
  • The STS-token-decoder: Using the outputs of the above mentioned AWS Token Decoder lax protobuf parser’s, we can take the types of the token fields (e.g. string, number, struct) and create a strict protobuf token schemes (.proto files).

These .proto files schemes and can be now compiled into classes (Python in our case) to programmatically access these parsed buffer’s fields’ content. We did that and wrapped these classes with our own functionality, sharing the STS-token-decoder as an open source.

With this we obtain the following capabilities:

  1. Programmatic token analysis: we can analyze many tokens swiftly.
  2. Programmatic token synthesis: we can edit an existing token and the protobuf generated classes will take care of properly encoding and adjusting the buffer accordingly (including size updates)

In addition to our newly developped tools, we took advantage of the pre-existing awscurl to send AWS signed requests with our arbitrary tokens, including synthesized and edited tokens created with the STS-token-decoder.

Phase 3: building the research corpus

To correctly understand the role and meaning of these fields we needed to create a diverse research corpus consisting on many samples from multiple accounts and environments.

We had two main methods of obtaining such data

  • First party generated tokens: By utilizing AWS accounts we control, we can generate tokens using AWS CLI get-session-token
aws sts get-session-token

The fact we control the account allows us to change the environment parameters and observe their impact on the token. Such parameters can include the AWS region, username, user permission, requested token duration etc.

  • Collecting publicly disclosed tokens from the Internet: searching the Internet for such tokens, mostly posted as part of a support case, yielded a few samples.

While for such tokens we cannot know or control their generating parameters, they provide us a more diverse picture of the tokens landscape, not just in space (different accounts’ tokens), but also in time (old tokens) revealing some older versions of the tokens format.

A brief note on Session Tokens history

We had identified two main variants of the Session Token

  • Version 1 (“Global”) tokens: Until 2015, the STS service was only available as a global service. This is still the default when interacting with the STS service.
  • Version 2 (“Regional”) tokens: AWS recommends using these tokens obtained from the regional STS service, as they provide several advantages over version 1 tokens. It should be noted that regionality does not negatively impact token validity, but on the contrary:

Increase session token validity — Session tokens from Regional AWS STS endpoints are valid in all AWS Regions. Session tokens from the global STS endpoint are valid only in AWS Regions that are enabled by default.

However, they are not the default probably due to the fact that their bigger size may break systems that assumes a shorter maximal token size, based on version 1 tokens’ size.

Backward compatibility is a bitch

Within these versions we had identified older variants, marked with a different, lower type value. In total we had discovered 5 different variants of STS Session tokens spanning from the year 2011 till today.

Versions can easily identified by their first base64 decoded byte, or even with a prefix of the base64 encoded token, as the first fields are fixed per version (see more below).

Samples table: 5 different variants.

Older versions samples help our research as they usually represent simpler versions, which cover the core capabilities and help us focus on them.

Phase 4: Applying our tools on the research corpus

Now let’s apply our new strict protobuf token parser and our gathered insights (explained below) on the raw token above

 % python3 STS-session.py "IQoJb3JpZ2luX2VjEDoaCXVzLWVhc3QtMiJIMEYCIQDQh4gelDqno96q39RwiPT5x7K7SyVOSmeDpUMd9SthWAIhAP5tT81Cb+Rb2zN85delmYB4KECmW1uL7Tr36C/M2GaJKr0DCKP//////////wEQARoMNjY2MzU5NzY0NTI4Igyu9F2yAqZN3dG0q9YqkQMVrg/4mCJjDxg0QmplU581Z2P8LGhGfr9vgei6SaONhhfks5Kt9Ikbh61G9UiQ3SXgPLbHjOfTUueaIIcBz1Y3LcW+WajtfsGfB8CqT76lkJLtkvl+1KjSCVn6k+/K/iWgr3Zc1Ej+qT2djTH4x1OWFNS6i6iCtlUy/Z6i3P2fziHGsEmafkH3ict+07dFb3DA2aRnUhnaCHfQDNd/5ub70oILwB4UgtgGNkbM9SE/NxKgPZY9qIktYifqcgfDyYMYHlvY9XEc0UT2jfaQKDYVgMCdsdsW5mkoBYzLRisQhKxjfwaBpkRtdW8dEHFAG04eV4JSAbOSat3bgUwahATGizOdsMz/qhnS9qzShQGgSR6OU6pDDUtuHCGh0sgwrjsZ+bGDfzkw5Sy3JhjQpozfinCsAmDZ1t3nX6llw9OR9B2mdDHCeccsWGwjIvmprs21FtgjDuKGzaAET6HgQAR+pkFUgxBWVmZArtck1ziG21FEN8pFR75rOgxSkQ3yEZeDZkIIZ/aJnABGvbC3Fbq9ATD6ycuKBjqlAaGPeFKzdCR1dBh4sHQVHejXNegWWZV72n4MLyZx2FE9wLUfPGXXW+pYZg4SySvN0Z4OnGoYdlO/pjKvdRa507mSD8N8EhkwgpJMatFobJb0hsz7GY5flutVSkDfBDYkU91vpl7YCJ5rlvuR0I6iWe+K7smYj5hzm16YokWsRQ4EeWHo0peEJuqTZrZt/U4gHVsFpG44V8Yb6iRdZL78E+5xcgjeFw=="

type: 33
{
"name": "origin_ec",
"signKeyId": "58",
"region": "us-east-2",
"DERSig": "MEYCIQDQh4gelDqno96q39RwiPT5x7K7SyVOSmeDpUMd9SthWAIhAP5tT81Cb+Rb2zN85delmYB4KECmW1uL7Tr36C/M2GaJ",
"user": {
"encryptKeyId": "-93",
"someId": "1",
"accountId": "666359764528",
"IV": "rvRdsgKmTd3RtKvW",
"userEncryptedData": "Fa4P+JgiYw8YNEJqZVOfNWdj/CxoRn6/b4HoukmjjYYX5LOSrfSJG4etRvVIkN0l4Dy2x4zn01LnmiCHAc9WNy3Fvlmo7X7BnwfAqk++pZCS7ZL5ftSo0glZ+pPvyv4loK92XNRI/qk9nY0x+MdTlhTUuouogrZVMv2eotz9n84hxrBJmn5B94nLftO3RW9wwNmkZ1IZ2gh30AzXf+bm+9KCC8AeFILYBjZGzPUhPzcSoD2WPaiJLWIn6nIHw8mDGB5b2PVxHNFE9o32kCg2FYDAnbHbFuZpKAWMy0YrEISsY38GgaZEbXVvHRBxQBtOHleCUgGzkmrd24FMGoQExosznbDM/6oZ0vas0oUBoEkejlOqQw1LbhwhodLIMK47Gfmxg385MOUstyYY0KaM34pwrAJg2dbd51+pZcPTkfQdpnQxwnnHLFhsIyL5qa7NtRbYIw7ihs2gBE+h4EAEfqZBVIMQVlZmQK7XJNc4httRRDfKRUe+azoMUpEN8hGXg2ZCCGf2iZwARr2wtxW6vQE="
},
"creationUnixtime": 1632822522,
"auxData": "oY94UrN0JHV0GHiwdBUd6Nc16BZZlXvafgwvJnHYUT3AtR88Zddb6lhmDhLJK83Rng6cahh2U7+mMq91FrnTuZIPw3wSGTCCkkxq0WhslvSGzPsZjl+W61VKQN8ENiRT3W+mXtgInmuW+5HQjqJZ74ruyZiPmHObXpiiRaxFDgR5YejSl4Qm6pNmtm39TiAdWwWkbjhXxhvqJF1kvvwT7nFyCN4X"
}
creation time 2021-09-28 12:48:42
r: 94320536320976402567647841358238654142521339401152461683428052318097085522264 s: 115080600641957610719224864062596448685741805574002579554501035948511143159433

The types of the fields were obtained from the results of the lax protobuf parser’s outputs, but the fields names and their meaning were assigned by us as a result of this research below (and might be wrong!).

Explaining each field

  1. type: 33. A version 2 token, see above.
  2. name: A printable string. helps in generating a long prefix base64 that uniquely identifies this token version with no need to decode. (see above). “ec” may stand for elliptic curve (see DERSig below).
  3. signKeyId: The ID of the signing key of DERSig (see below). This ID is incremented on each hour.
  4. region: The AWS region of the STS service generating this version 2 token.
  5. DERSig: A DER encoded ECDSA signature. its r,s parameters are also printed. It signs the user part below using the NIST256p curve and the sha256 hash. More on that in the “Token signature in depth” section below.
  6. user: This field contains the following sub fields

6.1. encryptKeyId: The ID of the encryption key of userEncryptedData. This ID is incremented on each hour.

6.2. someId: We were not able to find out what that means. Takes relatively low values in our samples (0–5).

6.3. accountId: The AWS account ID of the token issuer. in this case, 666359764528.

6.4. IV: In all of our samples of all versions this field was of 12 byte long and appears to be random.

According to AWS docs

By default, the AWS Encryption SDK uses an algorithm suite with AES-GCM

and

The length of the initialization vector (IV) is always 12 bytes

Therefore, we assume it is Initialization Vector (IV) for the AES-GCM encrypted userEncryptedData

6.5. userEncryptedData: This is the crux of the session token.

  • This field seems like random but maintains the same length even for subsequent requests of the same user, therefore it is likely to be encrypted, probably with AES-GCM (see above)
  • Its size may change between different users, suggesting it is user aware.
  • When requesting a minimal session duration (900 seconds) vs. maximal duration (129600 seconds) this field expands by a single byte (inflection point is in 16384), strongly suggesting the expiry period is encoded within it.

Therefore we assume the encrypted user ID and Secret are encrypted there.

Please note the striking resemblance between AWS description of the information within the token and our analysis of the “user” part and specifically the “userEncryptedData field (source: youtube)

7. creationUnixtime: The creation date, expressed as epoch time, in this case 1632822522 which means 2021–09–28 12:48:42 as printed below. This field is probably included for logging purposes only, as changing it does not invalidate the token and does not make an expired token valid again. As written above we assume the actual critical data for token validity and expiry is encoded within the userEncryptedData field.

8. auxData: like the creationUnixtime this field is not protected with a signature and can be edited. Therefore, we assume it does not include mission critical data. We assume it is some encrypted user related data as its length remains the same for same users but may vary between other users.

For brevity, we presented here the analysis of the most recent version on the version 2 tokens, as older versions contain a subset of these fields.

There is one exception though with the last field of version 1 (type 23)

% python3 STS-session.py "FwoGZXIvYXdzEBAaDLHxhjed4A6ABQplMyKBAd0Jzohb7hRtcvWvjWSNw5bVcn5al0jGu9Cl7W2ijDztOnmLZICjbsFBYgO7mt2J1AM9CO0nrL9qBatm9+ytKde5MXuKyzMGY6J8YDLoXU625FQKpnGXelSQxA1mYI/VOjaSa2MP4gPZsgOBjyOuiRxUKmkgYglbzl8sGYco9KWSNyjK5/aKBjIoKnYXwjdTkOt7/Bw6HMETrjPUPyHStdSfCjt4IwGvu2ox5Xo8VHAp5g=="

type: 23
{
"name": "er/aws",
"encryptKeyId": "16",
"IV": "sfGGN53gDoAFCmUz",
"userEncryptedData": "3QnOiFvuFG1y9a+NZI3DltVyflqXSMa70KXtbaKMPO06eYtkgKNuwUFiA7ua3YnUAz0I7Sesv2oFq2b37K0p17kxe4rLMwZjonxgMuhdTrbkVAqmcZd6VJDEDWZgj9U6NpJrYw/iA9myA4GPI66JHFQqaSBiCVvOXywZhyj0pZI3",
"creationUnixtime": 1633530826,
"unknown3": "KnYXwjdTkOt7/Bw6HMETrjPUPyHStdSfCjt4IwGvu2ox5Xo8VHAp5g=="
}
creation time 2021-10-06 17:33:46

unknown3: in all of our samples it was 40 bytes long and seems random. Therefore we assume it is some encryption related parameter which is agnostic to the specific user. Since this field was absent in the previous version of Version 1 tokens (i.e. type 21), it probably means this field is not at the core of the protocol. Our best guess is that it is related to the optional AES-GCM key commitment which became mandatory by AWS in version 2.0.x

The AWS Encryption SDK provides full support for encrypting and decrypting with key commitment beginning in version 2.0.x. By default, all of your messages are encrypted and decrypted with key commitment. Version 1.7.x of the AWS Encryption SDK can decrypt ciphertexts with key commitment. It is designed to help users of earlier versions deploy version 2.0.x successfully.

Support for key commitment includes new algorithm suites and a new message format that produces a ciphertext only 30 bytes larger than a ciphertext without key commitment.

Token signature in depth

The DERSig field was initially wrongfully classified by our lax parser as a protobuf struct, while it actually is an ASN.1 DER encoded signature.

The raw DERSig field, wrongfully parsed as protobuf struct

We made this educated guess, as this buffer value adheres to the relevant characteristics of such signature, as described by Pieter Wuille:

A correct DER-encoded signature has the following form:

0x30: a header byte indicating a compound structure.

A 1-byte length descriptor for all what follows.

0x02: a header byte indicating an integer.

A 1-byte length descriptor for the R value

The R coordinate, as a big-endian integer.

0x02: a header byte indicating an integer.

A 1-byte length descriptor for the S value.

The S coordinate, as a big-endian integer.

Where initial bytes for R and S are not allowed, except when their first byte would otherwise be above (in which case a single in front is required)0x000x7F0x00

One of the nice features of ECDSA signatures is that given the right parameters:

  • Text to sign
  • Hashing function
  • Elliptic curve

it can recover its signer’s public key. (in practice, two possible public keys)

Therefore, if we can get two tokens signed by the same key, we can verify the correctness of our parameters guess.

Obviously the number of different possibilities is very big, so educated guesses are required again.

We guessed that Elliptic Curve is NIST P-256 and the hashing function is SHA-256, as these parameters are used elsewhere in AWS authentication:

With AWS Signature Version 4A, the signature does not include Region-specific information and is calculated using the algorithm.AWS4-ECDSA-P256-SHA256

We then guessed the “user” part is the part protected by the signature in its raw protobuf encoding

vks = ecdsa.VerifyingKey.from_public_key_recovery(self.session_pb.DER_Sig, self.session_pb.user.SerializeToString(), curve, sha256, sigdecode_der )

Luckily (actually after much trial and error) we were right!

We were able to identify that the signing key public key for a specific region at a specific time and ensured that it is the same across different users

-----BEGIN PUBLIC KEY-----
MFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAEjzuplh/vDM621Y4qNPmaVUM8TfyMstLGlu/9wT3M\nizt8SCDxslIHbNYu36khLM7mxqocy7jU3tJfNZKg+X2p3g==
-----END PUBLIC KEY-----

Following this revelation, we were able to observe that these keys change on an hourly basis and when they do signKeyId is incremented. Furthermore, using this insight we were able to verify that the signKeyId indeed represents this key id, as it had to be the same in the same region for both tokens, to allow a successful public key extraction.

Security implications

Forging attacks:

Using our tools we were able to edit tokens and change their value. However, requests carrying and signed by such modified tokens, were either:

  • Rejected by AWS services (as it invalidates the signature) when changing fields that may have an effect
  • Accepted by AWS services but had no visible effect (e.g. modified creationUnixtime as above), which is probably he reason they are not sealed with a signature by AWS to begin with.

Golden credentials attacks:

AWS auth system seems relatively resilient against such attacks. It seems that AWS encryption and signing keys are updated in a rapid (1 hour) pace. Therefore, it seems likely that even if attackers get a hold on such keys, it does not provide them with indefinite access time, only for a limited time until this key become obsolete.

Zombie tokens:

Since version 2 (AKA “Regional”) tokens are “self-contained” as described by AWS:

Session tokens from Regional AWS STS endpoints are valid in all AWS Regions. Session tokens from the global STS endpoint are valid only in AWS Regions that are enabled by default. If you intend to enable a new Region for your account, you can use session tokens from Regional AWS STS endpoints.

Therefore it might be possible that for a user that was removed from the SSO provider, its session tokens will remain effective until their expiry date.

Privacy and redaction:

Some tokens contain at least the account id and creation time. If you consider this information to be secret please be sure to redact them before you share the token.

Contributions and takeaways

Auth systems and protocols are the basis of our information security and AWS session tokens take an important part of the AWS auth system.

Prior to our research the inner workings of AWS session tokens were a complete black box and this research make them more of a glass box.

Specifically our main contributions were:

  • Finding the inner structure of such tokens and breaking them into fields.
  • Explaining the meaning of almost all of these fields.
  • Identifying the cryptographic primitives used by AWS for such tokens (e.g. curves, hash function, algorithms).
  • Discovering at least 5 variants of such session tokens in the wild.
  • Creating and sharing two open source tools to allow users and researchers to view and manipulate the contents of such tokens.
  • Exposing AWS internal cryptographic key management of hourly updates and provide a way to determine the relevant public keys.
  • Testing the resilience of such tokens against forging attacks.

We would like the highlight the following takeaways for the different potential reader of this write-up:

  • Builders: We hope you learned something new and can at least properly redact you tokens when you share them publicly.
  • Amazon AWS: AWS auth protocols seems to be highly secure and take advantage of the lessons learned by the abuse other protocols (i.e. Kerberos, SAML). We hope that AWS will make these protocols an open standard such that security researchers will be able to assess their security directly, and not invest time in the reverse analysis of their protocols.
  • Fellow Researchers: As you can see we have some gaps left in our understanding. We hope that others will use this research and tools as a springboard for further research. If you find something new or an error in this write-up please let us know!

Special thanks for @ace__pace for reviews and other assistance

Tal Be'ery

All things CyberSecurity. Security Research Manager. Co-Founder @ZenGo (KZen). Formerly, VP of Research @ Aorato acquired by @Microsoft ( MicrosoftATA)