Description
This is a question
Some Context
On my project we had to change the workflow in a pretty huge step function.
At first it looked something like this:
- Task A
- Task B
- Parrallel tasks
- Task C
- Task D
- Task E
- Task F
- Task G
- Task C
- Task H
- Task I
Then we moved to :
- Task A
- Task B
- Task C' -> calls a new Step Function containing Task C, Task D and Task E
- Task F' -> calls a new Step Function containing Task F and Task G
- Task H
- Task I
What went wrong?
In the first step function definition, the plugin generates the IAM policies that allows the step function to invoke a lambda, push a message in SQS, and so on.
The problem is that when transforming the step function to the second version, the resources moved into another step function.
Thus the plugins generated a new set of policies for the other step functions and did not generate the policies for the invocations that moved awway.
Then when we deployed on production the current running step functions, based on the first definition started to failed one by one with the following type of error:
User: arn:aws:sts::364593438022:assumed-role/service-MyStepFunctionRole/JikZyqUWAaDsnoaqSNVNFtLIImCpcPga is not authorized to perform: lambda:InvokeFunction on resource: arn:aws:lambda:eu-west-3:123456789123:function:service-myFunction
What to do?
We followed the @theburningmonk's guide on Blue/Green deployment wich is great and allowed us to upgrade our functions easily without breaking things.
I would love that the same thing exists for policies.
The fix in our situation has been to stop the step functions and restart them all. Fortunately there were only a few hundreds of executions concerned by the problem. Unfortunately the step function sends emails to end users thus they will receive an email they already received few days ago.
But I started trying to find a solution. I was thinking about creating a role for the step function and add all the needed policies like lambda::InvokeFunction for the 37 functions and all of their versions. The downside would have been to manually add every resource we add to the step function manually to the policy.
I was wondering how you would manage this situation if you were me?
Additional Data
- Serverless Framework Core Version you're using: 1.51.0
- The Plugin Version you're using: ^2.17.4