Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: divide subscribeToConfigPod method to manage GRPC server connection separately inside the modules #69

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

gatici
Copy link
Contributor

@gatici gatici commented Oct 2, 2024

The aim of this change is managing the GRPC server connection separately inside the modules.
This PR refactors ConfClient interface methods which is used to subscribe to Config pod NMS.
refactor:

  • add ConnectToGrpcServer method to ConfClient interface
  • change signatures of PublishOnConfigChange and subsribeToConfigPod methods

After getting updated this library, the necessary changes will be done in every module to solve the GRPC client re-connection issue.
These changes are done to solve the following problem:

If the Webconsole is restarted, other modules loses the GRPC connection to Webconsole
They tries to connect again for hours with no success.

2024-09-26T11:32:10.014Z [smf] 2024-09-26T11:32:10Z [ERRO][Config5g][GRPC] Connectivity status idle, trying to connect again.

…nnection separately through the modules

refactor:
- add ConnectToGrpcServer method to ConfClient interface
- change signatures of PublishOnConfigChange and subsribeToConfigPod methods

Signed-off-by: gatici <[email protected]>
@gatici gatici force-pushed the fix-grpc-client-reconnect-after-nms-restart branch from 37117a3 to 1352a38 Compare October 3, 2024 09:01
@thakurajayL thakurajayL self-assigned this Oct 5, 2024
@thakurajayL
Copy link
Contributor

Could you please suggest one example how NFs are going to use these updated APIs? This will help in doing effective code review.

// On Receiving Configuration from ConfigServer, this api publishes
// on created channel and returns the channel
PublishOnConfigChange(bool) chan *protos.NetworkSliceResponse
PublishOnConfigChange(metadataRequested bool, stream protos.ConfigService_NetworkSliceSubscribeClient) chan *protos.NetworkSliceResponse
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have 2 set of NFs

  1. One set of NFs use PublishOnConfigChange. SMF, AMF, PCF falls in this category
  2. Other set of NFs use subscribeToConfigPod - UDR, ausf, nssf,udm, nrf

We should try to see if we can bring consistency here.

Copy link
Contributor Author

@gatici gatici Oct 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @thakurajayL ,

There are 2 different implementations in the NFs, but the difference does not come from pulling or pushing the config. Because both PublishOnConfigChange and ConfigWatcher calls the same subscribeToConfigPod which pulls the messages. The main difference is how we create the GRPC client.

These PRs trying to solve the GRPC client reconnection issue, so we need an observer in the NF side which checks the GRPC Client status and re-creates it if necessary. In the second implementation, it is difficult to observe the client status. Hence, We are not planning to use ConfigWatcher method any more.

In the following PR's in the NFs, all the NF's will implement PublishOnConfigChange method. Does it make sense ?Please check this sample PR omec-project/nrf#132.
I will remove ConfigWatcher method from the config5g library.

  1. In SMF, AMF, PCF :

     client := gClient.ConnectToConfigServer(factory.AmfConfig.Configuration.WebuiUri)
     configChannel := client.PublishOnConfigChange(true)
              // PublishOnConfigChange calls the subscribeToConfigPod
         go ausf.updateConfig(configChannel)
    
  2. In UDR, ausf, nssf,udm, nrf:

     commChannel := client.ConfigWatcher(factory.AusfConfig.Configuration.WebuiUri)
             // ConfigWatcher calls the subscribeToConfigPod
     go ausf.updateConfig(commChannel)
    


logger.GrpcLog.Infoln("stream msg received")
logger.GrpcLog.Debugf("network slices %d, RC of config pod %d", len(rsp.NetworkSlice), rsp.RestartCounter)
if configPodRestartCounter == 0 || (configPodRestartCounter == rsp.RestartCounter) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally pulling the config from webserver would be lot more simple approach. Current way of pushing config from webconsole is bit error prone. This may need more changes and something can be done later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am planning to open PR's in each NF to unify the implementation as described above.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it could be a good moment to revisit the approach. Having the NF get the config from webconsole would be better in a lot of ways, but we would also have to make sure that a configuration change properly triggers a reconfiguration on the NFs.

We could keep the GRPC connection to have webconsole notify subscribers that the configuration has changed, and then the NFs can get the current up to date configuration from webconsole. We would however need to take into account that for a large deployment with many subscribers, the whole config could be large.

I think this would warrant a more in-depth redesign and it might make more sense outside of this PR that is meant to fix a current severe bug.

@gatici
Copy link
Contributor Author

gatici commented Oct 7, 2024

Could you please suggest one example how NFs are going to use these updated APIs? This will help in doing effective code review.

Hello @thakurajayL ,
This PR shows how to use updated config5G library: omec-project/nrf#132

ConfigWather method is not necessary anymore as all the NFs will call PublishOnConfigChange

Signed-off-by: gatici <[email protected]>
Comment on lines +148 to +150
// ConnectToGrpcServer connects to a GRPC server with a Client ID
// and returns a stream if connection is successful else returns nil
func (confClient *ConfigClient) ConnectToGrpcServer() (stream protos.ConfigService_NetworkSliceSubscribeClient) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function does not really connect to anything. It checks the state of the connection and then subscribes to a stream of NetworkSlice. I would rename the function and docstring to reflect that.

}
return stream
} else if status == connectivity.Idle {
logger.GrpcLog.Errorf("connectivity status idle, trying to connect again")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This log line in particular is not really coherent, as this function does not attempt any reconnection. I think the API of the function could be changed also and return an error instead of logging. That would let the caller decide what to log and if it wants to try to connect again.

// ConnectToGrpcServer connects to a GRPC server with a Client ID
// and returns a stream if connection is successful else returns nil
func (confClient *ConfigClient) ConnectToGrpcServer() (stream protos.ConfigService_NetworkSliceSubscribeClient) {
logger.GrpcLog.Infoln("connectToGrpcServer")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be at the debug level.

Suggested change
logger.GrpcLog.Infoln("connectToGrpcServer")
logger.GrpcLog.Debugln("connectToGrpcServer")


logger.GrpcLog.Infoln("stream msg received")
logger.GrpcLog.Debugf("network slices %d, RC of config pod %d", len(rsp.NetworkSlice), rsp.RestartCounter)
if configPodRestartCounter == 0 || (configPodRestartCounter == rsp.RestartCounter) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it could be a good moment to revisit the approach. Having the NF get the config from webconsole would be better in a lot of ways, but we would also have to make sure that a configuration change properly triggers a reconfiguration on the NFs.

We could keep the GRPC connection to have webconsole notify subscribers that the configuration has changed, and then the NFs can get the current up to date configuration from webconsole. We would however need to take into account that for a large deployment with many subscribers, the whole config could be large.

I think this would warrant a more in-depth redesign and it might make more sense outside of this PR that is meant to fix a current severe bug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants