You are managing a Kubernetes cluster running AI training jobs using TensorFlow. The jobs require access to multiple GPUs across different nodes, but inter-node communication seems slow, impacting performance.
What is a potential networking configuration you would implement to optimize inter-node communication for distributed training?
Submit