Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VPP memif #2005

Open
wants to merge 15 commits into
base: master
Choose a base branch
from
Open

VPP memif #2005

wants to merge 15 commits into from

Conversation

lebauce
Copy link
Member

@lebauce lebauce commented Sep 17, 2019

No description provided.

@lebauce
Copy link
Member Author

lebauce commented Sep 26, 2019

run skydive-cdd-overview-tests
run skydive-scale-tests

@lebauce
Copy link
Member Author

lebauce commented Sep 26, 2019

run skydive-functional-tests-backend-orientdb

@lebauce
Copy link
Member Author

lebauce commented Sep 26, 2019

run skydive-cdd-overview-tests
run skydive-functional-tests-backend-orientdb

@lebauce
Copy link
Member Author

lebauce commented Oct 4, 2019

run skydive-go-fmt

1 similar comment
@lebauce
Copy link
Member Author

lebauce commented Oct 4, 2019

run skydive-go-fmt

@lebauce
Copy link
Member Author

lebauce commented Oct 4, 2019

run skydive-cdd-overview-tests

@lebauce
Copy link
Member Author

lebauce commented Oct 4, 2019

run skydive-functional-hw-tests

@lebauce
Copy link
Member Author

lebauce commented Oct 7, 2019

run skydive-scale-tests

1 similar comment
@lebauce
Copy link
Member Author

lebauce commented Oct 7, 2019

run skydive-scale-tests

@lebauce
Copy link
Member Author

lebauce commented Oct 14, 2019

@fgschwan We recently merged the support for seeds in Skydive : it's a single Skydive probe executed as a standalone process in the Docker namespace.
This PR modifies the VPP existing probe to be able to run inside the namespace, using the socket interface. I integrated the memifs retrieval that you implemented and the links between them. I also cherry-picked the tests from your PR.
It would be very nice if you could give it a try.

@fgschwan
Copy link

@fgschwan We recently merged the support for seeds in Skydive : it's a single Skydive probe executed as a standalone process in the Docker namespace.
This PR modifies the VPP existing probe to be able to run inside the namespace, using the socket interface. I integrated the memifs retrieval that you implemented and the links between them. I also cherry-picked the tests from your PR.
It would be very nice if you could give it a try.

@lebauce
Hi, i'm currently quite busy with other things, but my colleague Milan Lenco could take a look at this PR and try to use it with running VPPs inside docker containers.

@@ -374,7 +374,7 @@
builders:
- skydive-cleanup
- skydive-test:
test: BACKEND=elasticsearch ARGS="-ovs.oflow.native" scripts/ci/run-functional-tests.sh
test: BACKEND=elasticsearch WITH_DOCKER_VPP=true ARGS="-ovs.oflow.native" scripts/ci/run-functional-tests.sh

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WITH_DOCKER_VPP=true can be removed, not used in this PR

@@ -2,6 +2,7 @@

//go:generate go run git.fd.io/govpp.git/cmd/binapi-generator --input-file=/usr/share/vpp/api/interface.api.json --output-dir=./bin_api
//go:generate go run git.fd.io/govpp.git/cmd/binapi-generator --input-file=/usr/share/vpp/api/vpe.api.json --output-dir=./bin_api
//go:generate go run git.fd.io/govpp.git/cmd/binapi-generator --input-file=/usr/share/vpp/api/memif.api.json --output-dir=./bin_api

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The major issue with the design of VPP binary APIs, is that practically any change breaks backward compatibility. The binary API files generated in this PR are I suppose for 19.04 VPP. I tried to run skydive against container with VPP 19.08 and got some expected errors, such as:

skydive-agent_1     | 2019-10-17T10:09:29.960Z	WARN	vpp/vpp.go:649 (*Probe).Fire	vpp-93026bb9-358b-4bff-414b-6c852edd130b: time="2019-10-17T10:09:29Z" level=warning msg="Unknown message received, ID: 829"
skydive-agent_1     | 2019-10-17T10:09:27.753Z	ERROR	probes/wrapper.go:52 (*serviceManager).Start.func1	vpp-a31cbfb5-a85f-4503-5ad3-9b5225f85436: no reply received within the timeout period 1s

This is because any change in binary API message changes its CRC which will cause it to be refused and dropped on either VPP or GoVPP side. Similary, a newly added or removed message can shift sequence IDs of existing messages, once again breaking compatibility between client and VPP.
Although you have added the .vppbinapi makefile target that allows to build skydive for a custom VPP version, a single skydive binary will always be limited to only one version of VPP and might not interpret network containing multiple different containerized VPPs properly. Unfortunately, the interface.api that you use to dump interfaces changes pretty much with every release. For example, SwInterfaceDump structure has been edited (interface name is string, not array of bytes) in 19.08 and probes/vpp/vpp.go needs to be edited to reflect the change for it to compile. Actually, between 19.04 and 19.08 even the API language changed a bit (string type was introduced) and govpp had to be adjusted. Therefore to generate APIs for 19.08 you would also need to update the govpp dependency first. To make this even worse, in 19.08 the path to installed *.api.json files has changed - core APIs and plugin APIs are further separated to core and plugins sub-directories.

What we do in ligato/vpp-agent to mitigate this issue, is that we hide interaction with VPP behind an interface. Then we provide implementation of that interface for the latest VPP version (top of the master branch, frequently updated) and for the last 2 releases (for example: https://github.com/ligato/vpp-agent/tree/master/plugins/vpp/ifplugin/vppcalls). Each of these implementations has its own generated binary APIs corresponding to its version. For example in this file we define APIs for managing VPP interfaces. Then every implementation of this API registers itself into this map. When connection to a VPP instance is made, CheckCompatibility method is used to find and select that implementation which is compatible with the given VPP. In your case you could define API with methods like FetchMemifs(), FetchInterfaces(), SubscribeInterfaceEvents and define your own structures for interface metadata with fields that you actually use.

If support of multiple VPP versions is not a priority right now, then maybe at least document which version is supported out-of-the-box right now and perhaps allow to switch the .vppbinapi makefile target off, so that skydive can be compiled with binary APIs already generated and included in the repo, without having to have VPP installed in the host.

func (p *Probe) interfacesEvents() {
defer p.wg.Done()
type memifSocket struct {
*memif.MemifDetails

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an example where interface configuration is defined be directly re-using structures from the VPP binary API. Since that changes quite often, it might not compile if skydive is build against a different VPP version.

return nil, err
}

sharedMemifSockets[msg2.SocketID].MemifDetails = msg2

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually there can be multiple (listener) memifs (on single VPP) that use the same socket. Master-Slave memif connection is created not only when the same socket file is used, but also when IDs (field ID) are the same.
In other words there can be multiple listeners with different IDs using the same socket and based on the ID of the connecting memif the match is made.

func (p *Probe) GetABLinks(node *graph.Node) (edges []*graph.Edge) {
socketFilename, _ := node.GetFieldString("VPP.SocketFilename")
socketFilename, _ = p.getHostSocketFilename(socketFilename, node)
nodes, _ := p.slaveMemifs.Get(socketFilename)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use also IDs to determine the links between memifs.

conn, err := govpp.Connect(p.shm)
// Do starts the VPP probe and get all interfaces
func (p *Probe) Do(ctx context.Context, wg *sync.WaitGroup) error {
conn, err := govpp.Connect(p.addr)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm getting lot's of these error messages for the host and containers that do not run any VPP:

skydive-agent_1     | 2019-10-17T12:23:27.844Z	ERROR	probes/wrapper.go:52 (*serviceManager).Start.func1	dev-VirtualBox: VPP connection error: VPP API socket file /run/vpp/api.sock does not exist
skydive-agent_1     | 2019-10-17T12:23:28.016Z	ERROR	probes/wrapper.go:52 (*serviceManager).Start.func1	vpp-9d93ad24-9cc6-4d85-584e-c333eb1777ae: VPP connection error: VPP API socket file /run/vpp/api.sock does not exist

Perhaps Do() method could start by checking if the socket files exists or if a process named vpp exists and continue only if it is the case to avoid polluting the log.

msg2 := &memif.MemifDetails{}
reqCtx = ch.SendMultiRequest(req2)
for {
if lastReply, err := reqCtx.ReceiveReply(msg2); lastReply {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is a bug: msg2 := &memif.MemifDetails{} should be inside the for loop just before ReceiveReply(). Otherwise it is the same struct instance that is being rewritten with each iteration and all the memifs will end up in sharedMemifSockets with the same configuration of the last memif. (i.e. this issue occurs if VPP has multiple memifs)

}
}
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw, the original Filip's PR also contained a support for AF-PACKET interfaces. If it is of an interest for you, this is how you can tell if the given interface is AF_PACKET and this is how to obtain the name of the Linux interface to which it is attached.

@milanlenco
Copy link

I was able to get it running and overall it looks good, only added few comments.

Also one general comment: If it is not already done/planned, I would also welcome if it was documented how to build and use vpp probe. For example mention that WITH_VPP=true has to be defined, that vpp probe is not loaded by default and needs to be listed in agent.topology.probes config and also document which version of VPP is supported out-of-the-box (for which version the binary API files have been generated and included in the repo).

@lebauce
Copy link
Member Author

lebauce commented Oct 18, 2019

@milanlenco Thanks a lot for taking the time to test and for the review. This is really useful, I will try to address the issues

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants