-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SPIKE - Initial Agent comms API
server design
#23395
Comments
APINote This is a work in progress The Agent comms API will be exposed via an HTTP server using TLS as the transport layer. It will contain the endpoints below. AuthenticationPOST /loginLog in to the server. Body: UUID, password EventsPOST /events/statelessSend events that are not necessarily processed by the engine. Body: Event POST /events/statefulSend events that must be processed and persisted. Body: Event CommandsGET /commandsSubscribe to obtain commands sent by the server. The connection is hijacked by the server and converted into a websocket or SSE connection. It is kept alive during the whole agent session and only the server can send events. Parameters: UUID (so the server knows which agent to send specific events to) POST /commands
Send commands to all or a specific set of agents. Parameters: command, UUIDs ManagementGET /configurationGet information about the group configuration. Parameters: - PUT /configurationUpdate the group configuration. Body: New configuration GET /upgradeDownload WPK files to upgrade agents. Parameters: File name SSE vs WebsocketsThe two alternatives I would consider for server-side events are Server sent events (SSE) and WebSockets. Here is an image that explains their differences pretty well. Server sent events are simpler and use the HTTP protocol under the hood, the messages can flow in one direction only and due to their simplicity, no external library may be required. Another advantage is that enterprise firewalls do not have issues inspecting the packets like it happens with websockets. I would only choose websockets if the messages structure is a limitation and we want to use a specific encoding for the commands. API frameworkThe frameworks that would require the least amount of dependencies changes are Connexion 3.0 and FastAPI, only these two are taken into consideration. Both are based on Starlette and Uvicorn, they are pretty similar in terms of dependencies and performance. Connexion does not offer built-in support for websockets or SSE. On the other hand, FastAPI does not require any external dependencies for it.
FastAPI has broader community and maintainer support, but it does not support a spec-first approach like Connexion, and considering we have connexion 3.0 running in the Server management API it may be appropriate to continue using it. This is something we must decide with the team. |
API-Indexer communicationThe communication with the indexer will be performed through the API it exposes, using the opensearch-py library as a SDK. For example, if a new agent wants to log in, we craft and send a HTTP POST request to the indexer with the identifiers of the agent so we can validate the authorization token. flowchart TD
subgraph Agents
Endpoints
Clouds
Other
end
subgraph Server["Server cluster"]
subgraph Wazuh1["Server node n"]
api1["Agent comms API"]
end
subgraph Wazuh2[" Server node 2"]
api2["Agent comms API"]
end
end
subgraph Indexer
subgraph Data_states["Data states"]
agents_list["Agents list"]
states["States"]
end
end
subgraph lb["Load Balancer"]
lb_node["Per request"]
end
Agents -- /login --> lb
lb -- /login --> Wazuh1
lb -- /login --> Wazuh2
Wazuh1 -- Read credentials --> agents_list
Wazuh2 -- Read credentials --> agents_list
style Wazuh1 fill:#abc2eb
style Wazuh2 fill:#abc2eb
style Data_states fill:#abc2eb
API-Engine communicationFor the communication with the Engine, we will use Unix sockets like the Server management API. Since we will have an Engine instance running on each of the nodes of a cluster, there's no need to use the broader internet to communicate. The API sees the request received from an agent, builds a request to the Engine and sends it through the socket, in the case of stateless events, it's not necessary to wait for the response. flowchart TD
subgraph Agents
Endpoints
Clouds
Other
end
subgraph Server["Server cluster"]
subgraph Wazuh1["Server node n"]
api1["Agent comms API"]
server1["Server </br> management API"]
Engine1["Engine"]
VD1["VD"]
end
subgraph Wazuh2[" Server node 2"]
api2["Agent comms API"]
server2["Server </br> management API"]
Engine2["Engine"]
VD2["VD"]
end
end
subgraph lb["Load Balancer"]
lb_node["Per request"]
end
Agents -- /events/stateless --> lb
lb -- /events/stateless --> Wazuh1
lb -- /events/stateless --> Wazuh2
api1 -- Unix socket --> Engine1
api2 -- Unix socket --> Engine2
style Wazuh1 fill:#abc2eb
style Wazuh2 fill:#abc2eb
|
Description
We want to, as part of #22677, replace the current
wazuh-remoted
andwazuh-agentd
services. Instead, we intend to develop a service that uses a standard protocol such as HTTP and request-driven communication, where different events can be forwarded to any of the Wazuh servers, unlike the current session-oriented approach where an agent sends all its messages to the server where it is connected.However, we will also need to maintain a session-oriented connection so that the server can send commands to the agents on demand. Some proposals for this other mode of communication could include the use of websockets or gRPC.
The preliminary design of the server must have a
/login
endpoint so that agents/clients can authenticate and obtain a token from the obtained credentials. Additionally, requests of three different types must be handled:The API must be versioned.
This is a research issue.
Implementation restrictions
Agent team
to align on communication protocols and API integration.Plan
devel-agent
team for this and their research issue.The text was updated successfully, but these errors were encountered: